Skip to content

mavis/util

ENV_VAR_PREFIX

ENV_VAR_PREFIX = 'MAVIS_'

logger

logger = logging.getLogger('mavis')

class NullableType

cast()

cast a value to a given type

def cast(value, cast_func):

Args

  • value
  • cast_func

Examples

>>> cast('1', int)
1

soft_cast()

cast a value to a given type, if the cast fails, cast to null

def soft_cast(value, cast_type):

Args

  • value
  • cast_type

Examples

>>> cast(None, int)
None
>>> cast('', int)
None

log_arguments()

output the arguments to the console

def log_arguments(args):

Args

  • args (Namespace): the namespace to print arguments for

mkdirp()

Make a directory or path of directories. Suppresses the error that is normally raised when the directory already exists

def mkdirp(dirname):

Args

  • dirname

filter_on_overlap()

filter a set of breakpoint pairs based on overlap with a set of genomic regions

def filter_on_overlap(
    bpps: List[BreakpointPair], regions_by_reference_name: Dict[str, List['BioInterval']]
):

Args

  • bpps (List[BreakpointPair]): list of breakpoint pairs to be filtered
  • regions_by_reference_name (Dict[str, List[BioInterval]]): regions to filter against

get_connected_components()

for a dictionary representing an adjacency matrix of undirected edges returns the connected components

def get_connected_components(adj_matrix):

Args

  • adj_matrix

generate_complete_stamp()

writes a complete stamp, optionally including the run time if start_time is given

def generate_complete_stamp(
    output_dir: str, prefix: str = 'MAVIS.', start_time: Optional[int] = None
) -> str:

Args

  • output_dir (str): path to the output dir the stamp should be written in
  • prefix (str): prefix for the stamp name
  • start_time (Optional[int]): the start time

Returns

  • str

Examples

>>> generate_complete_stamp('some_output_dir')
'some_output_dir/MAVIS.COMPLETE'

read_bpp_from_input_file()

reads a file using the tab module. Each row is converted to a breakpoint pair and other column data is stored in the data attribute

def read_bpp_from_input_file(
    filename: str,
    expand_orient: bool = False,
    expand_strand: bool = False,
    expand_svtype: bool = False,
    integer_columns: Set[str] = INTEGER_COLUMNS,
    float_columns: Set[str] = FLOAT_COLUMNS,
    required_columns: Set[str] = set(),
    add_default: Dict[str, Any] = {},
    summary: bool = False,
    apply: Dict[str, Callable] = {},
    overwrite: Dict[str, Any] = {},
) -> List[BreakpointPair]:

Args

  • filename (str): path to the input file
  • expand_orient (bool)
  • expand_strand (bool)
  • expand_svtype (bool)
  • integer_columns (Set[str])
  • float_columns (Set[str])
  • required_columns (Set[str])
  • add_default (Dict[str, Any])
  • summary (bool): the input is post-summary so some float/int columns have been merged and delimited with semi-colons
  • apply (Dict[str, Callable])
  • overwrite (Dict[str, Any]): set column values for all breakpoints, if the column exists overwrite its current value

Returns