Skip to content

mavis.assemble

class mavis.assemble.Contig

mavis.assemble.Contig.remap_depth()

the average depth of remapped reads over a give range of the contig sequence

def remap_depth(self, query_range=None):

Args

  • query_range (Interval): 1-based inclusive range

class mavis.assemble.DeBruijnGraph

inherits nx.DiGraph

wrapper for a basic digraph enforces edge weights

mavis.assemble.DeBruijnGraph.get_edge_freq()

returns the freq from the data attribute for a specified edge

def get_edge_freq(self, n1, n2):

Args

  • n1
  • n2

mavis.assemble.DeBruijnGraph.add_edge()

add a given edge to the graph, if it exists add the frequency to the existing frequency count

def add_edge(self, n1, n2, freq=1):

Args

  • n1
  • n2
  • freq

mavis.assemble.DeBruijnGraph.trim_tails_by_freq()

for any paths where all edges are lower than the minimum weight trim

def trim_tails_by_freq(self, min_weight):

Args

  • min_weight (int): the minimum weight for an edge to be retained

mavis.assemble.DeBruijnGraph.trim_forks_by_freq()

for all nodes in the graph, if the node has an out-degree > 1 and one of the outgoing edges has freq < min_weight. then that outgoing edge is deleted

def trim_forks_by_freq(self, min_weight):

Args

  • min_weight

mavis.assemble.DeBruijnGraph.trim_noncutting_paths_by_freq()

trim any low weight edges where another path exists between the source and target of higher weight

def trim_noncutting_paths_by_freq(self, min_weight):

Args

  • min_weight

mavis.assemble.DeBruijnGraph.get_sinks()

returns all nodes with an outgoing degree of zero

def get_sinks(self, subgraph=None):

Args

  • subgraph

mavis.assemble.DeBruijnGraph.get_sources()

returns all nodes with an incoming degree of zero

def get_sources(self, subgraph=None):

Args

  • subgraph

mavis.assemble.digraph_connected_components()

the networkx module does not support deriving connected components from digraphs (only simple graphs) this function assumes that connection != reachable this means there is no difference between connected components in a simple graph and a digraph

def digraph_connected_components(graph, subgraph=None):

Args

  • graph (networkx.DiGraph): the input graph to gather components from
  • subgraph

Returns

  • List[List]: returns a list of compnents which are lists of node names

mavis.assemble.pull_contigs_from_component()

builds contigs from the a connected component of the assembly DeBruijn graph

def pull_contigs_from_component(
    assembly, component, min_edge_trim_weight, assembly_max_paths, log=DEVNULL
):

Args

  • assembly (DeBruijnGraph): the assembly graph
  • component (list): list of nodes which make up the connected component
  • min_edge_trim_weight (int): the minimum weight to not remove a non cutting edge/path
  • assembly_max_paths (int): the maximum number of paths allowed before the graph is further simplified
  • log (Callable): the log function

Returns

  • Dict[str,int]: the paths/contigs and their scores

mavis.assemble.filter_contigs()

given a list of contigs, removes similar contigs to leave the highest (of the similar) scoring contig only

def filter_contigs(contigs, assembly_min_uniq=0.01):

Args

  • contigs
  • assembly_min_uniq

mavis.assemble.assemble()

for a set of sequences creates a DeBruijnGraph simplifies trailing and leading paths where edges fall below a weight threshold and the return all possible unitigs/contigs

drops any sequences too small to fit the kmer size

def assemble(
    sequences,
    kmer_size,
    min_edge_trim_weight=3,
    assembly_max_paths=20,
    assembly_min_uniq=0.01,
    min_complexity=0,
    log=lambda *pos, **kwargs: None,
    **kwargs
):

Args

Returns

  • List[Contig]: a list of putative contigs

mavis.assemble.kmers()

for a sequence, compute and return a list of all kmers of a specified size

def kmers(s, size):

Args

  • s (str): the input sequence
  • size (int): the size of the kmers

Returns

  • List[str]: the list of kmers

Examples

>>> kmers('abcdef', 2)
['ab', 'bc', 'cd', 'de', 'ef']