Skip to content

mavis.annotate.variant

class mavis.annotate.variant.Annotation

inherits BreakpointPair

a fusion of two transcripts created by the associated breakpoint_pair will also hold the other annotations for overlapping and encompassed and nearest genes

mavis.annotate.variant.Annotation.__init__()

Holds a breakpoint call and a set of transcripts, other information is gathered relative to these

def __init__(
    self, bpp, transcript1=None, transcript2=None, proximity=5000, data=None, **kwargs
):

Args

  • bpp (BreakpointPair): the breakpoint pair call. Will be adjusted and then stored based on the transcripts
  • transcript1 (Transcript): transcript at the first breakpoint
  • transcript2 (Transcript): Transcript at the second breakpoint
  • proximity
  • data (dict): optional dictionary to hold related attributes

mavis.annotate.variant.Annotation.add_gene()

adds a input_gene to the current set of annotations. Checks which set it should be added to

def add_gene(self, input_gene):

Args

  • input_gene (input_gene): the input_gene being added

mavis.annotate.variant.Annotation.flatten()

generates a dictionary of the annotation information as strings

def flatten(self):

Returns

  • Dict[str,str]: dictionary of attribute names and values

class mavis.annotate.variant.IndelCall

mavis.annotate.variant.IndelCall.__init__()

Given two sequences, Assuming there exists a single difference between the two call an indel which accounts for the change

def __init__(self, refseq, mutseq):

Args

  • refseq (str): The reference (amino acid) sequence
  • mutseq (str): The mutated (amino acid) sequence

mavis.annotate.variant.IndelCall.hgvs_protein_notation()

returns the HGVS protein notation for an indel call

def hgvs_protein_notation(self):

mavis.annotate.variant.flatten_fusion_translation()

for a given fusion product (translation) gather the information to be output to the tabbed files

def flatten_fusion_translation(translation):

Args

  • translation (Translation): the translation which is on the fusion transcript

Returns

  • dict: the dictionary of column names to values

mavis.annotate.variant.call_protein_indel()

compare the fusion protein/aa sequence to the reference protein/aa sequence and return an hgvs notation indel call

def call_protein_indel(ref_translation, fusion_translation, reference_genome=None):

Args

  • ref_translation (Translation): the reference protein/translation
  • fusion_translation (Translation): the fusion protein/translation
  • reference_genome: the reference genome object used to fetch the reference translation AA sequence

Returns

  • str: the HGVS protein indel notation

mavis.annotate.variant.choose_more_annotated()

for a given set of annotations if there are annotations which contain transcripts and annotations that are simply intergenic regions, discard the intergenic region annotations

similarly if there are annotations where both breakpoints fall in a transcript and annotations where one or more breakpoints lands in an intergenic region, discard those that land in the intergenic region

def choose_more_annotated(ann_list):

Args

  • ann_list (List[Annotation]): list of input annotations

Returns

  • List[Annotation]: the filtered list

Warning

input annotations are assumed to be the same event (the same validation_id) the logic used would not apply to different events

mavis.annotate.variant.choose_transcripts_by_priority()

for each set of annotations with the same combinations of genes, choose the annotation with the most "best_transcripts" or most "alphanumeric" choices of transcript. Throw an error if they are identical

def choose_transcripts_by_priority(ann_list):

Args

  • ann_list (List[Annotation]): input annotations

Returns

  • List[Annotation]: the filtered list

Warning

input annotations are assumed to be the same event (the same validation_id) the logic used would not apply to different events