Skip to content

mavis/annotate/variant

class Annotation

inherits BreakpointPair

a fusion of two transcripts created by the associated breakpoint_pair will also hold the other annotations for overlapping and encompassed and nearest genes

Attributes

  • encompassed_genes (Set[Gene])
  • genes_proximal_to_break1 (Set[Gene])
  • genes_proximal_to_break2 (Set[Gene])
  • genes_overlapping_break1 (Set[Gene])
  • genes_overlapping_break2 (Set[Gene])
  • proximity (int)
  • fusion (Optional[FusionTranscript])
  • transcript1 (Optional[Transcript])
  • transcript2 (Optional[Transcript])

Annotation.__init__()

Holds a breakpoint call and a set of transcripts, other information is gathered relative to these

def __init__(
    self,
    bpp: BreakpointPair,
    transcript1: Optional[Transcript] = None,
    transcript2: Optional[Transcript] = None,
    proximity: int = 5000,
    **kwargs,
):

Args

  • bpp (BreakpointPair): the breakpoint pair call. Will be adjusted and then stored based on the transcripts
  • transcript1 (Optional[Transcript]): transcript at the first breakpoint
  • transcript2 (Optional[Transcript]): Transcript at the second breakpoint
  • proximity (int)

Annotation.add_gene()

adds a input_gene to the current set of annotations. Checks which set it should be added to

def add_gene(self, input_gene: Gene):

Args

  • input_gene (Gene): the input_gene being added

Annotation.flatten()

generates a dictionary of the annotation information as strings

def flatten(self) -> Dict:

Returns

  • Dict: dictionary of attribute names and values

class IndelCall

Attributes

  • nterm_aligned (int)
  • cterm_aligned (int)
  • ref_seq (str)
  • mut_seq (str)
  • ins_seq (str)
  • del_seq (str)
  • is_dup (bool)
  • terminates (bool)

IndelCall.__init__()

Given two sequences, Assuming there exists a single difference between the two call an indel which accounts for the change

def __init__(self, refseq: str, mutseq: str):

Args

  • refseq (str): The reference (amino acid) sequence
  • mutseq (str): The mutated (amino acid) sequence

IndelCall.hgvs_protein_notation()

returns the HGVS protein notation for an indel call

def hgvs_protein_notation(self) -> Optional[str]:

Returns

  • Optional[str]

flatten_fusion_translation()

for a given fusion product (translation) gather the information to be output to the tabbed files

def flatten_fusion_translation(translation: Translation) -> Dict:

Args

  • translation (Translation): the translation which is on the fusion transcript

Returns

  • Dict: the dictionary of column names to values

call_protein_indel()

compare the fusion protein/aa sequence to the reference protein/aa sequence and return an hgvs notation indel call

def call_protein_indel(
    ref_translation: Translation,
    fusion_translation: Translation,
    reference_genome: Optional[ReferenceGenome] = None,
) -> str:

Args

  • ref_translation (Translation): the reference protein/translation
  • fusion_translation (Translation): the fusion protein/translation
  • reference_genome (Optional[ReferenceGenome]): the reference genome object used to fetch the reference translation AA sequence

Returns

  • str: the HGVS protein indel notation

choose_more_annotated()

for a given set of annotations if there are annotations which contain transcripts and annotations that are simply intergenic regions, discard the intergenic region annotations

similarly if there are annotations where both breakpoints fall in a transcript and annotations where one or more breakpoints lands in an intergenic region, discard those that land in the intergenic region

def choose_more_annotated(ann_list: List[Annotation]) -> List[Annotation]:

Args

  • ann_list (List[Annotation]): list of input annotations

Returns

Warning

input annotations are assumed to be the same event (the same validation_id) the logic used would not apply to different events

choose_transcripts_by_priority()

for each set of annotations with the same combinations of genes, choose the annotation with the most "best_transcripts" or most "alphanumeric" choices of transcript. Throw an error if they are identical

def choose_transcripts_by_priority(ann_list: List[Annotation]) -> List[Annotation]:

Args

Returns

Warning

input annotations are assumed to be the same event (the same validation_id) the logic used would not apply to different events