mavis/summary/summary
filter_by_call_method()
Filters a set of breakpoint pairs to returns the call with the most evidence. Prefers contig evidence over spanning over split over flanking, etc.
def filter_by_call_method(bpp_list):
Args
- bpp_list
group_events()
group events together and join data attributes
def group_events(events):
Args
- events
group_by_distance()
groups a set of calls based on their proximity. Returns a new list of calls where close calls have been merged
def group_by_distance(calls, distances):
Args
- calls
- distances
annotate_with_dgv_bpp()
Given a list of breakpoint pairs (bpps) and bpps from another reference set, annotate the events that are within the set distance of both breakpoints
def annotate_with_dgv_bpp(
bpps: List[BreakpointPair],
dgv_regions_list: List[BreakpointPair],
input_cluster_radius: int,
):
Args
- bpps (List[BreakpointPair]): the list of BreakpointPair objects
- dgv_regions_list (List[BreakpointPair]): the dgv reference regions specified by the MAVIS input file represented as BreakpointPairs
- input_cluster_radius (
int
): Distance used in matching input SVs to reference SVs through clusterind, defined by summary.cluster_radius in the configuration file
annotate_with_dgv_bed()
Given a list of breakpoint pairs (bpps) and bpps from another reference set, annotate the events that are within the set distance of both breakpoints
def annotate_with_dgv_bed(
bpps: List[BreakpointPair],
dgv_regions_by_reference_name: Dict[Tuple[str], List[BioInterval]],
input_cluster_radius: int,
):
Args
- bpps (List[BreakpointPair]): the list of BreakpointPair objects
- dgv_regions_by_reference_name (Dict[Tuple[
str
], List[BioInterval]]): the dgv reference regions specified by the bed input file represented as BioIntervals - input_cluster_radius (
int
): Distance used in matching input SVs to reference SVs through clusterind, defined by summary.cluster_radius in the configuration file
annotate_dgv()
Given a list of breakpoint pairs (bpps) and bpps from another reference set, annotate the events that are within the set distance of both breakpoints
def annotate_dgv(
bpps: List[BreakpointPair],
dgv_regions_by_reference_name: Dict[Tuple[str], Union[List[BreakpointPair], List[BioInterval]]],
input_cluster_radius: int,
):
Args
- bpps (List[BreakpointPair]): the list of BreakpointPair objects
- dgv_regions_by_reference_name (Dict[Tuple[
str
], Union[List[BreakpointPair], List[BioInterval]]]): tuple of strings of chr name and its associated list of BreakpointPair/BioInterval objects specified by the MAVIS input file - input_cluster_radius (
int
): Distance used in matching input SVs to reference SVs through clusterind, defined by summary.cluster_radius in the configuration file
get_pairing_state()
given two libraries, returns the appropriate descriptor for their matched state
def get_pairing_state(
current_protocol,
current_disease_state,
other_protocol,
other_disease_state,
is_matched=False,
inferred_is_matched=False,
):
Args
- current_protocol (PROTOCOL): the protocol of the current library
- current_disease_state (DISEASE_STATUS): the disease status of the current library
- other_protocol (PROTOCOL): protocol of the library being comparing to
- other_disease_state (DISEASE_STATUS): disease status of the library being compared to
- is_matched (
bool
): True if the libraries are paired - inferred_is_matched
Returns
(PAIRING_STATE)
: descriptor of the pairing of the two libraries