mavis/cluster/cluster
class BreakpointPairGroupKey
inherits []
merge_integer_intervals()
Merges a set of integer intervals into a single interval where the center is the weighted mean of the input intervals. The weight is inversely proportional to the length of each interval. The length of the final interval is the average of the lengths of the input intervals capped in size so that it never extends beyond the union of the input intervals
def merge_integer_intervals(*intervals, weight_adjustment: int = 0) -> Interval:
Returns
merge_by_union()
for a given set of breakpoint pairs, merge the union of all pairs that are within the given distance (cluster_radius)
def merge_by_union(
input_pairs: List[BreakpointPair],
group_key: BreakpointPairGroupKey,
weight_adjustment: int = 10,
cluster_radius: int = 200,
) -> Dict[BreakpointPairGroupKey, List[BreakpointPair]]:
Args
- input_pairs (List[BreakpointPair])
- group_key (BreakpointPairGroupKey)
- weight_adjustment (
int
) - cluster_radius (
int
)
Returns
- Dict[BreakpointPairGroupKey, List[BreakpointPair]]
merge_breakpoint_pairs()
two-step merging process
- merges all 'small' (see cluster_initial_size_limit) events as the union of all events that fall within the cluster_radius
- for all remaining events choose the 'best' merge for any event within cluster_radius of an existing node. Otherwise the node is added unmerged. The events in the second phase are done in order of smallest total breakpoint interval size to largest
def merge_breakpoint_pairs(
input_pairs: List[BreakpointPair],
cluster_radius: int = 200,
cluster_initial_size_limit: int = 25,
verbose: bool = False,
) -> Dict[BreakpointPair, List[BreakpointPair]]:
Args
- input_pairs (List[BreakpointPair]): the pairs to be merged
- cluster_radius (
int
): maximum distance allowed for a node to merge - cluster_initial_size_limit (
int
): maximum size of breakpoint intervals allowed in the first merging phase - verbose (
bool
)
Returns
- Dict[BreakpointPair, List[BreakpointPair]]: mapping of merged breakpoint pairs to the input pairs used in the merge