pipeline module

class mavis.schedule.pipeline.Pipeline(output_dir, scheduler, validations=None, annotations=None, pairing=None, summary=None, checker=None, batch_id='batch-tmD9NPdhkNfTgLDnVsjYu9')[source]

Bases: object

Parameters:
  • output_dir (str) – path to main output directory for all mavis pipeline results
  • scheduler (Scheduler) – the class for interacting with a job scheduler
  • validations (list of Job) – list of validation jobs
  • annotations (list of Job) – list of annotation jobs
  • pairing (Job) – pairing job
  • summary (Job) – summary job
  • batch_id (str) – the batch id for this pipeline run. Used in avoinfing job name conflicts
ERROR_STATES = {'CANCELLED', 'ERROR', 'FAILED', 'NOT SUBMITTED', 'UNKNOWN'}
classmethod build(config)[source]
Parameters:config (MavisConfig) – the main configuration. Note this is the config after all reference inputs have been loaded
Returns:the pipeline instance with job dependencies information etc.
Return type:Pipeline
check_status(submit=False, resubmit=False, log=<mavis.util.Log object>)[source]

Check all jobs for completetion. Report any failures, etc.

Parameters:submit (bool) – submit any pending jobs
classmethod format_args(subcommand, args)[source]
classmethod read_build_file(filepath)[source]

read the configuration file which stored the build information concerning jobs and dependencies

Parameters:filepath (str) – path to the input config file
write_build_file(filename)[source]

write the build.cfg file for the current pipeline. This is the file used in re-loading the pipeline to check the status and report failures, etc. later.

Parameters:filename (str) – path to the output config file
write_submission_script(subcommand, job, args, aligner_path=None)[source]
Parameters:
  • subcommand (SUBCOMMAND) – the pipeline step this script will run
  • job (Job) – the job the script is for
  • args (dict) – arguments for the subcommand
mavis.schedule.pipeline.annotate_args(config, libconf)[source]

Pull arguments from the main config and library specific config to pass to annotate

Parameters:
mavis.schedule.pipeline.cluster_args(config, libconf)[source]

Pull arguments from the main config and library specific config to pass to cluster

Parameters:
mavis.schedule.pipeline.parse_run_time(filename)[source]

parses the run time listed at the end of a file following mavis conventions

mavis.schedule.pipeline.run_conversion(config, libconf, conversion_dir, assume_no_untemplated=True)[source]

Converts files if not already converted. Returns a list of filenames

mavis.schedule.pipeline.stringify_args_to_command(args)[source]

takes a list of arguments and prepares them for writing to a bash script

mavis.schedule.pipeline.summary_args(config)[source]

Pull arguments from the main config and library specific config to pass to summary

Parameters:
mavis.schedule.pipeline.validate_args(config, libconf)[source]

Pull arguments from the main config and library specific config to pass to validate

Parameters: