BCCDC-PHL Auto IRIDA Azure Upload
Indices and tables
auto_irida_azure_upload.core
Core functionality.
- auto_irida_azure_upload.core.downsample_reads(config, run_id, samplesheet)
Downsample reads.
- Parameters
config (dict[str, object]) – Application config.
run_id (str) – Sequencing run ID.
samplesheet – samplesheet: [ { ID: str, R1: str, R2: str , GENOME_SIZE: str, COVERAGE: str} ]
- auto_irida_azure_upload.core.find_fastq(run, library_id, read_type)
Find the fastq file for a specific library on a specific run.
- Parameters
run (dict[str, str]) – Sequencing run. Keys: [‘sequencing_run_id’, ‘path’, ‘instrument_type’]
library_id (str) – Library ID
read_type (str) – Read type (‘R1’ or ‘R2’)
- Returns
Path to fastq file
- Return type
Optional[str]
- auto_irida_azure_upload.core.find_run_dirs(config, check_upload_complete=True)
Find sequencing run directories under the ‘run_parent_dirs’ listed in the config.
- Parameters
config (dict[str, object]) – Application config.
check_upload_complete (bool) – Check for presence of ‘upload_complete.json’ file.
- Returns
Run directory. Keys: [‘sequencing_run_id’, ‘path’, ‘instrument_type’]
- Return type
Iterator[Optional[dict[str, str]]]
- auto_irida_azure_upload.core.prepare_downsampling_samplesheet(config, run)
Prepare a SampleSheet to use for downsampling.
- Parameters
config (dict[str, object]) – Application config.
run (dict[str, str]) – Sequencing run to prepare SampleSheet.csv file for. Keys: [‘sequencing_run_id’, ‘path’, ‘instrument_type’]
- Returns
Downsampling samplesheet [ { ID: str, R1: str, R2: str, GENOME_SIZE: str, COVERAGE: str } ]
- Return type
list[dict[str, str]]
- auto_irida_azure_upload.core.prepare_samplelist(config, run, downsampled_reads={})
Prepare a SampleList for a specific run.
- Parameters
config (dict[str, object]) – Application config.
run (dict[str, str]) – Sequencing run to prepare SampleList.csv file for. Keys: [‘sequencing_run_id’, ‘path’, ‘instrument_type’]
- Returns
List of samples to upload. Keys: [‘Sample_Name’, ‘Project_ID’, ‘File_Forward’, ‘File_Forward_Absolute_Path’, ‘File_Reverse’, ‘’File_Reverse_Absolute_Path’]
- Return type
list[dict[str, str]]
- auto_irida_azure_upload.core.prepare_upload_dir(config, run, sample_list)
Prepare upload directory for run.
- Parameters
config (dict[str, object]) –
run (dict[str, str]) – Sequencing run to prepare upload directory for. Keys: [‘sequencing_run_id’, ‘path’, ‘instrument_type’]
sample_list (list[dict[str, str]]) – List of samples to upload. Keys: [‘Sample_Name’, ‘Project_ID’, ‘File_Forward’, ‘File_Forward_Absolute_Path’, ‘File_Reverse’, ‘’File_Reverse_Absolute_Path’]
- Returns
Upload dir path
- Return type
path
- auto_irida_azure_upload.core.scan(config: dict[str, object]) Iterator[Optional[dict[str, object]]]
Scanning involves looking for all existing runs and storing them to the database, then looking for all existing symlinks and storing them to the database. At the end of a scan, we should be able to determine which (if any) symlinks need to be created.
- Parameters
config (dict[str, object]) – Application config.
- Returns
A run directory to analyze, or None
- Return type
Iterator[Optional[dict[str, object]]]
- auto_irida_azure_upload.core.upload_run(config, run, upload_dir)
Initiate an analysis on one directory of fastq files.
auto_irida_azure_upload.samplesheet
Functions for parsing Illumina SampleSheet files.
- auto_irida_azure_upload.samplesheet.choose_samplesheet_to_parse(samplesheet_paths: list[str], instrument_type: str, run_id: str)
A run directory may have multiple SampleSheet.csv files in it. Choose only one to parse.
- Parameters
samplesheet_paths (list[str]) – List of paths to SampleSheet.csv files
instrument_type (str) – Instrument type, should be one of: “miseq”, “nextseq”
run_id (str) – Sequencing run ID
- auto_irida_azure_upload.samplesheet.find_samplesheets(run_dir, instrument_type)
Find SampleSheets in run directory.
- Parameters
run_dir (str) – Path to sequencing run directory
instrument_type (str) – Instrument type (‘miseq’ or ‘nextseq’)
- auto_irida_azure_upload.samplesheet.parse_samplesheet(samplesheet_path: str, instrument_type: str) dict[str, object]
- Parameters
samplesheet_path (str) –
instrument_type (str) – One of miseq or nextseq
- auto_irida_azure_upload.samplesheet.parse_samplesheet_miseq(samplesheet_path)
Parse a MiSeq SampleSheet. Returns None if parsing fails.
- Parameters
samplesheet_path (str) – Path to SampleSheet file.
- Returns
Parsed SampleSheet
- Return type
Optional[dict[str, object]]
- auto_irida_azure_upload.samplesheet.parse_samplesheet_nextseq(samplesheet_path)
auto_irida_azure_upload.util
This module includes some useful utility functions that may be useful in other modules.