BCCDC-PHL TB Genomics Database
Indices and tables
tb_db.crud
This module includes methods used to Create, Read, Update and Delete (CRUD) entities from the database.
- tb_db.crud.add_sample_to_cgmlst_cluster(db: Session, sample_id: str, cgmlst_cluster: dict[str, object], runid)
Create a cgmlst cluster, for sample specified by sample_id and runid.
- Parameters
db (sqlalchemy.orm.Session) – Database session.
cgmlst_cluster (dict[str, object]) – Dict representing a cgmlst cluster.
runs (dict[str,str]) – Dict with sample ids and their run ids
- Returns
sample with cgmlst cluster added.
- Return type
models.Sample
- tb_db.crud.add_samples_to_cgmlst_clusters(db: Session, cgmlst_cluster: list[dict[str, object]], runs: dict[str, str])
Create multiple cgmlst clusters, for sample specified by sample_id.
- Parameters
db (sqlalchemy.orm.Session) – Database session.
cgmlst_cluster (dict[str, object]) – Dict representing a cgmlst cluster.
runs (dict[str,str]) – Dict with sample ids and their run ids
- Returns
sample with cgmlst cluster added.
- Return type
models.Sample
- tb_db.crud.create_amr_summary(db: Session, amr_report: list[dict[str, object]], runs: dict[str, str])
creating models for amr and drug resistance, amr is one to one relationship with sample,whereas drug resistance is one to many
- Parameters
db (sqlalchemy.orm.Session) – Database session.
amr_report (list[dict[str,object]]) – list of dict representing amr profile for each sample
runs (dict[str,str]) – dictionary representing samples and their run ids
- Returns
created amr object
- Return type
models.AmrProfile
- tb_db.crud.create_cgmlst_allele_profile(db: Session, scheme: dict, cgmlst_allele_profile: dict[str, object], runid: str)
Create a single cgMLST allele profile record.
- Parameters
db (sqlalchemy.orm.Session) – Database session
cgmlst_allele_profile – Dictionary representing a cgMLST allele profile. Must include keys sample_id, profile, and percent_called
runid (str) – Sequencing Run ID
- Returns
Created cgMLST profiles
- Return type
- tb_db.crud.create_cgmlst_allele_profiles(db: Session, scheme: dict, cgmlst_allele_profiles: list[dict[str, object]], runs: list[dict[str, str]])
Create multiple cgMLST allele profile records.
- Parameters
db (sqlalchemy.orm.Session) – Database session.
cgmlst_allele_profiles (list[dict[str, object]]) – List of dictionaries representing cgMLST allele profiles.
runs – a list of samples with theirs Sequencing Run ID
- Returns
Created cgMLST allele profiles.
- Return type
- tb_db.crud.create_complexes(db: Session, complexes: list[dict[str, object]], runs: dict[str, str])
Create multiple tb complexes assignment table.
- Parameters
db (sqlalchemy.orm.Session) – Database session.
complexes (list[dict[str, object]]) – List of dictionaries designating MTBC complex, NTM or non-mycobacteria.
- Returns
Created tb complexes.
- Return type
list[models.TbComplex]
- tb_db.crud.create_libraries(db: Session, libraries: dict[str, object])
Create/add libraries tables
- Parameters
db (sqlalchemy.orm.Session) – Database session.
libraries (dict[str,object], dictionaries representing sample qc, keys:sample_id,sample_name,sequencing_run_id,most_abundant_species_name,most_abundant_species_fraction_total_reads,estimated_genome_size_bp...) – str representing sample id.
- Returns
created libraries object
- Return type
- tb_db.crud.create_miru_profile(db: Session, sample_id: str, miru_profile: dict[str, object])
Create single MIRU profile record, for sample specified by sample_id.
- Parameters
db (sqlalchemy.orm.Session) – Database session.
sample_id (str) – Sample ID
miru_profile (dict[str, object]) – Dict representing a MIRU profile.
- Returns
Created MIRU profile.
- Return type
- tb_db.crud.create_miru_profiles(db: Session, miru_profiles_by_sample_id: dict[str, object])
Create multiple MIRU profile records.
- Parameters
db (sqlalchemy.orm.Session) – Database session.
miru_profiles_by_sample_id (dict[str, object]) –
- Returns
Created MIRU profiles.
- Return type
list[models.MiruProfile]
- tb_db.crud.create_sample(db: Session, sample: dict[str, object])
Create a single sample record.
- Parameters
db (sqlalchemy.orm.Session) – Database session.
sample (dict[str, object]) – Dictionary representing a sample. Must include keys sample_id, accession, and collection_date.
- Returns
Created sample
- Return type
models.Sample|NoneType
- tb_db.crud.create_samples(db: Session, samples: list[dict[str, object]])
Create multiple sample records.
- Parameters
db (sqlalchemy.orm.Session) – Database session.
samples (list[dict[str, object]]) – List of dictionaries representing samples. Must include keys sample_id and collection_date
- Returns
Created samples
- Return type
list[models.Sample]
- tb_db.crud.create_species(db: Session, species: list[dict[str, object]], runs: dict[str, str])
Create multiple tb species table.
- Parameters
db (sqlalchemy.orm.Session) – Database session.
species (list[dict[str, object]]) – List of dictionaries designating MTBC complex, NTM or non-mycobacteria.
- Returns
Created tb species.
- Return type
list[models.TbSpecies]
- tb_db.crud.delete_sample(db: Session, sample_id: str)
Delete all database records for a sample.
- Parameters
db (sqlalchemy.orm.Session) – Database session
sample_id (str) – Sample ID
- Returns
All deleted records for sample.
- Return type
list[models.Sample]
- tb_db.crud.get_cgmlst_cluster_by_sample_id(db: Session, sample_id: str)
Get cgmlst cluster(s) for sample specified by sample_id.
- Parameters
db (sqlalchemy.orm.Session) – Database session.
sample_id – str representing sample id.
- Returns
a list of strings of cgmlst clusters this sample belongs to
- Return type
str
- tb_db.crud.get_miru_cluster_by_sample_id(db: Session, sample_id: str)
Get miru cluster for a given sample.
- Parameters
db (sqlalchemy.orm.Session) – Database session
sample_id (str) – Sample ID
- Returns
Miru Cluster name for the sample.
- Return type
str
- tb_db.crud.get_sample(db: Session, sample_id: str)
Get current valid database record for a sample.
- Parameters
db (sqlalchemy.orm.Session) – Database session
sample_id (str) – Sample ID
- Returns
Current valid database record for the sample.
- Return type
models.Sample|NoneType
tb_db.models
This module defines the entities to be stored in the database, and their relationships with one another.
- tb_db.models.camel_to_snake(s: str) str
Converts camelCase to snake_case :param s: String to convert :type s: str :return: snake_case equivalent of camelCase input :rtype: str
tb_db.parsers
This module includes methods used to parse various files to prepare them for input to the database.
- tb_db.parsers.parse_cgmlst(cgmlst_path: str, uncalled='-')
Parse a cgMLST csv file.
- Parameters
cgmlst_path (str) – Path to cgMLST csv file.
- Returns
Dictionary of cgMLST profiles, indexed by Sample ID.
- Return type
dict[str, object]
- tb_db.parsers.parse_miru(miru_path: str) dict[str, object]
Parse a MIRU csv file.
- Parameters
miru_path (str) – Path to MIRU csv file.
- Returns
Dict of MIRU profiles, indexed by Sample ID
- Return type
dict[str, object]
tb_db.util
This module includes some useful utility functions that may be useful in other modules.