BCCDC-PHL TB Genomics Database

Indices and tables

tb_db.crud

This module includes methods used to Create, Read, Update and Delete (CRUD) entities from the database.

tb_db.crud.add_sample_to_cgmlst_cluster(db: Session, sample_id: str, cgmlst_cluster: dict[str, object], runid)

Create a cgmlst cluster, for sample specified by sample_id and runid.

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • cgmlst_cluster (dict[str, object]) – Dict representing a cgmlst cluster.

  • runs (dict[str,str]) – Dict with sample ids and their run ids

Returns

sample with cgmlst cluster added.

Return type

models.Sample

tb_db.crud.add_samples_to_cgmlst_clusters(db: Session, cgmlst_cluster: list[dict[str, object]], runs: dict[str, str])

Create multiple cgmlst clusters, for sample specified by sample_id.

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • cgmlst_cluster (dict[str, object]) – Dict representing a cgmlst cluster.

  • runs (dict[str,str]) – Dict with sample ids and their run ids

Returns

sample with cgmlst cluster added.

Return type

models.Sample

tb_db.crud.create_amr_summary(db: Session, amr_report: list[dict[str, object]], runs: dict[str, str])

creating models for amr and drug resistance, amr is one to one relationship with sample,whereas drug resistance is one to many

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • amr_report (list[dict[str,object]]) – list of dict representing amr profile for each sample

  • runs (dict[str,str]) – dictionary representing samples and their run ids

Returns

created amr object

Return type

models.AmrProfile

tb_db.crud.create_cgmlst_allele_profile(db: Session, scheme: dict, cgmlst_allele_profile: dict[str, object], runid: str)

Create a single cgMLST allele profile record.

Parameters
  • db (sqlalchemy.orm.Session) – Database session

  • cgmlst_allele_profile – Dictionary representing a cgMLST allele profile. Must include keys sample_id, profile, and percent_called

  • runid (str) – Sequencing Run ID

Returns

Created cgMLST profiles

Return type

models.CgmlstAlleleProfile

tb_db.crud.create_cgmlst_allele_profiles(db: Session, scheme: dict, cgmlst_allele_profiles: list[dict[str, object]], runs: list[dict[str, str]])

Create multiple cgMLST allele profile records.

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • cgmlst_allele_profiles (list[dict[str, object]]) – List of dictionaries representing cgMLST allele profiles.

  • runs – a list of samples with theirs Sequencing Run ID

Returns

Created cgMLST allele profiles.

Return type

list[models.CgmlstAlleleProfile]

tb_db.crud.create_complexes(db: Session, complexes: list[dict[str, object]], runs: dict[str, str])

Create multiple tb complexes assignment table.

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • complexes (list[dict[str, object]]) – List of dictionaries designating MTBC complex, NTM or non-mycobacteria.

Returns

Created tb complexes.

Return type

list[models.TbComplex]

tb_db.crud.create_libraries(db: Session, libraries: dict[str, object])

Create/add libraries tables

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • libraries (dict[str,object], dictionaries representing sample qc, keys:sample_id,sample_name,sequencing_run_id,most_abundant_species_name,most_abundant_species_fraction_total_reads,estimated_genome_size_bp...) – str representing sample id.

Returns

created libraries object

Return type

models.Library

tb_db.crud.create_miru_profile(db: Session, sample_id: str, miru_profile: dict[str, object])

Create single MIRU profile record, for sample specified by sample_id.

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • sample_id (str) – Sample ID

  • miru_profile (dict[str, object]) – Dict representing a MIRU profile.

Returns

Created MIRU profile.

Return type

models.MiruProfile

tb_db.crud.create_miru_profiles(db: Session, miru_profiles_by_sample_id: dict[str, object])

Create multiple MIRU profile records.

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • miru_profiles_by_sample_id (dict[str, object]) –

Returns

Created MIRU profiles.

Return type

list[models.MiruProfile]

tb_db.crud.create_sample(db: Session, sample: dict[str, object])

Create a single sample record.

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • sample (dict[str, object]) – Dictionary representing a sample. Must include keys sample_id, accession, and collection_date.

Returns

Created sample

Return type

models.Sample|NoneType

tb_db.crud.create_samples(db: Session, samples: list[dict[str, object]])

Create multiple sample records.

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • samples (list[dict[str, object]]) – List of dictionaries representing samples. Must include keys sample_id and collection_date

Returns

Created samples

Return type

list[models.Sample]

tb_db.crud.create_species(db: Session, species: list[dict[str, object]], runs: dict[str, str])

Create multiple tb species table.

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • species (list[dict[str, object]]) – List of dictionaries designating MTBC complex, NTM or non-mycobacteria.

Returns

Created tb species.

Return type

list[models.TbSpecies]

tb_db.crud.delete_sample(db: Session, sample_id: str)

Delete all database records for a sample.

Parameters
  • db (sqlalchemy.orm.Session) – Database session

  • sample_id (str) – Sample ID

Returns

All deleted records for sample.

Return type

list[models.Sample]

tb_db.crud.get_cgmlst_cluster_by_sample_id(db: Session, sample_id: str)

Get cgmlst cluster(s) for sample specified by sample_id.

Parameters
  • db (sqlalchemy.orm.Session) – Database session.

  • sample_id – str representing sample id.

Returns

a list of strings of cgmlst clusters this sample belongs to

Return type

str

tb_db.crud.get_miru_cluster_by_sample_id(db: Session, sample_id: str)

Get miru cluster for a given sample.

Parameters
  • db (sqlalchemy.orm.Session) – Database session

  • sample_id (str) – Sample ID

Returns

Miru Cluster name for the sample.

Return type

str

tb_db.crud.get_sample(db: Session, sample_id: str)

Get current valid database record for a sample.

Parameters
  • db (sqlalchemy.orm.Session) – Database session

  • sample_id (str) – Sample ID

Returns

Current valid database record for the sample.

Return type

models.Sample|NoneType

tb_db.models

This module defines the entities to be stored in the database, and their relationships with one another.

class tb_db.models.CgmlstAlleleProfile(**kwargs)
id
class tb_db.models.CgmlstScheme(**kwargs)
id
class tb_db.models.Library(**kwargs)
id
class tb_db.models.MiruProfile(**kwargs)
id
tb_db.models.camel_to_snake(s: str) str

Converts camelCase to snake_case :param s: String to convert :type s: str :return: snake_case equivalent of camelCase input :rtype: str

tb_db.parsers

This module includes methods used to parse various files to prepare them for input to the database.

tb_db.parsers.parse_cgmlst(cgmlst_path: str, uncalled='-')

Parse a cgMLST csv file.

Parameters

cgmlst_path (str) – Path to cgMLST csv file.

Returns

Dictionary of cgMLST profiles, indexed by Sample ID.

Return type

dict[str, object]

tb_db.parsers.parse_miru(miru_path: str) dict[str, object]

Parse a MIRU csv file.

Parameters

miru_path (str) – Path to MIRU csv file.

Returns

Dict of MIRU profiles, indexed by Sample ID

Return type

dict[str, object]

tb_db.util

This module includes some useful utility functions that may be useful in other modules.