microbetag.tools¶

Functions invoking other software/tools in the microbetag pipeline.

Functions¶

run_seed_complementarity(→ None)

Based on the type of files to be used for the seed complementarity step, adjustments are needed

hmmsearch(→ None)

Function to invoke hmmsearch software.

run_prodigal(→ None)

Function to predict ORFs using Prodigal.

kegg_annotation(→ bool)

Function to perform KEGG annotation in parallel.

phenotrex_genotype(→ None)

Runs the compute-genotype program of phenotrex to get COGs present in the list of genomes under study.

phenotrex_predict(→ None)

Runs the predict() program of phenotrex to predict whether a genome does have a trait or not.

run_manta(→ None)

Runs the manta package to perform network clustering.

run_flashweave(→ None)

Runs FlashWeave to infer co-occurrence network.

run_faprotax(→ None)

Runs FAPROTAX collapse_table.py script to annotate taxa based on their taxonomy using literature.

Module Contents¶

microbetag.tools.run_seed_complementarity(config: microbetag.config.Config) None[source]¶

Based on the type of files to be used for the seed complementarity step, adjustments are needed in terms of inner architecture of the files, then invokes ExportSeedComplementarities to get seed and non-seed sets, and based on those export seed complementarities.

Parameters:

config – Instance of the microbetag Config class

microbetag.tools.hmmsearch(params: List) None[source]¶

Function to invoke hmmsearch software.

Parameters:

params – list of parameters to be passed to the hmmsearch function.

Note

We use only 1 cpu since we use a muliprocessing.Pool in the kegg_annotation().

Cite:

HMMER 3.4 (Aug 2023); http://hmmer.org/ Copyright (C) 2023 Howard Hughes Medical Institute. Freely distributed under the BSD open source license.

microbetag.tools.run_prodigal(fasta: str, basename: str, outdir: str) None[source]¶

Function to predict ORFs using Prodigal. By default outdir is the ORFs folder fna FASTA nucleic acid Used generically to specify nucleic acids ffn FASTA nucleotide of gene regions Contains coding regions for a genome

Cite:

Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC bioinformatics. 2010 Dec;11:1-1.

microbetag.tools.kegg_annotation(faa: str, basename: str, out_dir: str, db_dir: str, ko_dic: dict, threads: int) bool[source]¶

Function to perform KEGG annotation in parallel. The function invokes hmmsearch.

Parameters:
  • faa – Filepath to the .faa file of the bin in process

  • basename – Bin id

  • out_dir – Path to output directory where .hmmout files will be stored

  • db_dir – Path to KEGG database directory

  • ko_dic

  • threads – Number of threads to be used

microbetag.tools.phenotrex_genotype(config: microbetag.config.Config) None[source]¶

Runs the compute-genotype program of phenotrex to get COGs present in the list of genomes under study.

Cite:

Feldbauer R, Schulz F, Horn M, Rattei T. Prediction of microbial phenotypes based on comparative genomics. BMC bioinformatics. 2015 Dec;16:1-8. https://phenotrex.readthedocs.io

microbetag.tools.phenotrex_predict(config: microbetag.config.Config) None[source]¶

Runs the predict() program of phenotrex to predict whether a genome does have a trait or not.

Cite:

Feldbauer R, Schulz F, Horn M, Rattei T. Prediction of microbial phenotypes based on comparative genomics. BMC bioinformatics. 2015 Dec;16:1-8. https://phenotrex.readthedocs.io

microbetag.tools.run_manta(config: microbetag.config.Config) None[source]¶

Runs the manta package to perform network clustering.

Cite:

Röttjers L, Faust K. Manta: A clustering algorithm for weighted ecological networks. Msystems. 2020 Feb 25;5(1):10-128.

microbetag.tools.run_flashweave(config: microbetag.config.Config) None[source]¶

Runs FlashWeave to infer co-occurrence network.

Cite:

Tackmann J, Rodrigues JF, von Mering C. Rapid inference of direct interactions in large-scale ecological networks from heterogeneous microbial sequencing data. Cell systems. 2019 Sep 25;9(3):286-96.

microbetag.tools.run_faprotax(config: microbetag.config.Config) None[source]¶

Runs FAPROTAX collapse_table.py script to annotate taxa based on their taxonomy using literature.

Cite:

Louca S, Parfrey LW, Doebeli M. Decoupling function and taxonomy in the global ocean microbiome. Science. 2016 Sep 16;353(6305):1272-7.