microbetag.tools¶
Functions invoking other software/tools in the microbetag pipeline.
Functions¶
|
Based on the type of files to be used for the seed complementarity step, adjustments are needed |
|
Function to invoke hmmsearch software. |
|
Function to predict ORFs using Prodigal. |
|
Function to perform KEGG annotation in parallel. |
|
Runs the compute-genotype program of phenotrex to get COGs present in the list of genomes under study. |
|
Runs the |
|
Runs the manta package to perform network clustering. |
|
Runs FlashWeave to infer co-occurrence network. |
|
Runs FAPROTAX collapse_table.py script to annotate taxa based on their taxonomy using literature. |
Module Contents¶
- microbetag.tools.run_seed_complementarity(config: microbetag.config.Config) None[source]¶
Based on the type of files to be used for the seed complementarity step, adjustments are needed in terms of inner architecture of the files, then invokes
ExportSeedComplementaritiesto get seed and non-seed sets, and based on those export seed complementarities.- Parameters:
config – Instance of the microbetag
Configclass
- microbetag.tools.hmmsearch(params: List) None[source]¶
Function to invoke hmmsearch software.
- Parameters:
params – list of parameters to be passed to the hmmsearch function.
Note
We use only 1 cpu since we use a muliprocessing.Pool in the kegg_annotation().
- Cite:
HMMER 3.4 (Aug 2023); http://hmmer.org/ Copyright (C) 2023 Howard Hughes Medical Institute. Freely distributed under the BSD open source license.
- microbetag.tools.run_prodigal(fasta: str, basename: str, outdir: str) None[source]¶
Function to predict ORFs using Prodigal. By default outdir is the ORFs folder fna FASTA nucleic acid Used generically to specify nucleic acids ffn FASTA nucleotide of gene regions Contains coding regions for a genome
- Cite:
Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC bioinformatics. 2010 Dec;11:1-1.
- microbetag.tools.kegg_annotation(faa: str, basename: str, out_dir: str, db_dir: str, ko_dic: dict, threads: int) bool[source]¶
Function to perform KEGG annotation in parallel. The function invokes hmmsearch.
- Parameters:
faa – Filepath to the .faa file of the bin in process
basename – Bin id
out_dir – Path to output directory where .hmmout files will be stored
db_dir – Path to KEGG database directory
ko_dic
threads – Number of threads to be used
- microbetag.tools.phenotrex_genotype(config: microbetag.config.Config) None[source]¶
Runs the compute-genotype program of phenotrex to get COGs present in the list of genomes under study.
- Cite:
Feldbauer R, Schulz F, Horn M, Rattei T. Prediction of microbial phenotypes based on comparative genomics. BMC bioinformatics. 2015 Dec;16:1-8. https://phenotrex.readthedocs.io
- microbetag.tools.phenotrex_predict(config: microbetag.config.Config) None[source]¶
Runs the
predict()program of phenotrex to predict whether a genome does have a trait or not.- Cite:
Feldbauer R, Schulz F, Horn M, Rattei T. Prediction of microbial phenotypes based on comparative genomics. BMC bioinformatics. 2015 Dec;16:1-8. https://phenotrex.readthedocs.io
- microbetag.tools.run_manta(config: microbetag.config.Config) None[source]¶
Runs the manta package to perform network clustering.
- Cite:
Röttjers L, Faust K. Manta: A clustering algorithm for weighted ecological networks. Msystems. 2020 Feb 25;5(1):10-128.
- microbetag.tools.run_flashweave(config: microbetag.config.Config) None[source]¶
Runs FlashWeave to infer co-occurrence network.
- Cite:
Tackmann J, Rodrigues JF, von Mering C. Rapid inference of direct interactions in large-scale ecological networks from heterogeneous microbial sequencing data. Cell systems. 2019 Sep 25;9(3):286-96.
- microbetag.tools.run_faprotax(config: microbetag.config.Config) None[source]¶
Runs FAPROTAX collapse_table.py script to annotate taxa based on their taxonomy using literature.
- Cite:
Louca S, Parfrey LW, Doebeli M. Decoupling function and taxonomy in the global ocean microbiome. Science. 2016 Sep 16;353(6305):1272-7.