microbetag.tools ================ .. py:module:: microbetag.tools .. autoapi-nested-parse:: Functions invoking other software/tools in the microbetag pipeline. Functions --------- .. autoapisummary:: microbetag.tools.run_seed_complementarity microbetag.tools.hmmsearch microbetag.tools.run_prodigal microbetag.tools.kegg_annotation microbetag.tools.phenotrex_genotype microbetag.tools.phenotrex_predict microbetag.tools.run_manta microbetag.tools.run_flashweave microbetag.tools.run_faprotax Module Contents --------------- .. py:function:: run_seed_complementarity(config: microbetag.config.Config) -> None Based on the type of files to be used for the seed complementarity step, adjustments are needed in terms of inner architecture of the files, then invokes :class:`.ExportSeedComplementarities` to get seed and non-seed sets, and based on those export seed complementarities. :param config: Instance of the microbetag :class:`.Config` class .. py:function:: hmmsearch(params: List) -> None Function to invoke hmmsearch software. :param params: list of parameters to be passed to the hmmsearch function. .. note:: We use only 1 cpu since we use a muliprocessing.Pool in the kegg_annotation(). Cite: HMMER 3.4 (Aug 2023); http://hmmer.org/ Copyright (C) 2023 Howard Hughes Medical Institute. Freely distributed under the BSD open source license. .. py:function:: run_prodigal(fasta: str, basename: str, outdir: str) -> None Function to predict ORFs using Prodigal. By default outdir is the ORFs folder fna FASTA nucleic acid Used generically to specify nucleic acids ffn FASTA nucleotide of gene regions Contains coding regions for a genome Cite: Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC bioinformatics. 2010 Dec;11:1-1. .. py:function:: kegg_annotation(faa: str, basename: str, out_dir: str, db_dir: str, ko_dic: dict, threads: int) -> bool Function to perform KEGG annotation in parallel. The function invokes `hmmsearch`. :param faa: Filepath to the .faa file of the bin in process :param basename: Bin id :param out_dir: Path to output directory where .hmmout files will be stored :param db_dir: Path to KEGG database directory :param ko_dic: :param threads: Number of threads to be used .. py:function:: phenotrex_genotype(config: microbetag.config.Config) -> None Runs the `compute-genotype` program of phenotrex to get COGs present in the list of genomes under study. Cite: Feldbauer R, Schulz F, Horn M, Rattei T. Prediction of microbial phenotypes based on comparative genomics. BMC bioinformatics. 2015 Dec;16:1-8. https://phenotrex.readthedocs.io .. py:function:: phenotrex_predict(config: microbetag.config.Config) -> None Runs the :func:`predict` program of `phenotrex` to predict whether a genome does have a trait or not. Cite: Feldbauer R, Schulz F, Horn M, Rattei T. Prediction of microbial phenotypes based on comparative genomics. BMC bioinformatics. 2015 Dec;16:1-8. https://phenotrex.readthedocs.io .. py:function:: run_manta(config: microbetag.config.Config) -> None Runs the manta package to perform network clustering. Cite: Röttjers L, Faust K. Manta: A clustering algorithm for weighted ecological networks. Msystems. 2020 Feb 25;5(1):10-128. .. py:function:: run_flashweave(config: microbetag.config.Config) -> None Runs FlashWeave to infer co-occurrence network. Cite: Tackmann J, Rodrigues JF, von Mering C. Rapid inference of direct interactions in large-scale ecological networks from heterogeneous microbial sequencing data. Cell systems. 2019 Sep 25;9(3):286-96. .. py:function:: run_faprotax(config: microbetag.config.Config) -> None Runs FAPROTAX collapse_table.py script to annotate taxa based on their taxonomy using literature. Cite: Louca S, Parfrey LW, Doebeli M. Decoupling function and taxonomy in the global ocean microbiome. Science. 2016 Sep 16;353(6305):1272-7.