microbetag.config¶

Classes¶

Config

Parses a microbetag configuration file (yaml) to init a microbetag run.

Functions¶

get_value(conf, key[, default])

Retrieves the 'value' field for a given key from a nested configuration dictionary.

load_config(yaml_file)

load_abundance(→ tuple[pandas.DataFrame, str, str, str])

Load a tsv/csv format abundance table assuming the sequence id is procided in the first column

Module Contents¶

class microbetag.config.Config(conf: dict, config_file: str = None)[source]¶

Parses a microbetag configuration file (yaml) to init a microbetag run.

Parameters:
  • conf – A dictionary where the YAML configuration file has been loaded

  • config_file – Filepath to the configuration YAML file.

Attention

It is essential to use the corresponding to the microbetag version you are using configuration template file. Otherwise, the Config class will fail to create an instance and microbetag will exit. You may find microbetag configuration templates by version at: https://github.com/hariszaf/microbetag/tree/fix-phylomint/config_files

Example

>>> with open(args.config, "r") as yaml_file:
        yaml_conf = yaml.safe_load(yaml_file)
>>> conf = Config(yaml_conf, args.config)
export_to_log(log_file='parameters.log')[source]¶

Dumps the Config instance in a JSON file.

microbetag.config.get_value(conf, key, default=None)[source]¶

Retrieves the ‘value’ field for a given key from a nested configuration dictionary.

Returns: any: The value associated with conf[key][‘value’], or the provided default if not found or None.

microbetag.config.load_config(yaml_file)[source]¶
microbetag.config.load_abundance(abd_file: str) tuple[pandas.DataFrame, str, str, str][source]¶

Load a tsv/csv format abundance table assuming the sequence id is procided in the first column and the taxonomy in the last one

Parameters:

abd_file – Filepath to abundance table file.

Returns:

  • seq_id2tax: A pandas.DataFrame with the sequence id and their corresponding taxonomy

  • seq_id_col: The name of the column with the sequence identifier (e.g. seqId)

  • tax_col: The name of the column with the taxonomy

Return type:

A tuple including