parse_module_definitions

author: Haris Zafeiropoulos package: microbetag description: Aim of this script is to build all the unique sets of KO terms that can be used to build up each KEGG module (https://www.genome.jp/brite/ko00002) output: A 2-levels .json file, in the first one, the various steps will be denoted and in the second the multiple alternative combinations of terms

will be shown. All the terms of a combination are necessary for the module to be complete

notes: The pathway module is defined by the logical expression of K numbers, and the signature module

is defined by the logical expression of K numbers and M numbers. A SPACE ( ) or a PLUS (+) sign, representing a connection in the pathway or the molecular complex, is treated as an AND operator and a COMMA (,), used for alternatives, is treated as an OR operator. A MINUS (-) sign designates an optional item in the complex. This script was inspired by 2 functiosn from the following script of microbeAnnotator: https://github.com/cruizperez/MicrobeAnnotator/tree/master/microbeannotator/data/01.KEGG_DB/00.KEGG_Data_Scrapper.py

Attributes

Functions

flatten(lis)

Takes a nested list and returns its contents in a sequential one

parse_commas_on_pre_and_post_character(string)

Takes a string and returns independent scenarions separated by commas (,)

check_if_all_in_one_par(string)

Function to tell you whether a string is included in a single parenthesis

get_independent_step_alternatives(step_as_a_list)

It takes a complete step and returns its unique indipendent pats recursively

split_to_independent_chunks(string)

Takes a string and returns indices where you can split it to parts that can be combined

parse(my_string)

Parses a module's definitions to each main steps

parse_regular_module_dictionary(module_components_raw, ...)

Breaks down a module to its steps using the parse() function

create_final_regular_dictionary(module_steps_parsed)

This function returns all the possible combinations of KOs to have a complete KEGG module

Module Contents

parse_module_definitions.structurals = ['M00144', 'M00149', 'M00151', 'M00152', 'M00154', 'M00155', 'M00153', 'M00156', 'M00158', 'M00160'][source]
parse_module_definitions.flatten(lis)[source]

Takes a nested list and returns its contents in a sequential one e.g. [[a,b,c,][d,e,]] –> [a,b,c,d,e]

parse_module_definitions.parse_commas_on_pre_and_post_character(string)[source]

Takes a string and returns independent scenarions separated by commas (,) e.g. K02304,(K24866+K03794) [‘K02304’, ‘(K24866+K03794)’]

parse_module_definitions.check_if_all_in_one_par(string)[source]

Function to tell you whether a string is included in a single parenthesis e.g.: ((K00705,K22451)_(K02438,K01200)) or not e.g.: K00975_(K00703,K13679,K20812)

parse_module_definitions.get_independent_step_alternatives(step_as_a_list)[source]

It takes a complete step and returns its unique indipendent pats recursively e.g.: in the first round for ((K13939,(K13940,K01633 K00950) K00796),(K01633 K13941)) we get the (K01633 K13941)) as an independent way while in the second one, we get the K13939

parse_module_definitions.split_to_independent_chunks(string)[source]

Takes a string and returns indices where you can split it to parts that can be combined independently to get the part of the corresponding KEGG module definition e.g. “K00941_(K00788,K21220)” [0, 7, 22] or “((K03831,K03638)_K03750)” [0, 24]

parse_module_definitions.parse(my_string)[source]

Parses a module’s definitions to each main steps e.g. md definition: (K02303,K13542) (K03394,K13540) K02229 (K05934,K13540,K13541) K05936 K02228 K05895 K00595 K06042 K02224 K02230+K09882+K09883 [‘(K02303,K13542)’, ‘(K03394,K13540)’, ‘K02229’, ‘(K05934,K13540,K13541)’, ‘K05936’, ‘K02228’, ‘K05895’, ‘K00595’, ‘K06042’, ‘K02224’, ‘K02230+K09882+K09883’]

parse_module_definitions.parse_regular_module_dictionary(module_components_raw, structural_list)[source]

Breaks down a module to its steps using the parse() function

parse_module_definitions.create_final_regular_dictionary(module_steps_parsed)[source]

This function returns all the possible combinations of KOs to have a complete KEGG module

parse_module_definitions.modules[source]
parse_module_definitions.module_components_raw[source]
parse_module_definitions.definition[source]
parse_module_definitions.module_steps_parsed[source]
parse_module_definitions.P[source]
parse_module_definitions.q[source]
parse_module_definitions.module[source]