parse_module_definitions ======================== .. py:module:: parse_module_definitions .. autoapi-nested-parse:: author: Haris Zafeiropoulos package: microbetag description: Aim of this script is to build all the unique sets of KO terms that can be used to build up each KEGG module (https://www.genome.jp/brite/ko00002) output: A 2-levels .json file, in the first one, the various steps will be denoted and in the second the multiple alternative combinations of terms will be shown. All the terms of a combination are necessary for the module to be complete notes: The pathway module is defined by the logical expression of K numbers, and the signature module is defined by the logical expression of K numbers and M numbers. A SPACE ( ) or a PLUS (+) sign, representing a connection in the pathway or the molecular complex, is treated as an AND operator and a COMMA (,), used for alternatives, is treated as an OR operator. A MINUS (-) sign designates an optional item in the complex. This script was inspired by 2 functiosn from the following script of microbeAnnotator: https://github.com/cruizperez/MicrobeAnnotator/tree/master/microbeannotator/data/01.KEGG_DB/00.KEGG_Data_Scrapper.py Attributes ---------- .. autoapisummary:: parse_module_definitions.structurals parse_module_definitions.modules parse_module_definitions.module_components_raw parse_module_definitions.definition parse_module_definitions.module_steps_parsed parse_module_definitions.P parse_module_definitions.q parse_module_definitions.module Functions --------- .. autoapisummary:: parse_module_definitions.flatten parse_module_definitions.parse_commas_on_pre_and_post_character parse_module_definitions.check_if_all_in_one_par parse_module_definitions.get_independent_step_alternatives parse_module_definitions.split_to_independent_chunks parse_module_definitions.parse parse_module_definitions.parse_regular_module_dictionary parse_module_definitions.create_final_regular_dictionary Module Contents --------------- .. py:data:: structurals :value: ['M00144', 'M00149', 'M00151', 'M00152', 'M00154', 'M00155', 'M00153', 'M00156', 'M00158', 'M00160'] .. py:function:: flatten(lis) Takes a nested list and returns its contents in a sequential one e.g. [[a,b,c,][d,e,]] --> [a,b,c,d,e] .. py:function:: parse_commas_on_pre_and_post_character(string) Takes a string and returns independent scenarions separated by commas (,) e.g. K02304,(K24866+K03794) ['K02304', '(K24866+K03794)'] .. py:function:: check_if_all_in_one_par(string) Function to tell you whether a string is included in a single parenthesis e.g.: ((K00705,K22451)_(K02438,K01200)) or not e.g.: K00975_(K00703,K13679,K20812) .. py:function:: get_independent_step_alternatives(step_as_a_list) It takes a complete step and returns its unique indipendent pats recursively e.g.: in the first round for ((K13939,(K13940,K01633 K00950) K00796),(K01633 K13941)) we get the (K01633 K13941)) as an independent way while in the second one, we get the K13939 .. py:function:: split_to_independent_chunks(string) Takes a string and returns indices where you can split it to parts that can be combined independently to get the part of the corresponding KEGG module definition e.g. "K00941_(K00788,K21220)" [0, 7, 22] or "((K03831,K03638)_K03750)" [0, 24] .. py:function:: parse(my_string) Parses a module's definitions to each main steps e.g. md definition: (K02303,K13542) (K03394,K13540) K02229 (K05934,K13540,K13541) K05936 K02228 K05895 K00595 K06042 K02224 K02230+K09882+K09883 ['(K02303,K13542)', '(K03394,K13540)', 'K02229', '(K05934,K13540,K13541)', 'K05936', 'K02228', 'K05895', 'K00595', 'K06042', 'K02224', 'K02230+K09882+K09883'] .. py:function:: parse_regular_module_dictionary(module_components_raw, structural_list) Breaks down a module to its steps using the parse() function .. py:function:: create_final_regular_dictionary(module_steps_parsed) This function returns all the possible combinations of KOs to have a complete KEGG module .. py:data:: modules .. py:data:: module_components_raw .. py:data:: definition .. py:data:: module_steps_parsed .. py:data:: P .. py:data:: q .. py:data:: module