microbetagDB API¶

microbetagDB API provides programmatic access to the data. Using the Application Programming Interface (API) you can access the microbetagDB directly to get information about genome-based predicted traits of a specific taxon, potential pathway complementarities of a taxa pair etc.

The base address to the API is https://msysbio.gbiomed.kuleuven.be/.

Below you may find the syntax to retrieve the various data and/or annotations included.

Remember that microbetag is a NCBI Taxonomy oriented resource. That means that a “species” of interest is a NCBI Taxonomy ID. For example, if you are interested in Bifidobacterium animalis, you first need to go to the NCBI Taxonomy portal and get its corresponding ID. However, once you do so you get a list of subspecies and strains available. You can either use the species ID or one of a specific strain in your queries.

Get genome IDs for a NCBI Taxonomy ID¶

To check whether a species is present on microbetag, one may find its corresponding NCBI Taxonomy ID and search for related genomes present on the microbetagDB.

For example, assuming we are interested in the Blautia hansenii DSM 20583 strain, we find from NCBI Taxonomy that its corresponding ID is 537007.

Using the ncbiTaxId-to-genomeId route we may get the related genomes on microbetagDB:

curl https://msysbio.gbiomed.kuleuven.be/ncbiTaxId-to-genomeId?ncbiTaxId=537007

that returns a list of genomes used in microbetag annotations:

{
  "537007": [
    "GCF_002222595.2"
  ]
}

In this case there is a genome available for this NCBI Taxonomy ID (GCF_002222595.2).

If no genome is available, then you will get a NotFoundError. For example, for the Bifidobacterium animalis subsp. animalis IM386 (NCBI Tax ID: 1402194) there is no genome in microbetagDB:

curl https://msysbio.gbiomed.kuleuven.be/ncbiTaxId-to-genomeId?ncbiTaxId=1402194
{
  "error": "NCBI Taxonomy Id 1402194 was not found in microbetagDB."
}

🧬 Genome-based predicted phenotypic traits¶

Once you have identified the genomes related to your NCBI Taxonomy ID under study using the ncbiTaxId-to-genomeId route, you may get the corresponding phenotypic traits of that genome(s) using the /phen-traits/ route and the corresponding genome ID.

For example, in case of Blautia hansenii DSM 20583 (NCBI Taxonomy ID: 537007) we saw there is a genome on microbetagDB; to get its phenotypic traits we can simply run:

curl https://msysbio.gbiomed.kuleuven.be/phen-traits?ncbiAssemblyId=GCF_002222595.2

this would return (we show only a part of the outcome)

{
  "NOB": "NO",
  "NOBScore": "0.8078",
  "T3SS": "NO",
  "T3SSScore": "0.8391",
  "T6SS": "NO",
  "T6SSScore": "0.7836",
  "aSaccharolytic": "NO",
  "aSaccharolyticScore": "0.8672",
  ...
}

Note

For a thorough description of the abbreviations used, have a look in the microbetag’s modules tab.

Warning

  1. Currently, microbetag has annotations only for the GTDB representative genomes. Thus, genomes returned by the ncbiTaxId-to-genomeId route that come from other resources (e.g., MGnify, KEGG) do not have phenotypic tratis.

  2. All genomes have a 3-letter prefix that is either GCA or GCG. In case a GTDB genome you are querying returns an “Internal Server Error”, try again replacing that prefix; e.g. if you initially had “GCA_002222595.2”, try again with “GCF_002222595.2”.

In case a genome ID is provided for which there are no phenotypic traits on microbetagDB, you will get a message explaining this:

No Phen traits for the genome id asked.         
Make sure you are asking for a GTDB v202 representative genome.

🧩 Get pathway complementarities¶

In case you are interested in the complementarities of a specific pair of GTDB representative genomes, where \(species_A\) is the beneficiary, and \(species_B\) the donor, one may use the genome-complements route, providing their corresponding IDs, and the type of the IDs provided.

  • đź”§ Query Parameters

Parameter

Type

Required

Description

beneficiaryId

string

âś…

ID of the potential beneficiary (e.g. GCA_011050685.1)

donorId

string

âś…

ID of the potential donor (e.g. 1402194)

typeCategory

string

âś…

ncbiGenomeIds | ncbiTaxonomyIds | patricGenomeIds

Warning

Both IDs need to be of the same type.

  • 🔍 Example Request

Here is an example where Desulfurococcaceae archeon genome (GCA_011364525.1) is the beneficiary taxon, and Malikia spinosa (GCA_002980625.1) its potential donor:

curl "https://msysbio.gbiomed.kuleuven.be/pathway-complements?beneficiaryId=GCA_011050685.1&donorId=GCA_002289575.1&typeCategory=ncbiGenomeIds"

Warning

Please make sure you have the quotation marks in the beginning and the end of your URL.

Here is its partial output:

{
  "0": {
    "beneficiary-ncbi-tax-id": 182270,
    "complements": [
      {
        "beneficiary-genome": "GCA_011050685.1",
        "complements": [
          [
            [
              "M00001",
              "K21071",
              "K00845;K01810;K21071;K01624;K01803;K00134;K00927;K01834;K01689;K00873",
              "https://www.kegg.jp/kegg-bin/show_pathway?map00010/K00845%09%23EAD1DC/K01810%09%23EAD1DC/K01624%09%23EAD1DC/K01803%09%23EAD1DC/K00134%09%23EAD1DC/K00927%09%23EAD1DC/K01834%09%23EAD1DC/K01689%09%23EAD1DC/K00873%09%23EAD1DC/K21071%09%2300A898/"
            ]
          ],
        #....  MORE COMPLEMENT CASES 
        ],
        "donor-genome": "GCA_002289575.1"
      }
    ],
    "donor-ncbi-tax-id": 161920
  }
}

Let us now describe its content.

  • The response returns an entry for each beneficiary entry ID

  • Each entry consists of the beneficiary-ncbi-tax-id and the donor-ncbi-tax-id, the NCBI Taxonomy IDs of the two taxa

  • For each such pair there is acomplements list. Each entry of which stands for
    a combination of beneficiary-genome - donor-genome IDs and has again a complements list with the actual pathway complementarities found between the two genomes

  • Each entry in that list has 4 elements:

    • the KEGG module the complement is part of

    • the KO terms(s) to be exchanged

    • he complete alternative (set of KOs that ensure the module)

    • a URL to a colored KEGG map that includes the module and highlights KOs of the benefciary and those to get from the donor

  • Under the complements key, you can find the different potential complements between those two genomes. In the above chunk we only show the first two. Each complement refers to a specific KEGG module, denoted in the module key. Each complement entry, has also a complement key with the exact KO terms that the donor would have to provide, so the donor would have a complete alternative of the module. In the complete-alternative you can find all the KOs (both those the beneficiary carries on its own and those it would get from the donor) to have a complete alternative of the module under study. Last, the coloured-map key provides you the link to the KEGG map showing the module under study and the KO terms to be used; beneficiary’s KOs are coloured with pink while those to get from the donor with green.

Warning

When using curl make sure you use quotes (””) around your URL.

Get seed scores and complements¶

Get competition and complementarity seed scores¶

When we calculate the seed scores, we consider both taxa as \(species_A\) and \(species_B\)
(see on the Modules tab for more).

Seed scores can be retrieved in a similar to the complements way. For example, let’s check the scores between Streptomyces sp. AW19M42 (NCBI Taxonomy ID: 1379686) and Afipia clevelandensis ATCC 49720 (NCBI Taxonomy ID: 883079). By running:

curl "https://msysbio.gbiomed.kuleuven.be/seed-scores?beneficiaryId=1379686&donorId=883079&typeCategory=ncbiTaxonomyIds"

we get

{
  "beneficiary_maps": {
    "gc_and_patric_ids": {
      "GCF_000470535.1": "1379686.4"
    },
    "ncbi_tax_id": "1379686"
  },
  "donor_maps": {
    "gc_and_patric_ids": {
      "GCF_000336555.1": "883079.3"
    },
    "ncbi_tax_id": "883079"
  },
  "seed-scores": {
    "0": {
      "CompetitionScore": "0.51",
      "CooperationScore": "0.33",
      "PATRIC_A": "1379686.4",
      "PATRIC_B": "883079.3"
    },
    "1": {
      "CompetitionScore": "0.67",
      "CooperationScore": "0.2",
      "PATRIC_A": "883079.3",
      "PATRIC_B": "1379686.4"
    }
  }
}

where, in the first case ["seed-scores"][0], Streptomyces is considered as \(speciesA\) and Afpia as \(speciesB\), and in case [1] the other way around.

If I run the same with the reverse order on the Tax IDs, then I get the same output only with a different order.

curl "https://msysbio.gbiomed.kuleuven.be/seed-scores?beneficiaryId=883079&donorId=1379686&typeCategory=ncbiTaxonomyIds"

Important

The function returns two pairs of seed scores, in which:

  • the first genome provided is considered as \(speciesA\) for the seed score indices,

  • and the second one as \(speciesB\).

Get seed complements¶

Seed complements can be retrieved for pairs of 3 different categories of IDs:

  • NCBI Taxonomy IDs (type_of_ids: ncbiTaxonomyIds)

  • NCBI Genome accession IDs (type_of_ids: ncbiGenomeIds) and

  • PATRIC IDs (type_of_ids: patricGenomeIds)

The main route of this feature is

https://msysbio.gbiomed.kuleuven.be/seed-complements/<beneficiary_id>/<donor_id>/<type_of_ids>

For example, one might get seed complements for a pair of NCBI Taxonomy IDs like this:

curl "https://msysbio.gbiomed.kuleuven.be/seed-complements?beneficiaryId=1379686&donorId=883079&typeCategory=ncbiTaxonomyIds"

Note

This route has no default for your id type! If not provided by the user, the API will fail.

Common errors¶

There are two common types of client errors on API calls:

  • 400 Bad requests.

In this case, you most probably are asking a malformed request syntax, invalid request message framing, or deceptive request routing. Check again your query and make sure you are u sing the right syntax.

  • 404 Not found.

In this case, the server cannot find the requested resource.

You can have such errors also in cases you are asking for a genome/species/pair of such that is not part of the microbetagDB.

Keep in mind that you can always contact us through our Matrix community for more.