KEGG Database — Biological Pathway & Molecular Network Queries

Overview

KEGG (Kyoto Encyclopedia of Genes and Genomes) is a comprehensive bioinformatics resource for biological pathway analysis, molecular interaction networks, and cross-database ID conversion. Access is via a direct REST API with no authentication — all operations use simple HTTP GET requests returning tab-delimited text.

When to Use

Mapping genes to biological pathways (e.g., "which pathways involve TP53?")
Retrieving metabolic pathway details, gene lists, or compound structures
Converting identifiers between KEGG, NCBI Gene, UniProt, and PubChem
Checking drug-drug interactions from KEGG's pharmacological database
Building pathway enrichment context (all genes per pathway for an organism)
Cross-referencing compounds, reactions, enzymes, and pathways
For Python-native multi-database queries (KEGG + UniProt + Ensembl in one script), prefer bioservices instead
For pathway visualization, use KEGG Mapper (https://www.kegg.jp/kegg/mapper/) directly

Prerequisites

pip install requests

API constraints:

Academic use only — commercial use requires a separate KEGG license
Max 10 entries per get/list/conv/link/ddi call (image/kgml/json: 1 entry only)
No explicit rate limit, but add time.sleep(0.5) between batch requests to avoid server-side throttling
Base URL: https://rest.kegg.jp/

Quick Start

import requests
import time

BASE = "https://rest.kegg.jp"

def kegg_get(operation, *args):
    """Generic KEGG REST API caller."""
    url = f"{BASE}/{operation}/{'/'.join(args)}"
    resp = requests.get(url)
    resp.raise_for_status()
    return resp.text

# Find pathways linked to human gene TP53
pathways = kegg_get("link", "pathway", "hsa:7157")
print(pathways[:200])
# hsa:7157	path:hsa04010
# hsa:7157	path:hsa04110
# ...

# Get pathway details
detail = kegg_get("get", "hsa04110")
print(detail[:300])

Core API

1. Database Information — `kegg_info`

Retrieve metadata and statistics about KEGG databases.

import requests

BASE = "https://rest.kegg.jp"

# Database-level info
info = requests.get(f"{BASE}/info/pathway").text
print(info[:200])
# pathway          Pathway
#                  Release 112.0, Dec 2025
#                  Kanehisa Laboratories
#                  ...

# Organism-level info
hsa_info = requests.get(f"{BASE}/info/hsa").text
print(hsa_info[:200])

Common databases: kegg, pathway, module, brite, genes, genome, compound, glycan, reaction, enzyme, disease, drug

2. Listing Entries — `kegg_list`

List entry identifiers and names from any KEGG database.

import requests

BASE = "https://rest.kegg.jp"

# All human pathways
hsa_pathways = requests.get(f"{BASE}/list/pathway/hsa").text
for line in hsa_pathways.strip().split("\n")[:5]:
    pathway_id, name = line.split("\t")
    print(f"{pathway_id}: {name}")
# path:hsa00010: Glycolysis / Gluconeogenesis - Homo sapiens (human)
# ...

# Specific entries (max 10, joined with +)
genes = requests.get(f"{BASE}/list/hsa:10458+hsa:10459").text
print(genes)

Common organism codes: hsa (human), mmu (mouse), dme (fruit fly), sce (yeast), eco (E. coli)

3. Keyword Search — `kegg_find`

Search databases by keywords or molecular properties.

import requests
import time

BASE = "https://rest.kegg.jp"

# Keyword search in genes
results = requests.get(f"{BASE}/find/genes/p53").text
print(f"Found {len(results.strip().split(chr(10)))} entries")
time.sleep(0.5)

# Chemical formula search (exact match)
compounds = requests.get(f"{BASE}/find/compound/C7H10N4O2/formula").text
print(compounds[:200])
time.sleep(0.5)

# Molecular weight range search
drugs = requests.get(f"{BASE}/find/drug/300-310/exact_mass").text
print(drugs[:200])

Search options: append /formula (exact match), /exact_mass (range), /mol_weight (range) to compound/drug queries.

4. Entry Retrieval — `kegg_get`

Retrieve complete database entries or specific data formats.

import requests
import time

BASE = "https://rest.kegg.jp"

# Full pathway entry (text format)
pathway = requests.get(f"{BASE}/get/hsa00010").text
print(pathway[:500])
time.sleep(0.5)

# Multiple entries (max 10, joined with +)
genes = requests.get(f"{BASE}/get/hsa:10458+hsa:10459").text

# Protein sequence (FASTA)
fasta = requests.get(f"{BASE}/get/hsa:10458/aaseq").text
print(fasta[:200])
time.sleep(0.5)

# Compound structure (MOL format)
mol = requests.get(f"{BASE}/get/cpd:C00002/mol").text  # ATP

# Pathway image (PNG, single entry only)
img_resp = requests.get(f"{BASE}/get/hsa05130/image")
with open("pathway.png", "wb") as f:
    f.write(img_resp.content)
print(f"Saved pathway image: {len(img_resp.content)} bytes")

Output formats: aaseq (protein FASTA), ntseq (nucleotide FASTA), mol (MOL), kcf (KCF), image (PNG), kgml (XML), json (pathway JSON). Image/KGML/JSON accept one entry only.

5. ID Conversion — `kegg_conv`

Convert identifiers between KEGG and external databases.

import requests
import time

BASE = "https://rest.kegg.jp"

# KEGG gene → NCBI Gene ID (specific gene)
ncbi = requests.get(f"{BASE}/conv/ncbi-geneid/hsa:10458").text
print(ncbi.strip())
# hsa:10458	ncbi-geneid:10458
time.sleep(0.5)

# KEGG gene → UniProt
uniprot = requests.get(f"{BASE}/conv/uniprot/hsa:10458").text
print(uniprot.strip())
time.sleep(0.5)

# Bulk conversion: all human genes → NCBI Gene IDs
all_conv = requests.get(f"{BASE}/conv/ncbi-geneid/hsa").text
lines = all_conv.strip().split("\n")
print(f"Total conversions: {len(lines)}")

# Reverse: NCBI Gene ID → KEGG
reverse = requests.get(f"{BASE}/conv/hsa/ncbi-geneid:7157").text
print(reverse.strip())  # TP53

Supported external databases: ncbi-geneid, ncbi-proteinid, uniprot, pubchem, chebi

6. Cross-Referencing — `kegg_link`

Find related entries within and between KEGG databases.

import requests
import time

BASE = "https://rest.kegg.jp"

# Genes in glycolysis pathway
genes = requests.get(f"{BASE}/link/genes/hsa00010").text
gene_list = [line.split("\t")[1] for line in genes.strip().split("\n") if line]
print(f"Glycolysis genes: {len(gene_list)}")
time.sleep(0.5)

# Pathways containing a specific gene
pathways = requests.get(f"{BASE}/link/pathway/hsa:7157").text  # TP53
print(pathways[:300])
time.sleep(0.5)

# Compounds in a pathway
compounds = requests.get(f"{BASE}/link/compound/hsa00010").text
print(f"Compounds in glycolysis: {len(compounds.strip().split(chr(10)))}")

# Map genes to KO (orthology) groups
ko = requests.get(f"{BASE}/link/ko/hsa:10458").text
print(ko.strip())

Common links: genes ↔ pathway, pathway ↔ compound, pathway ↔ enzyme, genes ↔ ko (orthology)

7. Drug-Drug Interactions — `kegg_ddi`

Check pharmacological interactions between drugs.

import requests

BASE = "https://rest.kegg.jp"

# Single drug — all known interactions
interactions = requests.get(f"{BASE}/ddi/D00001").text
print(f"Interactions: {len(interactions.strip().split(chr(10)))}")

# Pairwise check (max 10 drugs, joined with +)
pair = requests.get(f"{BASE}/ddi/D00001+D00002+D00003").text
print(pair[:300])

Key Concepts

Identifier Formats

Type	Format	Example
Reference pathway	`map#####`	`map00010` (Glycolysis, generic)
Organism pathway	`{org}#####`	`hsa00010` (Glycolysis, human)
Gene	`{org}:{number}`	`hsa:7157` (TP53)
Compound	`cpd:C#####`	`cpd:C00002` (ATP)
Drug	`dr:D#####`	`dr:D00001`
Enzyme	`ec:{EC_number}`	`ec:1.1.1.1`
KO (orthology)	`ko:K#####`	`ko:K00001`

Pathway Categories

KEGG organizes pathways into seven major categories:

Metabolism — map001xx (Glycolysis, TCA cycle, amino acid metabolism)
Genetic Information Processing — map030xx (Ribosome, Spliceosome, DN

kegg-database

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

xlsx

mem-search

weekly-digests

how-it-works

Recibe nuevas skills de Dados e Análise todos los lunes

KEGG Database — Biological Pathway & Molecular Network Queries

Overview

When to Use

Prerequisites

Quick Start

Core API

1. Database Information — `kegg_info`

2. Listing Entries — `kegg_list`

3. Keyword Search — `kegg_find`

4. Entry Retrieval — `kegg_get`

5. ID Conversion — `kegg_conv`

6. Cross-Referencing — `kegg_link`

7. Drug-Drug Interactions — `kegg_ddi`

Key Concepts

Identifier Formats

Pathway Categories

Comentarios · Sin comentarios

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

xlsx

mem-search

weekly-digests

how-it-works

Recibe nuevas skills de Dados e Análise todos los lunes

KEGG Database — Biological Pathway & Molecular Network Queries

Overview

When to Use

Prerequisites

Quick Start

Core API

1. Database Information — kegg_info

2. Listing Entries — kegg_list

3. Keyword Search — kegg_find

4. Entry Retrieval — kegg_get

5. ID Conversion — kegg_conv

6. Cross-Referencing — kegg_link

7. Drug-Drug Interactions — kegg_ddi

Key Concepts

Identifier Formats

Pathway Categories

Comentarios · Sin comentarios

1. Database Information — `kegg_info`

2. Listing Entries — `kegg_list`

3. Keyword Search — `kegg_find`

4. Entry Retrieval — `kegg_get`

5. ID Conversion — `kegg_conv`

6. Cross-Referencing — `kegg_link`

7. Drug-Drug Interactions — `kegg_ddi`