Open Targets Database
Overview
The Open Targets Platform is a comprehensive resource for systematic identification and prioritization of potential therapeutic drug targets. It integrates publicly available datasets including human genetics, omics, literature, and chemical data to build and score target-disease associations.
Key capabilities:
- Query target (gene) annotations including tractability, safety, expression
- Search for disease-target associations with evidence scores
- Retrieve evidence from multiple data types (genetics, pathways, literature, etc.)
- Find known drugs for diseases and their mechanisms
- Access drug information including clinical trial phases and adverse events
- Evaluate target druggability and therapeutic potential
Data access: The platform provides a GraphQL API, web interface, data downloads, and Google BigQuery access. This skill focuses on the GraphQL API for programmatic access.
When to Use This Skill
This skill should be used when:
- Target discovery: Finding potential therapeutic targets for a disease
- Target assessment: Evaluating tractability, safety, and druggability of genes
- Evidence gathering: Retrieving supporting evidence for target-disease associations
- Drug repurposing: Identifying existing drugs that could be repurposed for new indications
- Competitive intelligence: Understanding clinical precedence and drug development landscape
- Target prioritization: Ranking targets based on genetic evidence and other data types
- Mechanism research: Investigating biological pathways and gene functions
- Biomarker discovery: Finding genes differentially expressed in disease
- Safety assessment: Identifying potential toxicity concerns for drug targets
Core Workflow
1. Search for Entities
Start by finding the identifiers for targets, diseases, or drugs of interest.
For targets (genes):
from scripts.query_opentargets import search_entities
# Search by gene symbol or name
results = search_entities("BRCA1", entity_types=["target"])
# Returns: [{"id": "ENSG00000012048", "name": "BRCA1", ...}]
For diseases:
# Search by disease name
results = search_entities("alzheimer", entity_types=["disease"])
# Returns: [{"id": "EFO_0000249", "name": "Alzheimer disease", ...}]
For drugs:
# Search by drug name
results = search_entities("aspirin", entity_types=["drug"])
# Returns: [{"id": "CHEMBL25", "name": "ASPIRIN", ...}]
Identifiers used:
- Targets: Ensembl gene IDs (e.g.,
ENSG00000157764) - Diseases: EFO (Experimental Factor Ontology) IDs (e.g.,
EFO_0000249) - Drugs: ChEMBL IDs (e.g.,
CHEMBL25)
2. Query Target Information
Retrieve comprehensive target annotations to assess druggability and biology.
from scripts.query_opentargets import get_target_info
target_info = get_target_info("ENSG00000157764", include_diseases=True)
# Access key fields:
# - approvedSymbol: HGNC gene symbol
# - approvedName: Full gene name
# - tractability: Druggability assessments across modalities
# - safetyLiabilities: Known safety concerns
# - geneticConstraint: Constraint scores from gnomAD
# - associatedDiseases: Top disease associations with scores
Key annotations to review:
- Tractability: Small molecule, antibody, PROTAC druggability predictions
- Safety: Known toxicity concerns from multiple databases
- Genetic constraint: pLI and LOEUF scores indicating essentiality
- Disease associations: Diseases linked to the target with evidence scores
Refer to references/target_annotations.md for detailed information about all target features.
3. Query Disease Information
Get disease details and associated targets/drugs.
from scripts.query_opentargets import get_disease_info
disease_info = get_disease_info("EFO_0000249", include_targets=True)
# Access fields:
# - name: Disease name
# - description: Disease description
# - therapeuticAreas: High-level disease categories
# - associatedTargets: Top targets with association scores
4. Retrieve Target-Disease Evidence
Get detailed evidence supporting a target-disease association.
from scripts.query_opentargets import get_target_disease_evidence
# Get all evidence
evidence = get_target_disease_evidence(
ensembl_id="ENSG00000157764",
efo_id="EFO_0000249"
)
# Filter by evidence type
genetic_evidence = get_target_disease_evidence(
ensembl_id="ENSG00000157764",
efo_id="EFO_0000249",
data_types=["genetic_association"]
)
# Each evidence record contains:
# - datasourceId: Specific data source (e.g., "gwas_catalog", "chembl")
# - datatypeId: Evidence category (e.g., "genetic_association", "known_drug")
# - score: Evidence strength (0-1)
# - studyId: Original study identifier
# - literature: Associated publications
Major evidence types:
- genetic_association: GWAS, rare variants, ClinVar, gene burden
- somatic_mutation: Cancer Gene Census, IntOGen, cancer biomarkers
- known_drug: Clinical precedence from approved/clinical drugs
- affected_pathway: CRISPR screens, pathway analyses, gene signatures
- rna_expression: Differential expression from Expression Atlas
- animal_model: Mouse phenotypes from IMPC
- literature: Text-mining from Europe PMC
Refer to references/evidence_types.md for detailed descriptions of all evidence types and interpretation guidelines.
5. Find Known Drugs
Identify drugs used for a disease and their targets.
from scripts.query_opentargets import get_known_drugs_for_disease
drugs = get_known_drugs_for_disease("EFO_0000249")
# drugs contains:
# - uniqueDrugs: Total number of unique drugs
# - uniqueTargets: Total number of unique targets
# - rows: List of drug-target-indication records with:
# - drug: {name, drugType, maximumClinicalTrialPhase}
# - targets: Genes targeted by the drug
# - phase: Clinical trial phase for this indication
# - status: Trial status (active, completed, etc.)
# - mechanismOfAction: How drug works
Clinical phases:
- Phase 4: Approved drug
- Phase 3: Late-stage clinical trials
- Phase 2: Mid-stage trials
- Phase 1: Early safety trials
6. Get Drug Information
Retrieve detailed drug information including mechanisms and indications.
from scripts.query_opentargets import get_drug_info
drug_info = get_drug_info("CHEMBL25")
# Access:
# - name, synonyms: Drug identifiers
# - drugType: Small molecule, antibody, etc.
# - maximumClinicalTrialPhase: Development stage
# - mechanismsOfAction: Target and action type
# - indications: Diseases with trial phases
# - withdrawnNotice: If withdrawn, reasons and countries
7. Get All Associations for a Target
Find all diseases associated with a target, optionally filtering by score.
from scripts.query_opentargets import get_target_associations
# Get associations with score >= 0.5
associations = get_target_associations(
ensembl_id="ENSG00000157764",
min_score=0.5
)
# Each association contains:
# - disease: {id, name}
# - score: Overall association score (0-1)
# - datatypeScores: Breakdown by evidence type
Association scores:
- Range: 0-1 (higher = stronger evidence)
- Aggregate evidence across all data types using harmonic sum
- NOT confidence scores but relative ranking metrics
- Under-studied diseases may have lower scores despite good evidence
GraphQL API Details
For custom queries beyond the provided helper functions, use the GraphQL API directly or modify scripts/query_opentargets.py.
Key information:
- Endpoint:
https://api.platform.opentargets.org/api/v4/graphql - Interactive browser:
https://api.platform.opentargets.org/api/v4/graphql/browser - No authentication required
- Request only needed fields to minimize response size
- Use pagination for large result sets:
page: {size: N, index: M}
Refer to `references/api_reference.