🦖 Semantic Similarity Index
Measure how isolated or connected disease research is across the global biomedical literature, using PubMedBERT embeddings on PubMed abstracts spanning 175 GBD diseases.
What it does
- Takes a disease list (GBD taxonomy) as input
- Retrieves PubMed abstracts (2000-2025) for each disease with quality filtering
- Generates 768-dimensional PubMedBERT embeddings for every abstract
- Computes four semantic equity metrics per disease:
- **Semantic Isolation I
[Description truncada. Veja o README completo no GitHub.]