Hypogenic
Overview
Hypogenic provides automated hypothesis generation and testing using large language models to accelerate scientific discovery. The framework supports three approaches: HypoGeniC (data-driven hypothesis generation), HypoRefine (synergistic literature and data integration), and Union methods (mechanistic combination of literature and data-driven hypotheses).
Quick Start
Get started with Hypogenic in minutes:
# Install the package
uv pip install hypogenic
# Clone example datasets
git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data
# Run basic hypothesis generation
hypogenic_generation --config ./data/your_task/config.yaml --method hypogenic --num_hypotheses 20
# Run inference on generated hypotheses
hypogenic_inference --config ./data/your_task/config.yaml --hypotheses output/hypotheses.json
Or use Python API:
from hypogenic import BaseTask
# Create task with your configuration
task = BaseTask(config_path="./data/your_task/config.yaml")
# Generate hypotheses
task.generate_hypotheses(method="hypogenic", num_hypotheses=20)
# Run inference
results = task.inference(hypothesis_bank="./output/hypotheses.json")
When to Use This Skill
Use this skill when working on:
- Generating scientific hypotheses from observational datasets
- Testing multiple competing hypotheses systematically
- Combining literature insights with empirical patterns
- Accelerating research discovery through automated hypothesis ideation
- Domains requiring hypothesis-driven analysis: deception detection, AI-generated content identification, mental health indicators, predictive modeling, or other empirical research
Key Features
Automated Hypothesis Generation
- Generate 10-20+ testable hypotheses from data in minutes
- Iterative refinement based on validation performance
- Support for both API-based (OpenAI, Anthropic) and local LLMs
Literature Integration
- Extract insights from research papers via PDF processing
- Combine theoretical foundations with empirical patterns
- Systematic literature-to-hypothesis pipeline with GROBID
Performance Optimization
- Redis caching reduces API costs for repeated experiments
- Parallel processing for large-scale hypothesis testing
- Adaptive refinement focuses on challenging examples
Flexible Configuration
- Template-based prompt engineering with variable injection
- Custom label extraction for domain-specific tasks
- Modular architecture for easy extension
Proven Results
- 8.97% improvement over few-shot baselines
- 15.75% improvement over literature-only approaches
- 80-84% hypothesis diversity (non-redundant insights)
- Human evaluators report significant decision-making improvements
Core Capabilities
1. HypoGeniC: Data-Driven Hypothesis Generation
Generate hypotheses solely from observational data through iterative refinement.
Process:
- Initialize with a small data subset to generate candidate hypotheses
- Iteratively refine hypotheses based on performance
- Replace poorly-performing hypotheses with new ones from challenging examples
Best for: Exploratory research without existing literature, pattern discovery in novel datasets
2. HypoRefine: Literature and Data Integration
Synergistically combine existing literature with empirical data through an agentic framework.
Process:
- Extract insights from relevant research papers (typically 10 papers)
- Generate theory-grounded hypotheses from literature
- Generate data-driven hypotheses from observational patterns
- Refine both hypothesis banks through iterative improvement
Best for: Research with established theoretical foundations, validating or extending existing theories
3. Union Methods
Mechanistically combine literature-only hypotheses with framework outputs.
Variants:
- Literature ∪ HypoGeniC: Combines literature hypotheses with data-driven generation
- Literature ∪ HypoRefine: Combines literature hypotheses with integrated approach
Best for: Comprehensive hypothesis coverage, eliminating redundancy while maintaining diverse perspectives
Installation
Install via pip:
uv pip install hypogenic
Optional dependencies:
- Redis server (port 6832): Enables caching of LLM responses to significantly reduce API costs during iterative hypothesis generation
- s2orc-doc2json: Required for processing literature PDFs in HypoRefine workflows
- GROBID: Required for PDF preprocessing (see Literature Processing section)
Clone example datasets:
# For HypoGeniC examples
git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data
# For HypoRefine/Union examples
git clone https://github.com/ChicagoHAI/Hypothesis-agent-datasets.git ./data
Dataset Format
Datasets must follow HuggingFace datasets format with specific naming conventions:
Required files:
<TASK>_train.json: Training data<TASK>_val.json: Validation data<TASK>_test.json: Test data
Required keys in JSON:
text_features_1throughtext_features_n: Lists of strings containing feature valueslabel: List of strings containing ground truth labels
Example (headline click prediction):
{
"headline_1": [
"What Up, Comet? You Just Got *PROBED*",
"Scientists Made a Breakthrough in Quantum Computing"
],
"headline_2": [
"Scientists Everywhere Were Holding Their Breath Today. Here's Why.",
"New Quantum Computer Achieves Milestone"
],
"label": [
"Headline 2 has more clicks than Headline 1",
"Headline 1 has more clicks than Headline 2"
]
}
Important notes:
- All lists must have the same length
- Label format must match your
extract_label()function output format - Feature keys can be customized to match your domain (e.g.,
review_text,post_content, etc.)
Configuration
Each task requires a config.yaml file specifying:
Required elements:
- Dataset paths (train/val/test)
- Prompt templates for:
- Observations generation
- Batched hypothesis generation
- Hypothesis inference
- Relevance checking
- Adaptive methods (for HypoRefine)
Template capabilities:
- Dataset placeholders for dynamic variable injection (e.g.,
${text_features_1},${num_hypotheses}) - Custom label extraction functions for domain-specific parsing
- Role-based prompt structure (system, user, assistant roles)
Configuration structure:
task_name: your_task_name
train_data_path: ./your_task_train.json
val_data_path: ./your_task_val.json
test_data_path: ./your_task_test.json
prompt_templates:
# Extra keys for reusable prompt components
observations: |
Feature 1: ${text_features_1}
Feature 2: ${text_features_2}
Observation: ${label}
# Required templates
batched_generation:
system: "Your system prompt here"
user: "Your user prompt with ${num_hypotheses} placeholder"
inference:
system: "Your inference system prompt"
user: "Your inference user prompt"
# Optional templates for advanced features
few_shot_baseline: {...}
is_relevant: {...}
adaptive_inference: {...}
adaptive_selection: {...}
Refer to references/config_template.yaml for a complete example configuration.
Literature Processing (HypoRefine/Union Methods)
To use literature-based hypothesis generation, you must preprocess PDF papers.
Note: The commands below run inside the cloned HypoGenic repository, not from this skill directory.
Step 1: Setup GROBID (first time only)
bash ./modules/setup_grobid.sh
Step 2: Add PDF files
Place research papers in literature/YOUR_TASK_NAME/raw/
Step 3: Process PDFs
# Start GROBID service
bash ./modules/run_grobid.sh
# Process PDFs for your task
cd examples
python pdf_preprocess.py --task_name YOUR_TASK_NAME
This converts PDFs to structured format for hypothesis extraction