Orchestration Log: When this skill is activated, append a log entry to outputs/orchestration_log.md:

### Skill Activation: Verification Engine
**Timestamp:** [current date/time]
**Actor:** AI Agent (verification-engine)
**Input:** [brief description of the verification request]
**Output:** [brief description of results — e.g., "Verified 42 citations: 35 VERIFIED, 5 PLAUSIBLE, 2 MISMATCH"]

Verification Engine

Core Principle

A citation is only as good as its accuracy. This engine systematically checks whether the papers you cite actually say what you claim they say. It fetches real source material — abstracts at minimum, full text when available — and compares each attribution claim against the actual content.

This addresses the dominant failure mode of LLM-generated academic writing: citation hallucination and misattribution. Even when citations point to real papers (no fabricated DOIs), the attributed claims may not match what the source actually says.

When to Activate

User says "verify citations", "check my references", "validate sources"
Before final submission of any paper with 20+ references
After major revisions that added new citations
When a reviewer questions citation accuracy
As Phase 7 in the paper-machine pipeline (optional quality gate)

Step 1: Extract Citation Claims

From LaTeX (`paper.tex`)

Scan the .tex file for all citation commands and extract the surrounding context:

\citep{key}         → parenthetical: "... as shown previously (Author, Year)."
\citet{key}         → textual: "Author (Year) demonstrated that ..."
\citeauthor{key}    → author name reference
\citeyear{key}      → year reference

For each citation occurrence, extract:

Citation key (the BibTeX key)
Claim context — the full sentence containing the citation, plus the preceding sentence if needed for meaning. This is the "attributed claim."
Section — which section of the paper contains this citation
Claim type — classify as:
- Specific finding ("X found that Y increases Z by 78%") — highest verification priority
- General attribution ("X surveys this space") — medium priority
- Methodological reference ("following X's framework") — lower priority
- Existence citation ("see X for a review") — lowest priority

From Markdown (`draft.md`)

Same logic, but scan for (Author, Year) and Author (Year) patterns instead of LaTeX citation commands.

Group by Source

Multiple citations of the same paper should be grouped. One paper may be cited 5 times with 5 different claims — each claim needs independent verification.

Output: A list of {key, claim, section, claim_type, priority} tuples.

Step 2: Match to BibTeX Entries

For each citation key, look up the entry in references.bib:

Extract: title, author, year, doi, journal, note
If DOI exists: this is the primary lookup key for Step 3
If no DOI: use title + first author as search query
Flag any citation keys that have NO matching BibTeX entry (orphan citations)

Output: Enriched list with DOI and metadata for each citation.

Step 3: Fetch Source Material (3-Tier Retrieval)

For each unique referenced paper (not each citation — deduplicate by BibTeX key):

Tier A — Abstract Retrieval (always attempt)

This is the baseline. Fast, reliable, works for any paper with a DOI or indexed title.

Strategy (try in order, stop at first success):

Search Semantic Scholar by title (use academic_search_semantic_scholar MCP tool):
- Set max_results: 3 (to find best match)
- Set full abstract retrieval to get complete abstract text
- Semantic Scholar also provides TLDR summaries — use both
Search OpenAlex by title (use academic_search_openalex MCP tool):
- Broader coverage (474M+ works), good for non-CS papers
- CC0 data, reliable abstracts
Search CrossRef by title (use academic_search_crossref MCP tool):
- Best for DOI verification
- Abstracts sometimes available

What you get: Title confirmation, abstract (50-300 words), TLDR (1-2 sentences), citation count, open access status, and PDF URL if available.

Tier B — Full-Text Retrieval (when available)

For open-access papers, go beyond the abstract:

Check for open-access PDF URL in the API response metadata
- Semantic Scholar: openAccessPdf.url field
- OpenAlex: open_access.oa_url field
arXiv preprints: If DOI starts with 10.48550/arxiv. or BibTeX key suggests arXiv, construct the PDF URL: https://arxiv.org/pdf/{arxiv_id}
Fetch the PDF:
- Use WebFetch with the PDF URL to get content, OR
- Download to /tmp/verify_papers/{bib_key}.pdf and use the Read tool (supports PDFs up to 100 pages)
Extract relevant sections: Don't read the entire paper. Search for:
- The abstract (always)
- The introduction (usually contains the paper's key claims)
- The results/findings section (for empirical papers)
- The conclusion

When to use Tier B:

Paper is a load-bearing citation (Tier 1 priority)
Abstract alone is insufficient to verify the specific claim
Paper is open-access (arXiv, DOAJ, PLoS, MDPI, Frontiers, etc.)

Tier C — Extension Point (Future)

For large-scale verification (100+ papers) or complex documents with tables/figures:

LlamaParse MCP: Add llamacloud-mcp server to plugin.json for high-fidelity PDF parsing with table extraction and semantic chunking
SemTools CLI: Install semtools for parse and search commands over local PDF collections
Local vector store: Index all retrieved papers for semantic search across the entire reference corpus

Not implemented in v5.1.0 — documented here as the upgrade path.

Step 4: Verify Claims Against Sources

For each {claim, source_content} pair, perform the comparison:

Classification Rubric

Status	Criteria	Evidence Required
VERIFIED	Claim directly supported by source	Quote or close paraphrase found in abstract/text
PLAUSIBLE	Abstract consistent, claim reasonable	Topic match, no contradiction, but specific claim not explicit
MISMATCH	Claim contradicts or misrepresents source	Source says something different from what's attributed
UNVERIFIABLE	Couldn't access source content	No abstract found, paywalled, API returned nothing
NOT FOUND	Paper doesn't appear to exist	DOI doesn't resolve, title search returns nothing

Verification Process

For each citation claim:

Read the claim carefully. What exactly is being attributed to this source?
- A specific number or statistic? → must match exactly
- A general finding or argument? → abstract should clearly support it
- A methodological approach? → source should describe that method
- A classification or taxonomy? → source should contain it
Read the source material. What does the paper actually say?
- Start with TLDR/abstract
- For specific claims, search full text if available (Tier B)
Compare. Does the claim match the source?
- VERIFIED: "Asai et al. found GPT-4o fabricates citations 78-90% of the time" → Abstract states: "Without retrieval, GPT-4o generates citations that are 78-90% fabricated" → Direct match
- PLAUSIBLE: "Wang et al. survey autonomous agents" → Abstract is about LLM-based autonomous agents → Topic matches, general claim
- MISMATCH: "Smith et al. found productivity increased by 55%" → Abstract states: "14% increase in productivity" → Wrong number
Record evidence. For each classification, note:
- The specific claim text
- The relevant source quote (from abstract or full text)
- Brief justification for the classification

Batch Processing

Process citations in order of priority tier (see Prioritization Strategy below). After com

verification-engine

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

dev-browser

agent-browser

understand-chat

understand-dashboard

Recibe nuevas skills de Pesquisa e Web todos los lunes

Verification Engine

Core Principle

When to Activate

Step 1: Extract Citation Claims

From LaTeX (`paper.tex`)

From Markdown (`draft.md`)

Group by Source

Step 2: Match to BibTeX Entries

Step 3: Fetch Source Material (3-Tier Retrieval)

Tier A — Abstract Retrieval (always attempt)

Tier B — Full-Text Retrieval (when available)

Tier C — Extension Point (Future)

Step 4: Verify Claims Against Sources

Classification Rubric

Verification Process

Batch Processing

Comentarios · Sin comentarios

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

dev-browser

agent-browser

understand-chat

understand-dashboard

Recibe nuevas skills de Pesquisa e Web todos los lunes

Verification Engine

Core Principle

When to Activate

Step 1: Extract Citation Claims

From LaTeX (paper.tex)

From Markdown (draft.md)

Group by Source

Step 2: Match to BibTeX Entries

Step 3: Fetch Source Material (3-Tier Retrieval)

Tier A — Abstract Retrieval (always attempt)

Tier B — Full-Text Retrieval (when available)

Tier C — Extension Point (Future)

Step 4: Verify Claims Against Sources

Classification Rubric

Verification Process

Batch Processing

Comentarios · Sin comentarios

From LaTeX (`paper.tex`)

From Markdown (`draft.md`)