biblio-check
Verifies academic citations against authoritative metadata APIs, classifies each entry into a three-tier confidence system, and re-emits references in the user's chosen style. Treats reference verification as a structured-data problem: the language model orchestrates, but the verdicts come from CrossRef, OpenAlex, PubMed, Semantic Scholar, and arXiv.
When to use this skill
Use it whenever the user wants to:
- check whether the references in their paper are real and correctly cited
- detect hallucinated or fabricated citations (especially in LLM-assisted drafts)
- reformat a bibliography into APA, MLA, AMA, Chicago, Vancouver, IEEE, or Harvard
- convert between bibliography formats
- audit a
.bibor.risfile before submission - find candidate supporting sources for a paper that has no references yet
If the user mentions any of citations, references, bibliography, works cited, DOI lookup, journal articles, in-text citations, or expresses worry about hallucinated sources, this is the right skill.
What this skill does NOT do
Reference verification is not the same as fact-checking. This skill confirms that a cited work exists and that its metadata (title, authors, year, journal, DOI) is correct. It does not confirm that the cited work actually supports the claim it is attached to in the body text. Make this distinction clear to the user when reporting results.
The three-tier classification
Every reference lands in one of three tiers (the first tier has three sub-flavors):
- VERIFIED -- all substantive components (authors, title, year, journal, volume, issue, pages, DOI) match the canonical record returned by at least two independent verification APIs. The citation can stay as written.
- VERIFIED (style edits suggested) -- substantive content is correct, but cosmetic edits would bring it into strict conformity with the requested style (e.g. add a missing middle initial, italicize the journal name, convert hyphen to en-dash in page ranges). The original is still accurate; the edits are polish.
- VERIFIED (grey literature / non-academic source) -- not indexed in any academic database, which is normal for government reports, NGO publications, agency websites, and standards documents. The tool either confirmed existence via URL HEAD-resolution OR detected an institutional author plus named publisher (e.g. 'National Committee for Quality Assurance. ... Washington, DC: NCQA; 2025.'). Kept verbatim; no canonical reformat is attempted because there is no authoritative metadata to draw from. Adding the publication URL to citations of this kind makes verification more direct on future runs.
- PARTIAL -- a real paper matching most fields was found, but the citation as written has substantive errors (wrong year, wrong author surname, wrong DOI, wrong volume, etc.). The report shows a field-by-field diff table with severity and proposes a corrected form.
- HALLUCINATED -- no verification API returned a record that confidently matches the citation. This is the strongest signal of fabrication.
Single-source matches are treated as HALLUCINATED unless backed by an exact identifier (DOI, arXiv ID, PMID) the user provided. We require this because every individual API has data quality issues; consensus across sources is the only thing we trust.
Usage
The skill bundles a Python module scripts.main with two subcommands.
Audit an existing bibliography
python -m scripts.main audit <input> --style <style> --format <format> --out <dir>
<input>is a path to a.pdf,.docx,.txt,.md,.bib, or.risfile<style>is one ofapa,mla,ama,chicago,vancouver,ieee,harvard<format>controls which output artifacts are produced. Defaults toauto, which picks based on the input:.docxinput -> annotated.docxwith tracked changes.bibinput -> corrected.bib.risinput -> corrected.ris.pdf,.txt,.mdinput -> annotated.docxwith tracked changes
- You can request multiple formats:
--format markdown docx.--format allemits everything. - Available formats:
markdown,json,docx,bibtex,ris.
Suggest sources for a paper that has no references
python -m scripts.main suggest <input> --style apa --out suggestions.json
Finds candidate sources for factual claims in the paper. NEVER auto-inserts. The user must review each candidate and decide whether it actually supports the claim.
Installation
Before first use, install Python dependencies:
pip install -r requirements.txt
How Claude should drive this skill
When a user wants their bibliography checked:
- Confirm the style. If they haven't specified, ask. The styles are
apa,mla,ama,chicago,vancouver,ieee,harvard. - Confirm the output format. If the input is
.docx,.bib, or.ris, the default is the same format back, and you can usually proceed without asking. If the input is.pdf,.txt, or.md, the default is an annotated.docxwith tracked changes, but offer the markdown report as an alternative if the user prefers a read-only summary. Always honor an explicit request like "give me just a markdown report" or "I want a bibtex file". - Locate the input. If the user pasted a bibliography in chat, save it to a
.txtfirst. - Run the audit.
python -m scripts.main audit <input> --style <style> --format <chosen> --out <dir> - Read
audit.md(if produced) oraudit.jsonand report a short summary: total checked, count per tier, and call out by name anything classifiedHALLUCINATEDorPARTIAL. - Present the output files using the
present_filestool. Do NOT enumerate the contents of every entry inline; the user will read the report themselves.
Style choice when the user is undecided
If the user doesn't know which style they need:
- Medical or biomedical paper -> AMA or Vancouver
- Psychology, education, social sciences -> APA
- Humanities, literature -> MLA
- History, some social sciences -> Chicago
- Engineering, computer science -> IEEE
- UK / Anglo-Australian context -> Harvard
When in doubt, ask which journal or institution the user is targeting and look up its required style.
Examples of when to invoke
Invoke:
- "Can you check the references in my dissertation? It's a .docx. I'm worried some of them might be hallucinated, I used an LLM to draft."
- "Audit this bibliography for me, MLA style, and tell me which ones are fake."
- "Convert these references from APA to AMA."
- "Here's my works cited from my paper. Are these real?"
- "I have a .bib file. Verify every entry."
- "I have a paper draft with no citations yet. Suggest sources for the claims about CRISPR off-target effects."
Do not invoke:
- "Write me an essay on climate change." (drafting, not citation work)
- "What's the difference between APA 6 and APA 7?" (a knowledge question; just answer directly)
- "Fix the grammar in my paper." (proofreading, not citations)
How verification works internally
For each reference, the tool queries multiple authoritative APIs in parallel:
- CrossRef (api.crossref.org) for DOIs and general scholarly metadata
- OpenAlex (api.openalex.org) for broad coverage including non-DOI works
- PubMed E-utilities for biomedical literature
- Semantic Scholar for CS/AI coverage and citation graph
- arXiv for preprints
- scholarly (Google Scholar scraper) only when
--use-scholarlyis set, as a last-resort fallback
Candidates are scored by a weighted formula:
0.55 * title_fuzz + 0.25 * author_surname_overlap + 0.15 * year_score + 0.05 * doi_bonus
A score of 88+ from the best candidate AND agreement from at least two independent sources (or an exact ID match) is the bar for VERIFIED. Substantive vs cosmetic differences are then separated: substantive ones move the entry to PARTIAL; cosmetic ones leave it under VERIFIED with style-edit suggestions