Citation Audit

🔒 Do not wrap this skill in /loop, /schedule, or CronCreate. It is verdict-bearing — it judges bibliographic correctness. Re-running that verdict on a timer adds no new signal (it changes only when the bibliography changes). Schedule the external wait that precedes it — bibliography finalized → then audit once. See shared-references/external-cadence.md.

Verify every \cite{...} in a paper against three independent layers:

Existence — the cited paper actually exists at the claimed arXiv ID / DOI / venue.
Metadata correctness — author names, year, venue, and title match canonical sources (DBLP, arXiv, ACL Anthology, Nature, OpenReview, etc.).
Context appropriateness — the cited paper actually supports the claim it is being used to support in the manuscript.

This skill is the fourth layer of \aris{}'s evidence-and-claim assurance, complementing experiment-audit (code), result-to-claim (science verdict), and paper-claim-audit (numerical claims). Together they form a bottom-up integrity stack from raw evaluation code to manuscript bibliography.

When to Use This Skill

Run before submission. The right gating point is:

After paper-write has produced the LaTeX draft and bib file
After paper-claim-audit has verified numerical claims
Before final paper-compile for submission

Do not run this on a half-written draft — most of the work is in cross-checking each \cite against context, which is wasted on placeholder text.

What This Skill Catches

The dangerous citation problems are not wildly fake citations — those are easy to spot. The dangerous ones are:

Wrong-context citations: real paper, but the cited claim is not what that paper actually establishes (e.g., citing Self-Refine to support "self-feedback produces correlated errors" — Self-Refine actually argues the opposite).
Author hallucinations: anonymous-author placeholders that slipped through, missing co-authors, wrong order.
Title drift: arXiv v1 vs v3 with different titles silently merged.
Venue confusion: arXiv preprint cited but the official venue is now CVPR/ICML/NeurIPS — using the wrong record.
Year mismatch: arXiv 2023 preprint with 2024 conference acceptance, year reported inconsistently.
Phantom DOIs: DOI looks real but does not resolve.
Self-citation drift: your own prior work cited with year off by one.

Constants

REVIEWER_MODEL = gpt-5.5 — Used via Codex MCP. Default for cross-model review with web access.
CONTEXT_POLICY = fresh — Each audit run uses a new reviewer thread (REVIEWER_BIAS_GUARD). Never codex-reply.
WEB_SEARCH = required — The reviewer must perform real web/DBLP/arXiv lookups, not pattern-match from memory.
OUTPUT = CITATION_AUDIT.md — Human-readable per-entry verdict report.
STATE = CITATION_AUDIT.json — Machine-readable verdict ledger consumable by downstream tools.
SOFT_ONLY = false — When true (set via — soft-only / — soft_only flag), the audit runs all three layers normally but forbids any .bib file mutation. Findings that would otherwise mutate the bib (FIX / REPLACE / REMOVE) are translated into per-occurrence sentence-rewrite proposals against the citing *.tex files. Used by /resubmit-pipeline Phase 1 to honor the user's hard "freeze the bib" constraint.
RENDER_HTML = true — When true (default), auto-render CITATION_AUDIT.md to HTML after writing the report. Uses full Codex review gate (audit-class artifact — render-fidelity check matches the skill's cross-model audit invariant). Set false to skip, or pass — render html: false.

Workflow

Step 1: Discover bib file and section files

Locate:

references.bib (or paper.bib / similar) under the paper directory
All *.tex files containing \cite{...} calls (typically sec/ or sections/)

If multiple bib files exist, audit each separately.

Step 2: Extract all (cite-key, context) pairs

For each \cite{key1,key2,...} invocation in the paper:

Record the cite key
Record the file + line number
Record the surrounding sentence (≥ 1 full sentence around the cite, for context check)

Output a flat list of (key, file, line, surrounding_sentence) tuples.

Also build the inverse: for each bib entry, the list of all places it is cited.

Define two protocol sets used throughout the rest of the workflow: cited_keys is the set of unique cite keys appearing in any \cite{...} invocation across the audited *.tex files (de-duplicated), and bib_keys is the set of keys parsed from the audited bib file(s). cited_keys drives Step 3 (audit only cited entries); bib_keys \ cited_keys is the uncited residual surfaced by the --uncited opt-in.

If the user passed --uncited, also compute the set difference bib_keys \ cited_keys here and stash it for use in Steps 5 and the JSON aggregation; see "Uncited Entry Detection (opt-in)" below for the protocol. The set-diff is a string operation only and does not consume reviewer budget.

Save the extracted contexts to paper/.aris/citation-audit/contexts.txt so the reviewer can read it directly. Use the paper-dir-relative path .aris/citation-audit/contexts.txt when recording the file in audited_input_hashes; do not stage under /tmp or other transient locations that the verifier cannot rehash later.

Step 3: Send each entry to fresh cross-model reviewer

For each cited bib entry — i.e., each key in cited_keys with at least one extracted citation context — invoke mcp__codex__codex (NOT codex-reply — fresh thread per entry, or batch with explicit per-entry isolation). Do not send entries in bib_keys \ cited_keys to the reviewer; those are detect-only and surface only when --uncited is explicitly enabled (see "Uncited Entry Detection" below).

mcp__codex__codex:
  model: gpt-5.5
  config: {"model_reasoning_effort": "xhigh"}
  sandbox: read-only
  prompt: |
    You are auditing a bibliographic entry. Use web/DBLP/arXiv search.

    ## Bib entry
    @article{key2024example,
      author = {...}, title = {...}, journal = {...}, year = {...}, ...
    }

    ## Where this entry is cited in the paper
    [paste extracted contexts]

    For this entry, verify:
    1. EXISTENCE: does this paper exist at the claimed arXiv ID / DOI / venue?
       Output: YES / NO / UNCERTAIN, with the verifying URL.
    2. METADATA: are author names, year, venue, title correct?
       For each, output: correct / wrong: should be ... / typo: ...
    3. CONTEXT: for each use, does the cited paper actually support the surrounding claim?
       Output per-use: SUPPORTS / WEAK / WRONG, with one-sentence reasoning.

    VERDICT: KEEP / FIX / REPLACE / REMOVE
    - KEEP: entry is clean, all uses are appropriate
    - FIX: metadata needs correction; uses are appropriate
    - REPLACE: cite is wrong-context, find a different paper that actually supports the claim
    - REMOVE: entry is hallucinated or unsupportable

    Be honest. If you cannot verify online, say UNCERTAIN; do not guess.

Save the response to .aris/traces/citation-audit/<date>_runNN/<key>.md per the review-tracing protocol.

Step 4: Aggregate verdicts

Build CITATION_AUDIT.json following the schema defined in "Submission Artifact Emission" below (single authoritative schema for this file). Per-entry ledger data goes under details.per_entry, not under a top-level entries field. The top-level verdict is a single overall value (PASS / WARN / FAIL / NOT_APPLICABLE / BLOCKED / ERROR) derived from per-entry verdicts per the decision table in "Submission Artifact Emission"; the top-level summary is a one-line human-readable string.

Concretely, details carries the per-entry ledger:

"details": {
  "total_entries": 29,
  "counts": { "KEEP": 11, "FIX": 14, "REPLACE": 3, "REMOVE": 1 },
  "per_entry

citation-audit

Como adicionar

Cole no README do seu repo

Skills relacionadas

dev-browser

agent-browser

understand-chat

understand-dashboard

Receba novas skills de Pesquisa e Web toda segunda