Paper Pre-Submission Review (Lite)
Heritage and scope
This is the in-session, Claude-Code-native counterpart to presubmit — our port of the reviewer2 adversarial peer-review pipeline to Anthropic Claude. The design inherits two things from that lineage:
- A Critical-Reviewer posture. Review sub-agents adopt the persona of a rigorous, epistemically humble reviewer who is brutally honest about weaknesses but impervious to prestige (reputation, journal status, citation counts, prior peer review) and grounds every finding in a quote from the manuscript.
- A verification cascade. Red Team findings are cross-checked against the source before they enter the final report; claims that cannot be pinned to a quoted passage are dropped as likely hallucinations.
What this skill is: a ~11-sub-agent review that runs inside a Claude Code session, no extra install, billable against your Claude Code plan. Fast feedback during writing.
What it is NOT: the full reviewer2/presubmit pipeline. That tool runs ~30 stages with a dedicated Red Team (Breaker, Butcher, Shredder, Collector, Void), Blue Team defence, numbers/fact-check/citation-verification cascades, and a legal pass — and it is resumable and cost-tracked. Reach for presubmit when you want deeper adversarial pressure, need a standalone deliverable (a review report file), or are preparing a manuscript for final external peer review. This skill is the fast in-flow check.
Instructions
Run a comprehensive pre-submission review of the academic paper using parallel review agents. Each agent examines a different dimension; cross-check agents audit Phase 1 findings for false positives and missed issues. The final output is a structured pre-submit report with severity-ranked findings and a journal-readiness checklist.
Critical-Reviewer posture (required for every sub-agent). Every review sub-agent must (a) cite a direct quote from the manuscript for every [CRITICAL] and [RECOMMENDED] finding — a short verbatim span is sufficient, and (b) attack the argument or the data, not the authors. Framing like "fraudulent" or "incompetent" is out of scope; "the claim on line X is not supported by the evidence on line Y" is in scope. The standard is brutally honest on the work, fair to the people.
This review can be re-run after fixes to verify issues are resolved.
1. Orientation (do this yourself before launching agents)
Read the paper yourself to understand its structure before writing agent prompts. Determine:
- Where the paper source lives (LaTeX
.texvs Pandoc.mdvs Word) and what the build command is - Whether a Supplementary Information file exists and where it lives
- Where figures are stored and how they are referenced (relative paths, figure directories)
- Whether a replication archive exists (look for
replication/,archive/,data/, README files) - The paper's rough structure: section names, approximate page count, key claims in the abstract
- The bibliography format and location (
.bib, inline, etc.) - Design family. Is this a conjoint/factorial-vignette paper, a list experiment, a topic-modeling or LLM-classification study, or a VLM-OCR corpus paper? If so, also invoke the relevant sibling skill (
conjoint-diagnostics,list-experiment,topic-modeling,text-classification,vlm-ocr-pipeline) and fold its domain-specific checklist into Agent 9's deliverable. For any experimental manuscript, also runmethods-reportingin audit mode so its 45-item checklist becomes the baseline for Agents 1, 2, 6, 7, and 8.
Use this knowledge to write specific agent prompts that reference actual file paths, section names, and relevant files. Generic prompts produce shallow results.
Orchestration contract. Before Phase 2, create a scratch directory .review-tmp/ in the paper's working directory. Each Phase 2 agent writes its structured findings to a dedicated file (agent-1-content.md, agent-2-numbers.md, ..., agent-9-archive.md) using the output format specified below. Phase 3 cross-checkers read these files directly; you do not need to paste findings back into their prompts. Launch all Phase 2 agents in a single message with parallel sub-agent tool calls (one call per agent) so they run concurrently; launch Phase 3 after all nine output files exist. For experimental manuscripts, Agents 6 (CONSORT/randomization-and-flow) and 7 (pre-registration verification) are mandatory; for non-experimental manuscripts they can be skipped and their checklist rows marked NA.
2. Parallel Deep Review (launch all 9 agents simultaneously)
Agent 1 — Content & Argument (Red Team primary): Read the full paper. Your posture is adversarial but fair: find every place the argument is weaker than the paper presents it to be. Check logical flow from introduction through conclusion. Identify unsupported claims, logical gaps, missing caveats, and places where the argument is unclear or circular. Flag any claims in the abstract not backed up in the body. Note missing discussion of limitations. Check whether the framing accurately positions the contribution relative to cited prior work. Required: support every [CRITICAL] and [RECOMMENDED] finding with a direct quote from the manuscript (verbatim span, ≤ 2 sentences) plus the file path + line number. Unsupported findings will be dropped by the cross-checker.
Agent 2 — Numbers & Internal Consistency: Check every quantitative claim against JARS-Quant reporting expectations (Appelbaum et al. 2018). Do numbers in the abstract match the body? Do table values match in-text references? Do SI cross-references point to the right appendices/tables? Are confidence intervals, p-values, N counts, and effect sizes reported consistently throughout? Flag multiple-comparisons issues (many tests without correction or discussion). Verify that significance thresholds are defined and used consistently. For experimental papers, verify denominator consistency across ITT and any complier/compliance-adjusted analyses and flag any manipulation-check that is present in the design but missing from the results. Flag forking-paths risks explicitly: DV switching between primary and secondary outcomes, covariate-set changes across models, transformation or subsetting decisions not traceable to a pre-registration, and any analysis whose choice was visibly made after seeing outcome data (Wicherts et al. 2016; Gelman & Loken 2014; Simmons et al. 2011). Do NOT audit the CONSORT flow, baseline balance, attrition-by-arm, or PAP-to-paper mapping — those are Agents 6 and 7.
Agent 3 — References & Citations: Audit the bibliography file. Are all cited works present? Are there uncited entries? Check for stale working papers (2025+) that may now be published — flag entries that need author verification. Check formatting consistency (journal names, author encoding, entry types). Do NOT check DOIs — that is Agent 4's job.
Agent 4 — DOI Audit: Check every bibliography entry for a DOI. For entries missing a DOI, attempt to locate one via web search (title + author + "doi"). Report which entries are missing DOIs and, where found, provide the correct DOI. Verify that existing DOIs resolve to the correct paper — wrong-paper DOIs are a common copy-paste error. This agent runs separately because DOI lookup is slow.
Agent 5 — Writing Quality & Journal Compliance: Check for redundancy, passive voice overuse, unclear antecedents, jargon without definition on first use, overly long sentences (60+ words), and inconsistent terminology for the same concept. Audit journal-level transparency compliance against the TOP Guidelines (Nosek et al. 2015): data citation, data/code/materials transparency, design and analysis transparency, preregistration of studies and analysis plans, replication standards. Check reporting conformance against JARS-Quant (Appelbaum et