Kill Argument Exercise: Adversarial Attack-Defense Review

Stress-test the headline claims of a paper against the strongest possible rejection argument: $ARGUMENTS

Why This Exists

Standard score-based reviews (/research-review, /auto-paper-improvement-loop) tend to produce balanced weakness lists. Each weakness gets ~equal attention, ranked CRITICAL > MAJOR > MINOR. Empirically, this misses one specific failure mode: the single most damaging argument a reviewer would write in a rejection paragraph — the one sentence that, if a senior area chair reads it, kills the paper.

A balanced reviewer might list "scope-overclaim risk" as MAJOR alongside 3-5 other MAJORs, never quite committing. An adversarial reviewer must commit: their entire job is to convince the area chair to reject in 200 words.

This skill runs that adversarial pass deliberately, then forces a second fresh reviewer to defend point-by-point, classify each rejection as already-fixed / partially-fixed / still-unresolved, and surface what's actually load-bearing.

Empirical motivation: in a real submission run, after several rounds of standard improvement (score 7-8/10), the kill-argument exercise surfaced framing weaknesses that no prior review caught (e.g., a setting being mostly conditional rather than truly general, or a baseline being irrelevant to real systems). Author rebuttal forced explicit scope qualifications in abstract and discussion that weren't visible from the score-based reviews alone.

How This Differs From Other Review Skills

Skill	What it asks the reviewer	Output
Standard peer review	"Score this paper, list weaknesses by severity"	balanced weakness list
`/research-review`	"Deep technical review of methods + claims"	structured deep critique
`/proof-checker`	"Is this theorem actually proved?"	per-step proof obligation audit
`/paper-claim-audit`	"Does the paper report numbers truthfully?"	per-claim evidence verification
`/citation-audit`	"Are citations real and used in correct context?"	per-entry KEEP/FIX/REPLACE/REMOVE
`/kill-argument`	"Write the single strongest rejection paragraph; then defend it."	attack memo + per-point defense + unresolved surfaced

This skill is complementary, not a replacement. Run after standard reviews when you want to know what the worst-case reviewer paragraph would look like, before camera-ready or rebuttal preparation.

When To Use

After 1-2 rounds of /auto-paper-improvement-loop settled at a stable score, but before submission. Surfaces what additional fixes would close the headline-attack gap.
During rebuttal preparation, to predict reviewer-2's strongest objection so you can prepare the response in advance.
For theory papers with a high-level title that may oversimplify the actual theorem (the most common reject-attack pattern).
For papers where a reviewer might attack scope, assumption-vs-claim mismatch, missing proof obligations, or evidence-vs-headline gaps.

This skill is most valuable for theory papers with ≥5 theorem-class environments (so the headline depends on real proof obligations). For empirical papers without theorems, use /research-review instead.

Constants

REVIEWER_MODEL = gpt-5.5 (default; specify gpt-5.4 if you want to fall back to the legacy default). Reviewer reasoning effort = xhigh.
CONTEXT_POLICY = fresh (REVIEWER_BIAS_GUARD). Each thread is a fresh spawn_agent call. Never use send_input. No prior review summary, fix list, or executor explanation enters either prompt.
ATTACK_LENGTH = approximately 200 words (do not exceed 250). Single coherent argument, not a list.
DEFENSE_DECOMPOSITION = 3-7 atomic rejection points extracted from the attack memo. Each gets its own classification.
CLASSIFICATION = answered_by_current_text / partially_answered / still_unresolved. (Names chosen so the adjudicator does not assume "fixed" implies prior history of patching — they read the paper as a fresh reviewer would.)
OUTPUT = KILL_ARGUMENT.md (human-readable) + KILL_ARGUMENT.json (machine-readable) in the paper directory.
RENDER_HTML = true — When true (default), auto-render KILL_ARGUMENT.md to HTML after writing the report via /render-html "<paper-dir>/KILL_ARGUMENT.md" --json "<paper-dir>/KILL_ARGUMENT.json". Uses full review gate (audit-class artifact). Set false to skip, or pass — render html: false. Non-blocking: failures don't invalidate the kill-argument verdict.

Workflow

Step 1: Discover paper files

Locate the paper directory and inventory the source.

PAPER_DIR="$ARGUMENTS"   # e.g., paper-overleaf/ or paper/
cd "$PAPER_DIR"

# Find the LaTeX entry point
ENTRY=$(grep -lE '^\\documentclass' *.tex 2>/dev/null | head -1)
echo "Entry: $ENTRY"

# Find all source files codex should read
find . -name "*.tex" -not -path "./.git/*" 2>/dev/null
find . -name "*.bib" -not -path "./.git/*" 2>/dev/null
find figures/ -name "*.pdf" -o -name "*.png" 2>/dev/null
ls -la *.pdf 2>/dev/null  # compiled PDF

If a compiled PDF is missing, the skill should still run on .tex source alone, but the prompt should mention this so the reviewer doesn't waste cycles trying to extract from a non-existent PDF.

Step 2: Attack memo (Thread 1, fresh codex)

Invoke spawn_agent (NOT send_input) with the following prompt structure. Use absolute or paper-directory-relative paths inside the prompt; do not rely on a cwd parameter.

spawn_agent:
  model: gpt-5.5
  reasoning_effort: xhigh
  message: |
    You are simulating a hostile NeurIPS / ICLR / ICML reviewer for a paper.
    This is a kill-argument adversarial check — your task is NOT to give a
    balanced review but to construct the **single strongest argument for
    rejecting this paper**.

    ## Files to read
    - LaTeX entry: <ENTRY>
    - All section files under sections/ or wherever they live
    - Macro files (math_commands.tex, etc.)
    - Compiled PDF: <main.pdf> (if available)

    Read the source carefully. Do not consult any prior reviews, fix lists,
    or summaries; this must be a fresh, zero-context adversarial pass.

    ## Your task
    Construct the single best argument to reject this paper in approximately
    200 words. Your goal is to write the worst-case rejection memo a senior
    NeurIPS area chair would produce after reading the paper.

    Focus on these axes (pick the most damaging combination, do not list all):
    1. Theorem validity: are central theorems actually proved as stated?
    2. Assumption-vs-claim mismatch: does the body silently retreat to a
       narrower object than the title/abstract advertise?
    3. Missing proof obligations: is a fundamental lemma invoked but not
       proved (e.g., concentration, generic position, prefactor envelope)
       that the headline depends on?
    4. Limit-order ambiguity: are limits in K/n/d/eps composed in a way the
       paper does not commit to?
    5. Claim-vs-evidence gap: is the empirical/numerical evidence too narrow
       to support the breadth of the stated theorem or take-away?
    6. Scope overclaim: does the title or abstract sell a result substantially
       broader than what the body proves?

    ## Constraints
    - Approximately 200 words total (do NOT exceed 250).
    - Single argument, not a list — pick the most damaging line of attack
      and develop it.
    - Cite specific file:line locations or equation numbers when accusing.
    - Tone: dispassionate but uncompromising. Do NOT hedge. Do NOT acknowledge
      mitigations the paper might have made elsewhere. This is the rejection
      paragraph; the defense gets the next pass.
    - Do NOT reference prior review rounds, fix lists, or any context outside
      the current paper files.

    Output: just the rejection memo, nothing else.

Save the

kill-argument

Como adicionar

Cole no README do seu repo

Skills relacionadas

dev-browser

agent-browser

understand-chat

understand-dashboard

Receba novas skills de Pesquisa e Web toda segunda