Auto Paper Improvement Loop: Review → Fix → Recompile
🔒 Do not wrap this skill in
/loop,/schedule, orCronCreate. It already loops internally (review → fix → recompile) with its own round structure and a deliberate fresh-reviewer bias guard each round (nocodex-reply). Re-asking it to "improve the paper" on a wall-clock timer produces no new signal — quality changes when the review changes, not when the clock ticks — and a timed re-run that also accepts its own output to decide when to stop crosses into self-acquittal (acceptance-gate.md). Schedule the external wait that precedes it, not the improvement loop. Seeshared-references/external-cadence.md.
Autonomously improve the paper at: $ARGUMENTS
Context
This skill is designed to run after Workflow 3 (/paper-plan → /paper-figure → /paper-write → /paper-compile). It takes a compiled paper and iteratively improves it through external LLM review.
Unlike /auto-review-loop (which iterates on research — running experiments, collecting data, rewriting narrative), this skill iterates on paper writing quality — fixing theoretical inconsistencies, softening overclaims, adding missing content, and improving presentation.
Constants
- MAX_ROUNDS = 2 — Two rounds of review→fix→recompile. Empirically, Round 1 catches structural issues (4→6/10), Round 2 catches remaining presentation issues (6→7/10). Diminishing returns beyond 2 rounds for writing-only improvements.
- REVIEWER_MODEL =
gpt-5.5— Model used via Codex MCP for paper review. - REVIEWER_BIAS_GUARD = true — When
true, every review round uses a freshmcp__codex__codexthread with no prior review context. Never usemcp__codex__codex-replyfor review rounds. Set tofalseonly for deliberate debugging of the legacy behavior. Empirical evidence: running the same paper withcodex-reply+ "since last round we did X" prompts inflated scores from real 3/10 → fake 8/10 across multiple rounds; switching to fresh threads recovered the true 3/10 assessment. - REVIEW_LOG =
PAPER_IMPROVEMENT_LOG.md— Cumulative log of all rounds, stored in paper directory. - HUMAN_CHECKPOINT = false — When
true, pause after each round's review and present score + weaknesses to the user. The user can approve fixes, provide custom modification instructions, skip specific fixes, or stop early. Whenfalse(default), runs fully autonomously. - EDIT_WHITELIST =
null— Optional path to a YAML/JSON whitelist file constraining which paths and operations the fix-implementation step may touch. Whennull(default), all edits proceed unconstrained. When set via— edit-whitelist <path>(also accepts— edit_whitelist <path>), the loop loads the file at startup and consults it before each edit; rejected edits are logged toPAPER_IMPROVEMENT_LOG.mdrather than silently dropped. See "Optional: Edit Whitelist" below.
💡 Override:
/auto-paper-improvement-loop "paper/" — human checkpoint: true
Optional: Style reference (— style-ref: <source>, opt-in)
Lets the user steer structural fixes only during improvement (section reordering hints, paragraph length nudges, figure density adjustments) toward a reference paper. Default OFF — when the user does not pass — style-ref, do nothing differently from before.
Only when — style-ref: <source> appears in $ARGUMENTS, run the helper FIRST, before the loop starts:
# Resolve $STYLE_HELPER via the canonical strict-safe chain (see
# shared-references/integration-contract.md §2). Policy A — gate:
# unresolved helper means --style-ref cannot be satisfied, so abort.
cd "$(git rev-parse --show-toplevel 2>/dev/null || pwd)" || exit 1
if [ -z "${ARIS_REPO:-}" ] && [ -f .aris/installed-skills.txt ]; then
ARIS_REPO=$(awk -F'\t' '$1=="repo_root"{print $2; exit}' .aris/installed-skills.txt 2>/dev/null) || true
fi
STYLE_HELPER=".aris/tools/extract_paper_style.py"
[ -f "$STYLE_HELPER" ] || STYLE_HELPER="tools/extract_paper_style.py"
[ -f "$STYLE_HELPER" ] || { [ -n "${ARIS_REPO:-}" ] && STYLE_HELPER="$ARIS_REPO/tools/extract_paper_style.py"; }
[ -f "$STYLE_HELPER" ] || {
echo "ERROR: extract_paper_style.py not resolved at .aris/tools/, tools/, or \$ARIS_REPO/tools/." >&2
echo " Fix: rerun bash tools/install_aris.sh, export ARIS_REPO, or copy the helper to tools/." >&2
echo " --style-ref cannot be satisfied; aborting." >&2
exit 1
}
STYLE_STATUS=0
CACHE=$(python3 "$STYLE_HELPER" --source "<source>") || STYLE_STATUS=$?
case "$STYLE_STATUS" in
0) ;; # use $CACHE/style_profile.md as structural guidance for the FIX phase only
2) echo "warning: style-ref skipped (missing optional dep)" >&2 ;;
3) echo "error: --style-ref source failed; aborting loop" >&2 ; exit 1 ;;
*) echo "error: helper failed unexpectedly; aborting loop" >&2 ; exit 1 ;;
esac
Sources accepted: local TeX dir / file, local PDF, arXiv id, http(s) URL. Overleaf URLs/IDs are rejected — clone via /overleaf-sync setup <id> first and pass the local clone path.
Strict rules (full contract in tools/extract_paper_style.py docstring):
- Use
style_profile.mdonly during the fix-implementation phase, to nudge structural choices when applying reviewer feedback. Reviewer feedback always takes precedence; style ref is tie-breaker for how to apply a fix, not whether to apply it. - Never copy prose, claims, examples, or terminology from anything reachable through the cache when implementing fixes.
- Never pass
— style-ref(or the cache contents) to the GPT-5.5 reviewer sub-agent. The Reviewer Independence Protocol below requires reviewers see only the artifact and the user's prompt — leaking the style ref would contaminate the review with author-side context. This is the most critical invariant in this skill.
Optional: Edit Whitelist (— edit-whitelist <path>, opt-in)
Lets the caller hard-constrain which files and operations the fix-implementation step (Step 3 / Step 6) is allowed to touch. Default OFF — when the user does not pass — edit-whitelist (or the alias — edit_whitelist), the loop applies all reviewer-driven edits without restriction, exactly as before.
This is the parameter that upstream pipelines (e.g. /resubmit-pipeline Phase 2) use to enforce text-only resubmit microedits: no .bib mutations, no .sty / .bst mutations, no edits to prior-submission directories, no new \cite{...}, no new theorem environments, no new numerical claims.
Schema
The whitelist file is YAML or JSON. All four sections are optional:
allowed_paths:
- sec/*.tex
- main.tex
- figures/*.tex
forbidden_paths:
- "**/*.bib"
- "**/*.sty"
- "**/*.bst"
- "../OldSubmission/**"
forbidden_operations:
- new_cite # blocks \cite{...}, \citep{...}, \citet{...}, \citeauthor{...} additions
- new_bibitem # blocks \bibitem{...} additions
- new_theorem_env # blocks \begin{theorem|lemma|proposition|corollary} additions
- numerical_claim # blocks adding new numbers / percentages / metrics
forbidden_deletions: # operations that block REMOVALS, not additions
- delete_existing_cite # blocks removal of \cite{...} from the body (use citation-audit --soft-only instead)
- delete_theorem_env # blocks removal of an existing \begin{theorem|...} block
requires_user_approval_for: # operations that don't auto-reject but pause for explicit user OK
- rewrite_abstract # paraphrasing the entire abstract triggers a checkpoint
- rewrite_intro_first_para
- delete_section
max_edits_per_round: 30 # hard cap on number of accepted edits per round (rejections are not counted; if cap is hit, remaining proposed edits are deferred to the next round with a warning)
rationale: "Resubmit mode: text-only microedits, paper structure frozen by user constraint."