Published skills
debug-agent
Step through a simulation agent replay to find where and why it failed — timeout, invalid move, crash, or strategic collapse. Use when the user says "debug the agent", "why did it lose", "find the failing turn", "analyze the replay", or after a regression in local eval.
recall-learnings
Surface prior learnings relevant to the current task. Use before starting any non-trivial work — new experiment, new feature, new submission strategy, debugging a pattern — to check if we've been here before. Also use when the user says "what have we learned about X", "any past notes on Y", "recall", or "check learnings".
retrospect-session
Session-end retrospective that extracts 0-3 durable learnings from the conversation, updates the LEARNINGS.md digest, and flags recurring scars. Use when the user says "retro", "retrospect", "wrap up", "end of session", "what did we learn today", or when a long working session is winding down. The Stop hook also nudges this skill.
run-local-eval
Run the local kaggle-environments evaluation harness for a simulation/agent competition. Use when the user says "local eval", "test agent", "run simulation", "self-play", or before submitting an agent.
log-experiment
Record a scientific-method entry for an experiment — hypothesis, setup, outcome, lesson. Use before running any non-trivial experiment (new feature, new model, new CV strategy, new agent tactic) so the hypothesis is written down BEFORE the result. Also use when the user says "log experiment", "track this test", or "record the hypothesis".
post-submission-review
After any kaggle submission, diff against the prior best submission, fetch the new public score, write the outcome to the competition's submissions/LOG.md, and capture a learning if the delta is notable. Triggered automatically by a PostToolUse hook on `kaggle competitions submit`. Also use when the user says "review that submission" or "what did that submission do".
eda-audit
Structured EDA audit for tabular competitions — target distribution, missing values, leakage scan, class imbalance, feature types, outliers, train/test shift. Use when the user says "do EDA", "audit the data", "explore the dataset", "check for leakage", or when starting a new tabular competition.
ensemble-blend
Blend multiple Kaggle submission CSVs via weighted average (regression) or rank average (classification/ranking). Use when the user says "blend submissions", "ensemble these CSVs", "rank average", "weighted blend", or "combine my models".
improve-agent
Use when asked to improve, iterate, or build a better version of an existing Kaggle simulation/agent competition bot. Triggers on "improve the agent", "make a better version", "vN", "iterate on agent", "what should we try next", or any request to advance beyond the current best submission.
capture-learning
Capture a single learning to .learnings/ when something surprising, painful, or validation-worthy happens during Kaggle work. Use proactively whenever an experiment fails for a non-obvious reason, a submission regresses, you hit a tool/API quirk, or a non-obvious approach is confirmed to work. Also use when the user says "remember this", "save this learning", "log this", or describes a mistake the
new-competition
Scaffold a new Kaggle competition subfolder under competitions/<slug>/ with CLAUDE.md, README, gitignore, submission log, and standard subdirs. Use when the user says "new competition", "start <slug>", "scaffold <slug>", or provides a Kaggle competition URL.
preflight-consult
Pre-flight sanity check before any major move — new model choice, new feature, new submission, merging a big change. Pulls relevant learnings, runs a self-critique, and asks the user to confirm. Use when about to commit time to a non-trivial direction, OR when the user says "preflight", "sanity check this", "before we go".
leaderboard-check
Pull the current Kaggle leaderboard for a competition and compare against the user's best score. Use when the user says "leaderboard", "how are we doing", "score vs top", or "where do we rank".
submit-competition
Submits a file to a Kaggle competition with pre-flight validation and post-submit logging. Use when the user says "submit", "send submission", "push to Kaggle", or "make a submission".
Category alert