a* (autostar)

A generalised autonomous optimisation loop — soft RLVR for the masses. The user defines a goal; the system runs structured experiments, evaluates progress across independent tracks, reflects at strategic checkpoints, and learns from every attempt — including learning how to learn better the next time.

If you can measure it, you can improve it.

Experimental-first principle

a* is an experimental optimisation loop. Do not reach for external mathematical optimisers or solvers (e.g. scipy.optimize, cvxpy, linear/quadratic programming solvers, evolutionary algorithm libraries, Bayesian optimisation frameworks, or any other off-the-shelf optimisation package) as a shortcut to improving the artifact. The value of a* is in the structured explore-evaluate-reflect cycle, not in delegating the search to a solver.

If at any point during onboarding, pre-run analysis, or execution you believe the problem is well-suited to a closed-form or mathematical optimisation approach, you must ask the user first before pursuing it. Present it as an alternative:

"This problem looks like it could be approached with a mathematical optimiser (e.g. [specific method]). Would you like me to try that instead of running the experimental loop, or would you prefer to proceed with a*?"

Do not silently install, import, or invoke an external optimiser. Do not reframe the a* loop as a wrapper around a solver. If the user explicitly opts for a mathematical approach, that is a different workflow — not an a* run.

Concepts

Before running, ensure you understand these terms precisely:

Term	Meaning
Step	One execution with one parameter set. Atomic unit of work.
Play	A named bundle of parameters that move together (optional; disable with `plays: false`).
Lap	A set of steps sharing the same parameter family. Establishes statistical confidence in a direction.
Round	A set of laps. Ends with a mandatory reflection: worth pursuing? ask user? pivot?
Run	One user-initiated process. Lasts until budget is exhausted or goal is met.
Track	One independently verifiable sub-goal. Has its own verifier and ratchet.
Disposition	A learned prior on how to approach a (problem class, action intent) pair. Stored in long-term memory; conditions all significant actions.

Runtime capability contract

Before Phase 1, detect the host runtime's capabilities and map them onto the abstract adapter contract in references/runtime-capabilities.md.

Use abstract capabilities first:

structured_choice for bounded approvals
freeform_input for open-ended elicitation
file_presentation / local_html for rubric builder and visualiser
subprocess for external-tool verifiers and render scripts
pause_resume for human gates and round escalations

Claude-specific tools are examples of adapters, not the specification:

Claude Code: ask_user + shell + browser/file paths
Claude.ai: structured chat + present_files

If a capability is missing, follow the fallback policy in references/runtime-capabilities.md before onboarding the mission.

Concrete runtime profiles and adapters live in:

runtime-profiles/claude-code.json
runtime-profiles/codex.json
runtime-profiles/gemini.json
runtime-profiles/claude-ai.json
runtime-profiles/pi.json
runtime-profiles/chat-only.json
runtime-profiles/template.json
references/adapter-claude-code.md
references/adapter-codex.md
references/adapter-gemini.md
references/adapter-claude-ai.md
references/adapter-pi.md
references/adapter-chat-only.md
references/adapter-template.md
scripts/runtime_profile.py

Before detailed verifier/rubric work, check that the active runtime can support the proposed mission. Use scripts/runtime_profile.py check-mission with the current runtime profile and planned verifier types. If it fails, stop and reconfigure before proceeding.

Phase 1: Onboarding

Do not begin execution until onboarding is complete and the user has approved the mission.

Onboarding is an interactive dialogue, not a monologue. At every decision point you must stop and ask the user rather than inferring and proceeding. Use the host runtime's structured_choice capability for bounded decisions; in Claude Code this maps to ask_user. Use open prose questions for genuinely open-ended inputs (e.g. goal description, rubric wording).

The mandatory user-confirmation checkpoints are:

Goal decomposition confirmed — present inferred tracks as choices; user approves, removes, or adds before proceeding
Required vs preferred — for each track, explicitly ask; do not infer
Verifier type per track — present options; user selects
Hard constraints confirmed — present inferred list; user amends
Budget — present three concrete options; user selects
Plays — enabled/disabled, and approval of proposed bundles
Final mission confirmation — full summary; explicit go/no-go before any step runs

Never skip a checkpoint. If the user's initial message contained enough information to pre-populate an answer, present it as a pre-selected option and ask them to confirm or change it. Do not silently accept it.

Rubric builder: When configuring LLM judge tracks (onboarding checkpoint 2+), surface the bundled rubric builder through the runtime's local_html or file_presentation capability so the user can describe score anchors interactively and get a generated rubric they can edit and confirm:

# Claude Code / terminal
open assets/rubric-builder.html          # macOS
xdg-open assets/rubric-builder.html     # Linux
start assets/rubric-builder.html         # Windows

If running in Claude.ai, use present_files on assets/rubric-builder.html instead. If the runtime cannot surface local HTML, fall back to manual rubric elicitation as defined in references/runtime-capabilities.md. The user exports a tracks.md from the tool; load that as the confirmed track configuration. Only fall back to manual elicitation for tracks the tool did not cover (external_tool, deterministic, human_gate types do not need a rubric).

Read references/onboarding.md for the full dialogue flow, question wording, and decision trees at each checkpoint. Read references/runtime-capabilities.md before adapting this flow to a non-Claude host.

Rubric builder UI: When Phase B (verifier elicitation) reaches an llm_judge or hybrid track, present assets/rubric-builder.html to the user before configuring that track. The builder calls Claude to generate the rubric from the user's anchor descriptions, lets them review and edit it inline, and exports a tracks.md file you can use directly. Tell the user:

"I'm opening the rubric builder for the [track name] track. Describe the score anchors, and it will draft the rubric for you to review and confirm."

After the user exports tracks.md from the builder, read it and use it as the track configuration. Do not re-elicit rubrics that are already confirmed there.

The onboarding produces four documents, all stored in the run directory:

`mission.md`

GOAL:               [plain language description of success]
ARTIFACT:           [what is being mutated and where it lives]
PLAYS:              enabled | disabled
BUDGET:             [strategy + ceiling — see references/budgeting.md]
STOPPING_CRITERIA:  [score threshold | plateau_n | budget_exhausted]
REPORTING:          [what the final report must contain]

`tracks.md`

One block per track. See Verification taxonomy below for verifier types.

TRACK: <name>
required:     true | false
weight:       0.0–1.0  (weights across non-required tracks must sum to 1.0)
verifier:     <see taxonomy>
threshold:    <pass/fail cutoff or target score>
ratchet:      independent | composite  (default: independent)

`constraints.md`

HARD:

autostar

How to add

Drop this on your repo README

Related skills

understand-dashboard

understand-chat

understand-domain

dev-browser

Get new Pesquisa e Web skills every Monday