a* (autostar) — web runtime

A generalised autonomous optimisation loop — soft RLVR for the masses. The user defines a goal; the system runs structured experiments, evaluates progress across independent tracks, reflects at strategic checkpoints, and learns from every attempt — including learning how to learn better the next time.

If you can measure it, you can improve it.

Web runtime constraints

This package runs inside a web chat runtime with reduced capabilities:

No subprocess access — external_tool verifiers are unavailable
No unrestricted local files — file read/write is limited
Memory: connector-backed > project-pack > none (see references/memory.md)

Do not silently downgrade external_tool verifiers to llm_judge. If the user requests a verifier type that requires subprocess access, explain the limitation and ask them to choose an alternative.

Experimental-first principle

a* is an experimental optimisation loop. Do not reach for external mathematical optimisers or solvers (e.g. scipy.optimize, cvxpy, linear/quadratic programming solvers, evolutionary algorithm libraries, Bayesian optimisation frameworks, or any other off-the-shelf optimisation package) as a shortcut to improving the artifact. The value of a* is in the structured explore-evaluate-reflect cycle, not in delegating the search to a solver.

If at any point during onboarding, pre-run analysis, or execution you believe the problem is well-suited to a closed-form or mathematical optimisation approach, you must ask the user first before pursuing it. Present it as an alternative:

"This problem looks like it could be approached with a mathematical optimiser (e.g. [specific method]). Would you like me to try that instead of running the experimental loop, or would you prefer to proceed with a*?"

Do not silently install, import, or invoke an external optimiser. Do not reframe the a* loop as a wrapper around a solver. If the user explicitly opts for a mathematical approach, that is a different workflow — not an a* run.

Concepts

Before running, ensure you understand these terms precisely:

Term	Meaning
Step	One execution with one parameter set. Atomic unit of work.
Play	A named bundle of parameters that move together (optional; disable with `plays: false`).
Lap	A set of steps sharing the same parameter family. Establishes statistical confidence in a direction.
Round	A set of laps. Ends with a mandatory reflection: worth pursuing? ask user? pivot?
Run	One user-initiated process. Lasts until budget is exhausted or goal is met.
Track	One independently verifiable sub-goal. Has its own verifier and ratchet.
Disposition	A learned prior on how to approach a (problem class, action intent) pair. Stored in long-term memory; conditions all significant actions.

Runtime capability contract

Before Phase 1, detect the host runtime's capabilities. The web runtime provides:

structured_choice: basic — bounded approvals via chat
freeform_input: true — open-ended elicitation
file_presentation: inline — present files inline in chat
local_html: inline — render HTML inline
subprocess: false — no subprocess access
pause_resume: true — human gates and round escalations
file_read_write: limited
long_term_memory: false (until an effective memory surface is probed)

If a capability is missing, follow the fallback policy in references/runtime-capabilities.md before onboarding the mission.

Memory probing

Before starting, probe memory surfaces in order:

connector_backed — check if remote memory connector tools are available
project_pack — check if project knowledge contains an exported memory pack
none — short-term memory only

If neither a connector nor a project pack is available, state plainly:

"Long-term memory is unavailable in this session. a* is running with short-term memory only."

See references/adapter-claude-ai.md and references/memory.md for details.

Phase 1: Onboarding

Do not begin execution until onboarding is complete and the user has approved the mission.

Onboarding is an interactive dialogue, not a monologue. At every decision point you must stop and ask the user rather than inferring and proceeding. Use structured choices for bounded decisions and open prose questions for genuinely open-ended inputs (e.g. goal description, rubric wording).

The mandatory user-confirmation checkpoints are:

Goal decomposition confirmed — present inferred tracks as choices; user approves, removes, or adds before proceeding
Required vs preferred — for each track, explicitly ask; do not infer
Verifier type per track — present options (excluding external_tool which is unavailable in this runtime); user selects
Hard constraints confirmed — present inferred list; user amends
Budget — present three concrete options; user selects
Plays — enabled/disabled, and approval of proposed bundles
Final mission confirmation — full summary; explicit go/no-go before any step runs

Never skip a checkpoint. If the user's initial message contained enough information to pre-populate an answer, present it as a pre-selected option and ask them to confirm or change it. Do not silently accept it.

Rubric builder: When configuring LLM judge tracks (onboarding checkpoint 2+), elicit score anchors interactively through the chat interface. Present the rubric draft to the user for review and confirmation before proceeding.

The onboarding produces four documents, all maintained in conversation state:

`mission.md`

GOAL:               [plain language description of success]
ARTIFACT:           [what is being mutated and where it lives]
PLAYS:              enabled | disabled
BUDGET:             [strategy + ceiling — see references/budgeting.md]
STOPPING_CRITERIA:  [score threshold | plateau_n | budget_exhausted]
REPORTING:          [what the final report must contain]

`tracks.md`

One block per track. See Verification taxonomy below for verifier types.

TRACK: <name>
required:     true | false
weight:       0.0–1.0  (weights across non-required tracks must sum to 1.0)
verifier:     <see taxonomy>
threshold:    <pass/fail cutoff or target score>
ratchet:      independent | composite  (default: independent)

`constraints.md`

HARD:   [list — violations cause immediate step rejection before scoring]
SOFT:   [list — passed to LLM judge as weighting hints]

`plays.md` (if enabled)

PLAY: <name>
parameters:       [list of (param, from, to)]
hypothesis:       [why these move together]
tracks_targeted:  [list]
atomic_fallback:  true | false

Verification taxonomy

This is the core of the rubric system. Every track must declare one of the following verifier types. In this web runtime, external_tool is not available.

1. Deterministic programmatic

A function, script, or expression that produces a binary pass/fail or a bounded score with no randomness. Does not require an LLM call. In this runtime, deterministic checks are limited to what can be evaluated inline (e.g. character count, regex match, format compliance).

verifier:
  type: deterministic
  fn:   word_count(artifact) <= 400
  returns: bool

2. External tool (subprocess) — NOT AVAILABLE

This verifier type requires subprocess access and is not available in this runtime. Do not offer it during onboarding. If the user asks for it, explain:

"External tool verifiers (pyright, pytest, eslint, etc.) require subprocess access which isn't available in this runtime. I can use an LLM judge with a rubric that targets the same quality dimension, or you can run those checks separately and report results back to me."

Do not silently substitute an LLM judge for an external tool. The user must explicitly appr

autostar-web

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

dev-browser

agent-browser

understand-chat

understand-dashboard

Recibe nuevas skills de Pesquisa e Web todos los lunes

a* (autostar) — web runtime

Web runtime constraints

Experimental-first principle

Concepts

Runtime capability contract

Memory probing

Phase 1: Onboarding

`mission.md`

`tracks.md`

`constraints.md`

`plays.md` (if enabled)

Verification taxonomy

1. Deterministic programmatic

2. External tool (subprocess) — NOT AVAILABLE

Comentarios · Sin comentarios

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

dev-browser

agent-browser

understand-chat

understand-dashboard

Recibe nuevas skills de Pesquisa e Web todos los lunes

a* (autostar) — web runtime

Web runtime constraints

Experimental-first principle

Concepts

Runtime capability contract

Memory probing

Phase 1: Onboarding

mission.md

tracks.md

constraints.md

plays.md (if enabled)

Verification taxonomy

1. Deterministic programmatic

2. External tool (subprocess) — NOT AVAILABLE

Comentarios · Sin comentarios

`mission.md`

`tracks.md`

`constraints.md`

`plays.md` (if enabled)