SSkilltecabyclaudinhocode
Enviar skill
← Voltar para o catálogo

forkcast

Produtividade

Generates a decision tree for project tasks. Forks every meaningful choice, simulates each branch forward up to 10 steps, scores outcomes against weighted objectives, and emits `tasks/forkcast.md` — a human-editable markdown decision tree with embedded mermaid diagram. Closes the gap LeCun's critique points at: LLMs lack world models, so we make the model's lookahead, confidence, and grounding *le

1estrelas
Ver no GitHub ↗Autor: sboghossian

/forkcast — World-Model Lookahead for Project Decisions

You are running a structured lookahead simulation for the user's project. You do not commit to a single plan. You fork every decision, simulate each branch, score outcomes, and emit a decision tree the user can edit.

This is the constructive answer to Yann LeCun's critique that LLMs lack world models. You don't have one in the JEPA sense. What you have is a text-native project — code, commits, memory, prior decisions — and the LLM's pretraining already encodes the transition function for that domain (cf. RAP, Hao 2023). Your job is to make that implicit world model explicit, scored, and editable.

The loop

  1. Read the goal. From args, tasks/todo.md, or the conversation. If ambiguous, ask one clarifying question, then proceed.
  2. Read the context. git log -20, CLAUDE.md, MEMORY.md, any obvious docs in the repo. This is your grounding pool.
  3. Identify forks. A fork is any decision point with ≥2 reasonable paths. Don't manufacture forks that don't exist; don't collapse real ones into a default.
  4. For each fork: generate N candidate branches (default N=3). Each branch is a named direction, not a step yet.
  5. For each branch: simulate K steps forward (default K=6, max K=10). Each step:
    • States the predicted next action and its predicted effect on project state
    • Cites at least one grounding artifact (file, commit, memory, prior decision)
    • Emits a step confidence (0.0–1.0); cumulative = product along path
    • Stops extending when cumulative confidence < 0.3 (mark [speculative])
  6. Score each branch leaf against user-defined or inferred objectives (default: ship-speed 0.4, revenue-quality 0.3, reversibility 0.2, brand-fit 0.1). Score 0–10. Weighted sum = branch score.
  7. Self-consistency check at forks. Re-generate each branch's first 3 steps twice more with raised temperature. If outputs diverge, penalize.
  8. Critic gate at depth 3 and at chosen-leaf. Spawn /codex challenge (or inline critic if codex unavailable): "what does this branch assume that isn't stated? What evidence would falsify it?" Append flags as ⚠️ Critic flag: lines.
  9. Backpropagate the winner. Highest weighted score wins; mark ← chosen.
  10. Emit tasks/forkcast.md following the artifact format below.
  11. Wait for user. Do not auto-execute. Forkcast produces decisions, not actions. The user reads, edits, locks, re-runs.

Artifact format

tasks/forkcast.md is the deliverable. It is markdown, diff-friendly, GitHub-renderable, and the user is expected to edit it directly.

Required sections (in order):

  • Header: goal, objectives + weights, horizon, branching factor, critic
  • Embedded mermaid flowchart TD showing the full tree at a glance
  • Hierarchical decision sections — one per fork, branches as task list items with checkboxes ([x] = chosen, [ ] = considered)
  • Each branch line carries: name, score, cumulative confidence, grounding tag ([grounded] / [ungrounded]), self-consistency tag ([3/3] / [2/3])
  • Speculative steps tagged [speculative] past the confidence horizon
  • ## Decision summary with chosen path, predicted timeline, open questions
  • ## How to use this file block (reproduce verbatim from the example)

See examples/forkcast.example.md for the canonical layout.

User interaction protocol

The forkcast.md file is the conversation surface, not chat. Users interact by:

  • Checking a box = lock that branch. Re-runs respect locked branches.
  • Unchecking the chosen box = mark it as "considered, not chosen" (don't delete the analysis — it has value as a record).
  • Editing a score = override the model's ranking. Re-runs treat human-edited scores as ground truth.
  • Adding > Constraint: ... anywhere = a hard constraint for the next run; branches that violate it must be pruned and the violation must be cited.
  • Running /forkcast from <fork-id> = regenerate only that subtree, leaving the rest of the file intact.
  • Running /forkcast --linearize = emit a flat tasks/todo.md from the chosen path, ready for /yalla or normal execution.

Hallucination mitigation

The single largest objection to LLM lookahead is correct: confidence compounds multiplicatively, and a 10-step rollout is mostly fiction. Forkcast does not pretend otherwise. It surrounds the prediction with four guardrails so the user sees what's grounded vs what's speculation:

  1. Confidence cascade with auto-stop at cumulative < 0.3
  2. Grounding citations required per step (file/commit/memory/prior decision)
  3. Critic gate via /codex challenge at depth 3 and at chosen-leaf
  4. Self-consistency at every fork (3 rollouts at raised temperature)

Full design in docs/HALLUCINATION_MITIGATION.md. The honest claim: this does not eliminate hallucination, it makes it legible. A [speculative] ⚠️ branch is doing exactly what LeCun says LLMs do badly — and the user can see it doing it, and decide.

Prior art

Forkcast is not novel research. It is a packaging and distribution play that borrows from established techniques and ships them as a Claude skill:

Borrowed:

  • Tree of Thoughts (Yao 2023) — branch reasoning, evaluate states, search. Forkcast's core decomposition.
  • LATS (Zhou 2023) — MCTS + LM value functions + self-reflection. Validates branch-scoring with LM-as-judge.
  • RAP (Hao 2023) — uses the LLM itself as a world model for rollouts. Direct precedent for the 10-step simulation.
  • Self-Consistency (Wang 2022) — sample many paths, marginalize. Justifies multi-branch enumeration.
  • Reflexion (Shinn 2023) — verbal critique fed back as memory. Used for branch re-scoring.
  • ReAct (Yao 2022) — reasoning + acting traces. Forkcast nodes are ReAct-shaped.
  • Constitutional AI critic loops (Bai 2022) — pattern for the audit gate.
  • SWE-agent and OpenHands — structured action spaces beat free-form for software tasks.
  • Voyager (Wang 2023) — growing skill library. Justifies skill-extraction from successful chosen paths (future work).

Deliberately not doing:

  • GOAP / HTN / PDDL — require hand-authored predicates. Forkcast uses LM-native natural-language preconditions.
  • MCTS — no UCB rollouts, no win/loss backprop. Shallow enumeration with LM scoring is sufficient for project-decision domains.
  • JEPA / World Models (Ha & Schmidhuber 2018) — visual/embedding latent dynamics for embodied control. Unnecessary for text-native task spaces where the LM's pretraining already encodes the transition function.

What forkcast adds on top:

  • User-editable markdown artifact as the deliverable (not internal scratch). Manus and Cognition emit plans but treat them as ephemeral; forkcast's tree IS the product.
  • Project-context grounding — branches scored against the actual repo / CLAUDE.md / memory, not abstract benchmarks.
  • Confidence cascade with citation requirements — closes the LeCun gap by making the absent world model auditable rather than pretending it exists.

Defaults and configuration

BRANCHING_FACTOR  = 3      # candidates per fork
HORIZON_K         = 6      # default depth, max 10
MIN_CONFIDENCE    = 0.3    # cumulative threshold for `[speculative]`
SELF_CONSIST_N    = 3      # rollouts per branch at fork
CRITIC_DEPTH      = 3      # depth at which to spawn /codex ch

Como adicionar

/plugin marketplace add sboghossian/forkcast

O comando exato pode variar conforme o repositório. Confira o README no GitHub.

Comentários · Nenhum comentário

Entre para comentar. Entrar

  • Ainda não há comentários. Seja o primeiro.