Think-ACH - Analysis of Competing Hypotheses
Systematically narrows among multiple hypotheses against evidence using Richards Heuer's Analysis of Competing Hypotheses (ACH) technique. Generates hypotheses (parallel, isolated), enumerates evidence (parallel, isolated), builds an explicit matrix mapping each piece of evidence against each hypothesis, focuses on disconfirming evidence to rank hypotheses, and reports the surviving leader along with sensitivity analysis and falsification milestones.
This skill produces no tangible artifacts. It is a consultant, not an implementer. No code, no tickets, no commits. The output is a structured analysis the user can act on — a hypothesis leaderboard with the matrix that supports it.
The technique
ACH was developed by Richards J. Heuer Jr. for the CIA Directorate of Intelligence and is documented in his book Psychology of Intelligence Analysis (1999). It was designed specifically to counter the cognitive failure modes that intelligence analysts (and everyone else reasoning under uncertainty) habitually exhibit:
- Confirmation bias — seeking evidence that confirms a preferred hypothesis rather than evidence that disconfirms competing ones
- Premature closure — locking in on the first plausible hypothesis and stopping the search
- Anchoring — letting the leading candidate dominate subsequent reasoning
- Cherry-picking evidence — emphasizing convenient evidence and rationalizing away inconvenient evidence
- Failure to consider alternatives — never enumerating the full hypothesis space
ACH's structural countermeasures:
- Force enumeration of all plausible hypotheses upfront (anti-anchoring; anti-premature-closure)
- Build an explicit matrix of evidence × hypothesis (anti-cherry-picking; makes the analysis legible)
- Focus on disconfirmation — the central insight: a hypothesis cannot be proven, only disconfirmed. The surviving hypothesis is the one with the least disconfirming evidence, not the most confirming. (Anti-confirmation-bias.)
- Identify diagnosticity — surface which evidence actually discriminates among hypotheses; drop evidence consistent with all
- Sensitivity analysis — for each load-bearing piece of evidence, ask "what if this is wrong?" and watch how the conclusion changes
- Report all hypotheses, not just the leader — preserve the alternatives so the user knows what's still in play
- Identify falsification milestones — what future observation would distinguish the top candidates?
When to use vs /think-diagnose
These two skills overlap in problem domain (multi-candidate evaluation under uncertainty) but are structurally distinct.
-
/think-diagnose— open-ended causal exploration. Generative + evaluative. Lens-driven brainstorming of candidate causes (technical, human-factors, environmental, measurement-artifact, etc.) plus narrative evidence evaluation. Use when the user has a phenomenon and wants to understand its causes broadly. Output: leading candidates with distinguishing evidence needed. -
/think-ach— rigorous narrowing among hypotheses. Primarily evaluative, with explicit matrix structure and disconfirmation focus. Use when the user has competing hypotheses (provided or just-generated) and wants to systematically narrow among them. ACH is broader than diagnosis — it applies to causal attribution, forecasting, attribution-of-responsibility, strategic assessment, and similar multi-hypothesis questions.
Natural workflow when both apply: /think-diagnose generates candidate causes; /think-ach rigorously narrows among them. They are complementary, not duplicative.
ACH also stands alone for non-causal questions ("which of these scenarios is most likely?", "which actor is most likely responsible?", "which interpretation of the data is most defensible?").
Roles
Judge (you, running this skill):
- Receive the question and any seed hypotheses
- Validate the question is ACH-shaped
- Spawn hypothesizers in isolation across angles
- Spawn evidence-gatherers in isolation across evidence classes
- Build the matrix (evaluating each cell independently)
- Run diagnosticity, disconfirmation-focused ranking, sensitivity analysis, and falsification-milestone identification
- Synthesize the report
Hypothesizers (THK - ACH Hypothesizer): Each receives the question and an assigned angle (leading, alternative, adversarial, null, deceptive, surprise). Generates hypotheses from that angle in isolation.
Evidence-gatherers (THK - ACH Evidence Gatherer): Each receives the question and an assigned evidence class (direct-observational, documentary-historical, structural, behavioral, absent, anomalous). Enumerates relevant evidence in that class in isolation.
Workflow
1. Receive the Question and Any Seed Hypotheses
The question may arrive as:
- Conversation context — summarize back, confirm
- A document — read the file (incident report, design analysis, intelligence brief)
- Fresh user input — capture verbatim
The user may also provide seed hypotheses they already have in mind. Capture them as inputs to step 3 (they don't replace the parallel hypothesizers — they augment).
Produce a written brief of the question. A good brief includes:
- The question — what is being analyzed (a phenomenon, a forecast, an attribution claim, a scenario assessment)?
- Scope — what's in, what's out
- Available evidence — what evidence is available in principle (what records, observations, sources can be drawn on)?
- Seed hypotheses — any hypotheses the user has already articulated
2. Validate the Question Is ACH-Shaped
ACH applies when:
- Multiple plausible hypotheses exist — at least 3, ideally 4-7. With only 1-2 hypotheses, ACH is overkill; with 10+, the matrix becomes unwieldy and hypotheses are usually too granular.
- Evidence is available — there's enough material to discriminate among hypotheses. Pure speculation is not ACH territory.
- Hypotheses are roughly mutually exclusive — they should make different predictions about evidence. Hypotheses that all predict the same things cannot be discriminated.
- The user wants rigorous narrowing — not exploratory ideation. If the user wants to generate hypotheses, route to
/think-diagnose(for causal questions) or/think-brainstorm(for action options) first.
If the question fails any check, say so plainly and offer the alternative:
- Too few hypotheses or too vague →
/think-diagnoseto generate causes, or/think-brainstormto generate options - Too thin evidence → narrow the question, or wait until more evidence is available
- Hypotheses not mutually exclusive → reframe so they make distinguishable predictions
3. Enumerate Hypotheses (Parallel, Isolated)
Spawn 4-6 THK - ACH Hypothesizer agents in parallel, each with a different angle.
Hypothesis-generation angles:
- leading — the obvious, popular, or most-favored hypothesis
- alternative — hypotheses that contradict the leading candidate
- adversarial — someone benefits from a specific outcome; intentional action by an actor
- null — nothing unusual is happening; appearances are normal; the boring hypothesis
- deceptive — appearances are intentionally misleading; someone is covering up
- surprise — an unexpected hypothesis that fits the evidence; the one nobody volunteered
Selection heuristics:
- Always include leading and alternative — these establish the basic competition
- Include null unless the phenomenon being analyzed is structurally non-null (i.e., something has demonstrably happened that requires explanation)
- Include adversarial when the question involves actors with motivations
- Include deceptive when the question involves trust, intelligence, security, or signals that could be manipulated
- Include surprise when the question is novel or the user is c