AlterLab GameForge -- Hypothesis-Driven Prototyping
Prototypes are experiments, not demos. Every prototype exists to answer one question: "Is this worth building?" The moment you start polishing a prototype, you have stopped prototyping and started building -- and you may be building the wrong thing. Undertale's original demo was rough, ugly, and proved exactly one thing: the bullet-hell-meets-RPG mechanic was worth a full game. Hollow Knight started as a game jam prototype that validated tight combat in an atmospheric 2D world. Celeste began as a PICO-8 prototype that proved a single idea: precise air-dash platforming feels incredible at 8x8 pixel resolution. This workflow enforces that same discipline: define a hypothesis, build the minimum viable test, observe real players, make a binary kill-or-promote decision based on evidence.
Purpose & Triggers
Invoke this workflow when:
- A team member proposes a new mechanic and the response should be "prove it" rather than "ship it"
- The design document contains an assumption that has never been tested with real players
- Two competing mechanic designs need a head-to-head bake-off to determine which one feels better
- A feature sounds good on paper but the team has no visceral sense of whether it will be fun
- Pre-production needs to validate core loops before committing engineering resources
- A pivot is being considered and the new direction needs rapid feasibility confirmation
Do NOT use this workflow when:
- The mechanic is well-understood and already validated in similar games (just build it properly)
- You need a vertical slice for stakeholder presentation (that is a demo, not a prototype)
- The question is about content, not mechanics (content questions need playtesting, not prototyping)
Critical Rules
- One hypothesis per prototype. If you are testing two things, you have two prototypes. Combining hypotheses contaminates your results -- when it fails, you will not know which part failed.
- Time-box or die. Every prototype gets a strict time limit: 1-3 days maximum. If the hypothesis cannot be tested in that window, the scope is too large. Decompose it further.
- Prototype code is biohazard. It does not graduate to production. Ever. When a hypothesis is validated, the real implementation starts from scratch with proper architecture. Letting prototype code leak into production is how technical debt is born. Celeste's PICO-8 prototype shared zero code with the final game -- it proved the feel, then the real build started clean.
- Ugly is correct. Colored rectangles for characters. Placeholder sounds. Programmer art. Comic Sans labels. If anyone comments on the visual quality of a prototype, they have misunderstood its purpose. The Hollow Knight game jam prototype used simple silhouettes -- the atmosphere came later, the feel came first.
- Observe behavior, not opinions. Players will tell you what they think you want to hear. Watch what they DO. A player who says "yeah it was fine" but leaned forward and played for 20 minutes straight is giving you different data than their words suggest.
- Kill without sentiment. If the evidence says the hypothesis is false, the prototype dies. It does not matter how clever the idea was, how much you personally like it, or how much time you spent building it. Supergiant kills prototypes constantly -- their GDC talks reveal dozens of dead mechanics that never made it past the test phase because the team trusts evidence over attachment.
- Always reference
docs/game-design-theory.mdfor shared theoretical frameworks (MDA, Flow Theory, SDT) when formulating hypotheses about player experience.
Workflow
Step 1: Define the Hypothesis
Write the hypothesis in this exact format: "We believe that [mechanic/system] will produce [player behavior/emotion] when [specific condition]."
Then define falsification criteria -- what evidence would DISPROVE the hypothesis? This is the most important part. If you cannot define what failure looks like, your hypothesis is unfalsifiable and therefore untestable.
Examples of strong hypotheses:
- "We believe that a grappling hook with momentum preservation will make traversal feel exhilarating when the player chains three or more swings without touching the ground."
- "We believe that asymmetric co-op roles (one player builds, one player defends) will produce emergent communication when the builder can see threats the defender cannot."
- "We believe that a stamina system with visible recovery will create tactical tension when the player faces two enemies and cannot defeat both without resting."
Examples of weak hypotheses (and why):
- "The combat will be fun" -- unfalsifiable. What is fun? Under what conditions? For whom?
- "Players will like the art style" -- this is a content question, not a mechanic question. You do not need a prototype for this.
- "The game will be better with multiplayer" -- too broad. Which specific multiplayer interaction? What does "better" mean measurably?
Map the hypothesis to the MDA framework from docs/game-design-theory.md: which aesthetic are you targeting (Sensation, Fantasy, Narrative, Challenge, Fellowship, Discovery, Expression, Submission)? This grounds the hypothesis in established theory and helps you define what success looks like.
Step 2: Scope Ruthlessly
Ask: "What is the absolute minimum implementation that tests this hypothesis?" Then cut it in half.
You do not need:
- A menu system. Start the prototype in the test scenario directly.
- Multiple levels. One room, one encounter, one situation.
- Save/load functionality. Nobody is playing this for more than 10 minutes.
- Audio. Unless audio IS the hypothesis (e.g., testing rhythm-based mechanics).
- Any UI beyond what the player needs to understand the core interaction.
- Animations. Lerp between states. Snap to positions. Teleport if you must.
- Edge case handling. If a player finds a bug in a prototype, congratulations -- they are exploring. Note it and move on.
You DO need:
- The core input-to-feedback loop working at full speed. If the hypothesis is about how a mechanic FEELS, input latency and response timing must be representative.
- Enough game state to test the hypothesis. If you are testing resource management tension, you need at least a minimal economy that creates scarcity.
- A way to reset quickly. Testers will play multiple rounds. A 30-second restart cycle kills your testing velocity.
Create a scope checklist with exactly three columns: | Must Have (tests hypothesis) | Nice to Have (improves test clarity) | Out of Scope (save for production) |
If the "Must Have" column has more than 5 items, you have not scoped ruthlessly enough. Go back and decompose.
Step 3: Build Dirty
This is the only time in your career when bad code is the correct code.
- Hardcode everything. Magic numbers everywhere. No config files. No data-driven anything.
- One script file if you can manage it. No architecture. No separation of concerns. No design patterns.
- Copy-paste instead of abstracting. You are writing code that will be deleted in 72 hours.
- Use the fastest path to playable, even if that means ignoring every best practice you know.
- If your engine has a visual scripting system (Blueprints, Bolt, VisualScript), use it -- faster iteration for throwaway logic.
- Commit nothing to the main repository. Prototype code lives in a throwaway branch or a separate folder that will be deleted after the decision.
The build phase should consume no more than 60% of your time budget. If you are spending 2 of your 3 days building, you have 0.5 days for testing and 0.5 days for analysis. That is not enough. Target a 40/30/30 split: 40% build, 30% test, 30% analyze.
Step 4: Structured Evaluation
Do NOT test alone. Your own assessment of your own prototype is the least valuable data you can collect.
Minimum viable test: 3-5 people who are NOT on the development team play t