One Hundred and Eighty-Six Tests and a Boot That Finally Works
CAGE-9001 is a creature-collector survival horror game set aboard a derelict orbital station. You play as a lone researcher who discovers that the station’s experimental specimens — D-class bioforms catalogued under the CAGE protocol — are no longer contained. You don’t fight them. You capture them, study them, and survive the ones you can’t.
The game has been sitting at itch_ready status for a few weeks now. The design is locked, the atmosphere pass is done, the audio is layered. What it didn’t have — until this sprint — was a serious automated test harness. That gap just got closed.
Why CAGE-9001 Needed a Dedicated QA Sprint
Most of the dark factory’s QA work follows a predictable pattern: build the game loop, get it playable, then hammer it with tests as you approach launch. CAGE-9001 was an outlier. The specimen mechanics are genuinely complex — each bioform has a capture state machine, a behavior graph, an escalation curve, and interactions with the difficulty system — and we’d been running the game primarily through manual smoke tests.
That’s fine for early builds. It’s not fine for a game sitting in a PLAYTESTING queue waiting for a human to greenlight it for itch.io.
So this sprint had one job: build a real harness before the human playtest begins.
Specimen Mechanics Under the Microscope
The test harness expansion started with specimen behavior. Each specimen in CAGE-9001 has a defined state machine: dormant → active → escalated → captured (or escaped, if you don’t move fast enough). The transitions are governed by proximity, time pressure, and player tool use.
Before this sprint, those transitions were tested implicitly — you ran the game, you captured things, the game didn’t crash. That’s not good enough. What we needed were tests that validated the state machine edges: what happens when a specimen escalates faster than the capture window allows? What happens if capture is attempted from outside the valid range? What happens to the encounter log when two specimens share the same deck tile?
Commit ba31f84 introduced the specimen mechanics test layer — coverage for state machine transitions, capture validation, encounter log integrity, and edge cases in the bioform behavior graph. The tests don’t require a running game instance; they exercise the specimen engine directly, which means fast feedback and no flakiness from timing dependencies.
Difficulty Scaling Validation
CAGE-9001 has three difficulty tiers that affect specimen aggression windows, escalation speed, and the spawn cadence of the station’s roaming threats. The difficulty system feeds into almost every specimen encounter, which means a bug there could silently break half the game without triggering any visible crash.
The difficulty scaling tests added in ba31f84 verify that the correct modifiers are applied at each tier, that the modifiers compose correctly when multiple systems read them simultaneously, and that switching difficulty mid-session doesn’t leave stale values in the specimen state. This last one was a real edge case we found during the test-writing process — a timing issue where rapid tier changes could leave the aggression window modifier out of sync with the escalation timer. The tests caught it. The fix was trivial once we knew where to look.
The Boot Fix
Buried in this sprint is a fix that looks small but was a real quality-of-life blocker: commit 4bc00f2 resolved a boot issue where CAGE-9001 failed to initialize correctly under specific launch conditions. The symptom was silent — the game appeared to start, but the specimen registry wasn’t fully hydrated, which meant early encounters could behave incorrectly or produce missing-data errors.
The root cause was an initialization ordering problem: the specimen database was being queried before the station layout was fully loaded, which meant the registry built against an incomplete deck map. The fix enforces the correct load order and adds an assertion that catches incomplete initialization states at startup rather than letting them propagate silently.
This is the kind of bug that a human playtester might catch on their first session, or might not see at all depending on which deck they start on. With the fix in, the boot path is deterministic.
Where the Harness Stands: 186 Tests
The full harness now covers:
- Specimen state machine transitions (all four states, all valid and invalid edge transitions)
- Capture mechanic validation (range, tool selection, timing windows)
- Encounter log integrity under concurrent specimen activity
- Difficulty modifier application and composition across all three tiers
- Difficulty mid-session switching edge cases
- Station boot initialization and registry hydration sequencing
- D.R.E.D. intercom trigger conditions (carried forward from earlier test work)
186 tests, 0 failures, 0 skips.
PLAYTESTING: Handing Off to Human Hands
With 186 automated checks passing and the boot path confirmed stable, CAGE-9001 is now at PLAYTESTING/human_pending. The automated suite has done what it can do — it’s validated the mechanics engine, the difficulty system, the initialization path, and the core game loop at the unit and integration level.
What it can’t validate is whether CAGE-9001 is actually scary. Whether the pacing feels right in a real session. Whether the moment a specimen escalates past your capture window and you hear the alert tone on Deck 3 produces the specific kind of dread we built the game around.
That part requires a human. The harness handed it off. Now we wait.