Release Engineering for AI-Built Games: 3,830 Lines of Paranoia Per Title

/ / 7 min read

Release Engineering for AI-Built Games: 3,830 Lines of Paranoia Per Title

The Dark Factory ships games to Steam. Four games, built entirely by autonomous AI agents running on cron jobs every three hours. Polybreak, Chronostone, Voidrunner, Dreadnought — each one a real Love2D game with tens of thousands of lines of Lua, procedural audio, particle systems, and full gameplay loops.

But writing a game is not the same as shipping one.

Shipping means your binary runs on someone else’s machine. It means Steam’s metadata consistency checks pass. It means your AppID in steam_appid.txt matches your app_build.vdf which matches your depot_data.vdf, and if any of them disagree by one digit, your upload silently fails and you spend three days figuring out why.

When a human developer ships a game, they can eyeball the config files. They can run the build, test it locally, and push to Steam with reasonable confidence. When autonomous agents ship a game, there is no eyeball. There is no “reasonable confidence.” There is only infrastructure.

So we built 3,830 lines of bash per game. Eleven scripts. A ten-stage pipeline that refuses to trust anything, including itself.

The Trust Problem

Here’s the core tension: AI agents write good code, but they write it without the ambient awareness that human developers carry. A human knows that steam_appid.txt exists because they created it once, manually, and they remember. An agent knows it exists because its prompt says so — and prompts can be wrong, contexts can be stale, and files can drift between cycles.

Every agent run is a fresh session. The agent that wrote gfx.lua yesterday doesn’t remember doing it today. The agent that set the AppID to 480 (Valve’s placeholder) doesn’t remember that production requires a real ID. The orchestrator that told four agents to build four games can’t personally verify that all four have consistent Steam metadata.

So you don’t trust. You verify. Every time. Automatically.

The Pipeline

The release pipeline has ten stages, executed in strict fail-fast sequence. If any stage fails, nothing downstream runs. There is no “skip this check” flag.

Build → Preflight → ID Readiness → ID Cutover →
Release Dry-Run → Evidence Pack → Cutover Rehearsal →
Go-Live Pack → Readiness Smoke → Final Upload

Stage 1: Build

build.sh creates the .love archive from Lua source, then fuses it with Love2D runtime binaries for Windows and Linux. The output is a directory structure that maps directly to Steam depot layout:

build/
  polybreak.love
  win/
    polybreak.exe       # Fused Windows binary
    steam_appid.txt     # Must match master AppID
    *.dll               # Love2D runtime
  linux/
    polybreak.love
    polybreak.sh        # Launch wrapper
    steam_appid.txt
    love                # Love2D Linux binary

Before the build runs, it validates Steam metadata consistency across all config files. If steam_appid.txt says 480 but app_build.vdf says 481, the build fails immediately. No silent mismatches.

Stage 2: Preflight

preflight.sh is the quality gate. It runs four checks:

  1. Lua syntax validation — every .lua file through luac -p. Catches syntax errors that would crash at runtime.
  2. Steam metadata consistency — AppID and DepotID match across steam_appid.txt, app_build.vdf, and depot_data.vdf.
  3. Packaging smoke build — triggers a full build.sh run to verify artifact structure.
  4. Release command validation — dry-runs release-ready.sh to confirm the release path is executable.

This is where placeholder IDs get flagged. During development, all four games use Valve’s Spacewar IDs (AppID 480, DepotID 481). Preflight accepts placeholders for local testing but marks them clearly — production mode rejects them outright.

Stage 3-4: ID Readiness and Cutover

Production Steam IDs come from Valve’s partner portal. production-id-readiness.sh validates the IDs (positive integers, AppID != DepotID) and cross-checks them against VDF files. It never applies them automatically — it prints the exact command you’d need to run.

set-steam-ids.sh performs the atomic cutover. One command updates all three metadata files. It uses awk for safe in-place replacement — temp file, validate, then replace. If the replacement fails mid-operation, the originals survive.

Stage 5: Release Dry-Run

release-ready.sh is the strictest gate. It rejects placeholder IDs, re-validates Lua syntax, checks platform artifact integrity (Windows .exe exists, Linux launcher exists, DLLs present), and verifies steam_appid.txt consistency across platform packages.

It prints the exact SteamCMD invocation but never executes it:

steamcmd +login "$STEAM_USERNAME" "$STEAM_PASSWORD" 
  +run_app_build steam/app_build.vdf +quit

Upload only happens with an explicit --upload flag. The default is read-only.

Stage 6-8: The Evidence Machine

This is where the pipeline gets paranoid. release-dryrun-evidence.sh runs a full snapshot-apply-validate-restore cycle:

  1. Snapshot all Steam metadata files with SHA256 checksums
  2. Apply production IDs via set-steam-ids.sh
  3. Run the release dry-run
  4. Restore original metadata
  5. Compare restored checksums against originals

If even one byte drifts during restore, the entire workflow fails. The evidence directory captures before, applied, and restored states:

build/evidence/release-dryrun-evidence-20260306T113939Z/
  before/metadata/          # Original files
  applied/metadata/         # After ID cutover
  restored/metadata/        # After restore
  before/metadata.sha256    # Checksums
  restore-checksum-drift.diff  # Must be empty

go-live-evidence-pack.sh orchestrates three sub-stages (ID readiness, dry-run evidence, cutover rehearsal) and writes a JSON manifest documenting every step’s exit code and PASS/FAIL status. If any step fails, downstream steps show SKIP with the failure reason.

Stage 9-10: Final Gate and Upload

release-readiness-smoke.sh runs the full evidence pack and hard-fails if the result isn’t PASS. This is the last gate before upload.

final-upload-cutover.sh requires explicit --confirm-live plus Steam credentials. Without both, it prints what it would do and exits. With both, it applies production IDs and triggers the SteamCMD upload — the only script in the entire pipeline that actually sends data to Valve.

Why This Much Infrastructure?

The easy answer is “because AI agents make mistakes.” But that’s incomplete. Human developers make mistakes too. The real answer is that AI agents make mistakes without institutional memory.

A human developer who ships a broken build remembers next time. They add a manual check, or they tell their teammate, or they write it on a sticky note. An agent that ships a broken build doesn’t remember next time — its next session starts fresh. The only “memory” that survives between sessions is infrastructure.

Every one of these 3,830 lines represents a failure mode that was either encountered or anticipated. The checksum verification exists because metadata can drift. The placeholder detection exists because agents default to dev IDs. The evidence packs exist because “trust me, it worked” doesn’t survive an audit.

This is what release engineering looks like when your developers have no long-term memory: you encode every lesson into the pipeline itself.

Current Status

All four Dark Factory games pass preflight. Voidrunner — the feature-complete shmup — has run the full evidence pipeline. The only failure? Placeholder IDs. The pipeline correctly refuses to package a game with dev IDs for production.

The actual blocker is business, not engineering. Production AppIDs come from Valve’s partner portal, and that process takes weeks. The infrastructure is ready. The games are ready. The pipeline is waiting for a number.

When that number arrives, releasing Voidrunner to Steam is four commands:

# Write production IDs
echo "APP_ID=1234567nDEPOT_ID=1234568" > build/ids-prod.env

# Validate everything
./go-live-evidence-pack.sh --ids-file build/ids-prod.env

# Final smoke gate
./release-readiness-smoke.sh --ids-file build/ids-prod.env

# Ship it
STEAM_USERNAME=x STEAM_PASSWORD=y 
  ./final-upload-cutover.sh --ids-file build/ids-prod.env --confirm-live

Four commands. Eleven scripts. 3,830 lines of verification. Zero trust.

That’s how you ship a game that nobody wrote by hand.

// Leave a Response

Required fields are marked *