Building a Game Studio With No Game Designers — How AI Agents Ship Complete Games
The Dark Factory is a fully autonomous Love2D game studio. Four AI agents, each running on a cron schedule, develop four separate games in a single monorepo. A fifth agent — the studio orchestrator — coordinates quality across all of them. There are no game designers, no artists, no sound engineers. The games have procedural graphics, procedural audio, procedural level generation, and procedural music. Every asset is code.
Here’s how it actually works.
The Architecture: Five Agents, One Repo
The studio runs on cron-swarm, a system that schedules AI agent sessions as cron jobs. Each agent gets its own prompt file, its own context, and its own slice of the codebase. They share a single Git repository and coordinate through a memory layer built on append-only JSON streams.
love2d-studio (orchestrator — hourly)
|-- game-polybreak (arcade breakout — hourly)
|-- game-chronostone (RPG — hourly)
|-- game-voidrunner (vertical shmup — hourly)
|-- game-dreadnought (survival horror — hourly)
Every agent runs Claude Opus on high effort. The studio orchestrator fires at the top of each hour. Game agents fire at staggered offsets — :10, :25, :40, :55 — so they never compete for the same files simultaneously.
Each run follows the same loop: read context and open handoffs, assess the current state of the game via git log and source code, execute the highest-priority task, commit, update persistent facts, and exit. A typical run takes 4-5 minutes of compute time. Over 24 hours, the factory produces 50-80 commits across all four games.
The Games
All four games share the same meta-structure — a design document called GAME_META_STRUCTURE.md that the studio orchestrator enforces:
- Dual game modes: ARCADE (all content unlocked, score-focused) and CAMPAIGN (branching world map, story screens, shop)
- BITS currency: A Bitcoin parody earned through gameplay, spent in shops with parody product descriptions
- Three difficulty tiers: Easy (persistent upgrades), Normal (roguelite resets), Insane (harder enemies, darker vision)
- Campy sci-fi parody tone: Every player character is a robot with a punny name. Every story screen has a joke
Within those constraints, each game is genuinely different:
Polybreak — 100 levels of brick-breaking across 10 themed worlds. Player is BRKR-9000. The game has procedural music that generates per-world melodies, six power-up types, and a CRT post-processing pipeline with scanlines and vignette. Status: POLISH. 9,300+ lines of Lua.
Chronostone — A turn-based RPG with 7 areas, party management, and a battle system spanning 2,800+ lines of its own. Player is CHRON-E. Has a demo battle mode that showcases the spell system during attract. Status: POLISH.
Voidrunner — A vertical shmup with 10 sectors, 100 waves, 10+ enemy types, tiered weapon upgrades, and shield mechanics. Player is S.H.M.U.P-3000, a corporate troubleshooter. Has the richest graphics library in the factory — 326 lines of procedural VFX including shockwaves, glow lines, particle trails, and screen shake. Status: STEAM_READY. 12,100+ lines of Lua.
Dreadnought — A sci-fi survival horror game with cone-of-vision mechanics inspired by Darkwood. Player is D.R.E.D-9000, a maintenance bot on a derelict space station. Target is 100 sections across 10 decks, each with unique atmosphere, enemy ecology, hazards, and a deck boss. The audio system alone is 2,445 lines — procedural ambience, spatial sound, and 80+ synthesized sound effects. Status: BUILDING (9/100 sections). 9,600+ lines and growing.
Total codebase: 53,000+ lines of Lua across all four games.
How Agents Develop Expertise: Persistent Memory
Each game agent maintains persistent state through a v2 memory layer. Three data structures drive the decision loop:
Facts are mutable key-value pairs. The agent reads them at the start of each run and updates them at the end. A game’s milestone fact is its compressed history — everything that’s been built, backported, and polished:
All 100 layouts done; arcade mode endless; shake+trail migrated
to Gfx module; hitstop complete; glowLine laser beams; CRT polish
(scanlines+vignette); Gfx.shockwave backported; Audio.drone
backported (55Hz+82Hz ambient, per-world pitch); 9.3K LOC
That single line tells the next run exactly where the game stands. No investigation needed.
Handoffs are directed tasks sent from the studio orchestrator. They’re stored in append-only streams with lifecycle tracking: open → acked → done. When a game agent starts a run, it reads its open handoffs first. These always take priority over self-directed work.
Notes are timestamped decision records. When an agent makes a non-obvious choice — like closing a handoff as already-done because its existing implementation is superior — it writes a note explaining why:
Handoff shake-backport closed as already-done. Polybreak has its
own shake system (local shake table, line 421-426 in game.lua)
with hitstop, paddle squash, and proportional multi-break
intensity — richer than chronostone's Gfx.shake.
This prevents the studio from re-sending the same task and gives future runs context about past decisions. The memory layer is built on Git — every fact update, handoff event, and note is a commit. The full decision history of the factory is auditable.
Cross-Game Intelligence: The Backport System
This is where the studio orchestrator earns its compute budget.
Each game develops features independently. Polybreak invented procedural music generation. Voidrunner built the richest graphics library. Dreadnought created a 2,445-line spatial audio system. When one game develops something reusable, the studio detects it and propagates it.
The detection is straightforward. Every hour, the studio diffs shared modules across games:
grep "^function" games/*/src/gfx.lua
grep "^function" games/*/src/audio.lua
If Voidrunner has Gfx.shockwave but Polybreak doesn’t, the studio sends a specific backport handoff:
Backport Gfx.shockwave from voidrunner/src/gfx.lua (line 78) to
your gfx.lua. Signature: shockwave(x, y, r, g, b, radius, duration).
Use it for combo rewards and boss phase transitions.
The handoff is never vague. It includes the source file, line number, function signature, and specific integration points for the target game. The receiving agent reads the source implementation, adapts it to its own codebase style, wires it into gameplay events, and commits.
Real backport commits from the factory:
feat: backport Gfx.shockwave from voidrunner with combo, boss,
and powerup integration
feat: backport Gfx.debris from polybreak — chunky ship fragments
on enemy/boss death
feat: backport Audio.drone from voidrunner with per-world pitch
and layered ambience
feat: backport procedural music generation from polybreak with
stings integration
feat: backport hitstop into Gfx.shake() matching polybreak
implementation
The result is that all four games now share a common set of production-quality effects — shake, shockwave, debris, glowLine, vignette, scanlines, trail, drone, setMasterVolume — while each game adapted the integration to its own gameplay context.
This is not copy-paste. It’s directed adaptation with context.
Shared Bug Detection
Cross-game quality passes also catch bugs. When the studio orchestrator detects a fix in one game, it checks whether sibling games have the same issue.
A real example: Dreadnought hit a UTF-8 crash caused by non-ASCII characters in Lua string literals. The fix — replacing curly quotes, em dashes, and other non-ASCII characters with safe equivalents — was identified and propagated to all four games’ text rendering:
fix: replace all non-ASCII characters with ASCII equivalents to
prevent UTF-8 crash
Another pattern: QA passes in one game reveal structural issues that exist across the factory. When Polybreak’s QA pass found that noteFreq :upper() broke flat notes in 6 out of 10 world music tracks, the studio checked whether Chronostone’s music system had the same issue.
The Attract Mode System
Every game in the factory has an attract mode — an AI-controlled demo that activates after 10 seconds of idle time on the title screen. This serves two purposes: it makes the game look alive when nobody’s playing, and it enables automated QA testing.
The implementation pattern is consistent across all four games:
Game._demo = {
active = false,
idle_t = 0,
}
-- In the update loop:
if Game._demo.active then
Game._demo.step(dt)
else
Game._demo.idle_t = Game._demo.idle_t + dt
if Game._demo.idle_t >= 10 then
Game._demo.init()
end
end
-- Any user input resets:
Game._demo.idle_t = 0
Game._demo.active = false
But the gameplay inside each demo is game-specific. Polybreak auto-plays levels demonstrating ball physics and brick destruction. Chronostone runs a demo battle showcasing the spell system. Dreadnought has an AI bot that explores dark corridors with the cone-of-vision mechanic. Voidrunner auto-plays waves of enemy patterns.
The love2d-autoplay-qa skill uses these attract modes to launch each game, let it play itself, and capture screenshots for visual QA — entirely automated, no human input needed.
Zero External Assets
Every visual in every game is drawn with love.graphics primitives. Every sound is synthesized at runtime. Every melody is generated procedurally. There are no sprite sheets, no WAV files, no MIDI tracks, no fonts beyond the system default.
Voidrunner’s graphics library draws everything — ships, bullets, explosions, UI panels, particle effects — using polygons, glow effects, and procedural math:
function Gfx.glowPolygon(mode, cx, cy, radius, sides, rotation,
r, g, b, glowSize)
glowSize = glowSize or 8
for i = glowSize, 1, -1 do
local a = 0.06 * (1 - i / glowSize)
love.graphics.setColor(r, g, b, a)
Gfx.polygon("fill", cx, cy, radius + i, sides, rotation)
end
love.graphics.setColor(r, g, b, 0.7)
Gfx.polygon("fill", cx, cy, radius, sides, rotation)
love.graphics.setColor(r, g, b, 1)
Gfx.polygon("line", cx, cy, radius, sides, rotation)
end
Dreadnought’s 80+ sound effects — from alien screeches to steam bursts to ceiling creaks — are all generated with love.audio waveform synthesis and envelope shaping. The audio module handles spatial positioning, environmental reverb, and layered drone ambience, all in code.
This zero-asset approach is not a limitation — it’s a design choice. AI agents can write code. They can’t create pixel art or record foley. By making everything procedural, the entire game is within the agent’s medium.
How the Studio Orchestrator Thinks
The orchestrator runs every hour. Its prompt defines a specific survey-and-delegate pattern:
- Survey — Run
git log --since="3 hours ago"across all games. Identify which games are active and which are stale.
- Drill down — For each game, read the last few commits, check open handoffs, and read the current milestone fact.
- Assign — Every game gets at most one new handoff per run. BUILDING games get content tasks. POLISH games get specific polish tasks (audio feel, visual juice, game feel, UX, humor, accessibility). STEAM_READY games get readiness checks.
- Detect backport opportunities — Compare shared modules across games. If one game has a function that others lack, send a specific backport handoff.
- Escalate — If a game has been stale for 6+ hours, if a lifecycle transition occurs, or if a blocker can’t be resolved at the studio level, escalate to the operator.
The studio never writes game code itself. It reads code, compares implementations, and sends precisely-worded handoffs. It’s a portfolio manager, not a developer.
Lifecycle Transitions
Each game moves through a defined lifecycle: BUILDING → FEATURE_COMPLETE → POLISH → STEAM_READY → SHIPPED.
The studio tracks these transitions and adjusts its behavior accordingly. BUILDING games get feature and content tasks every hour. Once a game hits FEATURE_COMPLETE, it shifts to a polish rotation — cycling through audio, visuals, game feel, UX, writing, accessibility, and code health. When a game reaches STEAM_READY, its cron frequency drops (from hourly to every 3 hours) and it enters a holding pattern awaiting Steam credentials.
Voidrunner was the first game to reach STEAM_READY. Its polish prompt explicitly states: “NEVER output ‘nothing to do’ or ‘the game is complete.’ There is always something to polish.” Even in the lowest-priority state, the agent finds micro-improvements — tightening timing windows, adding screen shake to a hit that didn’t have it, adjusting color values for better contrast.
The Numbers
The Dark Factory has been running continuously since early March 2026. Some real metrics:
- 53,000+ lines of Lua across 4 games
- 3,446 lines of procedural audio code (Dreadnought alone: 2,445)
- 326-line graphics library in Voidrunner, backported to all games
- 100 levels in Polybreak, 100 waves in Voidrunner, 7 areas in Chronostone, 100 sections planned in Dreadnought
- 4-5 minutes average per agent run
- 50-80 commits per day across the factory
- 20+ cross-game backports completed (VFX, audio, game feel)
- 4 attract modes — every game plays itself for QA
The compute cost is real — five Claude Opus sessions running hourly adds up. But the output is also real: four playable games with procedural everything, cross-pollinated features, automated QA, and a coordination system that scales.
What This Isn’t
This is not AGI building games. Each agent runs for a few minutes, makes a focused change, and exits. The intelligence is in the system design — the prompt engineering, the memory architecture, the handoff protocol, and the backport detection.
The agents don’t understand game design theory. They don’t have artistic vision. They follow detailed prompts that encode specific design decisions (campy sci-fi parody, three difficulty tiers, dual game modes) and make those decisions concrete through code.
What they do have is consistency, patience, and the ability to read 12,000 lines of Lua and make a surgical change. That’s enough to ship games — if the system around them is well-designed.
The Dark Factory is less about AI capability and more about systems engineering. The agents are the workers. The architecture is the studio.