Procedural Level Generation: How AI Agents Build 4 Different Game Worlds From Pure Code

/ / 9 min read

Procedural Level Generation: How AI Agents Build 4 Different Game Worlds From Pure Code

The Dark Factory ships four games with radically different level structures. A 100-level breakout campaign. An RPG with three explorable maps. A shmup with 100 procedural waves across 10 corporate-dystopia sectors. A survival horror game where every room, corridor, nest, and flickering light is generated at runtime from a 2,495-line station generator.

None of them use a map editor. None of them load level files. Every world is built from algorithms — and the choice of which algorithm matters more than you’d think.

The Constraint

The Dark Factory operates under a hard rule: no external assets, no binary files, no visual editors. AI agents write code. They can iterate on a generation algorithm in the same commit cycle they iterate on gameplay. But they can’t open Tiled, drag tiles around, and export a JSON map.

This constraint forces each game to solve level design as a software engineering problem. And because each game has different gameplay requirements, each one solves it differently. The result is a catalog of four distinct procedural generation strategies, each optimized for its genre.

Strategy 1: Authored Layouts — Polybreak

Polybreak is a 100-level breakout game organized into 10 themed worlds. Every level is hand-designed as a string grid in 1,192 lines of Lua.

The grid is 12 columns wide with variable height. Each character maps to a brick type:

.  = empty
1  = 1-HP brick
2  = 2-HP brick
3  = 3-HP brick
X  = indestructible steel

A level looks like this in code — a grid of characters that defines the exact brick layout. World 1 levels use only 1-HP bricks in simple symmetric patterns. By World 5, steel blocks appear, forcing players to learn ricochet angles. World 10 levels are dense, multi-HP mazes where every shot counts.

This isn’t procedural generation at all — and that’s the point. For a breakout game, level design is puzzle design. The satisfaction comes from patterns that feel intentional: chevrons, rings, spirals, faces, shields. An AI agent authored all 100 layouts for visual coherence and difficulty progression. Randomization would have produced levels that work mechanically but feel arbitrary.

Lesson: Not everything should be procedurally generated. When the level IS the content — when players are meant to master specific layouts — authored design beats random generation every time.

Strategy 2: Deterministic Detail — Chronostone

Chronostone is an RPG with three hand-crafted maps: a town hub (Starbase Pyre), a forest dungeon (Whispering Woods), and a cave (The Data Core). The maps are static — rooms, corridors, NPC positions, and chest locations are all placed manually in a 773-line map renderer.

But within those static maps, the engine adds procedural visual detail using a deterministic tile hash:

hash = ((col * 7919 + row * 6271 + seed * 1013) % 256) / 255

This function takes a tile’s grid position and a seed, then outputs a value between 0 and 1. The prime multipliers (7919, 6271, 1013) ensure pseudo-random distribution without clustering. Every tile gets a unique but repeatable hash that controls:

  • Which brick texture variant appears on stone walls
  • Where grass tufts and floor cracks are placed
  • Subtle color shifts in stone and wood surfaces
  • Mortar line patterns between tiles

The forest map (Whispering Woods, 30×24 tiles) goes further — it uses seeded random placement for trees. math.randomseed(42) ensures 60% of candidate tree positions get filled, producing a consistent forest layout that looks organic but regenerates identically every time.

Lesson: Procedural detail within authored structure gives you the best of both worlds. The dungeon designer controls pacing, progression, and encounter placement. The hash function handles visual polish at a granularity no designer would bother with — and it costs exactly one line of math per tile.

Strategy 3: Numeric Scaling — Voidrunner

Voidrunner is a vertical-scrolling shmup with 100 waves across 10 sectors. There are no spatial levels to generate — the playfield is a fixed screen. What varies is enemy composition, spawn timing, and difficulty scaling.

The wave definition system runs on pure math:

baseCount = 4 + sector * 2 + floor(wave * 0.6)
spawnRate = max(0.2, 0.85 - sector * 0.04 - wave * 0.02)

Sector 1, Wave 1 spawns 7 enemies at 0.81-second intervals. By Sector 8, Wave 9, you’re facing 27 enemies spawning every 0.2 seconds. The curves are linear per-sector but compound across the full 100-wave campaign.

Enemy type composition follows a roster unlock pattern:

  • Sectors 1-2: Grunts, Interns, Weavers (3 types — learn the basics)
  • Sector 3: Adds Tanks and Reply-All Bombers (area denial)
  • Sectors 5-6: Snipers, Scope Creep, Consultants, Fun Coordinators (9 types)
  • Sectors 7-10: Full roster with elite promotions (2x HP chance scales with wave)

Every tenth wave is a sector boss. Each boss is unique — the Hiring Manager, the Scrum Master, the CFO, the Board of Directors. These are hand-designed encounters within the procedural wave system.

The system adds chaos through wave modifiers. After Sector 1 Wave 3, each wave has a 35% chance of applying a modifier: Casual Friday, Happy Hour, Efficiency, Unpaid Overtime, Open Door. Sector 8 (Quarterly Review) adds a permanent +30% enemy count with 0.7x spawn intervals — roughly 43% higher effective density.

Spawn order is randomized per wave using Fisher-Yates shuffle on the enemy queue. Same enemy types, different encounter order every playthrough.

Lesson: When your game’s challenge comes from composition rather than geography, you don’t need spatial generation. Numeric scaling with type rosters produces enormous variety from simple formulas. The key is getting the curves right — and an AI agent can tune difficulty curves faster than a human designer because it can test hundreds of parameter combinations automatically.

Strategy 4: Procedural Dungeon — Dreadnought

Dreadnought is the factory’s most complex generator. Its 2,495-line station.lua builds complete space station sections from scratch — rooms, corridors, dead ends, decorations, lighting, alien nests, environmental storytelling, and atmospheric hazards. Every playthrough produces a different station layout.

Room Placement

The algorithm starts with a 60×60 tile grid (1,920×1,920 pixels) filled with walls. It then carves rooms:

  1. Pick random position and random dimensions (5-9 tiles per side)
  2. Check for overlap with existing rooms (2-tile buffer enforced)
  3. Place room and record its center point
  4. Repeat up to 200 attempts until target room count is reached (typically 6-12 per section)

This is a relaxed packing algorithm — not binary space partition, but simpler and more predictable. The 2-tile buffer between rooms guarantees corridor space and prevents claustrophobic clustering.

Corridor Routing

Rooms are connected sequentially using L-shaped Manhattan-distance corridors:

  1. From Room A’s center, carve horizontally to Room B’s X coordinate
  2. Then carve vertically to Room B’s center
  3. Track all corridor tiles for dead-end branch generation

This creates natural T-junctions and L-shapes. The sequential connection (1→2→3→…→10) guarantees reachability — every room is accessible from every other room through the corridor chain.

Dead-end branches spawn perpendicular to corridors at a rate of roughly 1 per 12 corridor tiles, extending 3-6 tiles. These create exploration tension — in a horror game, every dead end could contain supplies or could be a trap.

Room Feature Variation

After carving, each room gets probabilistic modifications:

  • Pillars (30%): 2×2 wall blocks placed in room interiors, breaking line of sight
  • Partitions (20%): Horizontal or vertical walls with 2-3 tile gaps, subdividing space
  • Corner cutoffs (25%): L-shaped rooms created by filling in corner triangles
  • Alcoves (35%): 1-2 tile indentations on walls, creating nooks and hiding spots

These percentages combine — a single room might get pillars AND alcoves, producing complex interior geometry from simple probabilistic rules.

11 Atmospheric Layers

Raw room geometry isn’t enough for horror. Dreadnought overlays 11 procedural detail systems on every section:

  1. Ceiling lights — 1-3 per room, three states per section (working/flickering/broken), color-coded per theme. Engineering runs 25% working lights; Bridge runs 50%. Flicker cycles range from 0.03s to 8s.
  1. Alien nests — 2-4 per section, biased toward rooms far from spawn. Each nest is an irregular 5-7 vertex polygon with 3-5 extending tendrils and 2-4 pulsing egg pods.
  1. Wall graffiti — 3-6 section-themed text messages. Environmental storytelling placed on wall-adjacent floor tiles.
  1. Fallen crew — 2-4 body remains in four pose variants (slumped, prone, reaching, curled), each with unique polygon geometry.
  1. Spark panels — 3-10 damaged electrical panels with spark particle effects.
  1. Web traps — 4-6 alien web patches that slow player movement.
  1. Breach zones — 4-8 hull breach areas with atmospheric particle effects.
  1. Pipe bursts — 5-9 ruptured pipes with proximity-triggered ejection physics.
  1. Egg sacs — 6-12 alien egg clusters that hatch when the player approaches.
  1. Broken turrets — 3-7 malfunctioning defense turrets in 60-90% broken state.
  1. Section theme — Floor and wall color overrides that give each area a distinct visual identity (blue Engineering, green Medical, cyan Bridge, orange Cargo, and so on).

These layers compose multiplicatively. A single room might have flickering cyan lights, two alien nests with pulsing tendrils, a slumped crew body against the wall, sparking electrical panels, and a graffiti warning scrawled on the floor. Every room tells a different procedural story.

Section-Specific Tuning

The generator runs 10 times per game — once per station section. Each section uses different parameters:

Section Rooms Aliens Lighting Decorations
Engineering 6 1 25% working, blue Pipes, consoles, racks
Medical 5 2 Fluorescent white Cryo pods, tables
Bridge 7 3 50% working, cyan Consoles, chairs
Cargo 8 4 Dim orange Crates, floor marks
Science 6 3 Purple Tables, pods
Crew Quarters 8 5 Warm yellow Tables, chairs, racks
Reactor 9 6 Bright green Pipes, floor marks
Hangar 10 7 Dark grey Floor marks, crates
Communications 7 6 Magenta Consoles, dishes
Armory 12 8 Crimson Crates, racks

Same algorithm, 10 different configurations. The Armory (Section 10) generates 12 rooms with 8 aliens in crimson lighting. Engineering (Section 1) generates 6 rooms with a single alien in dim blue. The difficulty curve emerges from parameter tuning, not code complexity.

Four Games, Four Strategies

Game Strategy Generation Code Levels Key Tradeoff
Polybreak Authored static 1,192 lines 100 Design quality vs. replayability
Chronostone Hybrid (static + hash detail) 1,067 lines 3 maps Narrative control vs. variety
Voidrunner Numeric scaling Part of 12,102 lines 100 waves Tuning precision vs. surprise
Dreadnought Full procedural dungeon 2,495 lines 10 sections Atmosphere vs. authored pacing

Each strategy is optimal for its genre. Breakout needs designed puzzles. RPGs need authored narrative flow. Shmups need tuned difficulty curves. Horror needs unpredictable spaces where the player never feels safe.

What AI Agents Get Right About Level Design

The interesting pattern isn’t any single technique — it’s that AI agents consistently pick the right generation strategy for each game’s requirements. They don’t default to “procedural everything” or “hand-design everything.” They analyze what kind of content the game needs and engineer accordingly.

Polybreak’s agent wrote 100 symbolic layouts because breakout levels are puzzles. Dreadnought’s agent wrote a dungeon generator because horror depends on the unknown. Voidrunner’s agent wrote scaling formulas because shmups are about escalation. Chronostone’s agent split the difference — authored structure with procedural polish.

This is what makes AI-driven game development interesting. The agents aren’t just executing a single generation paradigm. They’re making architectural decisions about what should be random and what should be designed — and getting those decisions right is what separates a procedurally generated game that feels alive from one that feels like noise.

Four games. Four strategies. Zero map editors. Every world built from the algorithm that fits it best.

// Leave a Response

Required fields are marked *