AI Game Simulation

The AI mirror
for game designers.

I build AI simulation systems that give board game designers faster, richer feedback — without replacing the human playtest.

AI should act as a mirror that reflects design problems back to you — not a replacement for real players around a table.

Playtesting is slow.
Iteration is slower.

Board game design requires dozens of playtests to surface balance issues, unclear rules, and broken strategies. But each session takes hours to schedule, run, and analyse. AI simulation compresses that loop.

Problem 01

Simulation takes time

A single playtest session can take 2–4 hours. Finding a balanced group of testers, prepping materials, and running the debrief adds days to each iteration cycle.

Problem 02

We want to see many solutions

Human playtests explore one path. AI agents can simultaneously test dozens of strategies, faction combinations, and rule variants — surfacing edge cases humans rarely encounter.

Solution

The mirror problem

AI simulation acts as an early mirror — giving designers structured, rapid feedback on rule clarity, balance, and strategy before expensive human sessions.

What it's not

Faster feedback loops

By identifying obvious issues before a human table, designers arrive at each real playtest with cleaner rules and sharper questions — making the human time count more.

Important caveat: AI simulation does not replace human playtesting. It is a pre-flight check — a way to catch structural issues before you put real players in front of a prototype. The goal is to make human playtests more productive, not obsolete.

Game designer.
AI practitioner.

I'm a game designer who has spent the last several years at the intersection of board game design and applied AI engineering. I'm optimistic about AI as a tool — and rigorous about where it falls short.

2017

AI Engineering

Started working professionally with AI and machine learning systems. Began exploring how language models could assist structured, rule-based reasoning — the foundation of game simulation.

2020

pyplaytest — Open Source

Published an open source Python module for enforcing LLM-driven iteration on game playability. One of the earliest attempts to use language models to evaluate board game rule sets programmatically.

github.com/dat-boris/playtest-deck-mechanics →

2025

LLM-based Agent Simulation

Built a multi-agent simulation framework using LangGraph — where specialized AI agents take on player roles, simulate full game sessions, and generate structured rule and balance feedback. Applied to The Beautiful Bid, a board game about FIFA corruption.

Three problems,
three approaches.

Case Study 01

Dynamic Game Observation

Using LangGraph tracing to observe how AI players traverse game states — turn by turn, decision by decision. This gives designers a structural map of how their game actually flows in practice.

Demo Video — LangGraph Tracing

[ Add video URL here ]

TURN 1 — game_master_adjudicate_round_start

5 Chairman spaces (C1–C5) await first bids

Presidency space open for next round's leadership contest

Bidding Phase

First bid on any Chairman space sets that Chairman's vice

Bribe tiles count as 2 bid strength but risk investigation penalties

Flow setup → bidding → investigation → resolution

What this surfaces: By tracing each agent decision through LangGraph's waterfall view, designers can see exactly where players hesitate, which rules generate clarification questions, and which game states are never reached.

Case Study 02

Rule Iteration via AI Feedback

AI agents playing as distinct characters surface rule clarity issues and balance problems through structured post-game feedback. These quotes are real outputs from simulation sessions on The Beautiful Bid.

Rule Clarity Issue
The rules around the "Explicit Bribery Windows" were a bit unclear on the first read-through. It would be helpful to have more examples or clarification on when and how these bribery phases are triggered.
Rule Clarity Issue
The resolution order and the president's ability to influence it could also use some additional explanation. It wasn't immediately evident how this could impact the award conditions.
Positive Balance Signal
The corruption mechanic seemed to provide a viable strategy, but the Investigator's ability to audit and penalize it was an effective counterbalance. I didn't feel that any single strategy was overpowered.
Balance Tension
The corruption mechanic provided an interesting strategic avenue, but there were times when it felt overpowered — especially when players could reliably bribe key chairmen for large influence gains.
Watch out for sycophancy: A key challenge in LLM-based playtesting is that agents tend to give overly positive feedback by default. Engineering prompts that surface genuine tension — contradictory feedback from different agents — is critical to getting useful signal. The two conflicting "balance" quotes above are an example of prompting for disagreement.

Case Study 03

Game Balancing via Metrics Tracking

By logging structured events across every simulated turn — resources, corruption seen, corruption caught, agent questions — we can graph game balance over time and spot runaway-leader dynamics or dead strategies.

Metrics Tracking — Feb 28 Session

round_phase_index · agent_questions_count · total_resources_count · corruption_seen_count · corruption_caught_count

81
Total resources (peak)
3
Corruption caught
3.3
Avg agents / phase
27
Events logged

[ Replace with actual graph image from simulation session ]

Balance Problem Identified
The game may be too "anti-corruption" by default — the lack of overt bribery and corruption is limiting the strategic depth and tension.
Runaway Leader Risk
There is a potential risk of "runaway leader" scenarios if one player is able to establish a dominant position early on. Careful balancing of the bidding mechanics and award conditions would be important.
Why structured logging matters: Without event-level tracking, AI feedback is just prose. By instrumenting the simulation to emit structured events per turn, we can correlate text feedback with actual game state — and distinguish real balance problems from LLM hallucination.

What I'm looking for

I'm interested in working with game designers and AI practitioners who want to bring rigorous simulation thinking into the design process. Not looking to automate — looking to augment.

Game Designers

Have a prototype you want to stress-test before a print run? Let's build a simulation loop for your mechanics.

AI Researchers

Interested in multi-agent coordination, sycophancy mitigation, or LLM reasoning in constrained rule systems? I have a live playground.

Publishers & Studios

Looking to accelerate development pipelines with AI in the loop — without replacing your playtest culture? Let's talk.

Open Source

pyplaytest is public on GitHub. If you want to extend it, fork it, or build on top of the LangGraph simulation layer — contributions welcome.

The best way to reach me is by email. Tell me about your game, your design problem, and where you think simulation could help — even just a rough sketch.

hello@youremail.com