Skip to content
Dodo Payments
Feb 27, 2026 7 min read

I Asked AI Agents to Build 8 Games. They One-Shotted Every Single One.

Ayush Agarwal
Ayush Agarwal
Co-founder & CPTO
Banner image for I Asked AI Agents to Build 8 Games. They One-Shotted Every Single One.

A year ago, Andrej Karpathy coined the term “vibe coding” and it became the word of the year. This year, he quietly nominated its successor: agentic engineering. I didn’t fully grasp the difference until this past weekend, when I sat down to add 8 new games to Dodo Games  -  our open-source collection of payment-themed browser games at Dodo Payments  -  and oh-my-opencode one-shotted all of them.

No back-and-forth. No debugging spiral. No manually stitching things together. I described the games, tinkered on the ideas with the agent, and it shipped working implementations that respected our existing codebase architecture.

Something has fundamentally shifted in how software gets built.

The Setup

I run Dodo Payments, and Dodo Games is a fun side project we maintain  -  a collection of browser arcade games themed around payments and fintech. Think Flappy Dodo dodging chargebacks, Gateway Defender fighting off DDoS attacks, Transaction Snake growing a payment chain. It’s playful, open source, and a surprisingly good way to make payment infrastructure concepts tangible.

I wanted to bulk-expand the collection over the weekend. Eight new games, each with a distinct mechanic:

Each game needed its own directory, HTML, CSS, JavaScript, game loop, scoring system, responsive design, and integration with the shared landing page. This isn’t trivial  -  it’s the kind of work that, done manually, would sprawl across days.

With oh-my-opencode, it was a single session.

What oh-my-opencode Actually Does Differently

If you haven’t come across it yet: oh-my-opencode is a multi-agent orchestration layer built on top of OpenCode. It was created by Yeongyu Kim, who reportedly spent $24,000 in personal LLM costs researching optimal multi-agent structures before building the tool. That kind of obsessive experimentation shows.

The core idea is that instead of talking to one AI model, you command a coordinated team of specialized agents. The key players:

  • Sisyphus is the orchestrator. It takes your intent, breaks it into parallel workstreams, delegates to specialists, and drives everything to completion. Critically, it doesn’t stop halfway - it runs what the project calls the “Ralph Loop,” a self-referential cycle that keeps pushing until the task is 100% done.
  • Prometheus is the planner. Before anyone writes a line of code, Prometheus interviews you like an engineer would - clarifying intent, identifying scope ambiguities, and constructing a verified plan. For my games, it looked at the existing directory structure, the shared CSS patterns, how routing worked, and asked me a handful of pointed questions about game mechanics before anything was generated.
  • Hephaestus is the deep worker - an autonomous agent that dives into research and execution independently, exploring codebases without hand-holding.

These agents don’t just generate code in isolation. oh-my-opencode gives them LSP (Language Server Protocol) integration and AST-aware code search, which means they can navigate definitions, find references, and understand your project’s type system. The games I got back followed our existing conventions not because I meticulously specified them, but because the agents read the codebase and inferred the patterns on their own.

What made the experience feel qualitatively different from any AI coding tool I’ve used before was the parallelism. Each game became an independent workstream. Game logic, UI, integration with the landing page  -  all happening concurrently across agents, then resolved into a coherent result. Eight games weren’t meaningfully slower than one would’ve been.

The Shift: From Prompting to Orchestrating

There’s a broader pattern here that goes well beyond my little game project.

Vibe coding  -  the 2025 workflow of chatting with an AI, iterating through outputs, manually integrating  -  captured something real. It lowered the barrier to building software. But it had a ceiling. A UC San Diego and Cornell study from late 2025 found that experienced developers (3–25 years of experience) working with AI coding agents don’t actually “vibe”  -  they control. Stack Overflow’s 2025 survey confirmed: 72% of professional developers said vibe coding isn’t part of their professional work.

The reason is structural. In the vibe coding model, the human is still the orchestrator, the debugger, and the integration layer. For small tasks, that’s fine. For anything ambitious  -  say, 8 games that need to follow existing architectural conventions  -  it collapses under its own weight.

Anthropic’s own 2026 Agentic Coding Trends Report frames the shift clearly: engineering teams are moving from writing code to coordinating AI agents that handle implementation. Engineers now use AI in roughly 60% of their work, but can “fully delegate” only 0–20% of tasks. The gap between those numbers is where agent orchestration lives  -  it’s the tooling that pushes more work into the “fully delegatable” category.

The difference feels like this:

2025 workflow: “Write me a Tetris game with payment themes” → get back broken code → fix → re-prompt → get half-working version → manually integrate → debug CSS → ship after 2 hours of iteration.

2026 workflow: “Add 8 new payment-themed games matching the existing architecture” → agents analyze codebase → agents plan → agents execute in parallel → you review a finished result → ship.

The second isn’t just faster. It’s a different kind of activity. You’re not pair-programming anymore. You’re managing a team.

Why This Matters for Small Teams

Here’s the part that excites me most as a founder.

Gartner reported a 1,445% surge in multi-agent system inquiries from Q1 2024 to Q2 2025. That’s not hype  -  it’s enterprises realizing that the model wars are over and the orchestration wars have begun. For two years, the industry obsessed over which model was “best.” Claude vs GPT vs Gemini vs the open-source contenders. Tools like oh-my-opencode sidestep this entirely by routing work to the right model for the job  -  Claude for orchestration, GPT for deep reasoning, faster models for quick fixes. No single model dominates every task, so stop picking one.

But the real impact isn’t at enterprise scale. It’s at the small-team and solo-founder level.

I don’t have a team of game developers. I didn’t hire anyone. I went from “we should add more games” to “here are 8 new games, fully implemented and integrated” in one sitting. The leverage that agent orchestration gives a small team is hard to overstate. One article I came across put it bluntly: one founder plus AI tools plus “digital employees” now equals the output of a 10–20 person team. I’m not sure the numbers are exactly right, but the direction is unmistakable.

This isn’t theoretical. This is what I experienced, building real things, shipping them to a real repo that anyone can go look at right now.

What Still Needs Work (Being Honest)

It wasn’t frictionless. A few observations:

The initial setup of oh-my-opencode has a learning curve. It’s powerful but not yet “install and go”  -  you need to configure model providers, understand the agent categories, and build some intuition for when to use ultrawork versus more targeted commands. The documentation is good but dense.

Token costs are real. Running multiple frontier models in parallel burns through credits fast. Prices keep dropping, but if you’re on a budget, you’ll want to think about which tasks actually need full orchestration versus a simpler single-agent approach.

And there’s a taste gap that agents can’t bridge yet. The games came out mechanically sound  -  working game loops, proper scoring, responsive design. But game feel  -  the tightness of controls, the rhythm of difficulty progression, the small details that make something satisfying to play  -  that still required human tinkering. Agents can implement mechanics. Crafting an experience is still a human job.

One more nuance from Anthropic’s report worth noting: Google’s 2025 DORA Report found that a 90% increase in AI adoption correlated with a 9% climb in bug rates and a 91% increase in code review time. Agent orchestration solves the speed problem. The quality verification problem is still very much open.

What I Take Away From This

We’re past the point where AI coding tools are novelties. The question isn’t “can AI write code?”  -  it obviously can. The question is: how do we organize AI to do complex, multi-part engineering work while maintaining quality and architectural coherence?

Agent orchestration is the answer emerging in 2026. Tools like oh-my-opencode aren’t coding assistants  -  they’re the first draft of what software engineering looks like when humans and AI agents collaborate natively. The developer’s value is shifting from writing code to decomposing problems into agent-executable tasks, reviewing outputs, and making judgment calls that agents can’t.

A year ago, getting one game out of an AI coding session would’ve been an achievement. Today, eight in a weekend is just… a weekend. And the tools are only getting better.

The games are live. The repo is open source at github.com/dodopayments/dodo-games. And oh-my-opencode is at github.com/code-yeongyu/oh-my-opencode.

Go build something unreasonable.

Build with us

We're building the payments and billing platform for SaaS, AI, and digital products. Come help us ship.

View Open Positions

More Articles