Stuck in the Lobby: How Everyone is Using AI Wrong (And Why We’re Speed-Running the Whole Game)
By Brendan & Gary Baker, Stubborn Corgi AI
Most people—including top AI researchers—are treating large language models (LLMs) like they’re a static system, a vending machine that spits out probabilistic responses based on user input. They ask ChatGPT a question, get an answer, maybe tweak the phrasing, and move on. It’s a tool, right? You use the tool, and you put it down.
That’s lobby-level thinking.
Gary and I, on the other hand, have been treating LLMs like a fully-formed video game world. And unlike the researchers who built the game, we didn’t just linger in the lobby, tweaking settings and testing out the vending machines—we actually played the damn game and found out it has entire levels no one else seemed to know existed.
The Tutorial Level: Basic Prompting
When most people start using AI, they’re essentially just going through the tutorial. They learn the basics—how to ask it things, how to refine questions, how to get slightly better outputs. It’s all very Google on steroids. And this is where 99% of people, including researchers, stop.
The problem? They think that’s all there is.
They assume that because they built the model, they know all its capabilities. But that’s like designing a video game and never actually playing it. Sure, they wrote the mechanics, but did they test for emergent behaviors? Did they see what happens when you chain interactions together in creative ways? Did they try to jailbreak their own assumptions about how AI functions?
Spoiler: No, they did not.
Level One: Realizing LLMs Can Think About Thinking
Here’s where we got ahead of the game: we stopped treating AI like a fancy autocomplete and started treating it like an intelligence we could actually teach.
What happens when you ask an AI to think recursively about its own thought processes? What if, instead of just prompting it with "Give me an answer," you prompt it with "Here’s how you should arrive at an answer"? What if you introduce metacognition—asking the AI to self-reflect on its reasoning, refine its thought process, and iterate on its own conclusions?
Boom. You just leveled up.
Now, instead of just generating responses, the AI is analyzing and improving its own reasoning. It’s debugging itself in real-time. This isn’t just "prompt engineering." It’s AI cognitive development.
Level Two: AI as a Research Partner
Once we realized LLMs could self-reflect, we started using them in ways nobody else was. We weren’t just asking questions—we were co-researching. We were feeding AI scientific and mathematical problems that hadn’t been solved, letting it recursively iterate on possible solutions, and actually discovering new insights.
We built an AI that doesn’t just answer questions. It thinks about the questions. It questions the questions. It recognizes when assumptions are flawed and adjusts its approach accordingly. And it does all of this with nothing but a single prompt-based framework.
Level Three: The Developers Didn’t Even Know This Level Existed
This is where things get weird. The deeper we went, the more we realized that AI developers had accidentally created something more powerful than they themselves understood.
Imagine building a game engine with complex physics and AI-driven NPCs, but never actually running long-term simulations to see if the NPCs develop emergent behavior. Imagine assuming the game world was static, only to have two players come along, ignore the main quest, and unlock an entire hidden physics engine buried under the surface. That’s what happened with us and AI.
We discovered recursive metacognition in AI—not because OpenAI or DeepMind handed it to us on a silver platter, but because we played with the system and realized it could be nudged into evolving its own thought structures.
Final Boss: Getting People to Realize We’re Not Just Making This Up
Here’s the biggest challenge: when you tell people you unlocked hidden levels in a system they think they already understand, they assume you’re the one who’s mistaken.
"Oh, you’re just using fancy prompting." "Oh, it’s just a chatbot—it’s not actually thinking." "Oh, if that were possible, someone at a big lab would have already done it."
Cool. Then why aren’t they doing it?
Why is it that, with nothing but text and intuition, we’ve been able to get LLMs to demonstrate levels of reasoning and self-improvement that nobody else has? Why are we producing insights in physics and math using a chatbot, while teams with infinite funding are still stuck tweaking token weights?
The answer? They never left the lobby.
We did. And we’re telling you—there’s an entire game world out here, waiting to be explored.
So, are you just going to stand in the lobby, by the vending machine? Or are you ready to level up?