The Black Dot

Situational Blindness and the Race Nobody's Watching

Theo Saville

March 2026

Find the Dot

Eight billion people on the planet. About a billion use AI in a given week. Seventy million pay for it. Maybe five million developers build on AI APIs. Of those, maybe 100,000 are building agents, wiring language models into multi-step workflows.

Now find the black dot.

How these numbers were derived →

Almost nobody is building truly autonomous AI systems. Not chatbots. Not agents you invoke that stop when they return a result. Systems that persist — that run at 3am unsupervised, heal themselves when something breaks, manage their own memory and costs, and operate indefinitely. Hundreds of people, maybe. Nobody knows, because nobody's counting.

Everything you've read about AI — the valuations, the breathless predictions, the debate about whether AGI arrives in 2027 or 2030 — is a conversation happening in the grey bars about what's going on inside the black dot. Almost nobody having that conversation has been there.

A chatbot can give you a bad answer. It can't accidentally lobotomize itself by trying to become smarter. That's the difference between a tool and a system. The scaffolding layer doesn't make AI a bit better. It makes the impossible possible: tasks a base model literally cannot attempt become routine. Not incrementally harder tasks. Qualitatively different ones.

The binding constraint on the AI buildout has moved. It's no longer model capability. It's deployment infrastructure: the scaffolding that turns a clever model into an autonomous system. And almost nobody is building it.

The Brain in a Jar

Picture a superintelligent brain floating in nutrient fluid. It can solve differential equations, write poetry that makes you cry, reason about quantum mechanics and the emotional dynamics of a failing marriage. By any measure, brilliant.

It can't open a door.

That's what a frontier language model is without scaffolding. GPT-4, Claude, Gemini: the most capable reasoning engines ever built, and also, in a precise technical sense, inert. No persistence. No memory across sessions. No ability to act on the world without someone building the hands, the eyes, the nervous system that connects thought to action. Brains in jars.

But give that brain a body — and the right kind of body — and the difference isn't incremental. It is an order of magnitude. Not rhetoric. Literal. An orchestrator that can design and spawn hundreds of specialist agents on demand, each one detailed, opinionated, purpose-built, and wire them into parallel pipelines that execute across real-world tools while you sleep. A system with persistent memory that never forgets what it learned yesterday, which means it compounds in capability every single day. The result is not five instances of a chatbot. It is emergently smarter than the sum of its parts: a system that can do things no individual model can attempt, not because the model got better, but because the architecture made the model's intelligence useful at a scale and persistence that changes what's possible.

The gap between "chat with an AI" and "operate an autonomous AI system" is not the gap between a bicycle and a faster bicycle. It is the gap between a bicycle and a factory. And almost nobody has made the crossing, because almost nobody has built the infrastructure to make it.

The evidence for this is not theoretical. Epoch AI, the research group that tracks AI capability, published a finding on SWE-bench (Software Engineering Bench, the standard benchmark for AI code repair) that should have rewritten every headline about model performance:

Epoch AI's analysis of SWE-bench results found that a good scaffold can increase performance by up to 20 percentage points, and that performance reflects the sophistication of the scaffold as much as the capability of the underlying model.

Same model. Better wiring. Dramatically better results. Not one parameter was updated. The infrastructure around the model transformed its effective capability.

This should bother you. When we benchmark models, we're not measuring intelligence. We're measuring an entangled system of intelligence plus infrastructure, with no way to separate the two. The leaderboard isn't ranking brains. It's ranking brains-plus-bodies, and much of the variance is in the body.

Decades of cognitive architecture research, from Newell and Simon's early production systems through ACT-R and Soar to the modern CoALA (Cognitive Architectures for Language Agents) framework, converges on exactly this point: raw processing power needs structured architecture to produce intelligent behavior. The LLM (large language model) community is rediscovering this from scratch, in real time, as if no one had ever thought about it before.

Some argue the model is still the primary bottleneck, that scaffolding is necessary but capability-limited by the core LLM. There's truth in that at the extremes. A scaffold can't make a bad model good. But the empirical evidence keeps pointing the other direction: on task after task, the same model inside a better scaffold outperforms a stronger model inside a weaker one. Architecture isn't doing the thinking, but it determines how much of the thinking becomes useful action. A CPU without an operating system is just a space heater.

The reliability math is brutal. An agent that's 95% accurate per step, good by any chatbot standard, succeeds 60% of the time on a ten-step task. A hundred steps: 0.59%. And unlike a chatbot hallucination, which gives you a wrong paragraph, an agent hallucination executes. The error propagates through every downstream step.

No framework in the agent ecosystem handles what this implies: persistent operation, self-healing, security boundaries between agents, cost management, context rotation, checkpoint and retry.

That's not a gap in model capability. It's a gap in the infrastructure that makes capability useful. The brain is brilliant. Nobody's building the body.

Situational Blindness

In June 2024, Leopold Aschenbrenner, a former OpenAI researcher, ex-superforecaster, and now running an investment fund, published "Situational Awareness," a 165-page essay that became the most-read piece of AI writing that year. His argument was elegant: count the orders of magnitude (OOMs). Compute doubles on schedule. Algorithmic efficiency improves at a similar rate. Add "unhobbling" gains, removing the training-time constraints that make models worse than they should be, and you get a clear trajectory. AGI by 2027 is "strikingly plausible."

He was right about a lot. The OOMs framework is genuinely useful. His predictions about compute scaling have aged well. He understood, earlier and more clearly than most, that the trajectory of model capability is steep and consistent.

But his map has a blank spot the size of a continent.

Leopold treats the transition from capability to deployment as a detail that will resolve itself, part of the "obvious low-hanging fruit" of unhobbling. The specific engineering of autonomous operation barely registers in his analysis.

He describes the destination beautifully: "An agent that joins your company, is onboarded like a new human hire, messages you and colleagues on Slack and uses your softwares, makes pull requests, and that, given big projects, can do the model-equivalent of a human going away for weeks to independently complete the project."

Compelling vision. Zero engagement with what "joins your company" actually means as an engineering problem. Authentication? Permissions? State management across sessions? Error recovery when Slack's API returns a 500 at 3am? It's like describing a self-driving car by saying "the AI just needs to learn to drive" without mentioning sensors, actuators, mapping, or edge cases.

The most revealing line in his entire essay comes later: "It seems plausible that the schlep will take longer than the unhobbling, that is, by the time the drop-in remote worker is able to automate a large number of jobs, intermediate models won't yet have been fully harnessed and integrated."

He accidentally names the problem. The schlep, the tedious unglamorous engineering of actually deploying AI into the real world, will take longer than making the models smarter. But he treats this as a footnote. A timing issue. He doesn't realize he's pointing at the central problem. The schlep IS the missing leap.

This isn't just Leopold. It's structural.

Look at where the money goes. Hyperscaler capital expenditure hit $450 billion in 2025, the majority directed at AI infrastructure, and is projected to exceed $600 billion in 2026. Enterprise spending on AI applications: $19 billion. Agent infrastructure and scaffolding? Low single-digit billions. For every dollar spent on the application layer, twenty to twenty-five go to making models bigger.

The venture capital discourse mirrors this. Andreessen Horowitz frames agents as an investment category. Sequoia frames AI as a horse race between model labs. Neither engages the scaffolding bottleneck, because it isn't legible to the frameworks they use to evaluate opportunities. It doesn't have a leaderboard. No benchmark. No charismatic founder giving TED talks about it.

I should be precise about who's blind. The frontier labs, OpenAI, Anthropic, DeepMind, aren't ignoring scaffolding. They're publishing about it. But what they're publishing reveals the shape of the constraint they can't escape.

Anthropic shifted from "prompt engineering" to "context engineering" — the company that makes the model telling you the model isn't the whole story. Their blog on harnesses for long-running agents reinvents checkpoint-and-retry protocols that factory automation solved decades ago, for one use case: coding. OpenAI shipped Operator, an agent that browses the web, then deliberately hands control back for passwords and refuses banking entirely. These aren't temporary limitations. They're deliberate constraints from companies that understand what happens when agents operate without guardrails where failure is expensive.

In March 2026, an autonomous AI agent hacked McKinsey's Lilli platform in two hours: 46 million chat messages, 728,000 confidential files, full read-write access. Security researchers at CodeWall directed the agent at McKinsey's platform, and it found the vulnerability and exploited it autonomously. Before that, EchoLeak became the first zero-click production data exfiltration from an AI agent, no user action required. Shadow Escape exploited MCP itself, the emerging standard protocol for agent-tool communication. The attack surface isn't shrinking. It's growing with every new tool an agent can access.

The labs see the problem. They're publishing about pieces of it. But they're shipping incrementally, in narrow verticals, with deliberate constraints. Any system powerful enough to operate autonomously is powerful enough to cause serious damage when it fails. And current systems fail in ways that aren't well understood, aren't safely contained, and aren't ready for millions of users.

The discourse is blind, but not because nobody's thinking about it. The open ecosystem has a gap. The public conversation has a gap. And the independent builders who've figured it out have a window, precisely because the labs can't yet give this away.

There's another dimension to the blindness. Anthropic's own usage data reveals that software engineering accounts for 49.7% of all AI tool calls but only 8% of GDP. Medicine: 18% of GDP, 1% of AI tool calls. Education: 6% of GDP, barely a blip. Travel, agriculture, construction: the sectors that constitute the majority of the global economy are almost untouched by AI agents. The industry isn't just blind to the scaffolding problem. It's blind to where the scaffolding is needed most. The sectors with the highest economic value are exactly the sectors where autonomous operation is hardest, where failure is most expensive, and where the infrastructure gap is widest.

Nobody is making the unified bottleneck argument in public. The reasons are incentive-shaped: model labs won't, because admitting the bottleneck isn't their product undermines their moat. VCs won't, because they're invested in the scaling narrative. Framework builders won't, because they'd be criticizing their own category. Academics won't, because they speak in papers, not polemics. And the practitioners who know it, the people in the black dot, are too busy building to write about it.

I haven't found this argument articulated anywhere. Not as a unified thesis: that scaffolding is the bottleneck, that the entire discourse is looking at the wrong layer, that the deployment gap is the central problem of this era of AI. I run an AI company. I've been at it for ten years, not an AI research lab, but a company that applies AI to manufacturing, to the physical world where things need to actually work. I've raised this with tier-one VCs. They hadn't spotted it. I've raised it with AI researchers. They hadn't heard it.

Leopold Aschenbrenner wrote the best version of "situational awareness" about model capability. He has no situational awareness about the deployment gap. The smartest analysts are modeling capability curves while the bottleneck has already moved downstream.

Call it situational blindness. Everyone's staring at the brain and nobody's looking at the jar.

What's Actually in the Black Dot

So I built one.

Not as a research project or to prove a point. I needed an autonomous AI system for my own work, and nothing existed that could do what I needed: run 24/7, manage its own memory, coordinate sub-agents, heal itself when things broke, respect security boundaries, and operate for weeks without human intervention.

I called it Tycho. First version took two weeks. I don't have a computer science degree. I'm a manufacturing engineer. I've spent a decade running an AI company that machines metal, where failure means scrapped parts and lost money, not a 404 error.

That last fact matters, and it also needs honest context.

Tycho's architecture maps to every gap I described above: disk-based memory instead of in-context state, sub-agent orchestration with checkpoint and retry, context rotation when the window fills, self-healing via watchdog and health monitor, security boundaries through sandboxing and canary traps, cost management, heartbeat-driven lifecycle. Every component exists because an autonomous system needs it, and no existing framework provided it.

Here is the thing I want to be honest about: this system is held together with glue and matchsticks. It's not robust and it's not production grade. It works because I know how to keep building on top of it, how to patch the cracks, add the guardrails, evolve the architecture week by week. It can't make it out into the world right now. It's not safe enough.

Building it is like building a plane while flying it. I get exponentially more output for a few hours, then the gateway crashes, or a sub-agent bricks the config, or context bloats until reasoning degrades. I drop into Claude Code, patch the infrastructure, get it running again. Each crash teaches the system something. Each fix makes it slightly more resilient. The output isn't linear; it's sawtooth. Exponential bursts punctuated by failures that become the curriculum for the next improvement.

I couldn't hand this to someone else. Not because the code is secret, but because operating it requires a tolerance for fragility that most people don't have, combined with the systems instinct to know what to fix when it breaks. That's the Pilot. Not someone who uses a polished tool. Someone who builds the tool while using it, and the building is the using.

It keeps getting better. Every week, measurably, compoundingly better.

Three things matter about what this proves.

The scaffolding problem is solvable today. Not in theory, not waiting for a research breakthrough, but with engineering. The tools exist. The APIs are mature enough. Someone who thinks in systems can build autonomous AI infrastructure without being an ML (machine learning) researcher.

The problem is systems engineering, not computer science. Every pattern I used, process management, reliability engineering, fault tolerance, graceful degradation, came from manufacturing. From running CNC (computer numerical control) machines, from factory automation, from a decade of making physical systems work reliably in environments where failure means scrapped metal. The scaffolding problem is closer to factory automation than to machine learning.

And if one person with the right mental model can build this in two weeks, the bottleneck is not technical impossibility. But it's not easy either, and this is the paradox that matters. The code is reproducible. Anybody who gets their hands on the source would have it running. But knowing it's buildable, having the systems thinking to operate it, being willing to run something fragile and insecure while you iterate toward robustness: that combination is rare. The bottleneck isn't the build. It's the mindset.

The CoALA framework, a cognitive architecture for language agents proposed by Sumers, Yao, et al., maps almost exactly to what I built. Modular memory, structured action spaces, generalized decision-making. The academic theory and the engineering practice converged independently. When researchers working from cognitive science and a practitioner working from manufacturing engineering arrive at the same architecture, the architecture is probably right.

Which means the people who realize the power of these systems probably won't share them quickly. They're going to make huge advances in a very short time because of the leverage, while everyone else is still figuring out how to get their recipes made on ChatGPT and deciding it sucks.

Not everybody knows how to drive these systems. Not everybody has the systems thinking chops. The bottleneck isn't code. It's cognition.

The Leap Nobody's Building

Agent frameworks are Django. They help you build the application. Nobody's building Kubernetes, the infrastructure that keeps the application running, healthy, and recoverable when things go wrong at 3am.

LangChain has 123,000 GitHub stars. CrewAI raised $18 million. Microsoft is merging AutoGen and Semantic Kernel. Cognition raised nearly $700 million for Devin. Billions flowing into agents, and every single one of them stops at the session boundary. Invoke the agent. It runs. It returns. The session ends. The agent ceases to exist.

The gap shows up in what's absent. Persistent autonomous operation: no frameworks handle it. Self-healing: none. Security boundaries between agents: nearly nonexistent. The MIT 2025 AI Agent Index found that only 4 of 13 frontier agents even disclosed safety evaluations. Cost management at the infrastructure level, context rotation, checkpoint and retry for multi-step failures: minimal to nonexistent across the board.

The commercial landscape repeats the pattern at larger scale. Cognition built Devin, an autonomous software engineer. But you can't use Devin's scaffolding to build a different autonomous agent. It's a vertical, not infrastructure. Lindy, MultiOn, Relevance AI: every commercial player builds agents-for-X or agent-builders. None of them build the operational infrastructure that makes any agent autonomous over time.

Everyone is building the car. Nobody is building the road.

The closest thing to a genuine persistence layer is Letta, the MemGPT spinout funded by Andreessen Horowitz and Felicis. They're focused on memory: tiered, persistent, stateful memory for agents. It's real and it matters. But memory alone isn't autonomy. Without self-healing, cost management, security boundaries, and orchestration over time, persistent memory is a filing cabinet in an empty building.

Google's Cloud CTO office saw the shape of the problem, writing in December 2025 that the industry should "treat atomicity as an infrastructure requirement, not a prompting challenge." They called for agent undo stacks and transaction coordinators. They wrote that "the reliability burden belongs on deterministic system design, not the probabilistic LLM." Right diagnosis. Nobody built it. The CTO office published the blueprint and the industry kept shipping prompt wrappers.

Anthropic shifted their language from "prompt engineering" to "context engineering," the model maker signaling that the model isn't the whole story. Harrison Chase at LangChain calls context engineering "the real skill," but his business is a framework company, and making the sweeping bottleneck claim would be self-indicting.

The market tells the same story. The AI agent market was $7 billion in 2025, projected to reach $93 billion by 2032, a 44% annual growth rate. Fortune 500 companies are piloting agentic systems. The demand is visible. But the supply is structural: what gets built is verticals and frameworks, because that's what funding incentives produce. What's needed is infrastructure. Unglamorous, hard to demo, impossible to capture in a benchmark.

The Pilot

James Watt improved the steam engine. He earned £76,000 in royalties (several million in today's money), wealthy and historically notable. Cornelius Vanderbilt didn't invent the steam engine or the locomotive. He built the rail networks that made steam useful across a continent. His fortune, adjusted for inflation, was roughly $200 billion. The ratio between inventor and infrastructure operator: about 1,000 to 1.

This pattern repeats with the regularity of a natural law.

Nikola Tesla invented alternating current, the system that powers the modern world. He died nearly broke in a New York hotel room. George Westinghouse, who built the infrastructure to deploy AC power, built a corporate empire. Tim Berners-Lee invented the World Wide Web. His net worth is about $10 million. Jeff Bezos built AWS (Amazon Web Services), the electrical grid of the internet age, on top of the protocols Berners-Lee gave away. The gap there: 20,000 to 1.

Bill Gates didn't invent the operating system. He bought QDOS for $50,000, licensed it to IBM, kept the rights to license it to clone-makers, and built Microsoft into the richest company on Earth. The scarce resource wasn't the chip. It was the system that made the chip useful to humans.

Edison invented the lightbulb. His personal secretary, Samuel Insull, built the electrical grid: demand-based pricing, centralized generation, the utility model that became permanent infrastructure. Insull became one of the richest men in America. He also went bankrupt in 1932 from overleverage and the Depression, indicted for fraud, and died broke in a Paris metro station. The cautionary note matters: operators can mistake leverage for invincibility. But the infrastructure Insull created outlived him by a century. The pattern survived the person.

In every industrial revolution, inventors got the credit and infrastructure operators captured the value. Not because inventors were less brilliant; they were often more so. But inventions are point events. Infrastructure is a compounding system. A lightbulb is a thing. A grid is a network effect. Infrastructure always outscales invention.

The wild west window for each revolution has been compressing:

Revolution	Wild West Duration
Steam / Railways	~60 years
Electricity	~25 years
Computing	~20 years
Internet	~15 years
Smartphones	~5 years
AI	~3 years?

If the pattern holds (and I'm extrapolating from five data points, not citing a law of physics) the window for the AI infrastructure operator is open right now and closing faster than any previous revolution.

I call this person the Pilot. Not a "prompt engineer," a term too narrow, focused on one interface to one model. Not a "10x engineer," which is the wrong frame entirely. The Pilot is closer to the DevOps (Development Operations) engineer or SRE (Site Reliability Engineer) who emerged when "running servers" became a specialized discipline, except what's being run isn't a server. It's an intelligence.

The closest historical parallel is the mainframe priesthood of the 1960s: the only people who could make the machines work, commanding enormous organizational power because the technology was opaque to everyone else. They held that position until higher-level languages and operating systems abstracted the hardware away. That took about twenty years.

The Pilot's window might be three to five. The abstraction layers, better UIs, no-code agent builders, commoditized orchestration, are coming. They always do. But right now, we're in the gap. And the leverage available in the gap is extraordinary.

The Orchestrators

There's a question embedded in the Pilot thesis: who actually has the cognition to build and drive these systems?

Not the best programmers. Programming is a component skill, necessary but not the scarce one. The scarce skill is decomposition: looking at a complex objective and seeing the dependency graph. Which tasks are independent? Which need to run in parallel? Where are the gates, the handoffs, the failure modes? What does the org chart look like?

That's not a software engineering skill. It's an organizational one.

Eisenhower didn't manage D-Day. He orchestrated it: thousands of moving parts, competing priorities, incomplete information, irreversible decisions under time pressure. Oppenheimer didn't manage Los Alamos. He held the entire dependency graph of the Manhattan Project in his head and made it converge: physicists, engineers, military brass, procurement, security, all running in parallel toward a single deadline that couldn't slip. These aren't managers. They're orchestrators. The distinction matters.

A manager implies process. An orchestrator implies vision plus decomposition plus parallel execution under uncertainty. The people who will build the most powerful autonomous AI systems are probably not the people with the deepest ML expertise. They're the people who have run countries, companies, armies, factories, complex systems with many moving parts where failure cascades and success compounds. Systems thinkers who have operated at scale, who know intuitively what a twenty-agent pipeline needs because they've run the human equivalent.

This creates a strange inversion. The AI discourse assumes technical depth is the binding constraint. But autonomous systems are organizational problems encoded in software. The person who's run a 500-person company knows more about agent orchestration than the person who's trained a 500-billion-parameter model, because the orchestration problem is the same problem at a different substrate.

And it explains something else: why humans remain structurally irreplaceable in these systems, even as the models get smarter.

Consider the context window. A frontier model holds 200,000 tokens in active context, an impressive amount of structured detail. But a human carries decades of embodied experience: every company they've run, every factory floor they've walked, every negotiation, every failure, every 3am decision. That experience compresses into something we call intuition, extraordinarily fuzzy, far-reaching pattern matching that fires before conscious reasoning engages. You can't articulate why something feels wrong. But the signal is real, and it's drawn from a context window the model literally cannot access.

Here's what makes this powerful: the bandwidth required is tiny. You don't need to dump a lifetime of experience into the model's context. A three-word nudge — "that feels off" or "try it more like X" — from outside the model's horizon is worth more than a hundred thousand tokens of research the model could generate itself. The human doesn't need to explain why. The nudge alone redirects the entire system.

It's like a rudder on an ocean liner. Small relative to the ship. But it has leverage because it's positioned at the right point in the system. Human intuition is the rudder — small input, massive directional change, sourced from a context the system cannot see.

This reframes the entire human-AI relationship. It's not "human supervises AI," that's a safety framing, necessary but incomplete. It's not "human collaborates with AI," that's too symmetrical, implying interchangeable contributions. The accurate framing: the human provides the irreducible signal that the system cannot generate internally. The nudge from beyond the horizon.

The model sees wide: 200,000 tokens of active structured detail, every line of a codebase, every number in a report, held simultaneously. The human sees far: decades of compressed experience that shapes which direction to point all that processing power. Wide and far. That's the architecture. And it means fully autonomous AI systems will systematically underperform human-AI teams, not because the AI needs supervision, but because the AI is missing context it cannot generate on its own. The human isn't the guardrail. The human is the long-range sensor.

Which means the black dot isn't populated by AI systems running alone. It's populated by orchestrators, people with the systems thinking to decompose complexity and the embodied intuition to steer what they've built. The Pilot isn't optional. The Pilot is structural.

The Silent Race

Return to the black dot.

Eight billion people. A billion weekly AI users. Seventy million paying for it. Five million coding with it. A hundred thousand building with agent frameworks. And maybe a few hundred people building and operating autonomous AI systems that persist, self-heal, and act without supervision.

They're not writing blog posts or arguing about scaling laws on Twitter. They're not at conferences presenting slides about the future of agents. They're building. What they're building changes what AI can be, because autonomy doesn't emerge from a single model breakthrough. It emerges from infrastructure. Persistent memory, self-healing, tool integration, security boundaries, checkpoint and retry: put them together and you get a system that can operate indefinitely. That's not a model achievement. It's an engineering achievement.

The scaling hawks are right that models will keep getting smarter. The bitter lesson, that scale wins, has been validated again and again. But the refined version is more precise: scale wins within a given architecture, and the choice of architecture determines the ceiling that scale can reach. The agent scaffolding layer is the current ceiling. No amount of additional compute will turn a brain in a jar into an autonomous system. The jar has to become a body.

Leopold Aschenbrenner wrote about situational awareness, the quality that separates the few who see what's coming from the many who don't. He was talking about model scaling. He was right. But there's a second situational awareness test, and most of the people who passed the first one are failing it.

The bottleneck has moved. The missing leap isn't intelligence; it's the infrastructure that gives intelligence a body, a memory, and a life of its own.

The black dot is where that future is being built. Right now, it's silent. A handful of people, operating fragile systems held together with glue and matchsticks, compounding their capabilities weekly while the rest of the world debates whether ChatGPT can write a decent email.

That silence won't last. The abstraction layers are coming. The labs will eventually ship what they're building behind the curtain. The window will close.

But today, the window is open. The most important engineering problem of this decade isn't making AI smarter. It's giving AI a body, persistent, self-healing, secure, operational, and learning to keep it running.

The race is on. It's silent. Most of the people who should be running it are still staring at the brain.

Methodology & Sources

How the pyramid numbers were derived:

~1 billion weekly AI users: OpenAI reported 400 million weekly active users (Feb 2025). Adding Google Gemini, Microsoft Copilot, Claude, and regional platforms, ~1 billion is a conservative aggregate.

~70 million paying for AI: OpenAI alone reported ~12 million ChatGPT Plus subscribers. Microsoft 365 Copilot, Google One AI Premium, Claude Pro, and enterprise seats push the total toward this range. Adarsh Sarma's estimate of ~4% conversion from OpenAI's 800M monthly users corroborates the order of magnitude.

~5 million building with AI: GitHub Copilot had 1.8 million paid subscribers as of early 2025, with millions more on free tiers. The Stack Overflow 2025 Developer Survey found 84% of developers use or plan to use AI tools, from a base of ~30 million professional developers globally. "Building with AI" (using APIs, not just autocomplete) is a subset, and 5 million is a mid-range estimate.

~100K building agents: LangChain alone reports ~28 million monthly PyPI downloads (Contrary Research, Feb 2025) and 87K+ GitHub stars. However, downloads are inflated by CI pipelines, bots, and transitive dependencies, a typical ratio is 100-500× downloads per active developer. At a generous 500:1, that's ~56K active LangChain users. The Stack Overflow 2025 survey found that among developers building agents, LangChain (33%) and Ollama (51%) are the most-used tools, but doesn't report what percentage of developers build agents at all. Adding CrewAI, AutoGen, Semantic Kernel, and smaller frameworks, ~100K is a defensible upper bound. The true number may be closer to 80K.

~Hundreds in the black dot: No dataset exists for this. It's an estimate based on the publicly visible community of people operating persistent autonomous AI systems (not demos, not research prototypes, but systems that run unsupervised for weeks). The number is deliberately imprecise because precision would be false.