The Machine That Stopped Waiting
AIData EngineeringAdTechAgentsA Blinking Cursor and Twenty-Five Years
The first computer I used properly had a black screen and a blinking cursor, and a game called Dave where a little guy ran across the screen while you sat there at eight or nine years old wondering what else this thing was capable of. That question never really went away.
Windows arrived and the machine got a face. You could point at things, click on things, drag them around. It felt friendlier, which mostly meant you now broke things with a mouse instead of a keyboard. Word, for essays that lost their formatting every time you opened them in a slightly different version. Excel, for whatever Excel was supposed to be for, since nobody I knew in Dehradun in the 90s was running a household spreadsheet. You learned by doing things wrong, undoing them, and doing them slightly less wrong the second time.
Then the internet arrived and everything changed shape. The dial-up modem made a sound like a small animal negotiating a trade deal, and then you were connected to something. In the early days that something wasn't much: slow pages, broken links, a web that was more promise than reality. But even then, you could feel it. A kid in Dehradun could read the same thing as a kid in London, on the same day, for the cost of a phone call per hour. That was access in a way that hadn't existed before.
Wikipedia was the moment it became real for me. It wasn't always right, which anyone who quoted it in a class presentation found out in front of everyone faster than they'd have liked, but the fact that it existed at all felt like someone had taken the question "what if you could know anything?" seriously for the first time. You looked something up, you found it, and that itself was remarkable.
Broadband arrived and the wait disappeared, which sounds small but wasn't. When information loaded instantly, the barrier between question and answer collapsed from minutes to seconds. BSNL, the Indian telecom, had a plan: twenty dollars a month, with limits, but after 9pm the capacity was supposedly unlimited. That felt like a gift. A month later, a two-hundred-dollar bill arrived. The investigation into how it happened was brief. Half the usage was mine. The other half belonged to my mother, who had quietly discovered YouTube and was catching up on every TV serial she'd ever missed, one episode at a time, from 9pm until she felt like stopping. Nobody was wrong. We'd just both assumed we were the only one staying up past nine.
The forums showed up in earnest around the same time: Stack Overflow for coding problems, tech blogs for understanding what was actually happening under the hood. The rule of that era was: whatever problem you have, someone has already had it, written about it, and argued about it in a comment thread. Your job was to find the right thread.
Mobile took it the rest of the way. The machine went from a room to your pocket, which meant the question "what does this thing know?" became nearly inseparable from "what do I know?" You stopped distinguishing between things you remembered and things you could look up in ten seconds, which is either worrying or efficient depending on who you ask.
My daughter is ten months old and can operate a laptop in locked mode. What that means in practice: she smashes both hands on the keyboard with no particular agenda and ends up playing music on Spotify. Every single time. I cannot explain how she finds it. The machine has become intuitive enough that someone with a vocabulary of one word and no concept of what a laptop is can accidentally open a playlist.
That whole arc, from the blinking cursor to a baby accidentally opening Spotify, took about twenty-five years. Every step felt significant at the time, and every step required you to meet the machine partway: to learn its language, its commands, its interfaces, its search syntax. The machine kept getting better, but it was still fundamentally waiting for you to know what to ask and how to ask it.
Then, in late 2022, something changed. The machine stopped waiting.
What Changed, and What Didn't
There's a version of AI most people know: you type something, it writes back. You ask it to explain a concept, it explains the concept. You ask it to draft an email, it drafts the email. Useful, no question, but still fundamentally a conversation, and like any conversation, the moment it ends, nothing happened except that words were exchanged.
An AI agent does something different. You ask it to check whether your data pipeline has been dropping records over the last 72 hours, and instead of explaining how you might go about checking that, it actually runs the query, reads the result, notices the anomaly, traces it back to a schema change two days ago, and files a ticket. You come back to a resolved incident report. The conversation never happened. The work did.
That's the shift that everyone in AI is talking about, even if they're using different words for it.
A Timeline That Actually Matters
The first AI systems I encountered professionally weren't really intelligent at all. They were rule engines, long chains of if/else logic that someone had spent months encoding by hand. If the user matches this segment, show this ad. If the bid price crosses this threshold, pull back spend. They worked exactly as well as the person who wrote the rules, which is to say they worked well for the situations they'd anticipated and catastrophically for everything else. The moment reality drifted from the assumptions baked in, the whole thing fell apart quietly, in ways that showed up much later as unexplained revenue drops or misfired targeting.
Machine learning flipped the premise. Instead of telling a system what to do, you showed it examples, gave it outcomes, and let it figure out the pattern. A model trained on a year of campaign performance data could learn things about bid optimization that no human would have thought to encode as a rule. The catch was that these models were essentially black boxes: they improved measurably, but asking them why they'd made a particular decision got you a shrug expressed in floating-point numbers. Good for predictions, not helpful for understanding.
What broke the field open was a 2017 paper from researchers at Google called "Attention Is All You Need," which introduced the transformer architecture that underlies every major language model today. It didn't make noise outside of research circles at the time. But the mechanism it described, the ability for a model to weigh relationships between every word in a sequence simultaneously rather than processing things one at a time, turned out to scale remarkably well with data and compute. The bigger you made the model and the more text you fed it, the better it got, in ways that didn't plateau the way earlier approaches had.
GPT-3 landed in 2020 and made it obvious that something had shifted. By the time ChatGPT launched in November 2022, there was no arguing with it: a hundred million people signed up in two months. For the first time, the average person had direct access to a system that could hold a coherent conversation, write functional code, explain complex ideas, and generally operate at a level that made it useful outside of a lab or a research paper.
That was the moment most people clock as the AI revolution, but the more significant shift happened a bit later and more quietly.
From Answering to Acting
The thing that makes a language model a language model is also what limits it: it generates text. Extraordinarily capable text, text that can describe a solution in precise detail, but text nonetheless. No action follows; nothing changes in the world.
Agents change that by giving the model real tools: the ability to run code, query a database, call an API, read a file, write a file, send a request. The model doesn't just describe what it would do; it does it, reads what comes back, decides what to do next, and keeps going until the task is done or it hits a wall it can't get past.
What makes this feel different from ordinary automation isn't the actions themselves, it's the judgment applied to them. A traditional script runs a fixed sequence of steps. An agent reasons about what sequence of steps to take, and adapts when something unexpected comes back. If the query returns an error, it reads the error and decides whether to rewrite the query, try a different approach, or escalate. If the data looks wrong, it pauses and investigates rather than proceeding on bad information.
In data engineering, a pipeline breaking at 3am is not a new problem. The new part is an agent that monitors the pipeline, notices it's broken, traces the failure through the DAG, identifies that a source table schema changed without notice, rolls back the downstream transformation to the last clean run, posts the context in Slack, and then opens a draft pull request with the fix. By the time anyone is awake, there's a ticket, a Slack message with the full context, and a proposed solution waiting for a human to approve. That's not a hypothetical anymore.
In adtech, the equivalent is audience maintenance. Segment logic decays: a behavioral signal that was a strong predictor of purchase intent six months ago may be noise now if the market has shifted or the underlying data sources have changed. An agent watching segment performance can notice the decay, test whether the segment logic still holds against current data, and propose updated criteria, in a loop that runs continuously rather than waiting for someone to schedule a quarterly review.
When Agents Work Together
A single agent is useful, but multiple agents working together is when things get interesting, and also when the mental model most people have of AI starts to break down.
The pattern that's emerged in practice looks a lot like a good team at work. There's an orchestrator agent whose job is to break a large task into parts and hand them off to specialist agents. Those specialists have narrower focus, better tools for their particular domain, and less noise to work through. A specialist that does nothing but write and debug SQL is better at that job than a generalist juggling ten things at once.
Consider a campaign audit. An orchestrator gets the brief: something is wrong with ROAS on this advertiser, figure out what. It breaks the investigation into parallel threads. One agent pulls and analyzes the bid log data. Another checks the audience segment membership counts against historical norms. A third reviews creative fatigue metrics. Each works independently, returns structured findings, and the orchestrator synthesizes those findings into a coherent picture of what's actually happening. If one agent's findings implicate something in another agent's territory, the orchestrator routes a follow-up question there.
The agents communicate through structured outputs, essentially contracts about what information will come back in what format. This is less about agents literally talking to each other the way people do, and more about a system where each agent's output is designed to be usable by the next one in the chain. The orchestrator can spin up a verification agent whose only job is to check whether a claim made by the first agent actually holds, which means errors get caught before they compound.
That last part matters more than it sounds. The failure mode in most complex investigations isn't that nobody noticed something was wrong. It's that a plausible early finding narrowed the search and the real cause was never looked at. When every specialist's conclusion goes through an independent check before being acted on, you get a system that's harder to fool and harder to send down the wrong path.
What Actually Comes Next
The honest answer is that we're early. Agents are capable enough now to handle well-defined tasks with real data and real consequences, but they still make mistakes that a competent junior engineer wouldn't make, and they don't always know when they've hit the edge of their competence. The human-in-the-loop is still load-bearing in most production systems, not because AI can't act autonomously, but because the cost of a confident wrong action in a live pipeline is higher than the cost of a one-second human approval.
What's changing is the shape of that supervision. A year ago, you'd review every action before it executed. Now, for well-understood tasks in constrained environments, agents run autonomously and surface exceptions for review rather than every step. The trust is building, task by task, as the track record accumulates.
The shift that's actually coming isn't AI that thinks faster than humans. It's AI that removes the category of work where a capable person spends most of their time doing things they're overqualified to do. The diagnosis stays human. The hundred steps between the diagnosis and the resolution get handled.
I've spent my career on the systems behind data pipelines and audience platforms, and the bottleneck has never been understanding what needs to happen next. It's been everything between knowing what to do and the point where it's done. That gap is what agents are closing, one task at a time.