The Closing Loop — clord.ca

Everyone agrees: agents will make observability faster. Query telemetry, get an insight, take an action — same loop we’ve always had, just with a quicker operator. The dashboard becomes an API, the human becomes an LLM, and the time between “something happened” and “we understand it” compresses from hours to seconds. That’s real, and it’s happening now.

But “faster” is the wrong frame. What’s actually happening is that loops are closing. The human isn’t speeding up. The human is moving from inside the loop to outside it.

That’s not acceleration. That’s a phase change.

two modes

Agents interact with telemetry in two fundamentally different ways. The conversation only covers one of them.

Analysis mode: an agent queries telemetry, reasons about it, produces an insight or an action. AI-driven investigation already does this — compressing the incident response cycle from hours to seconds. But structurally, it’s the same loop — observe, interpret, decide — just with a faster operator. The paradigm doesn’t change.

Continuous control mode: an agent is persistently coupled to live telemetry and steering the system in real time. It doesn’t wait for an incident. It doesn’t “investigate.” It reads signals and adjusts parameters in a tight, ongoing feedback loop — managing latency, error rates, capacity, traffic distribution, all the time, without a human in the loop.

Not “check the dashboard every 5 minutes” — always-on steering, not incident response. Homeostasis.

This is already emerging. Deploy pipelines that monitor health signals after each release and roll back automatically when anomaly detection trips. Agents that instrument code, watch runtime logs, and iterate on fixes without waiting for a human to triage. These are early, crude versions of something bigger: agents that maintain persistent connections to telemetry streams and continuously adjust the systems they’re responsible for. Not in a year. Now.

cybernetics, originally

This second mode is what Norbert Wiener was actually describing when he coined the term “cybernetics” in 1948 — from the Greek kybernetes, steersman. Cybernetics: Or Control and Communication in the Animal and the Machine. Not analysis of data after the fact, but negative feedback: the output of a system fed back as input to minimize deviation from a desired state. Information, not energy, flowing through feedback loops to enable self-correcting behaviour. A steersman holds a course by swinging the rudder to offset deviations — the correction depends on the error, and the error depends on the correction.

Observability has always been cybernetic in this original sense. Real-time signals feeding back into real-time decisions. We just assumed the human was always the steersman. That assumption is breaking.

And when it breaks, the requirements change completely. A control loop that gets stale data steers wrong. Sub-second, streaming telemetry stops being a nice-to-have and becomes the product.

An agent in a control loop can’t afford to second-guess whether a metric is anomalous or just noisy — it needs calibrated signals it can trust without doing its own analysis. And when your observability layer is a control input, an outage doesn’t just mean you can’t see what’s happening. It means your autonomous systems go blind and start making bad decisions. The reliability bar is categorically different when telemetry is steering, not informing.

no observer

In a continuous control loop, there is no observer. There’s a system coupled to its own outputs. The data isn’t being “observed” — it’s being consumed as an input to a control function.

The word “observability” itself starts to feel like a misnomer. We named the field after the human in the loop. Remove the human, and what you have isn’t observation. It’s sensing. A nervous system for a machine that steers itself.

Nobody truly understands a large production system top to bottom. What teams actually want is help steering them — and increasingly, they want agents to do that steering. All the infrastructure built over the past decade to serve human observers — streaming data, time-series databases, real-time dashboards — turns out to be exactly what machine controllers need. Just not in the way anyone expected.

code review is the same pattern

Observability isn’t the only loop closing. Look at code review.

The pull request workflow is a human-in-the-loop ritual. Someone writes code, someone else reads it, they discuss, it gets merged. This made sense when humans were the verification mechanism — the fastest, most reliable way to catch bugs, share knowledge, maintain standards.

I know this is uncomfortable. Code review is sacred — reviewing code is what engineers do.

But an agent generating hundreds of changes a day breaks this. Not as a bottleneck worth optimizing. A review queue that backs up because you can review ten PRs a day while the machine produces a hundred isn’t a scaling problem. It’s a category error. You’re forcing a human ritual onto a machine workflow.

That backed-up queue isn’t telling you to hire more reviewers. It’s telling you the entire flow is wrong.

The code isn’t being “reviewed” any more than the telemetry is being “observed.” It’s the same structural shift, showing up in a different domain. The human was in the loop because the human was the best available feedback mechanism. That’s no longer true, and clinging to the ritual isn’t rigour. It’s reflex.

the instability nobody’s talking about

A hard problem lurks in both domains. When agents continuously adjust systems, their actions change the signals they steer by. Telemetry shifts, codebases drift, and baselines become unstable.

How do you distinguish organic system behaviour from agent-induced behaviour? If an agent tunes a connection pool and latency drops, did the system improve or did the agent just mask a deeper problem? If another agent is simultaneously adjusting rate limits based on the same latency signal, do they converge or oscillate?

Multiple agents steering on the same signals is a coordination problem that control theory has studied for decades — and it’s hard. In classical control, you design the feedback loops together, prove stability mathematically, and test extensively. In a world where different teams deploy different agents watching overlapping telemetry, nobody’s doing that proof.

This is an unsolved problem. And it applies equally to agents steering infrastructure and agents steering code. An agent that “fixes” a test failure by changing the code might be masking a design problem that a second agent will trip over tomorrow. The feedback loops interact in ways nobody planned for.

the human at the boundary

None of this means humans disappear. Organizations will live in both modes for years — humans steering some things, agents steering others, with messy handoffs between them.

The dashboard doesn’t become obsolete. It becomes the break-glass interface — the escalation surface when autonomous control hits its limits. A control loop runs until it encounters a situation it can’t handle, then surfaces the problem to a human in a way they can quickly understand and take over. That’s a deeply valuable role. Just not the primary one.

Code review goes the same way. Exception-triggered, not default-path. A human gets pulled in when automated verification can’t resolve a conflict, or when the change touches something genuinely novel. Not because every change needs a pair of eyes. Because this one does.

The human moves from the loop to its edge.

what’s left

This is an identity shift, not just a technical one.

Reviewing code is what engineers do. Watching dashboards is what ops does. Interpreting telemetry is what SREs do. These aren’t just workflows. They’re how people describe themselves, how they understand their role, how they know they’re doing their job.

When those rituals leave the loop, what’s left?