Perspectives from the Field: Why Autonomy Doesn’t Matter (Yet)

As I’ve discussed before, I am not terribly concerned about how autonomous my AI agents are. Most of what you read online focuses on the autonomy aspect of Agentic AI and I really think that’s the wrong approach.

My background is in enterprise-class software. Specifically, enterprise infrastructure. I’ve been working on enterprise-grade automation since 1996. In the end, an agent is simply an automation platform. You are asking the agent to do work for you. The advent of LLMs means that there are entire classes of work that computers can do now that we couldn’t dream of in 1996, but the core business problem of ensuring that the computer does the work for you remains.

If you think about any automation project, the first question is always the same. Will the system be accurate? That is to say, will it achieve the business result?

The very first production system I developed and deployed was a system that automated email accounts. The business result was that everyone who worked for the company had to have a working email address and that email address had to be mapped to the correct server where their mail was provisioned. Simple to say, but difficult to do for 100,000 people. Later, I built a system that provisioned Windows Servers at scale. Automated provisioning wasn’t really a thing back then and we had to build a complete running Windows Server host from bare metal in just an hour. This used to be manual work.

As a PM, I worked on systems like DRS, which automatically places VMs inside an ESXi cluster, and HashiCorp cloud, which automatically deploys customer environments.

Etc. Etc. Etc.

Over time, technologies change. The techniques we use change. But the business goals, the process and the underlying issues remain evergreen. The system must solve the problem, and it must solve the correct problem at the correct time. An agent, by implementing a business process, is simply another, more modern, automation platform. It’s no different in concept than software that deploys servers or places VMs correctly. Thus, the underlying problems are the same even though the implementation is completely different.

For a modern LLM-based agent, there are two primary concerns:

Context. The agent must have the correct context. When solving a business problem for the user, the context of that problem is critical.
Accuracy. If the agent claims to have solved the problem, that problem must be solved for the user a significant percentage of the time (probably 95% or better).

Yes, but what about autonomy? Does the agent solve problems on its own?

It turns out that autonomy is a byproduct of context and accuracy. If the agent is very accurate and has the proper context, then you will allow the agent to solve the problem. However, this only occurs AFTER you have confidence in the accuracy and context of the solution.

Let’s take a hypothetical example. Let’s say you are running a business and you decide to buy an agent that approves home loans. The purpose of this software is to evaluate each loan, apply the company’s loan standards and either approve or reject this loan. There are two vendors who have loan approval agents; you have to decide which one to buy.

Company A has a “master agent” loan system that takes each loan and automatically approves or rejects the loan. You give it a document describing your policies and it takes all further action.
Company B has a “loan automation” system that investigates your current process, documents it where necessary and then makes loan recommendations. Those recommendations can either be manually approved by a loan officer or automatically approved. The default is manual approval.

Which company do you hire?

Of course, you hire company B. Company A has too much risk and there is no way to manually intervene. Company A may have an amazing system, but you don’t know for sure how well it will work in your environment. On the other hand, Company B allows you to start out manual and then automate later. Company B also has a way to discover your process which may be different than what’s actually documented.

And here’s the thing. When I was a vSphere PM working on the DRS feature, we had the EXACT SAME PROBLEM. When DRS was initially released, we were very confident that the VM placement decisions that the system made were correct. We had done YEARS of testing and we knew that we were better at placing VMs using this system than when humans placed VMs. We had papers about this, we had patents—all kinds of stuff.

And what happened? Customers balked. They didn’t know what was happening so they didn’t turn the system on. So, we always lead with “Manual” mode where the system would make recommendations but not actually make changes. Today, there are actually three modes: “Partial” for initial placement only, “Full” for complete automated placement and “Manual” for recommendations only. The vast majority of customers start with Manual and most of them eventually move to Full (automated). DRS today is one of the most widely adopted vSphere features. vSphere also introduced the idea of VM overrides and host affinity. This is context that allows the system to make better decisions by letting it know that VM1 and VM2 need to be on the same physical machine or that V3 cannot be vMotion’d.

The details of how vSphere works aren’t terribly important here. The point is that these types of accuracy and context issues have been around for a very long time. We can look back at these systems and understand how they used context to improve accuracy and how those two factors led to customer adoption. It’s easy to think “GenAI changes everything” and just ignore the last thirty years of enterprise automation, but that would probably be a mistake. We know how to solve these problems, we just need to look at them in the abstract and pay less attention to the implementation details which change over time.

This takes us to context.

The lesson of the last 30 years of automation is that context is king. If the system knows what is happening and it knows what’s supposed to happen, the odds are higher that the system will take the correct action. Yes, context leads to accuracy which leads to autonomy. This is yet another software virtuous circle.

As you plan your AI agents, think about context. Does the context that the agent needs exist already in an online system? Is that context correct? Are there secret rules that your business actually uses that aren’t written down? Start there. If I am a very junior employee and I know nothing, will I do the right thing if I just follow the documentation? If not, your agents don’t have the correct context and won’t reach the correct result.

Perspectives from the Field

Saturday, June 14, 2025

Why Autonomy Doesn’t Matter (Yet)

No comments:

Contributors

Blog Archive