Human-in-the-Loop: On Designing Successful Enterprise AI Systems

The Reality of LLM-Based Enterprise Workflows in 2025

Two observations about the current state of LLM-based AI. First, AI in 2H2025 is not yet at a point, for most enterprise workflows, where it can be set loose on a task without human and software guardrails (we talk about software guardrails in a previous post). And second, the underlying probabilistic/non-deterministic nature of AI necessitates new ways of building products with humans closely validating LLM output.

Both of these limitations translate to exciting opportunities for engineers to build outside the staid SaaS frameworks. Client-server architecture (distributed systems, APIs, SaaS) changed how software was built ~20 years ago. Today, we are entering a new era for software development which could be even more dramatic.

The Limits of MCP Servers

Drowning an LLM in a pool of MCP servers and asking an LLM to leverage those as "Tools" to accomplish a high level task (like, say, to complete an ERP or Billing migration) doesn't work today. It's possible the technology will get there over the coming years. But today, complex tasks like data migrations demand guided workflows. The trick is to provide the AI with enough flexibility within those workflows to effectively automate some of the engineering tasks. We like to think of it is widening the aperture for the LLM.

Determining the right aperture given the current state of LLMs and the right supervision paradigms is a large part of what LLM application development entails today.

The New Software Development Spectrum

One useful model when considering LLM-based software products is to think of software development on a spectrum. Andrej Karpathy uses the model of the Autonomy Slider during a recent YC presentation. On one end we have a traditional product with deterministic code. On the other we have our "Pool of MCP Servers" where a multi-agent orchestration LLM is given a high level task and it automates all the intermediate steps to accomplish that task. Today state-of-the-art exists somewhere in the middle of this spectrum, let's call it AI-Assisted enterprise workflows. The current state-of-the-art requires not only software to guide the LLM but also human interactions to adjust LLM behavior.

The Role of Human Feedback for Enterprise Workflows

One of the most exciting areas for engineers is to explore optimal paradigms for human interaction with LLMs. Ideally these would make it easy for the human to adjust LLM behavior but also provide the highest bitrate communication back to the LLM to adjust its behavior.

Let's explore some of these feedback loops in the area of enterprise ERP and Billing migrations (an area we are well familiar with at Doyen). Interfaces we think about and have developed for human-in-the-loop include -

Interfaces for quickly inspecting complex data mappings generated by an LLM. Not only should a mapping be easy to visualize but it should be easy to understand how each mapped variable was created.
Interfaces for enabling humans to easily adjust LLM behavior. At Doyen this takes the form of Instructions which translate natural language to code.
Interfaces to comprehensively surface loading errors and enable humans to adjust those mapping Instructions to address those errors.

Looking Ahead

We can expect LLM technology to continue to improve. With LLM improvement, software developers will be more amenable to increasing the aperture for the LLM to explore and ultimately automate. In turn, each aperture increase will also necessitate new interfaces for visualizing and validating the output of LLMs. It's an exciting time to be in the software business!