Human in the Loop Human in the Loop

The Case for Human-in-the-Loop Automation

The AI debate swings between hype and fear. This article cuts through both, examining why the most effective organisations design systems where human judgement and AI work together – and how to decide where to draw the line.

There are two broad perspectives on AI in business: breathless excitement or existential fear. This article attempts a more balanced stance: the smartest business organisations aren’t replacing human judgement with AI; they’re building systems that combine the two, known as ‘human-in-the-loop’ systems. The trick of knowing just when to do that isn’t just safer; it’s smarter.

The Fallacy That Is Costing Businesses Money

There appears to be an odd trend in the dialogue surrounding AI in the business world that jumps immediately from the interesting middle ground to one of the extremes. On one side, there are the enthusiasts who believe that the complete automation of everything will be inevitable and desirable. On the other side, there are the sceptics who see AI as a “liability waiting to explode.” Both groups are, however, missing the point that AI adoption is not necessarily an either/or situation.

The reality of the situation appears to be far less clear-cut. Less than one-third of organisations have experienced the expected performance improvements from AI deployments at scale. A recent survey of a large number of organisations found that 41% experienced an AI-related privacy breach or security incident in the past year. The majority of these incidents were related to automated processes without adequate human review. Research into the characteristics of high-performing AI adopters discovered that high-performing organisations are significantly more likely to have developed human oversight protocols compared to lower-performing organisations.

The organisations that are successfully implementing AI are not necessarily the ones that have automated the most. They are the ones who have automated the right things.

Where AI Genuinely Excels

If you want to create successful human-in-the-loop systems, you first need an honest assessment of what AI is actually good at. And what AI is actually good at is quite a lot.

AI excels in environments with high volume, repetition, and rules. In environments like this, AI is good at processing invoices, routing support tickets, detecting anomalies in financial data, generating first drafts, transcribing meeting notes, summarising meeting notes, monitoring system performance, etc. In fact, AI is not just good at these things; it is often substantially better than humans at them, in terms of speed and consistency. In fact, automating these kinds of routine cognitive tasks has the potential to free up to 30% of a knowledge worker’s time. AI is also extremely good at pattern recognition in ways that humans just cannot compete with, and it is most likely to deliver the clearest, most measurable return on investment in customer experience and operational efficiency.

The problem is, AI is extremely good at these things, but this is not the same as being good at anything else. AI is extremely good at doing a very small set of things extremely well, at an enormous scale, extremely quickly. The moment you move outside the very narrow boundaries in which it has been trained, it is no longer good at anything like this.

Where AI Falls Short, and Why

The reason is that AI does not understand context as humans do. This may sound obvious, but this phenomenon still surprises many organisations.

Let’s take a few examples. An AI that approves expense claims identifies a legitimate expense for a client dinner as suspicious because it has not come across this restaurant before, or because this employee has not had this kind of expense before. A content generator that has been asked to produce a sensitive customer communication has done so in a way that is technically accurate but utterly inappropriate. A predictive tool that selects candidates for hire has been doing this for some time, but it only selects candidates from groups that have historically been overrepresented. None of these is a technological failure. All are failures to understand context.

Research has identified three types of failures that are common to AI decision-making. These are: training data that reflects historical bias; the inability of AI systems to consider novel situations that are outside their training data; and the inability to incorporate ethical decision-making into rule-based systems. AI systems fail not because they are not working well enough, but because the world is more complicated than the data that has been used to train them. And then there is the issue of confidence without calibration. This is still one of the largest issues that remains to be solved, as most AI systems will produce a confident answer even when they are not very reliable. This is still one of the largest issues that remains to be solved, as most AI systems will produce a confident answer even when they are not very reliable.

This is not a technical issue for leaders; this is a governance issue.

The Real Cost of Getting It Wrong

The cost of getting automation design wrong is not just financial; it’s reputational, regulatory, and human.

In regulated environments, automated decisions without adequate human oversight can lead to compliance issues. The EU AI Act, which came into effect in 2024, specifically outlines the need to ensure human oversight in AI system design in high-risk environments such as hiring, credit risk assessment, healthcare, and critical infrastructure. UK businesses in these spaces face significant legal risk if their automation design doesn’t ensure human oversight. The UK’s Information Commissioner’s Office has already stated that automated decisions with significant effects on individuals must have human oversight.

And then there’s the less tangible risk to reputation and customer relationships. Customers who get an obviously incorrect automated decision and cannot speak to a human to correct it do not simply shrug their shoulders and move on. They complain. They leave. Research has shown that perceived accountability is one of the largest influencers on customer trust in automated services. And when something goes wrong in automated services, and there’s no human to hold accountable, that’s when the erosion of trust accelerates.

Getting AI design wrong is not an easily correctable problem if you’ve also removed the human oversight to correct those problems.

What Human-in-the-Loop Actually Means

The term “human-in-the-loop” is somewhat loosely used, which causes ambiguity. It is important to be clear about what it actually means.

The most basic interpretation of human-in-the-loop (HITL) is an architecture of automation that requires a human to approve at specific points within an automated process. This does not mean “a human can override the system if they notice something wrong.” This latter interpretation is passive, relying on humans to remain vigilant at all times with all systems that have been encouraged to be trusted. This interpretation of human-in-the-loop is active.

There are three categories of oversight that can be distinguished. These are full automation, human-on-the-loop, and human-in-the-loop. Full automation means that the AI acts, and the human is notified after the fact. Human-on-the-loop means that the AI acts, and the human can act if they notice that something is wrong. Lastly, there is human-in-the-loop, where the AI recommends or prepares, and the human decides. The type of oversight that can be used will depend on the consequences of an error, the number of edge cases that occur, and the reversibility of the action. Research into human-AI teaming highlights that the objective of human-AI teaming is not to minimise human involvement but to optimise it. This means that instead of nominal oversight, there should be a collaborative human-AI system. This highlights the importance of human-AI teaming, which involves designing an oversight process that will actually work.

This means that there will be an escalation process that will be intelligent, clear, and will have feedback loops to train the AI over time to become more accurate within a specific context.

Designing the Right Handoff Points

It is at this point that most organisations struggle. It is also at this point that the actual strategic work occurs.

The natural tendency is to design the automation first, and then decide where to inject human checkpoints. This is a flawed strategy that almost always leads to poor outcomes because the human checkpoints will be determined by the technology’s architecture rather than the business’s risk profile. The strategy that we have seen work well in practice is to start with the consequences.

Three questions to answer for every process that is being considered for automation. What is the worst case that can be anticipated if this decision is wrong? What is the likelihood that the AI will ever face a situation that is novel to it, something that it has never been trained for? And how readily can a wrong decision be reversed? A risk tiering framework is the most rigorous way of doing this. Decisions that have high consequences and low reversibility should always be under human control, regardless of how confident the AI is of its decision. Another study on the application of AI also found that organisations that use decision rights in their AI implementation achieve better outcomes than those that make ad hoc configuration decisions.

The result of this work is a process diagram that includes decision points that are annotated with their respective risk tier and their respective model of oversight. It is not the most glamorous work, but it is the foundation of the automation that will actually stand up to the pressure. monday.com and Make are platforms that make it easy to design conditional escalations. The design decisions that underpin these escalations still need to be business-driven rather than technology-driven.

Augmentation, Not Replacement

The assumption that AI will replace human work is not only not supported by the data; it is strategically stupid.

The organisations that are making the most significant progress with AI are those that have fundamentally changed their perspective. They are not thinking “how do we automate this job?” They are thinking, “How do we make the person who does this job incredibly more effective?” Of course, there will be significant automation of various jobs. But the majority of jobs will be changed rather than eliminated. And the greatest benefit will be from human-AI collaboration rather than human-AI substitution. Human-AI teams will outperform either humans or AI working individually on complex, context-dependent tasks.

There is another dimension to this that people often overlook. When humans are working alongside AI on high-volume tasks, they are not made redundant. They become experts at the exceptions. And over time, that team working with the AI, working with the automation, develops an incredibly sophisticated understanding of all the things that the AI does not handle. This is a skill that is incredibly valuable. And it is incredibly difficult to replicate. This is not about creating a workforce of AI managers to oversee automated systems. This is about creating a workforce whose cognitive capabilities are freed from the mundane to be applied to judgment, creativity, and relationships. This is a wonderful future. This is a wonderful future for leaders to contemplate. This is a wonderful future for people to contemplate.

Conclusion: The Smartest Automation Is Selective

The organisations that will build lasting competitive advantage through the use of AI are not those that automate most; they are those that automate most precisely – those that understand clearly where they can apply AI to add value without ambiguity, those that build robust human oversight into every element of their business that carries meaningful risk, and those that design their business to enable the two to work together seamlessly rather than awkwardly.

This is not a cautionary note; it is a note of clarity. AI is a genuinely powerful technology, and applied well, it will change what is possible for organisations. The “well” is the bit that requires design, assessment, and leadership that is willing to make considered decisions rather than follow the hype.

If you are reviewing your strategy for automation and would like to gain some structured thoughts on how the boundary between human and machine should be positioned for your organisation, we’d be delighted to work through that with you. Book a consultation to talk through your needs.

Leave a Reply

Your email address will not be published. Required fields are marked *