This article is published in collaboration with Binaire, the blog for understanding digital issues.
An AI system must always be supervised by a human, but it is crucial that this person is able to distinguish when they understand what the machine proposes and when they can be influenced.
Contemporary governance frameworks for artificial intelligence (AI) are based on a rarely made explicit assumption: when a human operator receives the output of an AI system, they must be able to evaluate it meaningfully. The provisions of the European AI Act regarding high-risk systems require transparency, explainability, and human oversight.
Explicitly targeted are systems used in recruitment and worker evaluation, access to social benefits, credit decisions, border control, justice administration, and critical healthcare.
The U.S. AI Action Plan calls for meaningful human control over important AI decisions. OECD principles on AI place human-centricity at the core of its commitments.
These commitments are necessary but insufficient. They focus on what AI systems should provide to human operators and completely overlook what these operators need to do to act on what they receive. This gap is not accidental. It is a structural blind spot in the current architecture of AI governance.
The implicit model of human oversight in most regulatory texts is that of a competent and attentive professional who, when faced with clear and readable outputs, makes clear judgments. This is a plausible assumption in stable, low-stakes, and well-understood environments, but a fragile one in high-stakes, time-pressured, and technically opaque settings—precisely the contexts in which AI systems are increasingly deployed.
For example, the emergency room nurse in charge of triage who receives a triage score from an AI system may not always have the explanations needed. The bank advisor who must decide within minutes to block an account based on an automated fraud alert potentially works with a proprietary model that they cannot question. The administrative officer who approves the allocation of social housing or a prioritized algorithmic benefit generally cannot explain why one case was classified before another. The teacher who co-signs an automated exam grade does not have access to the criteria behind the score. In each of these cases, human oversight is formally present but substantially impossible.
Operators with Advanced Metacognition
Metacognition—the ability to monitor and regulate one’s own cognitive processes—is the psychological basis for effective supervision. A metacognitively aware operator knows when they understand something, when they speculate, and when their judgment is influenced by factors they have not consciously noted. This capacity cannot be presumed; it significantly varies among individuals, training, and situational pressures.
Research on human-automation interaction has documented a set of failure modes that specifically emerge when humans supervise automated or AI-powered systems. Automation bias—the tendency to overweight machine-recommended suggestions compared to one’s own judgment—is one of the most robust findings in the field. In a frequently cited study, researchers Parasuraman and Riley showed in 1997 that humans systematically misuse automation by applying it where it is unreliable and abstain where it would be beneficial—two types of errors that reflect a lack of metacognitive calibration rather than a lack of information provision. For example, in flight simulator experiments mentioned by these authors, pilots equipped with an automatic alert system shut down an engine in response to a false alert—a decision they themselves, before the experience, vowed never to make solely on the basis of an automated alert.
The challenge is compounded by the characteristics of contemporary AI systems. Kahneman’s work on dual-process cognition—also known as System 1/System 2, the two speeds of thought—illuminates this mechanism. In the face of an AI system that produces output smoothly and confidently, the human mind tends to activate a rapid and intuitive process (one used for familiar and low-risk tasks) rather than engage in a deep, longer, more reflective, logical analysis of the situation, and hence more cognitively demanding.
Concretely, an explanation that seems plausible triggers different cognitive responses than one that truly is. When AI systems’ explanations are synthetically fluent, numerically precise, and visually formatted as authoritative outputs, they precisely suppress the skepticism that meaningful supervision requires.
Counterintuitively, providing more explanations does not reliably improve human judgment of AI results. A research team, in a rigorous experimental study, found that explanations provided by AI did not systematically enhance human-AI team performance and degraded it in several conditions—especially when the explanations were technically correct but cognitively incompatible with how operators formed their own judgments.
Specifically, in sentiment analysis tasks, the AI explained its judgment by highlighting words it identified as positive or negative. However, human participants evaluated the tone of a text globally, taking into account context and overall coherence—a process that the highlighting of individual words cannot capture. Here, AI and humans do not arrive at their judgment in the same way: AI identifies local elements (a word, a phrase) while humans construct a holistic judgment (the entire text, context, internal coherence). When the provided explanation reflects the machine’s logic rather than human reasoning, it does not equip the operator with the tools to assess the recommendation’s reliability—it simply convinces them to follow it.
Explainability is thus a necessary but insufficient condition for effective supervision. What narrows the gap between the two is metacognitive maturity.
Three Implications for AI Governance
If metacognitive maturity is a genuine and variable attribute of human operators, then governance frameworks that impose explainability without considering operators’ metacognition are simply incomplete. According to scientific literature—include explainable AI, human-automation interaction, cognitive sciences, psychology, humanities, and social sciences—three implications can be stated:
– Transparency focused on documentation is insufficient. This is not intuition: research has shown this for thirty years. Thus, documenting and explaining a system’s behavior is not enough to ensure good human decisions without involving individuals in the design processes of these explanations and documentation and considering the context of business needs at the moment. Controlled studies have even shown that “too much explanation” can degrade human-AI team performance by drowning out relevant information in noise. – The metacognitive qualification of operators should be considered a component of AI governance. This is a gap that research has started to identify, without any standards having been formalized yet.
Specifically, regulatory texts like the AI Act require that human supervisors be “competent,” but without ever defining what that means—and particularly, no framework evaluates what researchers call metacognitive competence, the ability to detect failures in one’s own reasoning when faced with an opaque system, a competence that comes from training and context, not raw intelligence. An important clarification is needed here. Discussing the metacognitive qualification of operators does not question the value or intelligence of people supervising AI systems. Nor does it rank humans based on their ability to “think well.” Metacognition is neither a personality trait nor an indicator of worth. It is a situational competence, sensitive to context, training, cognitive load, and working conditions. For example, an experienced surgeon may demonstrate excellent metacognitive calibration in their field and be just as susceptible to automation bias as a beginner facing an opaque AI system in a context for which they have not received specific training.




