The enterprise risk nobody is modeling: AI is replacing the very experts it needs to learn from
The AI industry's focus on autonomous self-improvement overlooks a critical vulnerability: the dwindling pool of human experts needed to evaluate and correct AI errors.

learn from">
For AI systems to continue improving in knowledge work, they require either a reliable mechanism for autonomous self-improvement or human evaluators capable of catching errors and generating high-quality feedback. While the industry has heavily invested in the former, it has largely neglected the latter. This oversight could have severe consequences, as AI systems are increasingly replacing the very experts they need to learn from.
The limitations of self-improvement in knowledge work are evident. Reinforcement learning (RL), which enabled AlphaZero to master games like Go, chess, and Shogi, relies on a stable environment and an unambiguous reward signal. However, knowledge work lacks these properties, with rules and regulations constantly evolving and outcomes often uncertain.
Without human evaluators to provide feedback, AI systems cannot improve. The problem is compounded by the decline of entry-level jobs that once allowed individuals to develop the expertise needed to evaluate AI errors. New grad hiring at major tech companies has dropped by half since 2019, with AI systems now handling tasks like document review, research, data cleaning, and code review.
While companies tout this as efficiency, economists warn of displacement. The consequence is a dwindling pool of potential experts, threatening the very foundation of AI development. History shows that knowledge can die when external factors, such as war or plague, disrupt the transmission of expertise.
However, the current crisis is self-inflicted, resulting from a thousand individually rational economic decisions. As fields like advanced mathematics, theoretical computer science, and complex systems architecture lose practitioners, the capacity for novel insight and innovation quietly collapses. Rubric-based evaluation, touted as a solution, has limitations.
While techniques like Constitutional AI and reinforcement learning from AI feedback (RLAIF) reduce dependence on human evaluators, they cannot capture the deeper aspects of judgment, such as instinct and intuition. A rubric can only measure what its creator knows to measure, leaving AI systems vulnerable to errors. Rather than slowing development, we must treat the evaluation gap as an open research problem, with the same urgency we bring to capability gains.
The responsible approach is to acknowledge the risk and invest in preserving the human infrastructure that currently fills the gap. As Ahmad Al-Dahle, CTO of Airbnb, notes, the thing AI most needs from humans is the thing we're least focused on preserving.
Source: VentureBeat