AI is changing the way we work.

Sign up for our newsletter and be the first to know about exclusive events, expert insight, and breakthrough research—delivered straight to your inbox.

Submit

Please share a few additional details to begin receiving the Valence newsletter

By clicking submit below, you consent to allow Valence to store and process the personal information submitted above to provide you the content requested.

Thank you! Your submission has been received!
Please close this window by clicking on it.
Oops! Something went wrong while submitting the form.

The Judgment Gap: Why Human Judgment Is the Scarcest Resource in the Age of AI | Prasad — Valence

In this session from the 2026 Valence AI & The Workforce Summit, Prasad — former head of People Analytics at Google and Stanford researcher — presents one of the summit's most intellectually rigorous frameworks: the Judgment Gap. Drawing on 15 years of Google research, back-of-envelope math showing AI is 2,000 times cheaper than human cognitive processing, and three behavioral science foundations (Gary Klein on pattern recognition, Phil Tetlock on calibrated confidence, and Michael Polanyi on tacit knowledge), Prasad argues that the default AI playbook — automate the routine, upskill the workforce — is creating a dangerous gap between the high-stakes judgment required of tomorrow's leaders and the developmental pathways available to build it. He closes with three concrete organizational shifts and a single litmus test question: Can your people tell when AI is wrong?

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

  • Item A
  • Item B
  • Item C

Text link

Bold text

Emphasis

Superscript

Subscript

Video Transcript

Key Takeaways

  • AI is approximately 2,000 times cheaper than human cognitive processing — and that cost pressure will reshape organizations completely: A typical knowledge worker processes and produces roughly 125,000 words of information per day. When you compare the fully loaded cost of a global knowledge worker against the cost of AI token processing for equivalent volume, the ratio is approximately 2,000 to 1. Cost pressures of this magnitude have historically restructured entire industries — shipping, trade, mobile commerce — and will do the same to knowledge work.
  • The default AI playbook is creating a judgment gap: The standard enterprise AI approach — automate the routine, upskill the workforce, expand AI scope over time — carries a dangerous implicit assumption: that human value is residual, limited to whatever AI hasn't touched yet. As AI touches more, human value in this framing becomes increasingly marginal. More critically, the routine, repetitive work that AI is automating is exactly the work through which humans built confidence, pattern recognition, and professional judgment. Remove it, and the developmental pipeline for senior decision-making collapses.
  • Productive cognition is the real organizational asset — and judgment is now the binding constraint: Every organization is a system for converting thinking into value. At Google, productive cognition was the product of intellectual capacity, information abundance, and effective collaboration. In an AI-powered world, that formula changes: productive cognition becomes judgment quality amplified by AI, plus independent AI agent work. Judgment quality — not AI access — is now the differentiating factor.
  • Judgment forms through three mechanisms: contextual reps, outcome ownership, and apprenticeship: Research from Gary Klein (firefighters and ER nurses), Phil Tetlock (superforecasters), and Michael Polanyi (tacit knowledge transfer) converges on the same conclusion. Judgment requires varied experience in high-stakes contexts, calibrated confidence built through owning outcomes, and apprenticeship-based transfer of knowledge that cannot be codified. Organizations must actively design all three — they will not emerge automatically from AI-assisted work.
  • Three concrete shifts to the default AI playbook: close the loop, design the gradient, separate review from correction: Close the loop means creating structured retrospectives so people see the outcomes of their decisions and reflect on why they made them. Design the gradient means building deliberate developmental progressions — simulations, case studies, and AI-assisted practice scenarios — so that new professionals build judgment reps before facing high-stakes decisions. Separate review from correction means using the review of AI-generated work as an apprenticeship moment, not a compliance check.
  • Every AI interaction either sets people up for dependency or development — and the litmus test is one question: Can your people tell when AI is wrong? Two years ago, most knowledge workers could. Today, some can, sometimes. In the future, it depends entirely on deliberate organizational design. If the default is 'ask AI and act on the answer,' judgment atrophies. If AI interactions are designed as development moments, judgment compounds.

Full Transcript

The Alia Decision: What High-Stakes Judgment Requires

[00:00:01.056]

Prasad: Alia Jones is a VP. She's in one of your organizations, and she has to make a critical promotion decision. She has an important role to fill, and she's down to two candidates. Her gut says Molly — her protege, someone she's worked with for two years. Alia knows exactly how Molly thinks and how she responds to pressure. There's a new algorithm that her HR department has come up with, and the algorithm says she should go with Ed. Ed is a 96% match for this role compared to Molly's 83%. The algorithm also says Ed has a much wider cross-functional network — critical for this role — but also that Ed is a higher flight risk than Molly.

That is the context that the Harvard Business Review came up with for a case study they asked me to apply. The question is: how does Alia make this decision? What is her capacity to make it, and how did she form that capacity? Whether it's an important people decision like this or a business decision, we all want our leaders to be excellent at making good calls when the stakes are high.

In my 15 years at Google, these are exactly the kinds of questions we were asking ourselves. What enables organizations to make great people decisions, and what enables people to make great business decisions? You may have heard of some of the work my team did — the science of hiring, Project Oxygen on the role of managers, Project Aristotle on psychological safety and team effectiveness. What we brought was curiosity and rigor to important questions. For every project you've heard about, there were at least a few shared only internally at Google, and several others that went nowhere — questions we asked without finding satisfactory answers. We didn't get bogged down by failure; we persisted in thinking about important questions.

▶ How Human Judgment Is Tested in High-Stakes Promotion Decisions

A Harvard Business Review case study frames the challenge of human judgment under AI: a VP must decide between promoting a trusted protege (83% algorithmic match) and a higher-scoring external candidate (96% match) with a stronger cross-functional network but higher flight risk. The case illustrates what judgment requires — calibrated confidence, contextual pattern recognition, and the willingness to own an outcome — and why those capacities cannot be delegated to an algorithm. This is the kind of decision that will define leadership value in an AI-assisted organization.

Productive Cognition: What Organizations Are Really Optimizing For

At Google, we came to believe that all of this research pointed toward one underlying concept. I want to introduce a term I think of as productive cognition. Every organization, regardless of industry, is a knowledge organization. Everything is getting more technical, more knowledge-intensive. Every knowledge organization is a system for converting thinking into value. That is where productive cognition comes in.

Productive cognition is the cumulative intellectual capacity across all of your people — embodied in your products, services, and customer relationships. The 'productive' part matters: it accounts for not just intellectual capacity, but subtracts every place where you have friction, misuse, underutilization, or suppressed voice. That is the full picture.

Based on what I've told you about the Google organizational model, this is implicitly what we were solving for. Productive cognition at Google was the product of three factors: intellectual capacity, information abundance, and effective collaboration. These were multiplicative — not additive. They build on each other. If one term is weak, the system collapses. Every organization has its own recipe. What matters is internal consistency and that multiplicative effect.

▶ What Is Productive Cognition? A Framework for Measuring Organizational Thinking

Productive cognition is the cumulative intellectual capacity of an organization's people, as embodied in its products, services, and customer relationships — minus every point of friction, misuse, or suppression. Developed by former Google People Analytics head Prasad, the concept treats every organization as a knowledge organization: a system whose core function is converting thinking into value. At Google, productive cognition was the multiplicative product of intellectual capacity, information abundance, and effective collaboration. AI changes the equation by making thinking abundant — shifting the binding constraint from capacity to judgment quality.

AI Is 2,000 Times Cheaper Than Human Cognitive Processing — and That Changes Everything

Here's where the formula changes. AI is making thinking abundant, and that changes the dynamics completely. Here is some back-of-envelope math I did a few weeks back. Take a typical knowledge worker — regardless of role or domain. A workday consists of responding to emails and Slack threads, initiating some, composing documents, crunching data, attending meetings. If you look at all the information processed and produced through those interactions, it roughly comes to around 125,000 words per day.

Here's the punch line. When you look at the fully loaded cost of a global knowledge worker and compare it to the cost of AI token processing for equivalent volume, the ratio is approximately 2,000 to 1. AI is 2,000 times cheaper than human cognitive processing. I want to acknowledge this is a simplistic calculation — human cognition is much richer, because it includes problem-solving, not just output generation. But cost pressures like this have repeatedly changed economies completely. When we had shipping containers. When we had mobile phones. Those cost pressures ensured that existing, incumbent, expensive structures gave way. The same will happen here.

▶ Why AI Being 2,000 Times Cheaper Than Human Cognition Is a Structural Economic Force

A back-of-envelope calculation by former Google People Analytics head Prasad estimates that a typical knowledge worker processes and produces approximately 125,000 words of information per day. When the fully loaded cost of that cognitive output is compared to the cost of equivalent AI token processing, the ratio is approximately 2,000 to 1 — AI is 2,000 times cheaper. While human cognition is richer than this calculation captures, cost pressures of this magnitude have historically restructured entire industries. Organizations should treat this as a structural economic force, not a marginal efficiency improvement.

The Judgment Gap: Why the Default AI Playbook Is Dangerous

What I'm seeing is organizations all around adopting a default playbook because they implicitly understand some of this. The typical playbook: automate the routine — not just robotic process automation, but deeper because now we have AI. Upskill all our people to use AI tools. Keep expanding the scope over time, since AI will only get better.

There's a substrate to this thinking that I don't think is bad, but I do think is, in some ways, dangerous. The implicit assumption is that the value of humans should be restricted to things that AI hasn't touched yet. As AI keeps touching more and more, the value of humans in this framing becomes increasingly residual. And that I find very unnerving.

Where will human work persist? The AI can generate 21 PowerPoints — but your leadership team, your CEO, your board still has the same 24 hours. Someone still has to filter all of this and make calls. Second, we still need to take accountability for high-stakes decisions. Organizations have tried to blame AI tools for bad outcomes — courts and public opinion have firmly rejected that. Third, there are many high-stakes decisions, like Alia's, where we want humans to wrestle with what good looks like. AI can generate possible outputs. But we want humans to wrestle with them and have the conviction of owning what they decide.

A lot of the work immediately ripe for automation beyond the old RPA is work with few degrees of freedom — repetitive transactional work where you can make slight deviations around rules and processes. But here's what breaks when that routine work disappears. That work built confidence. We all worked through imposter syndrome by doing the work, knowing the work, and knowing that we knew it. Repetition built recognition — by doing things many times, we developed instinct for where failure might occur. And when you specialize, you develop your distinctive signature of taste. All of that repetitive work, if automated by AI, creates a gap: the demands on tomorrow's leaders are going to be higher, but the preparation gets weaker. That is what I call the judgment gap.

▶ What Is the Judgment Gap and Why Should HR Leaders Be Concerned?

The judgment gap is the widening distance between what high-stakes leadership decisions require — pattern recognition, calibrated confidence, ownership of outcomes — and what the AI-assisted workplace actually develops in people. As AI automates the routine, repetitive work through which professionals historically built confidence, recognition, and taste, the developmental pipeline for senior judgment collapses. Coined by former Google People Analytics head Prasad, the judgment gap describes an organizational risk that compounds silently: leaders are asked to make increasingly consequential decisions with decreasing experiential preparation.

How Judgment Forms: Three Research Foundations

If judgment becomes the binding constraint, then we need to understand how it forms. There is substantial research here, and I'll summarize three foundations that are useful as we think about how to design for judgment.

Gary Klein: Pattern Recognition Requires Varied Contextual Exposure

Gary Klein studied firefighters and ER nurses who respond to critical situations with very little time — no opportunity to run a spreadsheet and think through alternatives. What he found is that they all develop an instinct for what to do by having exposure to varied contexts in very different environments until recognition becomes instinct. You need those reps, but they have to be varied. Exposure to enough varied contexts is what allows people to recognize patterns quickly under pressure.

Phil Tetlock: Calibrated Confidence Is More Valuable Than Certainty

Phil Tetlock, an economist at Wharton, studied forecasters who predict geopolitical events. The best forecasters were not people with high confidence in their specific outcomes — they were people who knew the limits of their confidence. The example: Tom Brady predicting the Super Bowl said, 'Both teams are good. On any given day, either could win. But if they played ten times, I think the Seahawks would win six times and the Patriots four.' That is calibrated confidence. You don't want salespeople saying they'll win 100% of bids. You want them to say: 'I know which bids are higher probability, and I know where to put more work into the proposal.'

Michael Polanyi: Tacit Knowledge Transfers Through Apprenticeship

Michael Polanyi studied how expert craftsmen transfer knowledge. His work on tacit learning is probably familiar to many here. His core insight: we all know more than we can tell. The surgeon's hands know more than the surgeon can articulate. Apprenticeship — observing someone at work, having them shape your experience — is the primary mechanism for transferring that tacit knowledge. It cannot be codified or taught from a slide deck.

Where all of this converges: Gary Klein's contextual embedding helps with pattern recognition and fast decisions. Phil Tetlock's outcome ownership helps with calibrated confidence. And Polanyi's apprenticeship is the primary vehicle for tacit knowledge transfer. These are the three pillars of judgment development — and none of them happen automatically in an AI-first workplace.

▶ Three Research-Backed Foundations for Developing Human Judgment at Work

Research from three leading scholars converges on how professional judgment actually develops. Gary Klein's studies of firefighters and ER nurses show that pattern recognition requires varied exposure to high-stakes contexts until recognition becomes instinct — reps matter, but variety is essential. Phil Tetlock's work on superforecasters shows that the best judgment comes from calibrated confidence: knowing the limits of what you know, not certainty. Michael Polanyi's work on tacit knowledge shows that the most critical professional knowledge — the kind that experts know but cannot articulate — transfers primarily through apprenticeship. Organizations designing for AI must actively engineer all three.

Three Concrete Shifts to the Default AI Playbook

Shift 1: Close the Loop

People need to know the outcomes of the decisions they make. In complex organizations, this is genuinely difficult — outcomes are delayed, attribution is unclear, information is hard to collect and return. But outcome transparency alone is not sufficient. What you need in addition is structured reflection: when you make a decision, capture why you made it, what risks you considered, what your confidence was in the outcome. A lightweight decision journal. Then, months later — with your manager or an HR partner — revisit the calls you made. Which ones worked? Which didn't? Why? What did you learn? AI coaching tools like Nadia can help with every one of these structured reflection processes.

Shift 2: Design the Gradient

You cannot trust someone into VP-level decisions from day one. You cannot say, 'I automated all the routine work — good news, new graduate, come be a VP now.' You have to design the developmental progression deliberately. Use AI to create simulations, case studies, and scenarios that build judgment reps. Design those reps to transfer not only the explicit knowledge codified in your business processes, but the implicit knowledge of how your organization works and the tacit knowledge that is even hard to articulate. This is the challenge — and where a partnership with Valence can genuinely help your people.

Shift 3: Separate Review from Correction

Imagine that everyone in your organization is going to be reviewing AI-assisted work. There are two approaches a leader can take. One: 'This isn't compliant with our format, let me correct it' — a textbook approach focused on right answers. The other: use the review as an apprenticeship moment, asking 'what information did you take into account? What risks did you consider?' That second approach transfers the judgment, not just the correction. That is the addition to the default playbook that I would have you think about.

▶ Three Organizational Shifts to Develop Human Judgment in an AI-Assisted Workplace

Former Google People Analytics head Prasad identifies three concrete departures from the standard enterprise AI playbook that are necessary to develop human judgment alongside AI capability. First, close the loop: create structured retrospectives so people see the outcomes of their decisions and reflect on why they made them — lightweight decision journals reviewed months later with a manager or coach. Second, design the gradient: build deliberate developmental progressions using AI-generated simulations and scenarios, not just on-the-job experience. Third, separate review from correction: use the review of AI-generated work as an apprenticeship moment, not a compliance check. AI coaching tools like Nadia can support all three shifts.

Measuring Judgment Development — and the One Litmus Test Question

Measurement matters here. You can look at activities — how much work is going into pure coordination versus judgment-oriented work. You can look at reps and data from AI coaching tools. You can assess human calibration using what is called a revealed preferences survey: ask people, 'When you're facing a difficult decision, whose judgment do you rely on?' Measure the evolution of that answer over time. And look at outcomes themselves — how quickly are people course correcting? How quickly are they transferring learning from one situation to another across high-stakes business decisions, not just people decisions?

Let's get back to Alia. There's a default playbook version that concerns me: go all-in on AI implementation, and future leaders like Alia are left with no basis for making these calls — exposed only to AI recommendations, acting as signatories to decisions they don't understand, getting no outcome feedback, receiving weak coaching. Or you could have a judgment-enhanced version where leaders are getting to exercise real stakes, making decisions, receiving the right mentorship and coaching — set up to succeed and thrive in an AI-enhanced world.

Here is my closing thought. We talk always about automation or amplification. But I want you to think about two other dualities that come with AI. Every AI interaction either sets your people up for dependency or for development. There is work we want to offload cognitively to AI — that is appropriate. But if people default to 'go to AI, get the answer,' judgment atrophies. Design AI interactions as development moments, and efficiency will follow.

And here is the single litmus test question you can ask your people tomorrow, six months from now, a year from now: Can you tell when AI is wrong? Two years ago, most of us could. Today, some of us can, sometimes, for some situations. In the future — it depends entirely on us. Thank you.

▶ The Single Litmus Test for Whether Your Organization Is Developing or Eroding Human Judgment

Former Google People Analytics head Prasad closes with one question that reveals whether an organization is building or eroding human judgment in the age of AI: 'Can your people tell when AI is wrong?' Two years ago, most knowledge workers could answer yes. Today, the answer depends on the individual and the situation. In the future, the answer will depend on deliberate organizational design choices made now — whether AI interactions are designed as development moments or dependency moments. Every AI interaction, according to Prasad, sets people up for one or the other.

[00:28:07.570]

Moderator: Thank you, Prasad.