Same Industry, Same Tools, Radically Different Outcomes

In 2024, Klarna — the Swedish fintech — deployed OpenAI-powered chatbots to replace customer service staff, cutting headcount from 5,527 to approximately 3,400: a 40% reduction. The initial results looked promising. The AI agent handled two-thirds of customer inquiries. But internal reviews revealed that the AI lacked the nuance and empathy that experienced service agents carried. Customer satisfaction suffered. Quality dropped. CEO Sebastian Siemiatkowski made a public admission that few technology leaders are willing to make: "We went too far." Klarna began quietly rehiring human staff, as reported by Bloomberg.

At the same time, JPMorgan was running more than 450 active AI use cases in daily production — fraud detection, investment analysis, compliance reporting, document verification, customer service — with structured human oversight embedded throughout. The outcome: 4.2× ROI, the highest documented AI return in financial services, and no public reversal. Both companies operate in financial services. Both had access to the same foundation models, the same APIs, the same AI tools. The difference was not the technology.

This is the most instructive AI case study of 2025-2026 — not because Klarna failed technically, but because Klarna succeeded technically and failed strategically. The AI worked. The operating model was wrong. And understanding what made it wrong is more valuable than any AI implementation guide, because the same strategic error is being made at scale across enterprise deployments in 2026.

⚠️ Cautionary Tale
Klarna — AI-First Gone Wrong
  • Headcount cut 40%: 5,527 → ~3,400 employees
  • AI handled two-thirds of customer inquiries initially
  • AI lacked nuance and empathy for complex, emotional queries
  • Customer satisfaction declined; quality degraded
  • CEO publicly admitted: "We went too far" (WEF Davos, Jan 2026)
  • Bloomberg reported quiet rehiring of human staff began
  • Strategy: AI replaced human judgment in judgment-dependent work
❌ Outcome: Strategy reversal, reputational cost, rehiring
✅ Reference Model
JPMorgan — AI-Augmented at Scale
  • 450+ active AI use cases in production daily
  • Fraud detection in real time across millions of transactions
  • AI generates investment memos and analyses; humans validate
  • Compliance checking automated; human review on flagged items
  • Structured human oversight embedded at judgment-critical points
  • $18B technology budget deployed across AI-augmented workflows
  • Strategy: AI handles execution; humans handle judgment and exceptions
✅ Outcome: 4.2× ROI — highest in financial services, no reversal
88%
Of organisations use AI in at least one function (McKinsey). Only 1% describe their rollouts as mature. And only 5% are achieving AI value at scale (BCG). The gap is operating model, not tooling.
55%
Of AI high performers fundamentally redesign workflows around AI vs only 20% of others (McKinsey/Stanford). Workflow redesign is the differentiator — not model selection.
77%
Of enterprise API usage is full automation (Anthropic Economic Index). Individual Claude.ai usage: 52% human-AI collaboration. Enterprises automate more aggressively — and must design accordingly.

What Each Strategy Actually Is — Precise Definitions

Before mapping workflows or choosing between strategies, the terms have to be defined precisely — because most of the confusion in enterprise AI strategy conversations comes from using "AI-first" and "AI-augmented" as loose labels rather than specific operating-model commitments. Both describe a complete philosophy of how AI integrates into business operations, who is the primary actor in each workflow, and where human judgment enters the system.

AI-First Strategy

Redesigning business operations and products with AI as the primary executor of workflows. Humans are repositioned above or selectively within the loop — defining objectives, overseeing outputs, and handling exceptions — while AI handles the structured execution of work end-to-end.

The design question: "How should this process work if AI is the primary worker?" McKinsey's agentic organisation research describes this as "AI-first workflows, with humans and IT systems selectively reintroduced in AI-native design."

  • Products are designed for AI execution from the first architectural decision
  • Workflows are reverse-engineered around AI capabilities, not adapted from human processes
  • Human roles shift to orchestration, exception management, and quality oversight
  • Appropriate for: new AI-native products, high-volume structured processes with recoverable errors, and competitive contexts where speed and cost are the primary variables

AI-Augmented Strategy

Deploying AI to enhance human professionals working within existing or redesigned workflows. Humans remain the primary decision-makers and relationship owners; AI handles specific subtasks — research, drafting, analysis, data processing — that were previously manual.

The design question: "How can AI make our people meaningfully better at what they do?" Microsoft CEO Satya Nadella frames the business case: "Firm sovereignty — a company's ability to embed its tacit knowledge in models it controls — matters more than model access."

  • Existing workflows are preserved or improved, not replaced
  • AI tools are deployed at the task level, not the workflow level
  • Human professional capabilities and relationships remain central to value delivery
  • Appropriate for: judgment-intensive work, relationship-dependent services, high-stakes decision-making, and contexts where trust and expertise are the competitive differentiators
📌 The Critical Distinction

AI-first is not about the degree of automation. AI-augmented is not about going slow with AI. The distinction is about workflow design philosophy: who is the primary actor — AI or human — and what role does the other play? Both can achieve high levels of AI integration. The difference is in whether the workflow was designed for AI execution (AI-first) or enhanced for human execution (AI-augmented). A company can run AI-first in document processing and AI-augmented in client advisory simultaneously — and the best enterprise AI deployments do exactly this.

Above the Loop vs In the Loop — How Human Oversight Is Designed

The most precise way to distinguish AI-first from AI-augmented at an operational level is through the concept of "human in the loop" — specifically, where the human sits relative to the AI execution. McKinsey's agentic organisation research offers the clearest framework: humans positioned "above the loop" in AI-first designs, and "selectively within the loop where human contact matters."

AI-First Operating Model

Human Above the Loop

1

Human defines the goal, success criteria, and boundaries of acceptable output

2

AI agents receive the goal and execute: research, planning, implementation, quality check — autonomously

3

AI submits completed work for human review

4

Human validates, approves, or redirects — intervention at outcome level, not task level

5

Human selectively re-enters when AI encounters edge cases, novel situations, or exceptions

AI-Augmented Operating Model

Human In the Loop

1

Human professional receives task and plans approach — applying their judgment and expertise

2

AI tools handle specific subtasks: research compilation, draft generation, data analysis, document review

3

Human reviews AI output, applies professional judgment, and integrates into their work product

4

Human delivers final output — maintaining full professional responsibility for the outcome

5

Human remains the relationship owner, accountable for quality, and custodian of client trust

The Anthropic Economic Index provides the data that confirms this distinction in practice: 52% of individual Claude.ai usage involves human-AI collaboration (AI-augmented pattern), versus 45% full automation. Enterprise API usage shows the inverse: 77% automation, suggesting enterprises are deploying AI in full-automation patterns far more aggressively than individual users. Stanford's Enterprise AI Playbook analysis of 51 successful deployments finds that "structured human oversight correlates with success" — and specifically that McKinsey reports 65% of AI high performers have senior leaders demonstrating active ownership and commitment to oversight architecture.

The Two Failure Modes — What Goes Wrong With Each Strategy

Understanding the failure mode of each strategy is as important as understanding when to use it — because both strategies have characteristic failure patterns that are predictable and preventable.

⚠️ AI-First Failure Mode: Replacing Judgment That Cannot Be Replaced

Klarna is the reference case, but it is not unique. AI-first strategy fails when it is applied to work that requires empathy, contextual judgment, or nuanced relationship reading — the qualities that experienced professionals develop over years of human interaction. The AI handles the structured, volume-driven layer perfectly. It fails at the edge cases, emotional complexity, and novel situations that require genuine understanding of context rather than pattern matching. The failure is not the model — it is applying AI-first design to judgment-intensive work where AI-augmented (human primary, AI accelerating) would have produced better outcomes. The symptom: initially impressive metrics followed by quality degradation, customer satisfaction decline, and ultimately strategy reversal.

⚠️ AI-Augmented Failure Mode: Activity Without Transformation

EY's 2025 Work Reimagined Survey documented the failure mode of AI-augmented strategies deployed without proficiency investment: 88% daily AI usage, but only 5% using AI in advanced ways. Deloitte's State of AI in the Enterprise 2026 found that fewer than 60% of employees with approved AI tools use them regularly — despite broad adoption metrics. This is the failure pattern where organisations can demonstrate AI adoption numbers but cannot demonstrate AI impact. The symptom: impressive adoption statistics, GenAI tools purchased across the organisation, individual productivity gains that do not aggregate into enterprise-level EBIT impact. This is "activity without transformation" — the exact failure mode that BCG identifies as leaving 95% of organisations unable to achieve AI value at scale.

Designing your enterprise AI operating model and want an expert assessment of which workflows should be AI-first vs AI-augmented — before committing to architecture that is difficult and expensive to reverse? Automely provides this assessment free.

Free 45-minute enterprise AI operating model consultation. We map your specific workflows against the decision framework in this guide and recommend the architecture that avoids both failure modes.

Get Free Operating Model Assessment →

The Four-Variable Decision Framework — Choosing the Right Strategy

The right enterprise AI operating model is not a universal choice. It is a per-workflow decision determined by four variables. Most enterprises should be running both strategies simultaneously across different parts of their operation — AI-first in high-volume structured execution workflows, AI-augmented in judgment-intensive professional work.

1
Workflow Nature — Is This Work Structured Execution or Contextual Judgment?
AI-First Indicator

High-volume, rules-governed, structured inputs with well-defined outputs. The work follows patterns and can be specified with clear decision criteria. Invoice processing, fraud detection, compliance checking, document classification, code generation for standard patterns.

AI-Augmented Indicator

Judgment-intensive, contextual, novel, or relationship-dependent. The work requires reading between the lines, adapting to unstated context, or exercising professional expertise that accumulated over time. Client advisory, strategic analysis, complex negotiations, sensitive customer conversations.

2
Error Consequence — What Happens When AI Gets It Wrong?
AI-First Indicator

Errors are recoverable, detectable before consequential impact, and correctable without significant damage. A misclassified document can be reclassified. A draft that misses the mark can be revised. A flagged transaction that was not fraud can be cleared. Low-stakes, high-reversibility errors favour AI-first with robust monitoring.

AI-Augmented Indicator

Errors are high-stakes, difficult to detect before damage occurs, or irreversible. A misdiagnosed medical condition. A legally incorrect contract clause. An investment recommendation that damages a client relationship. A compliance failure that triggers regulatory action. High-stakes, low-reversibility errors require human judgment in the loop.

3
Competitive Context — What Is the Primary Source of Competitive Advantage?
AI-First Indicator

Speed, cost, and scale are the primary competitive variables. The competitor who processes more applications faster at lower cost wins. This is the context where AI-first architecture produces the largest competitive advantage — because it unlocks throughput levels that human-operated competitors structurally cannot match. Fintech volume processing, e-commerce fulfilment, content at scale.

AI-Augmented Indicator

Trust, expertise, relationships, and professional judgment are the primary competitive variables. The competitor who provides better advice, deeper insight, and more trusted relationships wins. AI-first in these contexts commoditises the human element that is the actual product — as Klarna discovered. Professional services, advisory, premium customer service.

4
Data Readiness — Is Your Data Clean, Sufficient, and Governed Enough?
AI-First Indicator

Clean, well-labelled, consistently structured data is available in sufficient volume for the specific use case. The data governance is established: quality monitoring, data lineage, and access controls are in place. AI-first workflows on inadequate data amplify data quality problems at production scale — which is why 85% of AI project failures trace to data issues, not model issues.

AI-Augmented Indicator

Data is incomplete, inconsistent, or insufficient for autonomous AI execution — but usable by AI as a research and analysis tool under human oversight. AI-augmented is the appropriate strategy while data infrastructure is being built, or in domains where proprietary data volume is inherently limited. Human judgment compensates for data gaps that AI-first cannot tolerate.

Workflow Mapping — Which Enterprise Functions Suit Each Strategy

The decision framework becomes concrete when applied to specific enterprise functions. The table below maps 12 common enterprise workflows to the recommended strategy (AI-first, AI-augmented, or hybrid), the rationale, and an example deployment outcome. Use it as a starting point for your own workflow audit — locate the functions that matter most in your operation, compare the recommended strategy against your current approach, and identify where reallocation is most likely to improve outcomes.

Business Function / WorkflowRecommended StrategyRationaleExample Deployment
Invoice processing and accounts payableAI-FirstStructured, high-volume, rules-governed, recoverable errors80% time reduction; AI processes end-to-end, human reviews exceptions
Fraud detection (transaction monitoring)AI-FirstReal-time, pattern-based, human review on flagged itemsJPMorgan: AI detects across millions of transactions; human analysts investigate alerts
Customer service — simple/structured queriesAI-FirstFAQ, order status, account access — structured, recoverable65%+ of queries resolved autonomously; $0.50 vs $6.00 per interaction
Customer service — complaints, disputes, sensitiveAI-AugmentedEmotional, judgment-dependent, relationship-critical — Klarna's warningAI provides agent briefing; human owns the conversation
Legal document review (high volume)AI-FirstStructured, pattern-based document analysis; human validates findings240h/year/professional recovered; Salesforce $5M savings
Legal counsel and client advisoryAI-AugmentedProfessional judgment, liability, and relationship — human primaryAI handles research; lawyer owns advice and client relationship
Software development — standard featuresAI-FirstWell-defined patterns, automated testing, human reviews output376% ROI; Claude Code agents implement features; engineers validate
Software architecture and novel designAI-AugmentedJudgment-intensive, complex trade-offs — senior developer leadsAI accelerates research and prototyping; architect makes decisions
Sales outreach and lead qualificationAI-FirstHigh-volume, structured research and personalisation25× outreach volume increase with agentic AI; human owns relationship conversations
Strategic account managementAI-AugmentedRelationship and judgment central; AI provides intelligenceAI prepares briefings and analysis; salesperson owns relationship
Medical documentation and codingHybridHigh volume + high stakes; AI processes, clinician validates68% document handling automated; clinician reviews and signs off
Clinical diagnosis and treatment planningAI-AugmentedHighest stakes, irreversible consequences — physician leads with AI as decision supportAI surfaces relevant data and guidelines; physician makes diagnosis

The Enterprise Hybrid Model — AI-First at Execution, AI-Augmented at Judgment

The most successful enterprise AI deployments in 2026 do not choose between AI-first and AI-augmented. They deploy both simultaneously, with deliberate workflow mapping that assigns each approach to the functions where it is appropriate. McKinsey's description of the agentic organisation articulates the target state: "Humans will be mostly positioned above the loop to steer and direct outcomes and selectively within the loop where human contact matters."

This is precisely what JPMorgan has built: AI-first in transaction processing, fraud detection, and document handling — workflows where volume, speed, and consistency are what matter. AI-augmented in investment advisory, client relationship management, and strategic analysis — where professional judgment, accountability, and trust are what matter. The result is 450+ AI use cases in production without the reversal Klarna experienced, because the operating model was designed with workflow clarity rather than wholesale automation philosophy.

✅ The Satya Nadella Framing — What the Sustainable Model Looks Like

Microsoft CEO Satya Nadella at Davos 2026: "The future belongs to companies that treat models as components, and treat orchestration, context, and proprietary knowledge as their true differentiators." This is the enterprise AI operating model described in one sentence: models (AI) handle the execution layer; orchestration (workflow design), context (proprietary business knowledge), and proprietary data are the competitive differentiators that humans embed and maintain. Neither strategy alone captures this. The hybrid does — because it deploys AI where models create the most throughput advantage, and invests in the human-embedded proprietary knowledge that the models run on.

PwC's 2026 AI predictions describe the practical implementation of this hybrid: "Spell it out. As you design a new agentic workflow, map it step-by-step, specifying where agents own the work, where people do, where people and agents collaborate, and how oversight can take place for each step." This mapping — explicit, deliberate, per-step — is what separates the enterprise AI implementations that succeed from those that either over-automate (Klarna's path) or under-automate (the 88% adoption, 5% proficiency failure). For the governance architecture that makes this hybrid model sustainable, see our guide to why most AI projects fail and what successful ones do differently, particularly the pre-approval success-metric discipline.

Ready to map your enterprise workflows against the AI-first vs AI-augmented decision framework — and build the operating model that avoids both the Klarna failure mode and the 88% adoption / 5% proficiency trap?

Free 45-minute enterprise AI operating model consultation. We map your workflows, recommend the strategy per function, and outline the implementation sequence that builds the hybrid enterprise model with measurable outcomes at each stage.

Book Free AI Operating Model Consultation →
HK

Hamid Khan

CEO & Co-Founder, Automely

Hamid leads Automely's enterprise AI practice — designing operating models that combine AI-first execution with AI-augmented judgment for businesses across the US, UK, and EU. Sources: McKinsey agentic organisation research, BCG AI at scale studies, Anthropic Economic Index, EY 2025 Work Reimagined Survey, Deloitte State of AI in the Enterprise 2026, Stanford Enterprise AI Playbook, PwC 2026 AI predictions, Bloomberg coverage of Klarna, public statements by Sebastian Siemiatkowski and Satya Nadella at WEF Davos 2026. 4.9★ Clutch. 120+ AI projects. Learn more →