"We Went Too Far" -- Klarna and the Cost of Replacing Human Judgment

TwinLadder Casebook Series | TwinLadder | February 2026

The Hook

In May 2025, Sebastian Siemiatkowski sat across from a Bloomberg journalist and said the four words that no CEO who has staked a company's identity on artificial intelligence wants to say: "We went too far."

Eighteen months earlier, the Klarna chief executive had stood on the opposite side of this admission. He had declared that AI "can already do all of the jobs that we, as humans, do." He had frozen hiring across the entire company. He had replaced approximately 700 customer service agents with an OpenAI-powered chatbot and announced the results to the world: 2.3 million conversations handled in the first month, resolution times compressed from eleven minutes to two, customer service costs cut by 40 percent. The numbers were extraordinary. The narrative was irresistible. Klarna was not merely adopting AI. It was becoming AI.

And then it was not. By the time Siemiatkowski spoke to Bloomberg, the company was hiring human agents again. The reason, stated with uncommon directness for a technology executive, was that "cost unfortunately seems to have been a too predominant evaluation factor when organizing this, what you end up having is lower quality." The most celebrated AI deployment in European fintech had produced a reversal that would reshape how the industry thinks about the boundary between automation and human judgment.

The Story

The Promise

Klarna was already one of Europe's most valuable private technology companies when the AI pivot began. Founded in Stockholm in 2005, the buy-now-pay-later pioneer had reached a peak valuation of $45.6 billion in 2021, built on the premise that consumer credit could be made seamless, transparent, and embedded directly into the checkout experience. By late 2022, the company employed approximately 5,500 people across multiple continents.

The AI strategy emerged from a specific observation. Klarna's customer service operation -- like most financial services support functions -- was dominated by repetitive, pattern-matching inquiries. Order status. Refund requests. Payment schedule adjustments. Account access. These interactions consumed enormous labour resources and followed predictable scripts. They were, in the language of automation economics, the low-hanging fruit.

In 2023, Siemiatkowski made the decision that would define the company's next two years. Klarna implemented a hiring freeze and began deploying an AI assistant built in collaboration with OpenAI. The system would not supplement human agents. It would replace them. Natural attrition -- which runs at 15 to 20 percent annually in customer service operations -- would handle the transition. No mass layoffs. Just a gradual, seemingly painless substitution of human labour with machine capability.

The Metrics Success

The early numbers were, by any conventional measure, remarkable. When Klarna launched its AI assistant globally in February 2024, the company reported that the system had handled two-thirds of all customer service chats within its first month -- the equivalent workload of 700 full-time agents. Resolution time fell from an average of eleven minutes to two. Repeat inquiries dropped by 25 percent. The company projected a $40 million profit improvement for 2024 attributable to the AI deployment alone.

Siemiatkowski promoted these results aggressively. In interviews, investor presentations, and public appearances, he positioned Klarna as proof that AI could deliver on the replacement thesis -- not in theory, not in a pilot, but at scale across a global financial services operation. OpenAI featured Klarna as a showcase case study on its own website. The fintech press amplified the narrative. When Klarna's headcount fell from 5,527 to approximately 3,400 over two years -- a reduction of nearly 40 percent -- the CEO attributed it to AI and natural attrition, and described the savings as enabling a nearly 60 percent increase in remaining employee salaries.

The story was clean. The metrics were strong. And the metrics were hiding everything that mattered.

The Hidden Costs

The first signs appeared in customer feedback channels. Complaints about generic, formulaic responses. Reports of being trapped in loops -- explaining an issue to the chatbot, failing to reach resolution, starting again from scratch. Customers with disputes that required judgment, not pattern-matching, found themselves facing a system incapable of distinguishing between a routine refund and a genuinely complex financial grievance.

The empathy gap surfaced most visibly in the interactions that mattered most. A customer whose payment dispute involved extenuating circumstances -- a medical emergency, a merchant fraud claim, a billing error compounded by an account lock -- encountered a system that could process the words but could not read the situation. The chatbot could resolve a straightforward return in two minutes. It could not recognize when two minutes of scripted responses would drive a loyal customer to a competitor.

The Better Business Bureau logged over 900 complaints against Klarna in a three-year period, predominantly focused on refunds and billing issues -- precisely the categories where the gap between automated processing and human judgment is widest. Internal reviews revealed what the headline metrics had obscured: service quality had declined materially following the replacement of human agents with chatbots -- a reality the CEO himself would later acknowledge when he told Bloomberg that cost had become "a too predominant evaluation factor," resulting in "lower quality." The speed metric was up. The trust metric was down. And in financial services, trust is the product.

By early 2025, the pattern was unmistakable. Klarna's AI-first customer service strategy had optimized for the variable that was easiest to measure -- resolution time -- while degrading the variable that was hardest to measure but most consequential: customer confidence that the company would exercise judgment on their behalf.

The Reversal

Siemiatkowski's Bloomberg admission in May 2025 was not a pivot. It was a correction. The company announced that it would rehire human agents, piloting what it described as an "Uber-style" workforce model -- remote agents with flexible schedules, targeting students, parents, and workers in underserved labour markets. The system would be hybrid: AI for straightforward queries, human agents for situations requiring empathy, discretion, or escalation.

"From a brand perspective, a company perspective, I just think it is so critical that you are clear to your customer that there will always be a human if you want," Siemiatkowski told Bloomberg. The statement was notable not for what it conceded but for what it implied. The AI had been deployed to reduce cost. The human was being rehired to restore trust. And trust, once eroded, is more expensive to rebuild than it was to maintain.

In September 2025, Klarna completed its United States IPO, raising $1.37 billion at a valuation of approximately $17.4 billion. The shares rose 15 percent on the first day of trading. The market valued the company handsomely. But the $17.4 billion figure sat well below the $45.6 billion peak of 2021 -- a gap that reflected, among other factors, the lingering questions about whether the company's efficiency gains had come at the cost of the customer relationships that drove its growth.

Through the TwinLadder Lens

The Klarna case is a textbook illustration of what happens when an organization tries to operate at a high level of AI integration without climbing the ladder that makes such integration sustainable.

TwinLadder's TwinLadder framework defines four progressive levels of AI competence. Level 0 is AI Literacy -- the baseline ability to critically evaluate what AI produces. Level 1 is the Professional Twin -- the practice of mirroring individual roles with AI agents, not to replace professionals but to create the conditions for comparison, learning, and judgment-building. Level 2 is the Operational Twin -- digital replicas of business functions that enable testing before committing. Level 3 is the Ecosystem Twin -- modelling entire value chains so that systemic effects become visible before they become damage.

The ladder is designed to be climbed, not skipped. Klarna skipped it entirely.

Before the AI deployment, Klarna's 700 customer service agents constituted a Level 1 asset that the company did not recognize as such. Those agents carried tacit knowledge: the ability to read emotional subtext in a customer's message, the judgment to distinguish a routine complaint from a brewing churn risk, the experience to know when a strict policy should yield to a pragmatic resolution. They knew which refund disputes signalled merchant fraud patterns. They knew which tone of complaint predicted social media escalation. They understood the difference between a customer who wanted an answer and a customer who wanted to be heard.

This knowledge was not documented in any training manual. It was not captured in any dataset. It was the accumulated residue of thousands of human interactions, carried in memory, intuition, and professional judgment. In the language of the TwinLadder, these agents represented a latent Professional Twin capability -- human expertise that, properly mirrored by AI, could have produced an augmented service function more capable than either humans or AI alone.

Instead, Klarna eliminated the human side of the mirror. It did not move its agents up the ladder -- from routine query handlers to complex case specialists, from front-line responders to customer intelligence analysts, from script followers to judgment exercisers. It moved them off the ladder entirely. The 700 agents were not reskilled. They were replaced.

The result was a regression. Klarna had the potential to operate at Level 1 -- human professionals augmented by AI twins, each making the other more effective. Instead, it regressed to a pre-ladder state: a system that could process queries but could not exercise judgment. The AI handled volume. It could not handle meaning. And the humans who could have supplied that meaning were gone.

The TwinLadder framework predicts this outcome with precision. The ladder exists because AI capability without human competence produces exactly what Klarna experienced -- impressive metrics masking eroding trust, speed without understanding, efficiency without judgment. The ladder is climbed, not skipped, because each level builds the human capacity to direct the AI capability above it. Without Level 1 professionals who understand the domain deeply enough to challenge AI output, there is no foundation for anything that follows.

The Pattern

Klarna is not an isolated case. It is the most visible instance of a pattern that is now emerging across industries -- and the data confirms that the pattern is widespread.

In February 2026, Gartner published a prediction that by 2027, 50 percent of companies that reduced customer service headcount due to AI will rehire staff to perform similar functions, albeit under different job titles. The prediction was grounded in a survey of 321 customer service leaders conducted in October 2025, which found that only 20 percent had actually reduced agent staffing due to AI. The majority reported that headcount remained steady, even as they supported more customers. The companies that did cut had begun to discover what Klarna discovered: AI handles transactions, but customers want relationships.

Forrester's Predictions 2026 report quantified the regret. Fifty-five percent of employers who laid off workers for AI now report regretting the decision. The core finding is stark: organizations are making staffing decisions based on AI capabilities that do not yet exist, betting on a future promise while destroying present capacity. Forrester's analysts noted that "many firms are so focused on chasing AI-fuelled efficiencies that they have not determined what AI can actually offer."

The measurement blind spot at the centre of this pattern is the confusion of speed with trust. Klarna measured resolution time -- two minutes, down from eleven. It did not measure, or did not weight heavily enough, the quality of resolution, the emotional experience of the customer, or the downstream effect on repeat purchase behaviour. This is the classic trap of quantitative optimization: the variables that are easiest to measure become the variables that drive decisions, while the variables that matter most -- trust, loyalty, judgment, empathy -- resist measurement and therefore resist prioritization.

The empathy gap is real, and it is structural. AI systems process language. They do not process meaning. They can identify that a customer is using words associated with frustration. They cannot feel the weight of that frustration or calibrate their response to the specific human situation producing it. When the stakes are low -- a routine order inquiry, a simple status check -- this gap is invisible. When the stakes are high -- a disputed charge during a financial hardship, an account lockout during a time-sensitive transaction -- the gap becomes the entire experience.

The Lesson

The lesson from Klarna is not that AI should be avoided in customer service. It is that AI augments human judgment; it does not replace it. The distinction is not semantic. It is architectural.

IKEA understood this distinction and built accordingly. When the Swedish retailer deployed its AI chatbot "Billie" in 2021, the system took over 47 percent of routine customer queries. IKEA did not use the efficiency gain to eliminate its 8,500 customer service employees. It reskilled them as interior design advisors -- a higher-value role that used their accumulated knowledge of customer needs, product capabilities, and home environments to deliver a service that AI could not replicate. The result was 1.3 billion euros in revenue generated through IKEA's remote interior design channel in 2022 alone. The AI handled the routine. The humans handled the judgment. Both became more valuable.

The contrast with Klarna is instructive. Both companies are Swedish. Both deployed AI chatbots in their customer service operations. Both achieved significant efficiency gains. But IKEA moved its people up the ladder -- from routine handlers to expert advisors -- while Klarna moved its people off the ladder entirely. IKEA built a Level 1 implementation: human professionals augmented by AI, each operating in the domain where they add the most value. Klarna attempted to skip to a fully automated state and discovered, expensively, that the state does not exist.

The TwinLadder approach to this challenge is straightforward. Do not eliminate the roles that carry tacit knowledge. Transform them. Use AI to handle the volume, the repetition, the pattern-matching that consumes human attention without developing human judgment. Use the freed capacity to move professionals into roles where their judgment, empathy, and contextual understanding create value that AI cannot. Measure not only what the AI produces but what the humans operating alongside it are learning. The metric that matters is not resolution time. It is whether your people are becoming more capable or more dependent.

The organizations that will lead the next decade are not the ones that replaced the most humans with AI. They are the ones that used AI to make their humans irreplaceable.

Monday Morning Question: If your AI systems went offline tomorrow, would your team have the judgment and domain knowledge to serve your customers -- or did you automate away the expertise you will need to recover?

Sources

Bloomberg -- "Klarna Turns From AI to Real Person Customer Service" (May 2025): https://www.bloomberg.com/news/articles/2025-05-08/klarna-turns-from-ai-to-real-person-customer-service
Gartner -- "Gartner Predicts Half of Companies That Cut Customer Service Staff Due to AI Will Rehire by 2027" (February 2026): https://www.gartner.com/en/newsroom/press-releases/2026-02-03-gartner-predicts-half-of-companies-that-cut-customer-service-staff-due-to-ai-will-rehire-by-2027
Klarna International -- "Klarna AI Assistant Handles Two-Thirds of Customer Service Chats in Its First Month" (February 2024): https://www.klarna.com/international/press/klarna-ai-assistant-handles-two-thirds-of-customer-service-chats-in-its-first-month/
CNBC -- "Klarna CEO Says AI Helped Company Shrink Workforce by 40%" (May 2025): https://www.cnbc.com/2025/05/14/klarna-ceo-says-ai-helped-company-shrink-workforce-by-40percent.html
Fortune -- "Klarna Plans to Hire Humans Again, as New Landmark Survey Reveals Most AI Projects Fail to Deliver" (May 2025): https://fortune.com/2025/05/09/klarna-ai-humans-return-on-investment/
Forrester Research -- "Predictions 2026: The Future of Work" -- cited via https://www.hcamag.com/us/specialization/recruitment/when-ai-redundancies-backfire-employers-now-scrambling-to-rehire-humans/564000
PYMNTS -- "IKEA Uses AI to Transform Call Center Employees Into Interior Design Advisors" (2023): https://www.pymnts.com/news/retail/2023/ikea-uses-artificial-intelligence-transform-call-center-employees-into-interior-design-advisors/
Entrepreneur -- "Klarna CEO Reverses Course By Hiring More Humans, Not AI" (2025): https://www.entrepreneur.com/business-news/klarna-ceo-reverses-course-by-hiring-more-humans-not-ai/491396