The Conversation Lab: Does agent empathy move customer satisfaction?

Author:

Alexander Dunn

Reading time:

6 mins

Last updated:

May 6 2026

Blog /The Conversation Lab / The Conversation Lab: Does agent empathy move customer satisfaction?

We analyzed 35 million conversations to find out if agent empathy actually moves customer satisfaction

This article is the first in a series we’re calling Level AI’s Conversation Lab, where we publish short data stories from customer conversations. We analyze interaction patterns across our customer base, using our platform’s native capabilities, and share what we find with the rest of the CX industry.

Empathy is one of those things everyone in CX agrees matters and almost nobody measures with any rigor. It shows up in training materials, gets referenced loosely in QA reviews, and rarely gets tied back to satisfaction in a way that's specific enough to coach against.

We wanted to know if that was a measurement failure or a real ambiguity, so we ran the analysis. We looked at 35.1 million conversations that happened year to date, across 85 customers. The question we sought to answer: does explicit emotional validation (an agent directly acknowledging a customer's frustration, stress, or concern) ultimately change satisfaction scores?

Across the full dataset, conversations with explicit emotional validation scored an average iCSAT of 3.94. Conversations without it scored 3.52. That is an absolute lift of +0.42 points and a relative improvement of +11.84%.

iCSAT lift when agents explicitly acknowledged customer emotion

A note on iCSAT

Throughout this report, our primary outcome measurement is iCSAT (Inferred Customer Satisfaction). It is scored on a 1–5 scale (with higher being better) and is evaluated alongside resolution scores and customer effort metrics. iCSAT scores every interaction on three dimensions: sentiment, resolution, and customer effort. Traditional CSAT captures the small percentage of customers who fill out a survey, while iCSAT is able to measure the full population.

How we measured it

We used empathy- and sympathy-related Instascore questions as a proxy for explicit emotional validation. In practice, that means we looked for conversations where the agent acknowledged the customer’s frustration, concern, stress, or emotional state directly. We then compared those conversations against conversations where that validation did not appear, using iCSAT as the outcome measure. The proxy definition stayed constant across the analysis: Instascore questions containing “empathy” or “sympathy” counted as positive when QUESTION_SCORE > 0. Conversations were deduplicated at the ASR log level before aggregation.

The lift depends on the type of friction

The +11.84% aggregate is important. The category view is where the finding becomes more useful. Across the major categories in the dataset, explicit emotional validation correlated with higher customer satisfaction in nearly every case. The size of the lift changed meaningfully by category.

Category	iCSAT lift
Technology & SaaS	+14.8%
Financial Services & Insurance	+14.3%
Food & Hospitality	+12.6%
Retail & E-Commerce	+8.4%
Home & Real Estate	+4.7%
Utilities & Energy	+2.7%
Healthcare & Wellness	+1.8%

In uncertain situations, validation carries more weight

Technology & SaaS showed a +14.8% lift in iCSAT when agents explicitly acknowledged customer emotion. Financial Services & Insurance showed a +14.3% lift. Those are the two strongest category-level lifts in the dataset.

What makes that useful is the range of interactions sitting inside both categories. In Technology & SaaS, the work likely includes account access, payment friction, onboarding issues, implementation questions, product support, identity verification, scheduling problems, and service interruptions. In Financial Services & Insurance, it likely includes claims, disputes, payment issues, account access, debt conversations, policy questions, and money movement. These are different service environments with different expectations and different resolution paths. The positive relationship still appears in both.

That strengthens the broader finding. Explicit emotional validation correlated with higher satisfaction across categories with very different forms of customer work.

Disrupted plans also benefit from acknowledgement

Food & Hospitality showed a +12.6% lift across 442,610 empathy-flagged conversations.

This category includes food ordering, reservation-related support, lodging, and travel-adjacent service interactions where timing, disruption, service quality, and customer expectations are already visible in the conversation. Acknowledgment carries more weight when the customer is dealing with a broken plan, a delayed experience, or a service issue that already feels immediate.

The result adds useful depth to the report. Validation correlated with higher satisfaction in categories shaped by urgency and service disruption, and it also appeared in categories built around more routine operational work.

Even transactional interactions respond to acknowledgment

Retail & E-Commerce showed an +8.4% lift in iCSAT when agents explicitly acknowledged customer emotion. That result came from 3.1 million empathy-flagged conversations, making it one of the clearest category-level signals in the dataset.

Orders, returns, damaged shipments, subscription issues, billing problems, and fulfillment delays may look operational on paper. They still create friction for the customer. The data shows that acknowledgment changes how those interactions land.

This is one of the more useful findings in the report. Explicit emotional validation did not only correlate with higher satisfaction in categories defined by claims, uncertainty, or service disruption. It also showed up in a category built around repeatable, high-volume customer work.

In an industry full of edge cases, the signal gets muddied

Healthcare & Wellness showed a +1.8% lift across 429,834 empathy-flagged conversations.

That is a smaller effect than the top categories. It is still positive.

Many health-related interactions carry constraints an agent cannot resolve in the moment. Coverage decisions, care timelines, transportation logistics, clinical uncertainty, staffing limitations, and product or treatment constraints can shape satisfaction in ways that acknowledgment alone cannot materially lift. In that environment, validation still matters, but it has less room to move the score.

That makes Healthcare & Wellness a lower-lift category within the overall pattern.

Agent empathy across 35.1 million conversations

4 out of 5 conversations skip validation entirely

One number in this dataset matters as much as the lift itself: only 18.2% of conversations across the full 35.1 million included explicit emotional validation. The remaining 81.8% did not.

A behavior becomes operational when teams define it clearly, score it consistently, and coach against that standard. Explicit emotional validation needs that level of rigor. Otherwise it stays subjective, uneven, and difficult to operationalize.

This is where rubric design matters. A behavior only becomes measurable when teams agree on what counts, score it consistently, and review it across the full conversation set.

What to do with this data

Three operational moves come directly out of this analysis.

1. Score validation as a behavior

Define explicit emotional validation as an observable action. Did the agent directly acknowledge the customer’s emotional state or not? Score that consistently across conversations, the same way you score resolution or compliance.

2. Coach by category

The effect size changes by category. Technology & SaaS, Financial Services & Insurance, Food & Hospitality, and Retail & E-Commerce all show enough lift to support targeted coaching around acknowledgment. The coaching standard should reflect the operating environment instead of flattening every queue into the same rubric.

3. Score every conversation

Sampled QA can catch missed moments on individual calls. It does not reliably surface a behavioral gap that runs across a queue, shift, or manager cohort. Full-population measurement is what turns validation into a measurable indicator.

What this report says

This analysis confirms that explicit emotional validation is a measurable behavior. Across 35.1 million conversations, it correlated with higher customer satisfaction. The size of that lift changed by category. Technology & SaaS and Financial Services & Insurance showed the strongest lifts in the dataset. Food & Hospitality added another strong result. Retail & E-Commerce showed the pattern in a high-volume service environment. Healthcare & Wellness showed a smaller but still positive lift.

That is enough to support a more operational view of empathy.

Subscribe to Ctrl+CX

Hear insights directly from Rob Dwyer, Level AI's CX Executive in Residence