See how Deepgram stacks up. Check out our ASR Comparison Tool. 🍎🍊

All Posts

Sentiment Analysis and Emotion Recognition: What’s the Difference?

Sentiment analysis and emotion recognition are two of the hottest topics in speech understanding today. But they’re often confused for one another—so much so that people often say “sentiment analysis” when they’re referring to emotion recognition. In this post, we’ll explain what both sentiment analysis and emotional recognition are, how they are used in business, and some of the limitations and challenges of each.

1. What is Sentiment Analysis?

Sentiment analysis is a typically text-based machine learning classification task. It might operate on single sentences, paragraphs, or even entire articles. The typical goal of sentiment analysis is to determine whether the author of a text has a positive or a negative opinion about whatever the topic of the text is. To this end, the typical training sets for sentiment analysis models are things like IMDb reviews of movies and Amazon product reviews, where it’s easy to tell how someone felt about a topic (that is, their star ratings can be used as part of the training data).

Sentiment analysis has a variety of uses, including analyzing customer feedback, monitoring social media conversations, tracking brand reputation, gauging public opinion on a topic or issue, and evaluating customer satisfaction levels.

Limitations and Challenges of Sentiment Analysis

There are, of course, limitations to systems like this. Sarcasm, for example, can be hard for sentiment analysis to detect (which isn’t surprising since humans also struggle to correctly identify sarcasm in written language). That might be less of a problem when you’re training and have the groundtruth of someone’s rating, but in the real world, the goal of sentiment analysis is to determine how someone felt in the absence of a rating.

State of Voice Technology 2022

State of Voice Technology 2022

This report provides a snapshot of the current landscape of voice tech—and what’s coming in the near future—based on a survey of 400 decision-makers across the enterprise. Learn what’s happening now and what’s coming in the future.

Download Now

2. What is Emotion Recognition?

The second way that people use the term sentiment analysis is to refer to what is more appropriately known as emotion recognition (sometimes called emotion detection, or incorrectly called emotional recognition). Unlike sentiment analysis, emotion recognition typically relies on audio data, rather than text, and uses things like intonation, volume, and speed to determine what emotion a speaker is feeling, usually coded as one of several categories like, happy, sad, angry, etc.

It’s important to note that this doesn’t automatically correlate with how someone feels about a topic. Someone can be happy talking about how they don’t like something (who doesn’t love to vent?), so ultimately, emotion recognition is trying to get at different information than sentiment analysis.

Uses of emotion recognition include helping call center agents understand how a caller is feeling, monitoring hospital patients for stress and pain, and even tracking responses to advertisements.

Limitations and Challenges of Emotion Recognition

If you’ve ever communicated with another human being—and we hope you have—you know that even with all of our experience with social interactions, it can still be tricky to determine someone’s emotional state just from talking to them. This is doubly problematic for attempts at emotion recognition. Not only are you trying to make a system to do something that’s tricky for humans, you’re going to do so using a dataset that humans have labeled based on the emotion that they think is present, even though they might not agree, and even though their labels might not accurately match the emotion the speaker was actually feeling.

This is further complicated by the fact that the audio used to train the model might be acted data, and not people actually expressing the emotion that they’re experiencing. Plus, most emotion recognition systems only look at audio data, and don’t include other things that could help make a determination, such as body language or facial expressions.

It’s also the case that we do more with our voice than express emotion—for example, sarcasm in English carries a particular kind of intonation that’s recognizable, but sarcasm isn’t an emotion. This creates an added complication for emotion recognition systems.

3. A Combination of Approaches

Because of the challenges of sentiment analysis and emotion recognition, some people have tried to combine the systems to try and better understand how people feel about a topic. If you have audio of someone speaking, in addition to conducting emotion recognition on that audio, you can also use a product like Deepgram to transcribe the audio into text, and then apply a text-based sentiment analysis model. This approach obviously only works when you have audio, and so isn’t appropriate for all use cases, but it can provide additional insight when working from spoken data.

Concluding Thoughts

Ultimately, sentiment analysis, emotion recognition, or some combination of the two systems can help drive improvements in customer service and retention. By harnessing audio and text data to determine how customers (and employees) are feeling and communicating, you can recommend early steps to help customer service agents, improve retention, and extract valuable insights from unstructured data.

If you want to learn more about what the future of voice tech looks like for customer experience, check out our recent webinar Importance of Voice Technology for Customer Experiences, which highlights some of the ways that voice tech tools like sentiment analysis and emotion recognition are being used today to power incredible customer experiences.

Related Resources

Voice Technology and the Changing Landscape of Customer Experience
Customer experience is a major differentiator for companies. And with the rise of voice technology, businesses are gaining an improved ability to better understand customer needs and sentiment. As customer conversations are analyzed, companies will...
5 Reasons Deep Learning for Speech Recognition is Business-Ready Now
I’m frustrated. When I read from tech authors, advisors, and our competitor’s blogs that End-to-End Deep Learning (E2EDL) Speech Recognition software is only being researched or not production-ready, I want to scream… “Listen people! End-to-end...
Which Speech Recognition Model is Best for My Business?
A funny thing happened when Deepgram first decided to use end-to-end deep learning (E2EDL) to design our next-generation speech-to-text (STT) solution. We found that this approach was hugely flexible and easier to optimize than traditional...

Apply Now

Receive up to $100,000 to use over 12 months.

Become a Partner

When you become a partner you’re in good company.

Talk to Customer Success