We have the answers to your questions! - Don't miss our next open house about the data universe!

Sentiment Analysis: Harnessing the Power of Machine Learning

- Reading Time: 3 minutes
Explore the world of sentiment analysis and machine learning! Discover how advanced algorithms are revolutionizing the way we analyze and understand emotions expressed in text data. Dive into the techniques, applications, and benefits of sentiment analysis in various industries

Sentiment analysis is a technique that has developed strongly in tandem with social networking, where users can express themselves on a massive scale and constantly share their feelings. Sentiment analysis aims to determine the emotional tone of a speech by classifying it in different categories, such as positive, negative or neutral.

It is popular with a wide range of players, from politicians in the middle of an election campaign to companies ready to launch a new product, to name but a few.

The politician will want to test his popularity rating with the electorate, while the company will want to assess how well its product is received by the public.

sentiments datascientest
fig: example of a sentence to be analysed

But what does Sentiment Analysis actually involve and how do Data Scientists use Machine Learning techniques to decipher the emotional tone of a speech?

Sentiment analysis is used when communicating, whether in writing or verbally. Data Scientists can use audio or text data. It is the format of the data that determines the Machine Learning technique to be used.

How do you analyse a spoken sentence?

In this case, the data to be analysed is an electrical signal generated by the brain called an electroencephalogram (or EEG). Overall, it looks like this:

sentiments analysis
fig: example of an electrical signal produced by the brain [1].

To collect this data, which is then analysed, electrodes are placed on the skull. If we carry out the experiment on you, you will look something like this:

sentiments analysis

Once the signals have been collected, the features representing the information contained in the signal need to be extracted. These features are a more readable format for the Machine Learning algorithm that will classify the signals. Features are extracted by applying various transformations, such as filters, to the electrical signal.

Once the features have been extracted, we give them as input to our algorithm, such as a Neural Network, so that it can classify the signals into different categories: positive/negative/neutral.

In reality, this technique of recovering a cerebral signal and then analysing it to deduce a polarity (positive/negative/neutral) is rarely used in everyday life and is mainly exploited in the field of research, particularly by researchers interested in issues combining Artificial Intelligence and neuroscience.

How is a comment written on Facebook classified?

To solve this problem, Data Scientists use classic Natural Language Processing methods (for more details, please refer to the Introduction to NLP – Natural Language Processing article on the site).

These methods analyse words directly and must take into account the contextual and linguistic aspects of the data.

In short, the sentence to be analysed will be treated as a sequence that defines a context and whose words are dependent on each other, i.e. they will be analysed in relation to the words that precede them in the sentence.

To process these sentences, Data Scientists will use Recurrent Neural Networks (RNN), which are neural networks specialised in sequence processing.

A (highly simplified) RNN architecture for sentiment analysis would therefore look something like this pipeline:

sentiments analysis
fig: architecture of an RNN for sentiment analysis

If we take the sentence “You’re polite”, we can see that the word “polite”, once analysed by the algorithm, will return one of the two classes (in this case, positive) and that this word “polite” was analysed after the word “es” (in other words, in the context of the word “es”) which was itself analysed after the word “tu” (in other words, in the context of the word “tu”).

This recurrence makes it possible to define a context, which is essential for analysing sentiment.

Of course, there are sentences that are subtle and complex for machines to analyse. “This perfume smells extremely good, it’s addictive”. The word “extremely” can have either positive or negative connotations, while “addictive” is generally associated with a negative feeling.

Although recurrent neural network (RNN) technology has been around for a number of years, it is only recently that scientists have been able to obtain some very promising results, thanks in particular to constantly improving computing capacities. As a result, this technology is being used more and more regularly by companies wishing to obtain feedback from their users on a product or any other person with access to a large quantity of messages in order to derive a general feeling.

Did you like this article?

You are not available?

Leave us your e-mail, so that we can send you your new articles when they are published!
icon newsletter

DataNews

Get monthly insider insights from experts directly in your mailbox