Retour aux articles

Machine Learning : Performance and interpretability

25 Mai 2020

m de lecture

Data Science

Raphael Kassel

Performance and interpretability: A necessary trade-off?

Over the last decade, an increasing number of companies have embraced the digital transformation by incorporating Artificial Intelligence methods in the way they conceive their products and define their processes. Gathering, analyzing, and harnessing the data have been considered as essential drivers for business growth in an increasingly number of fields.

However, as these tasks have become common practices for most companies of our digital era, many concerns are nowadays raised concerning the application of AI for specific purposes or in particular sectors. Let’s take a example in our day-to-day-life to illustrate this statement. Imagine your application for a loan is refused based on some scoring algorithms newly implemented by your bank. Although these algorithms might be efficient in the way they assess people’s risk profile, their lack of transparency put bank advisors in a difficult situation where they are unable to justify the bank decision.

The opaqueness of Artificial Intelligence is one of the major obstacles of its adoption by most of companies and organisations. It represents a permanent challenge for data scientists as they have to ensure a high accuracy of their model while maintaining a sufficient level of comprehensibility.

In this context, understanding the importance of model interpretability to solve concrete problems is the first step towards successful Data Science projects.

Explicability vs. Interpretability

Let’s start with essential definitions.
Explicability. An algorithmic decision is said to be explainable if it can be explicitly accounted for on the basis of known data and characteristics of the situation. In other words, if it is possible to relate the values taken by certain variables (the characteristics) and their consequences on the prediction, for example of a score, and thus on the decision.

To use the example of the bank loan, the model can be considered as explainable if it explicitly indicates the relationship between the values taken by the variables (age, family situation, salary, etc.) and the final score.

Interpretability. An algorithmic decision is said to be interpretable if it is possible to identify the characteristics or variables that participate most in the decision, or even to quantify their importance

For instance, the model used for loan scoring is interpretable if it is able to measure the relative importance of the variables used (age, family situation, salary, etc.) in the determination of the final score.

Note that an explainable decision is interpretable. Now we are clear about the context, how does it apply in practice?
The graph below provides a graphical illustration for case of the credit loan application

Why is it important?

A Classification of Machine Learning Algorithms

So, how Machine Learning algorithms can be divided between the two categories?
Broadly speaking, machine learning algorithms can be structured in two groups according to whether they lead, by construction, to an explicit model (white box) or a black box.

This classification can be further refined by grouping the current models into three groups:

The first group contains regression algorithms, decision trees, and traditional classification rules. They are close to monotonic linear functions and commonly used in economics and sociology;
The second group includes more advanced algorithms such as graphical models
The third group consist of advanced machine learning techniques such as SVM, ensemble learning, and deep learning methods. They are only able to provide information on the importance of the variables for the explicability of the model.

Choosing the right model

What is the right trade-off between model predictive power and interpretability? Let’s review the main question you should explore:

Do you need a model with an intrinsic or post-hoc interpretability ?
Depending on the level of interpretability required, you might choose a model easy to explain (intrinsic interpretability) or a back-box model for the training and the use of the results such as the feature importance to explain (post-hoc interpretability).
Do you need Model-Specific or Model-agnostic interpretability ?
There is a distinction to be made between the interpretability that can be specific to a model such as linear regression where weights can be interpreted and the interpretability that comes from general tools that applied after the model training (such as correlation, matrix confusion, or model-specific metrics)
Do you need local or global interpretability ?
The global explicability requirement is intended to make the decision-making process transparent for all data, while the local explicability criterion is intended to provide explanations for a single decision in a restricted neighborhood of data.

Therefore, before starting a Data Science project, it is essential to first determine the degree of interpretability of the model we wish to achieve.

Do you want to learn more on Machine Learning algorithms?
Feel free to get in touch with us and learn more about our programmes!

References

[1] C. Molnar, Interpretable Machine Learning
https://christophm.github.io/interpretable-ml-book/
[2] Problématiques juridiques et analyse automatique des données, Machine Learning and the Law
https://perso.math.univ-toulouse.fr/mllaw/home/statisticien/explicabilite-des-decisions-algorithmiques/ [3] A. Veriné, S. Mir, L’interprétabilité du Machine learning: quels défis à l’ère des processus de la décisions automatisés ? Wavestone report https://www.wavestone.com/app/uploads/2019/09/Wavestone_Interpretabilite_Machine_learning.pdf [4] J. Cupe, L’interprétabilité de l’IA – Le nouveau défi des data scientists, octobre 2018
https://www.actuia.com/contribution/jean-cupe/linterpretabilite-de-lia-le-nouveau-defi-des-data-scientists/

DataScientest News

Inscrivez-vous à notre Newsletter pour recevoir nos guides, tutoriels, et les dernières actualités data directement dans votre boîte mail.

Poursuivre la lecture

Un enseignant présentant des outils no-code à un groupe d'élèves, incluant un graphique et du code sur un écran.

Vous souhaitez être alerté des nouveaux contenus en data science et intelligence artificielle ?

Laissez-nous votre e-mail, pour que nous puissions vous envoyer vos nouveaux articles au moment de leur publication !

Machine Learning : Performance and interpretability

Performance and interpretability: A necessary trade-off?

Explicability vs. Interpretability

Why is it important?

A Classification of Machine Learning Algorithms

Choosing the right model

Do you want to learn more on Machine Learning algorithms?
Feel free to get in touch with us and learn more about our programmes!

References

DataScientest News

Outils no code gratuits : les meilleurs qu’il faut tester

Modélisation stochastique : principes, méthodes et applications

Bubble.io : le futur du développement s’écrit sans code

Données manquantes : Comment les gérer efficacement en data science ?

Vous souhaitez être alerté des nouveaux contenus en data science et intelligence artificielle ?

Machine Learning : Performance and interpretability

Performance and interpretability: A necessary trade-off?

Explicability vs. Interpretability

Why is it important?

A Classification of Machine Learning Algorithms

Choosing the right model

Do you want to learn more on Machine Learning algorithms? Feel free to get in touch with us and learn more about our programmes!

References

DataScientest News

Outils no code gratuits : les meilleurs qu’il faut tester

Modélisation stochastique : principes, méthodes et applications

Bubble.io : le futur du développement s’écrit sans code

Données manquantes : Comment les gérer efficacement en data science ?

Vous souhaitez être alerté des nouveaux contenus en data science et intelligence artificielle ?

DataNews

Do you want to learn more on Machine Learning algorithms?
Feel free to get in touch with us and learn more about our programmes!