Data cleaning : Definition, methods and relevance in Data Science
Data cleaning is an essential step in Data Science and Machine Learning. It consists in solving problems in data sets, to be able to exploit
🚀 Think you’ve got what it takes for a career in Data? Find out in just one minute!
Data cleaning is an essential step in Data Science and Machine Learning. It consists in solving problems in data sets, to be able to exploit
In the era of Big Data, companies are collecting ever larger amounts of data. But not all companies are making the most of it. And
The GitHub platform allows computer programmers to freely collaborate on code projects. Find out everything you need to know about this massively used service in
Hadoop is an Open Source framework dedicated to Big Data storage and processing. Discover everything you need to know: definition, history, functioning, advantages, training… For
The Python language being one of the most used, it contains a lot of frameworks, and many of them are developed exclusively for Data Science.
The Deep Neural Network imitates the functioning of the human brain. Find out everything you need to know about it: definition, functioning, use cases, training.A
This article will be divided into two parts: The first focuses on the choice of metrics specific to this type of data, the second details
Classification of unbalanced data is a classification problem where the training sample contains a strong disparity between the classes to be predicted. This problem is
According to the statistics of the Workflow Management Coalition, workflows are at the heart of every company’s IT management. Whether you identify, monitor and manage
This is a word we often use in IT, but what exactly is it ? A framework is a conceptual or real structure that you
NumPy is a very popular Python library that is mainly used to perform mathematical and scientific calculations. It offers many features and tools that can
Unbalanced data is very common in Machine Learning. Unfortunately, they complicate predictive analysis. So to balance these data sets, several methods have been implemented. How