Apache Pig: everything you need to know about the Hadoop programming language

Apache Pig is the programming language for Hadoop and MapReduce. Find out all you need to know: presentation, use cases, benefits, training… The MapReduce programming model of the Apache Hadoop framework makes it possible to process large volumes of Big Data. However, Data Analysts are not always familiar with this paradigm. This is why an […]

Data Quality: 10 mistakes not to make

Today, we’re living in the golden age of data. Every e-mail received, every application downloaded, every click to check the weather creates a quantity of data. However, as the famous IT expression goes: Garbage In, Garbage Out. The information a company can derive from data is only as good as the data itself. Poor-quality data […]

Redis: The favorite NoSQL database for developers

With the advent of NoSQL databases, Redis is coming into its own, offering in-memory data management. Created in 2009 by Salvatore Sanfilippo, Redis has become one of the most popular NoSQL databases. Named the database most loved by developers for 5 years in a row according to the annual “Stackoverflow Developer Survey”, Redis is currently […]

Power BI Tooltips: Discovering the tooltip and its use cases

The Power BI Info Bubble, also known as Tooltip, lets you add contextual information and detailed data to a report’s DataViz. Simply hover the mouse cursor over the visual to display them. Find out all you need to know: presentation, operation, benefits, tutorial, training… A Power BI report is very useful for presenting data in […]

Pillow: How to process images with Python

Formerly known as PIL, Pillow is an open source library specifically designed for image processing via Python. A veritable goldmine for image file manipulation, let’s take a look at some of the basic features of this benchmark library. What is Pillow, the Python library? Originally designed by Fredrik Lundh in 1995, Python Imaging Library abbreviated […]

Growth Marketing: definition and differences from growth hacking

If you’re familiar with the world of start-ups, you’ve probably already heard of Growth Marketing. This concept encompasses the marketing techniques needed to bring about rapid growth within a company. Increasingly popular, its spectrum remains misunderstood by many professionals, including marketers. So what is Growth Marketing? What levers can you use to bring rapid growth […]

Power BI for Retail: what’s at stake and what’s the outlook?

Power BI is Microsoft’s cloud-based Business Intelligence tool for managing and analyzing your data. Thanks to its DataVisualization features, you can visualize at a glance all your key indicators across all your activities, and easily interpret them. With its technology, ETLs and computing power, Microsoft’s Power BI gives you incredible speed to connect hundreds of […]

CodeSquire: All about AI as a programming assistant

To meet increasingly complex coding challenges, data scientists need tools to help them in their daily tasks. CodeSquire, the new AI code assistant, has been specially designed to improve their productivity and efficiency. Let’s find out more about its features. What is CodeSquire? For some time now, we’ve been noticing ChatGPT’s performance as an aid […]

AWS Cloud Quest: A role-playing game for cloud computing training

AWS Cloud Quest is a 3D role-playing game created by Amazon, to make learning how to use the Amazon Web Services cloud fun. Find out everything you need to know: presentation, benefits, how it works… It’s a proven fact: you learn better when you’re having fun. According to a study conducted by TalentLMS, 83% of […]

AWS Kinesis: what is this service? How much does it cost?

Amazon Kinesis is a managed service for collecting, processing and analyzing data streams in real time, and at scale. The service can be used to collect large volumes of data consumed by application processes running on Amazon EC2 instances (a server rental service for running web applications). Initially launched in 2013 at the Re:Invent conference, […]

Amazon EMR: A cluster management tool managed by AWS

Amazon EMR (Elastic MapReduce) is a data processing service managed by Amazon Web Service (AWS). It enables the management of large amounts of data, in the petabyte range, using popular tools such as Apache Hadoop, Hive, Spark and HBase, to name but a few. Amazon EMR has been designed to offer great flexibility and scalability, […]

AWS Glue: What is it? What’s it for?

AWS Glue is a fully managed, scalable data processing service that enables users to run serverless ETL (Extract, Transform, Load) workflows, freeing them from the need to manage the underlying infrastructure. A reminder about ETL processes ETL is a process designed to guarantee data quality and availability. It is divided into 3 phases: Extraction: recovery […]