🚀 Think you’ve got what it takes for a career in Data? Find out in just one minute!

SQL GROUP BY: Everything you need to know about this query

4 Jan 2024

m de lecture

Data Science

Melanie

One of the most useful SQL commands for data analysts is GROUP BY. It allows you to easily generate advanced statistics from categories. Suffice to say, it's one of the 10 commands you'll use the most, so you might as well know what it's all about.

SQL’s strengths are well established. This language offers an unrivalled level of abstraction for analyzing vast volumes of information. Whatever the database software used (MySQL, Ingres, Oracle Database, Microsoft SQL Server, PostgreSQL, SQLite, Firebird…), the data analyst knows he’ll get the desired result without having to program the precise logic.

💡Related articles:

SQL DELETE: How do I use this query?

SQL Developer: What is it? What’s it for?

SQL Tutorial: Top 5 Most Useful Methods

SQL joins: everything you need to know about table associations

SQL vs NoSQL: differences, uses, advantages and disadvantages

How do you group information in a database?

If there’s one particularly important command you’ll appreciate, it’s SQL GROUP BY. It groups information from a database according to a particular column, from which it is possible to obtain statistical information: sum, maximum, minimum, average, etc.

To better understand this concept, it’s best to start with an example. Here’s an extract from a table called Staff

Name	Department	Salary
Paul	Finance	$3,000
Ray	Sales	$2,500
Julia	Finance	$3,200
Dan	Marketing	$4,300
Josie	Sales	$2,200
Donna	Finance	$2,700

If we wanted to obtain the sum of salaries by department, we’d simply use a sequence like the following:

SELECT Department, SUM(Salary)

FROM Staff

GROUP BY Service

ORDER BY Service;

In this example, we group the table according to the Service column and ask for the sum of salaries for each of them.

The answer would be – based on the sample shown above :

How do I use the HAVING BY clause?

HAVING BY completes the SQL GROUP BY command by opening up the possibility of specifying a condition. In the example below, we’d like to display the average salary (the AVG function), but only if this average is greater than 2500:

SELECT Department, AVG(Salary)

FROM Personnel

GROUP BY Service

HAVING AVG(Salary) > 2500

ORDER BY Service;

The result is :

Finances	8 900
Marketing	4 300
Sales	4 700

Finances	2 967
Marketing	4 300

And that’s it! It’s that simple. The ease with which such operations can be implemented is a testament to the qualities of the SQL language.

DataScientest News

You are not available?

Leave us your e-mail, so that we can send you your new articles when they are published!

Data Analyst

Analytics Engineer

Data Scientist

AI / Machine Learning Engineer

Data Engineer

Cloud Engineer

DevOps Engineer

Deep Learning

MLOps

ETL Developer

Data Ops Engineer

Student Course

Amazon Web Services (AWS)

Microsoft Power BI

Data Management

Data Product Manager

SQL GROUP BY: Everything you need to know about this query

One of the most useful SQL commands for data analysts is GROUP BY. It allows you to easily generate advanced statistics from categories. Suffice to say, it's one of the 10 commands you'll use the most, so you might as well know what it's all about.

How do you group information in a database?

How do I use the HAVING BY clause?

You are not available?

Related articles

Hashcat: Description, Operation, and Usage

SAP Concur: What is it? How to use it?

SAP Customer Experience (SAP CX): What is it?

OSINT (Open Source Intelligence): What is it? Who can use it?

Data Analyst

Analytics Engineer

Data Scientist

AI / Machine Learning Engineer

Data Engineer

Cloud Engineer

DevOps Engineer

Deep Learning

MLOps

ETL Developer

Data Ops Engineer

Student Course

Amazon Web Services (AWS)

Microsoft Power BI

Data Management

Data Product Manager

SQL GROUP BY: Everything you need to know about this query

One of the most useful SQL commands for data analysts is GROUP BY. It allows you to easily generate advanced statistics from categories. Suffice to say, it's one of the 10 commands you'll use the most, so you might as well know what it's all about.

How do you group information in a database?

How do I use the HAVING BY clause?

You are not available?

Related articles

Hashcat: Description, Operation, and Usage

SAP Concur: What is it? How to use it?

SAP Customer Experience (SAP CX): What is it?

OSINT (Open Source Intelligence): What is it? Who can use it?

DataNews