🚀 Think you’ve got what it takes for a career in Data? Find out in just one minute!

SQL GROUP BY: Everything you need to know about this query

-
2
 m de lecture
-
sql group by

One of the most useful SQL commands for data analysts is GROUP BY. It allows you to easily generate advanced statistics from categories. Suffice to say, it's one of the 10 commands you'll use the most, so you might as well know what it's all about.

SQL’s strengths are well established. This language offers an unrivalled level of abstraction for analyzing vast volumes of information. Whatever the database software used (MySQL, Ingres, Oracle Database, Microsoft SQL Server, PostgreSQL, SQLite, Firebird…), the data analyst knows he’ll get the desired result without having to program the precise logic.

 

💡Related articles:

SQL DELETE: How do I use this query?
SQL Developer: What is it? What’s it for?
SQL Tutorial: Top 5 Most Useful Methods
SQL joins: everything you need to know about table associations
SQL vs NoSQL: differences, uses, advantages and disadvantages

How do you group information in a database?

If there’s one particularly important command you’ll appreciate, it’s SQL GROUP BY. It groups information from a database according to a particular column, from which it is possible to obtain statistical information: sum, maximum, minimum, average, etc.

To better understand this concept, it’s best to start with an example. Here’s an extract from a table called Staff

Name Department Salary
Paul Finance $3,000
Ray Sales $2,500
Julia Finance $3,200
Dan Marketing $4,300
Josie Sales $2,200
Donna Finance $2,700

If we wanted to obtain the sum of salaries by department, we’d simply use a sequence like the following:

SELECT Department, SUM(Salary)

FROM Staff

GROUP BY Service

ORDER BY Service;

In this example, we group the table according to the Service column and ask for the sum of salaries for each of them.

The answer would be – based on the sample shown above :

How do I use the HAVING BY clause?

HAVING BY completes the SQL GROUP BY command by opening up the possibility of specifying a condition. In the example below, we’d like to display the average salary (the AVG function), but only if this average is greater than 2500:

SELECT Department, AVG(Salary)

FROM Personnel

GROUP BY Service

HAVING AVG(Salary) > 2500

ORDER BY Service;

The result is :

Finances 8 900
Marketing 4 300
Sales 4 700
Finances 2 967
Marketing 4 300

And that’s it! It’s that simple. The ease with which such operations can be implemented is a testament to the qualities of the SQL language.

Facebook
Twitter
LinkedIn

DataScientest News

Sign up for our Newsletter to receive our guides, tutorials, events, and the latest news directly in your inbox.

You are not available?

Leave us your e-mail, so that we can send you your new articles when they are published!
icon newsletter

DataNews

Get monthly insider insights from experts directly in your mailbox