Jump to content

Everything You Need to Know About the SQL GROUP BY Statement - Other Helpful Tutorials - InviteHawk - The #1 Trusted Source for Free Tracker Invites

Buy, Sell, Trade, or Find Free Invites for top private trackers like redacted, blutopia, losslessclub, femdomcult, filelist, Chdbits, Uhdbits, empornium, iptorrents, hdbits, gazellegames, animebytes, privatehd, myspleen, torrentleech, morethantv, bibliotik, alpharatio, blady, passthepopcorn, brokenstones, pornbay, cgpeers, cinemageddon, broadcasthenet, learnbits, torrentseeds, beyondhd, cinemaz, u2.dmhy, Karagarga, PTerclub, Nyaa.si, Polishtracker, and many more.

Recommended Posts


  • Member ID:  53,629
  • Followers:  0
  • Topic Count:  3,075
  • Topics Per Day:  1.55
  • Content Count:  3,094
  • Content Per Day:  1.56
  • Reputation:   105
  • Achievement Points:  4,020
  • Days Won:  0
  • Joined:  01/24/2021
  • Status:  Offline
  • Last Seen:  

Much of the power of relational databases comes from filtering data and joining tables together. This is why we represent those relations in the first place. But modern database systems provide another valuable technique: grouping.

Grouping allows you to extract summary information from a database. It lets you combine results to create useful statistical data. Grouping saves you from writing code for common cases such as averaging lists of figures. And it can make for more efficient systems.

What Does the GROUP BY Clause Do?

GROUP BY, as the name suggests, groups results into a smaller set. The results consist of one row for each distinct value of the grouped column. We can show its usage by looking at some sample data with rows that share some common values.

The following is a very simple database with two tables representing record albums. You can set up such a database by writing a basic schema for your chosen database system. The albums table has nine rows with a primary key id column and columns for name, artist, year of release, and sales:

+----+---------------------------+-----------+--------------+-------+ | id | name | artist_id | release_year | sales | +----+---------------------------+-----------+--------------+-------+ | 1 | Abbey Road | 1 | 1969 | 14 | | 2 | The Dark Side of the Moon | 2 | 1973 | 24 | | 3 | Rumours | 3 | 1977 | 28 | | 4 | Nevermind | 4 | 1991 | 17 | | 5 | Animals | 2 | 1977 | 6 | | 6 | Goodbye Yellow Brick Road | 5 | 1973 | 8 | | 7 | 21 | 6 | 2011 | 25 | | 8 | 25 | 6 | 2015 | 22 | | 9 | Bat Out of Hell | 7 | 1977 | 28 | +----+---------------------------+-----------+--------------+-------+

The artists table is even simpler. It has seven rows with id and name columns:

+----+---------------+ | id | name | +----+---------------+ | 1 | The Beatles | | 2 | Pink Floyd | | 3 | Fleetwood Mac | | 4 | Nirvana | | 5 | Elton John | | 6 | Adele | | 7 | Meat Loaf | +----+---------------+

You can understand various aspects of GROUP BY with just a simple data set such as this. Of course, a real-life data set would have many, many more rows, but the principles remain the same.

Grouping by a Single Column

Let’s say we want to find out how many albums we have for each artist. Start with a typical SELECT query to fetch the artist_id column:

SELECT artist_id FROM albums

This returns all nine rows, as expected:

+-----------+ | artist_id | +-----------+ | 1 | | 2 | | 3 | | 4 | | 2 | | 5 | | 6 | | 6 | | 7 | +-----------+

To group these results by the artist, append the phrase GROUP BY artist_id:

SELECT artist_id FROM albums GROUP BY artist_id

Which gives the following results:

+-----------+ | artist_id | +-----------+ | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | +-----------+

There are seven rows in the result set, reduced from the total nine in the albums table. Each unique artist_id has a single row. Finally, to get the actual counts, add COUNT(*) to the columns selected:

SELECT artist_id, COUNT(*) FROM albums GROUP BY artist_id +-----------+----------+ | artist_id | COUNT(*) | +-----------+----------+ | 1 | 1 | | 2 | 2 | | 3 | 1 | | 4 | 1 | | 5 | 1 | | 6 | 2 | | 7 | 1 | +-----------+----------+

The results group two pairs of rows for the artists with ids 2 and 6. Each has two albums in our database.

RELATED: The Essential SQL Commands Cheat Sheet For Beginners

How to Access Grouped Data With an Aggregate Function

You may have used the COUNT function before, particularly in the COUNT(*) form as seen above. It fetches the number of results in a set. You can use it to get the total number of records in a table:

SELECT COUNT(*) FROM albums +----------+ | COUNT(*) | +----------+ | 9 | +----------+

COUNT is an aggregate function. This term refers to functions that translate values from multiple rows into a single value. They are often used in conjunction with the GROUP BY statement.

Rather than just count the number of rows, we can apply an aggregate function to grouped values:

SELECT artist_id, SUM(sales) FROM albums GROUP BY artist_id +-----------+------------+ | artist_id | SUM(sales) | +-----------+------------+ | 1 | 14 | | 2 | 30 | | 3 | 28 | | 4 | 17 | | 5 | 8 | | 6 | 47 | | 7 | 28 | +-----------+------------+

The total sales shown above for artists 2 and 6 are their multiple albums’ sales combined:

SELECT artist_id, sales FROM albums WHERE artist_id IN (2, 6) +-----------+-------+ | artist_id | sales | +-----------+-------+ | 2 | 24 | | 2 | 6 | | 6 | 25 | | 6 | 22 | +-----------+-------+

Grouping by Multiple Columns

You can group by more than one column. Just include multiple columns or expressions, separated by commas. The results will group according to the combination of these columns.

SELECT release_year, sales, count(*) FROM albums GROUP BY release_year, sales

This will typically produce more results than grouping by a single column:

+--------------+-------+----------+ | release_year | sales | count(*) | +--------------+-------+----------+ | 1969 | 14 | 1 | | 1973 | 24 | 1 | | 1977 | 28 | 2 | | 1991 | 17 | 1 | | 1977 | 6 | 1 | | 1973 | 8 | 1 | | 2011 | 25 | 1 | | 2015 | 22 | 1 | +--------------+-------+----------+

Note that, in our small example, just two albums have the same release year and sales count (28 in 1977).

Useful Aggregate Functions

Aside from COUNT, several functions work well with GROUP. Each function returns a value based on the records belonging to each result group.

COUNT() returns the total number of matching records.

SUM() returns the total of all values in the given column added up.

MIN() returns the smallest value in a given column.

MAX() returns the largest value in a given column.

AVG() returns the mean average. It’s the equivalent of SUM() / COUNT().

You can also use these functions without a GROUP clause:

SELECT AVG(sales) FROM albums +------------+ | AVG(sales) | +------------+ | 19.1111 | +------------+

Using GROUP BY With a WHERE Clause

Just as with a normal SELECT, you can still use WHERE to filter the result set:

SELECT artist_id, COUNT(*) FROM albums WHERE release_year > 1990 GROUP BY artist_id

+-----------+----------+ | artist_id | COUNT(*) | +-----------+----------+ | 4 | 1 | | 6 | 2 | +-----------+----------+

Now you have only those albums released after 1990, grouped by artist. You can also use a join with the WHERE clause, independently from the GROUP BY:

SELECT r.name, COUNT(*) AS albums FROM albums l, artists r WHERE artist_id=r.id AND release_year > 1990 GROUP BY artist_id

+---------+--------+ | name | albums | +---------+--------+ | Nirvana | 1 | | Adele | 2 | +---------+--------+

Note, however, that if you try to filter based on an aggregated column:

SELECT r.name, COUNT(*) AS albums FROM albums l, artists r WHERE artist_id=r.id AND albums > 2 GROUP BY artist_id;

You’ll get an error:

ERROR 1054 (42S22): Unknown column 'albums' in 'where clause'

Columns based on aggregate data are not available to the WHERE clause.

Using the HAVING Clause

So, how do you filter the result set after a grouping has taken place? The HAVING clause deals with this need:

SELECT r.name, COUNT(*) AS albums FROM albums l, artists r WHERE artist_id=r.id GROUP BY artist_id HAVING albums > 1;

Note that the HAVING clause comes after the GROUP BY. Otherwise, it’s essentially a simple replacement of the WHERE with HAVING. The results are:

+------------+--------+ | name | albums | +------------+--------+ | Pink Floyd | 2 | | Adele | 2 | +------------+--------+

You can still use a WHERE condition to filter the results before the grouping. It will work together with a HAVING clause for filtering after the grouping:

SELECT r.name, COUNT(*) AS albums FROM albums l, artists r WHERE artist_id=r.id AND release_year > 1990 GROUP BY artist_id HAVING albums > 1;

Only one artist in our database released more than one album after 1990:

+-------+--------+ | name | albums | +-------+--------+ | Adele | 2 | +-------+--------+

Combining Results With GROUP BY

The GROUP BY statement is an incredibly useful part of the SQL language. It can provide summary information of data, for a contents page, for example. It is an excellent alternative to fetching large quantities of data. The database handles this extra workload well since its very design makes it optimal for the job.

Once you understand grouping and how to join multiple tables, you’ll be able to utilize most of the power of a relational database.


  • Member ID:  48,211
  • Followers:  60
  • Topic Count:  144
  • Topics Per Day:  0.06
  • Content Count:  30,039
  • Content Per Day:  13.22
  • Reputation:   2,202
  • Achievement Points:  38,180
  • Days Won:  11
  • Joined:  04/09/2020
  • Status:  Offline
  • Last Seen:  

Avoid unnecessary posts such as 'Thank you', 'Welcome', etc. Such posts will be deleted and user will be warned if it happens again. If caught spamming, the following actions are applicable -

  • First time - Warning
  • Second time - 5000 Points will be deducted
  • Third time - Ban for 7 days
  • Fourth time - Permanent Ban

If the post helped you, reward the user by reacting to the post like this -

1.jpg

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Read this before posting -
  • Only post if you have something valuable to contribute.
  • Avoid unnecessary posts such as 'Thank you', 'Welcome', etc. Such posts will be deleted and you will be warned if it happens again.
  • If the post helped you, reward the user by reacting to the post like this -                      1.jpg
Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Customer Reviews

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.