Skip to main content

Group by

The GROUP BY clause groups rows that have the same values in specified columns into summary rows. For example, you can group sales data by product_id to calculate totals or averages for each product.

1. Syntax

SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1;
  • column1: The column by which rows are grouped.
  • aggregate_function(column2): The function applied to each group (e.g., SUM, COUNT, etc.).
  • Every column in the SELECT list that is not aggregated must be included in the GROUP BY clause.

2. How It Works

  • The GROUP BY clause divides the rows of a table into groups based on the unique values in one or more columns.
  • For each group, aggregate functions are applied to calculate summary values like totals or averages.

3. Example

Table: sales

product_idregionquantityprice
1North1050
2South2030
1North1550
3East540
2South1030

Query:

SELECT product_id, SUM(quantity) AS total_quantity
FROM sales
GROUP BY product_id;

Result:

product_idtotal_quantity
125
230
35

Explanation:

  • Rows with the same product_id are grouped together.
  • The SUM(quantity) calculates the total quantity for each product.

4. Example with Multiple Columns

Query:

SELECT region, product_id, SUM(quantity) AS total_quantity
FROM sales
GROUP BY region, product_id;

Result:

regionproduct_idtotal_quantity
North125
South230
East35

Explanation:

  • The rows are grouped by region and product_id.
  • The SUM(quantity) calculates the total for each unique combination of region and product_id.

5. Common Mistakes

  1. Not Including Columns in GROUP BY:
    All non-aggregated columns in the SELECT list must appear in the GROUP BY clause.

    Incorrect:

    SELECT product_id, region, SUM(quantity)
    FROM sales
    GROUP BY product_id;

    Correct:

    SELECT product_id, region, SUM(quantity)
    FROM sales
    GROUP BY product_id, region;
  2. Using WHERE Instead of HAVING:
    Use HAVING to filter aggregated results, not WHERE.

    Incorrect:

    SELECT product_id, SUM(quantity)
    FROM sales
    GROUP BY product_id
    WHERE SUM(quantity) > 10; -- Invalid

    Correct:

    SELECT product_id, SUM(quantity)
    FROM sales
    GROUP BY product_id
    HAVING SUM(quantity) > 10;