SQL WHERE and HAVING clause (with examples)

In SQL, there is two clause used for filtering: WHERE and HAVING. Most of the time WHERE clause is sufficient. However, the reason why HAVING clause exist is because WHERE clause cannot filter aggregates.

To show an example of how each clause can be used, let's say we are running an ice cream company. We produce several flavours of ice cream. We sell these ice creams to stores in multiple countries. Each store would have different stocks of our ice cream varieties. Our users are from all over the world.

From that example, we have the following schema of the database.

In the following section, I will show some examples for both clauses in PostgreSQL. I have created the database dump so you can experiment with it yourself.

WHERE clause

We can use the WHERE clause to search stores that is available in a county with the following query.

SELECT * FROM stores WHERE country = 'ID';

We can also use it to find users who cannot access our tasty ice cream with the following query.

SELECT users.name
FROM users LEFT JOIN stores ON users.country = stores.country
WHERE stores.id IS NULL;

I can list a lot more example for WHERE use case but, as you have known or noticed, all of it do not involves aggregation.

HAVING clause

Well, that is where the HAVING clause comes into play. Let's say we want to know which store have more than 50 stock of our ice creams. We can find out with the following query.

SELECT stores.name, SUM(stocks.amount) AS total_amount
FROM stores INNER JOIN stocks ON stores.id = stocks.store_id
GROUP BY stores.id
HAVING SUM(stocks.amount) > 50;

The result would be something like this:

                   name                | total_amount
---------------------------------------+--------------
 Curabitur Dictum Incorporated         |          200
 Feugiat LLP                           |          111
 Eget Volutpat Corporation             |          159
 Nisl Sem LLC                          |          153
 Sed Eu Nibh Industries                |          242
 Et Magnis Dis Industries              |           74
 Amet Consulting                       |          248
 Sit Ltd                               |          245
 Nunc Mauris Ltd                       |          242
 Lacinia Company                       |          209
 Sem Eget Massa LLC                    |          282
 Eros Nam Associates                   |          401
...

Note that in HAVING clause, we cannot use total_amount because the SELECT clause is performed after the HAVING clause.

What about if we use HAVING without GROUP BY? PostgreSQL will throw an error that something must be used in GROUP BY clause.

Using WHERE and HAVING clause together

Something to remember: WHERE clause is executed before the GROUP BY clause is executed whereas HAVING clause is executed after the GROUP BY clause.

We can use this to our advantage by using both clause.

Let's say we want to know the stores in Indonesia that have less than 50 stocks of ice cream so we can send them some more.

We can use WHERE and HAVING clause in this example.

SELECT stores.name, SUM(stocks.amount) AS total_amount
FROM stores INNER JOIN stocks ON stores.id = stocks.store_id
WHERE stores.country = 'Indonesia'
GROUP BY stores.id
HAVING SUM(stocks.amount) < 50;

The result is the following:

        name         | total_amount
---------------------+--------------
 Ultrices Industries |           15

We can actually use HAVING clause only in the query like the following. The result would be the same.

SELECT stores.name, SUM(stocks.amount) AS total_amount
FROM stores INNER JOIN stocks ON stores.id = stocks.store_id
GROUP BY stores.id
HAVING stores.country = 'Indonesia' AND SUM(stocks.amount) < 50;

Only using HAVING clause

However, there are some cases where we can only use HAVING clause.

For example, our ice cream application have a search bar where we can search the store's name or the name of our ice cream flavours.

Let's say we want to find all stores name and flavours' name that have "caramel" in it. This case uses two HAVING condition as show in the following query. This query results in 106 rows.

SELECT stores.name, string_agg(stocks.flavour, ', ') AS flavours
FROM stores INNER JOIN stocks ON stores.id = stocks.store_id
GROUP BY stores.id
HAVING stores.name ILIKE '%caramel%' OR string_agg(stocks.flavour, ', ') ILIKE '%caramel%';
string_agg is an aggregate function that collects several rows into a string in PostgreSQL.

Let us try to use both WHERE and HAVING clause.

SELECT stores.name, string_agg(stocks.flavour, ', ') AS flavours
FROM stores INNER JOIN stocks ON stores.id = stocks.store_id
WHERE stores.name ILIKE '%caramel%'
GROUP BY stores.id
HAVING string_agg(stocks.flavour, ', ') ILIKE '%caramel%';

When we execute the query, it results in 0 rows! So, what happened?

Remember that WHERE clause is executed before GROUP BY clause which means it is executed before HAVING clause.

In this case, we are filtering stores' name (the condition using WHERE) before we get to filter flavours' name (the condition using HAVING). Since we don't have any store that have "caramel" in its name, there is nothing to filter by the HAVING clause.

Conclusions

  • WHERE and HAVING clause is used for filtering.
  • WHERE clause is executed before GROUP BY clause (if present).
  • HAVING clause is executed after GROUP BY clause (must be present).

Comments

Popular posts from this blog

Experiment #1: Can JavaScript run something like a concurrent thread/process? No.

Ruby on Rails Time, Date, DateTime Cheatsheet