sqlcmd.net validated sql reference
intermediate subqueries MySQL MariaDB SQL Server PostgreSQL SQLite

Filter Rows Using A Correlated Subquery

Reference the outer query's current row inside a subquery to compute a per-row comparison value.

Docker-validated Not currently validation-green

Find products priced above the average for their own category

For each product, the subquery computes the average price of all products in the same category. The Widgets average is 15: Widget A (10) falls below it and is excluded, Widget B (20) exceeds it and is returned. The Gadgets average is 40: Gadget Y (30) is excluded, Gadget X (50) is returned.

Rows loaded before the example query runs.
Setup
CREATE TABLE products (
  id INT,
  name VARCHAR(50),
  category VARCHAR(50),
  price INT
);

INSERT INTO
  products
VALUES
  (1, 'Widget A', 'Widgets', 10),
  (2, 'Widget B', 'Widgets', 20),
  (3, 'Gadget X', 'Gadgets', 50),
  (4, 'Gadget Y', 'Gadgets', 30);
Shared across supported engines.
SQL
SELECT
  name,
  price
FROM
  products p
WHERE
  price > (
    SELECT
      AVG(price)
    FROM
      products
    WHERE
      category = p.category
  )
ORDER BY
  name;
Returned rows for the shared example.
nameprice
Gadget X50
Widget B20

Identical syntax and result across all engines.

Where this command helps.

  • keeping rows whose value exceeds the average for their group
  • filtering rows based on the value of a related row in another table

What the command is doing.

A correlated subquery is a subquery that references a column from the outer query. It re-evaluates for every row the outer query processes. The most common form appears in a WHERE clause comparing each row against a group aggregate — for example, WHERE price > (SELECT AVG(price) FROM t WHERE category = outer.category). This is something awkward to express as a plain join. The performance cost scales with row count; replacing a correlated subquery with a window function (AVG() OVER (PARTITION BY ...)) achieves the same result much faster on large tables.