sqlcmd.net validated sql reference
advanced aggregation MariaDB SQL Server PostgreSQL

Detect Outliers Using the IQR Method

Flag values that fall outside the interquartile fence `[Q1 − 1.5 × IQR, Q3 + 1.5 × IQR]` using `PERCENTILE_CONT` in PostgreSQL, MariaDB, and SQL Server.

Docker-validated Not currently validation-green

Find response times that are statistical outliers in an API log

The nine values sorted are 45, 50, 52, 55, 60, 62, 65, 70, 500. PERCENTILE_CONT(0.25) interpolates at position 1 + 0.25 × (9−1) = 3.0, which lands exactly on the 3rd sorted value: Q1 = 52. PERCENTILE_CONT(0.75) lands at position 1 + 0.75 × 8 = 7.0: Q3 = 65. IQR = 65 − 52 = 13. Upper fence = 65 + 1.5 × 13 = 84.5. Lower fence = 52 − 1.5 × 13 = 32.5. Only 500 exceeds the upper fence, so it is the single outlier returned. The value 70 falls below 84.5 and is therefore not flagged. SQL Server's PERCENTILE_CONT emits the boundary value once per input row; DISTINCT in the CTE collapses the nine identical copies into a single row so the outer join produces nine result candidates rather than eighty-one.

MariaDB PostgreSQL
Engine-specific syntax
Setup
CREATE TABLE response_times (ms INT);

INSERT INTO
  response_times
VALUES
  (45),
  (50),
  (52),
  (55),
  (60),
  (62),
  (65),
  (70),
  (500);
SQL
WITH
  bounds AS (
    SELECT
      PERCENTILE_CONT(0.25) WITHIN GROUP (
        ORDER BY
          ms
      ) AS q1,
      PERCENTILE_CONT(0.75) WITHIN GROUP (
        ORDER BY
          ms
      ) AS q3
    FROM
      response_times
  )
SELECT
  r.ms
FROM
  response_times r,
  bounds
WHERE
  r.ms < bounds.q1 - 1.5 * (bounds.q3 - bounds.q1)
  OR r.ms > bounds.q3 + 1.5 * (bounds.q3 - bounds.q1)
ORDER BY
  r.ms;
ms
500
SQL Server
Engine-specific syntax
Setup
CREATE TABLE response_times (ms INT);

INSERT INTO
  response_times
VALUES
  (45),
  (50),
  (52),
  (55),
  (60),
  (62),
  (65),
  (70),
  (500);
SQL
WITH
  bounds AS (
    SELECT DISTINCT
      PERCENTILE_CONT(0.25) WITHIN GROUP (
        ORDER BY
          ms
      ) OVER () AS q1,
      PERCENTILE_CONT(0.75) WITHIN GROUP (
        ORDER BY
          ms
      ) OVER () AS q3
    FROM
      response_times
  )
SELECT
  r.ms
FROM
  response_times r,
  bounds
WHERE
  r.ms < bounds.q1 - 1.5 * (bounds.q3 - bounds.q1)
  OR r.ms > bounds.q3 + 1.5 * (bounds.q3 - bounds.q1)
ORDER BY
  r.ms;
ms
500

PostgreSQL and MariaDB use PERCENTILE_CONT as a plain aggregate. SQL Server requires OVER() and DISTINCT to produce a single boundary row. MySQL and SQLite lack PERCENTILE_CONT and require a ROW_NUMBER workaround.

Where this command helps.

  • flagging API response times that are statistical outliers for investigation before computing SLA metrics
  • removing anomalous sensor readings from a dataset before fitting a regression model

What the command is doing.

The interquartile range (IQR) method is a robust outlier-detection technique that is insensitive to the very outliers it is trying to find, unlike mean- and standard-deviation-based approaches. The procedure is: compute the 25th percentile (Q1) and 75th percentile (Q3), calculate IQR = Q3 − Q1, then define fences as lower = Q1 − 1.5 × IQR and upper = Q3 + 1.5 × IQR. Any value below the lower fence or above the upper fence is a potential outlier (the 1.5 multiplier is John Tukey's rule of thumb; use 3.0 for extreme outliers only). PostgreSQL and MariaDB expose PERCENTILE_CONT(fraction) WITHIN GROUP (ORDER BY column) as an ordered-set aggregate. SQL Server implements the same function as a window function that requires OVER() and DISTINCT to collapse the per-row output into a single boundary row. MySQL and SQLite lack PERCENTILE_CONT; the quartiles must be approximated using ROW_NUMBER() to locate the relevant rows.