Delete Duplicate Rows While Keeping One

Use `ROW_NUMBER()` to identify duplicate rows and delete every copy except the one you want to keep.

Created Apr 12, 2026 Last updated Apr 12, 2026 5/5 supported engines validation-green 1 example 2 scenarios

Docker-validated Not currently validation-green

Example 1

Delete duplicate email rows and keep the earliest id

Each email group is partitioned together and ordered by id, so the earliest row gets rn = 1. Rows with rn > 1 are treated as duplicates and deleted. That keeps Ada One, Bob One, and Cara One while removing the later duplicates.

MySQL MariaDB SQLite

Engine-specific syntax

Source table data

Setup

CREATE TABLE contacts (id INT, email VARCHAR(100), name VARCHAR(50));

INSERT INTO
  contacts (id, email, name)
VALUES
  (1, '[email protected]', 'Ada One'),
  (2, '[email protected]', 'Bob One'),
  (3, '[email protected]', 'Ada Two'),
  (4, '[email protected]', 'Cara One'),
  (5, '[email protected]', 'Bob Two');

Validated query

SQL

DELETE FROM contacts
WHERE
  id IN (
    SELECT
      id
    FROM
      (
        SELECT
          id,
          ROW_NUMBER() OVER (
            PARTITION BY
              email
            ORDER BY
              id
          ) AS rn
        FROM
          contacts
      ) ranked
    WHERE
      rn > 1
  );

SELECT
  id,
  email,
  name
FROM
  contacts
ORDER BY
  id;

Expected result

id	email	name
1	[email protected]	Ada One
2	[email protected]	Bob One
4	[email protected]	Cara One

SQL Server

Engine-specific syntax

Source table data

Setup

CREATE TABLE contacts (id INT, email VARCHAR(100), name VARCHAR(50));

INSERT INTO
  contacts (id, email, name)
VALUES
  (1, '[email protected]', 'Ada One'),
  (2, '[email protected]', 'Bob One'),
  (3, '[email protected]', 'Ada Two'),
  (4, '[email protected]', 'Cara One'),
  (5, '[email protected]', 'Bob Two');

Validated query

SQL

WITH
  ranked AS (
    SELECT
      id,
      ROW_NUMBER() OVER (
        PARTITION BY
          email
        ORDER BY
          id
      ) AS rn
    FROM
      contacts
  )
DELETE FROM ranked
WHERE
  rn > 1;

SELECT
  id,
  email,
  name
FROM
  contacts
ORDER BY
  id;

Expected result

id	email	name
1	[email protected]	Ada One
2	[email protected]	Bob One
4	[email protected]	Cara One

PostgreSQL

Engine-specific syntax

Source table data

Setup

CREATE TABLE contacts (id INT, email VARCHAR(100), name VARCHAR(50));

INSERT INTO
  contacts (id, email, name)
VALUES
  (1, '[email protected]', 'Ada One'),
  (2, '[email protected]', 'Bob One'),
  (3, '[email protected]', 'Ada Two'),
  (4, '[email protected]', 'Cara One'),
  (5, '[email protected]', 'Bob Two');

Validated query

SQL

WITH
  ranked AS (
    SELECT
      id,
      ROW_NUMBER() OVER (
        PARTITION BY
          email
        ORDER BY
          id
      ) AS rn
    FROM
      contacts
  )
DELETE FROM contacts USING ranked
WHERE
  contacts.id = ranked.id
  AND ranked.rn > 1;

SELECT
  id,
  email,
  name
FROM
  contacts
ORDER BY
  id;

Expected result

id	email	name
1	[email protected]	Ada One
2	[email protected]	Bob One
4	[email protected]	Cara One

The deduplication logic is the same across engines, but the delete statement shape differs enough to keep the SQL split out.

Useful when

Where this command helps.

removing repeated imported rows while preserving the earliest record
cleaning up duplicate emails or external ids before adding a unique constraint

Explanation

What the command is doing.

After finding duplicate values, the next step is usually cleanup. A common pattern is to partition rows by the duplicate key, assign ROW_NUMBER() ordered by the row you want to preserve, then delete every row where the row number is greater than 1. This keeps one canonical row per duplicate group and removes the rest.

Delete Duplicate Rows While Keeping One

Delete duplicate email rows and keep the earliest id

Where this command helps.

What the command is doing.

Keep moving through nearby tasks.