Fit a Linear Regression With REGR_SLOPE and REGR_INTERCEPT

Compute the least-squares slope, intercept, and R² of a linear regression directly in SQL using `REGR_SLOPE(y, x)`, `REGR_INTERCEPT(y, x)`, and `REGR_R2(y, x)` in PostgreSQL.

Created May 5, 2026 Last updated May 5, 2026 1/1 supported engines validation-green 1 example 2 scenarios

Docker-validated Not currently validation-green

Example 1

Find the slope and intercept of the study-hours vs test-score relationship

The three data points (1,3), (2,5), (3,7) lie perfectly on the line score = 2 × hours + 1, so the slope is 2.0, the intercept is 1.0, and R² = 1.0 (perfect fit — no residual variance). In practice, real data is noisy and R² will be less than 1. The slope 2.0 means each additional hour of study predicts an increase of 2 points. To predict a score for a new hours value, multiply slope by hours and add intercept: 4 hours → 2 × 4 + 1 = 9. To compute slope manually in engines without REGR_SLOPE: (n·Σxy − Σx·Σy) / (n·Σx² − (Σx)²) where n = COUNT(*), which for this data gives (3×34 − 6×15) / (3×14 − 36) = 12/6 = 2.0.

Source table data Rows loaded before the example query runs.

Setup

CREATE TABLE study_data (hours_studied INT, test_score INT);

INSERT INTO
  study_data
VALUES
  (1, 3),
  (2, 5),
  (3, 7);

Validated query Shared across supported engines.

SQL

SELECT
  REGR_SLOPE(test_score, hours_studied) AS slope,
  REGR_INTERCEPT(test_score, hours_studied) AS intercept,
  REGR_R2(test_score, hours_studied) AS r_squared
FROM
  study_data;

Expected result Returned rows for the shared example.

slope	intercept	r_squared
2	1	1

Only PostgreSQL supports REGR_SLOPE, REGR_INTERCEPT, and REGR_R2 as built-in aggregates.

Useful when

Where this command helps.

finding the linear trend line through time-series data to quantify the rate of change per period
computing the price elasticity of demand by regressing log(quantity) on log(price)

Explanation

What the command is doing.

PostgreSQL exposes the full set of SQL:2003 linear regression aggregates. REGR_SLOPE(y, x) returns the slope b and REGR_INTERCEPT(y, x) returns the intercept a of the ordinary least-squares line ŷ = a + b·x that best fits the (x, y) pairs. REGR_R2(y, x) returns the coefficient of determination R², the square of the Pearson correlation, which measures what proportion of the variance in y is explained by x — a value of 1.0 indicates a perfect linear fit. Additional functions cover the count of non-null pairs (REGR_COUNT), averages of both variables (REGR_AVGX, REGR_AVGY), and variance of x (REGR_SXX, REGR_SYY, REGR_SXY). MySQL, MariaDB, SQL Server, and SQLite lack these built-ins; the slope can be computed manually as (n·Σxy − Σx·Σy) / (n·Σx² − (Σx)²) and the intercept as (AVG(y) − slope·AVG(x)). These aggregates also work as window functions for computing local regressions per partition.

Fit a Linear Regression With REGR_SLOPE and REGR_INTERCEPT

Find the slope and intercept of the study-hours vs test-score relationship

Where this command helps.

What the command is doing.

Keep moving through nearby tasks.