• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun
  • Calculators

Coefficient of Determination

By Jim Frost

« Back to Glossary Index

What is the Coefficient of Determination?

The coefficient of determination measures how well a linear regression model explains the variation in the outcome variable. It is a goodness-of-fit measure that evaluates how closely the model’s predicted values match the actual data. A higher value means the model accounts for more of the outcome’s variability.

The coefficient of determination is commonly referred to as R-squared or R². Its value ranges from 0 to 1:

  • 0 means the model does not explain any of the variation in the outcome.
  • 1 means the model perfectly predicts the outcome.

The coefficient of determination for the regression model on the left is 15%, and for the model on the right it is 85%. When a linear model accounts for more of the variance, the data points fall closer to the regression line. In practice, you’ll almost never see a model with an R² of 100%. That would mean the fitted values equal the observed data values exactly, and all observations lie perfectly on the regression line.

Graph that illustrates a regression model with a low R-squared.
Graph that illustrates a model with a high R-squared.

In general, a higher coefficient of determination indicates that, for a given dataset, the predicted values are closer to the actual values. However, a high value does not necessarily mean the model is appropriate. For instance, it might be an overfit model, inflating the coefficient of determination by capturing noise rather than signal. Or, it might fail to fit curvature and interaction effects, or not include all relevant predictors even if the R² appears high. In short, you’ll still need to assess the assumptions for least squares regression even with a high value.

Coefficient of Determination Formula

There are two coefficient of determination formulas, depending on the type of regression.

In simple linear regression, it is the square of Pearson’s correlation coefficient r:

Coefficient of determination formula for simple linear regression.

In multiple linear regression, the coefficient of determination formula uses the regression model’s sums of squares values:

Coefficient of determination formula for multiple linear regression.

where SSresidual is the residual sum of squares (RSS) and SStotal is the total sum of squares of the dependent variable around its mean.

The SSresidual / SStotal ratio represents the proportion of variation in the outcome that the model does not explain. Hence, subtracting this ratio from 1 gives the proportion that the model explains.

Example

For example, a model predicting test scores from study hours yields a coefficient of determination of 0.72. This means that 72% of the variation in test scores is explained by the model, while the remaining 28% is due to other factors not included in the model.

Related

Related Articles:
  • How To Interpret R-squared in Regression Analysis
  • Glossary: R-squared
« Back to Glossary Index

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Buy My Thinking Analytically Book!

    Cover for my book, Thinking Analytically: An Guide for Making Data-Driven Decisions.

    Top Posts

    • F-table
    • Cronbach’s Alpha: Definition, Calculations & Example
    • Z-table
    • How To Interpret R-squared in Regression Analysis
    • Accuracy vs Precision: Differences & Examples
    • Box Plot Explained with Examples
    • Interpreting Correlation Coefficients
    • How to Interpret P-values and Coefficients in Regression Analysis
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • T-Distribution Table of Critical Values

    Recent Posts

    • Data Collection Methods: Step-By-Step Guide with Examples
    • ANOVA Calculator
    • Positive Predictive Value: Meaning, Formula, and Interpretation
    • Median Absolute Deviation Calculator
    • Median Absolute Deviation: Definition, Finding & Formula
    • Outlier Calculator

    Recent Comments

    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Jim Frost on Comparing Regression Lines with Hypothesis Tests
    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Jim Frost on Pareto Chart: Making, Reading & Examples

    Copyright © 2026 · Jim Frost · Privacy Policy