• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun
  • Calculators

Confusion Matrix

By Jim Frost

« Back to Glossary Index

What is a Confusion Matrix?

A confusion matrix is a 2X2 table that summarizes the accuracy of a classification model or diagnostic test by comparing predicted outcomes to actual outcomes. It shows where the model or test made correct predictions and where it was wrong. This summary helps you understand whether it confuses positive and negatives, helping you improve its accuracy.

The matrix organizes outcomes into four mutually exclusive categories based on whether the results were positive or negative and whether they were correct or incorrect:

Image displaying the confusion matrix.

Each cell in the confusion matrix represents the following:

  • True Positive (TP): The model correctly predicts a positive case.
  • False Positive (FP): The model incorrectly predicts a positive case when it is actually negative.
  • False Negative (FN): The model incorrectly predicts a negative case when it is actually positive.
  • True Negative (TN): The model correctly predicts a negative case.

Common Metrics from a Confusion Matrix

A confusion matrix is the foundation for calculating many useful performance metrics in both diagnostic testing and classification models. Each metric describes a different way of evaluating how well the model performs based on the values in the matrix.

Sensitivity (True Positive Rate)

Sensitivity measures how well the confusion matrix captures actual positives. It is the proportion of true positives out of all people who actually have the condition.

Sensitivity = TP / (TP + FN)

For example, a COVID-19 test with a sensitivity of 90% correctly identifies 90% of infected individuals. High sensitivity is important when missing positive cases is costly or dangerous.

Specificity (True Negative Rate)

Specificity measures how well the confusion matrix accounts for actual negatives. It is the proportion of true negatives among all people who do not have the condition.

Specificity = TN / (TN + FP)

A cancer screening test with 95% specificity correctly rules out cancer in 95% of healthy individuals, reducing false positives and unnecessary follow-up procedures.

Learn in-depth about Sensitivity and Specificity: Definition, Formulas & Interpreting.

Positive Predictive Value (Precision)

Positive predictive value, also known as precision, is the proportion of positive predictions in the confusion matrix that are actually correct.

PPV = TP / (TP + FP)

If an email spam filter has a PPV of 80%, then 80% of emails it flags as spam truly are spam. This metric helps evaluate how trustworthy a positive result is.

Learn in-depth about Positive Predictive Value: Meaning, Formula, & Interpreting.

Negative Predictive Value (NPV)

Negative predictive value shows how often a negative prediction from the confusion matrix is accurate. It’s the proportion of true negatives among all negative predictions.

NPV = TN / (TN + FN)

A pregnancy test with an NPV of 92% gives a correct negative result 92% of the time. This builds confidence when the test result is negative.

Accuracy

Accuracy is the overall proportion of correct predictions, both positive and negative, based on the full confusion matrix.

Accuracy = (TP + TN) / (TP + FP + FN + TN)

If a classification model has 94% accuracy, it gets the right answer 94% of the time. However, accuracy can be misleading if most outcomes fall into one category. The confusion matrix helps reveal whether high accuracy reflects balanced performance or a skewed dataset.

F1 Score

The F1 score combines sensitivity and precision into one metric by taking their harmonic mean. It’s especially useful when there’s class imbalance or when both false positives and false negatives matter.

F1 Score = 2 × (Precision × Sensitivity) / (Precision + Sensitivity)

A fraud detection model with an F1 score of 0.75 is doing a solid job at both catching fraudulent transactions and avoiding false alarms. The confusion matrix provides the values needed to calculate this score and assess that balance.

Related

Related Articles:
  • Glossary: Classification
  • Sensitivity vs Specificity: Definition, Formulas & Interpreting
  • Glossary: True Positive [TP]
  • Glossary: False Negative [FN]
  • Glossary: True Negative [TN]
« Back to Glossary Index

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Buy My Thinking Analytically Book!

    Cover for my book, Thinking Analytically: An Guide for Making Data-Driven Decisions.

    Top Posts

    • F-table
    • Cronbach’s Alpha: Definition, Calculations & Example
    • Z-table
    • How To Interpret R-squared in Regression Analysis
    • Accuracy vs Precision: Differences & Examples
    • Box Plot Explained with Examples
    • Interpreting Correlation Coefficients
    • How to Interpret P-values and Coefficients in Regression Analysis
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • T-Distribution Table of Critical Values

    Recent Posts

    • Data Collection Methods: Step-By-Step Guide with Examples
    • ANOVA Calculator
    • Positive Predictive Value: Meaning, Formula, and Interpretation
    • Median Absolute Deviation Calculator
    • Median Absolute Deviation: Definition, Finding & Formula
    • Outlier Calculator

    Recent Comments

    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Jim Frost on Comparing Regression Lines with Hypothesis Tests
    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Jim Frost on Pareto Chart: Making, Reading & Examples

    Copyright © 2026 · Jim Frost · Privacy Policy