Use scatterplots to show relationships between pairs of continuous variables. These graphs display symbols at the X, Y coordinates of the data points for the paired variables. Scatterplots are also known as scattergrams and scatter charts. [Read more…] about Scatterplots: Using, Examples, and Interpreting
Use pie charts to compare the sizes of categories to the entire dataset. To create a pie chart, you must have a categorical variable that divides your data into groups. These graphs consist of a circle (i.e., the pie) with slices representing subgroups. The size of each slice is proportional to the relative size of each category out of the whole. [Read more…] about Pie Charts: Using, Examples, and Interpreting
Use bar charts to compare categories when you have at least one categorical or discrete variable. Each bar represents a summary value for one discrete level, where longer bars indicate higher values. Types of summary values include counts, sums, means, and standard deviations. Bar charts are also known as bar graphs. [Read more…] about Bar Charts: Using, Examples, and Interpreting
Use line charts to display a series of data points that are connected by lines. Analysts use line charts to emphasize changes in a metric on the vertical Y-axis by another variable on the horizontal X-axis. Often, the X-axis reflects time, but not always. Line charts are also known as line plots. [Read more…] about Line Charts: Using, Examples, and Interpreting
Use dot plots to display the distribution of your sample data when you have continuous variables. These graphs stack dots along the horizontal X-axis to represent the frequencies of different values. More dots indicate greater frequency. Each dot represents a set number of observations. [Read more…] about Dot Plots: Using, Examples, and Interpreting
Use an empirical cumulative distribution function plot to display the data points in your sample from lowest to highest against their percentiles. These graphs require continuous variables and allow you to derive percentiles and other distribution properties. This function is also known as the empirical CDF or ECDF. [Read more…] about Empirical Cumulative Distribution Function (CDF) Plots
Use contour plots to display the relationship between two independent variables and a dependent variable. The graph shows values of the Z variable for combinations of the X and Y variables. The X and Y values are displayed along the X and Y-axes, while contour lines and bands represent the Z value. The contour lines connect combinations of the X and Y variables that produce equal values of Z. [Read more…] about Contour Plots: Using, Examples, and Interpreting
Spearman’s correlation in statistics is a nonparametric alternative to Pearson’s correlation. Use Spearman’s correlation for data that follow curvilinear, monotonic relationships and for ordinal data. Statisticians also refer to Spearman’s rank order correlation coefficient as Spearman’s ρ (rho).
In this post, I’ll cover what all that means so you know when and why you should use Spearman’s correlation instead of the more common Pearson’s correlation. [Read more…] about Spearman’s Correlation Explained
Time series analysis tracks characteristics of a process at regular time intervals. It’s a fundamental method for understanding how a metric changes over time and forecasting future values. Analysts use time series methods in a wide variety of contexts. [Read more…] about Time Series Analysis Introduction
Histograms are graphs that display the distribution of your continuous data. They are fantastic exploratory tools because they reveal properties about your sample data in ways that summary statistics cannot. For instance, while the mean and standard deviation can numerically summarize your data, histograms bring your sample data to life.
In this blog post, I’ll show you how histograms reveal the shape of the distribution, its central tendency, and the spread of values in your sample data. You’ll also learn how to identify outliers, how histograms relate to probability distribution functions, and why you might need to use hypothesis tests with them.
[Read more…] about Using Histograms to Understand Your Data
Graphing your data before performing statistical analysis is a crucial step. Graphs bring your data to life in a way that statistical measures do not because they display the relationships and patterns. In this blog post, you’ll learn about using boxplots and individual value plots to compare distributions of continuous measurements between groups. You’ll also learn why you need to pair these plots with hypothesis tests when you want to make inferences about a population. [Read more…] about Boxplots vs. Individual Value Plots: Graphing Continuous Data by Groups
A probability distribution is a function that describes the likelihood of obtaining the possible values that a random variable can assume. In other words, the values of the variable vary based on the underlying probability distribution.
Suppose you draw a random sample and measure the heights of the subjects. As you measure heights, you can create a distribution of heights. This type of distribution is useful when you need to know which outcomes are most likely, the spread of potential values, and the likelihood of different results.
In this blog post, you’ll learn about probability distributions for both discrete and continuous variables. I’ll show you how they work and examples of how to use them. [Read more…] about Understanding Probability Distributions
In the field of statistics, data are vital. Data are the information that you collect to learn, draw conclusions, and test hypotheses. After all, statistics is the science of learning from data. However, there are different types of variables, and they record various kinds of information. Crucially, the type of information determines what you can learn from it, and, importantly, what you cannot learn from it. Consequently, it’s essential that you understand the different types of data. [Read more…] about Guide to Data Types and How to Graph Them in Statistics
In a previous blog post, I introduced the basic concepts of hypothesis testing and explained the need for performing these tests. In this post, I’ll build on that and compare various types of hypothesis tests that you can use with different types of data, explore some of the options, and explain how to interpret the results. Along the way, I’ll point out important planning considerations, related analyses, and pitfalls to avoid. [Read more…] about Comparing Hypothesis Tests for Continuous, Binary, and Count Data
Regression analysis mathematically describes the relationship between a set of independent variables and a dependent variable. There are numerous types of regression models that you can use. This choice often depends on the kind of data you have for the dependent variable and the type of model that provides the best fit. In this post, I cover the more common types of regression analyses and how to decide which one is right for your data. [Read more…] about Choosing the Correct Type of Regression Analysis