A linear regression equation describes the relationship between the independent variables (IVs) and the dependent variable (DV). It can also predict new values of the DV for the IV values you specify. [Read more…] about Linear Regression Equation Explained
What is Relative Risk?
Relative risk is the ratio of the probability of an adverse outcome in an exposure group divided by its likelihood in an unexposed group. This statistic indicates whether exposure corresponds to increases, decreases, or no change in the probability of the adverse outcome. Use relative risk to measure the strength of the association between exposure and the outcome. Analysts also refer to this statistic as the risk ratio. [Read more…] about Relative Risk: Definition, Formula & Interpretation
What is Factor Analysis?
Factor analysis uses the correlation structure amongst observed variables to model a smaller number of unobserved, latent variables known as factors. Researchers use this statistical method when subject-area knowledge suggests that latent factors cause observable variables to covary. Use factor analysis to identify the hidden variables. [Read more…] about Factor Analysis Guide with an Example
What is K Means Clustering?
The K means clustering algorithm divides a set of n observations into k clusters. Use K means clustering when you don’t have existing group labels and want to assign similar data points to the number of groups you specify (K). [Read more…] about What is K Means Clustering? With an Example
What is Cronbach’s Alpha?
Cronbach’s alpha coefficient measures the internal consistency, or reliability, of a set of survey items. Use this statistic to help determine whether a collection of items consistently measures the same characteristic. Cronbach’s alpha quantifies the level of agreement on a standardized 0 to 1 scale. Higher values indicate higher agreement between items. [Read more…] about Cronbach’s Alpha: Definition, Calculations & Example
The chi-square goodness of fit test evaluates whether proportions of categorical or discrete outcomes in a sample follow a population distribution with hypothesized proportions. In other words, when you draw a random sample, do the observed proportions follow the values that theory suggests. [Read more…] about Chi-Square Goodness of Fit Test: Uses & Examples
What is Inter-Rater Reliability?
Inter-rater reliability measures the agreement between subjective ratings by multiple raters, inspectors, judges, or appraisers. It answers the question, is the rating system consistent? High inter-rater reliability indicates that multiple raters’ ratings for the same item are consistent. Conversely, low reliability means they are inconsistent. [Read more…] about Inter-Rater Reliability: Definition, Examples & Assessing
What is the Margin of Error?
The margin of error (MOE) for a survey tells you how near you can expect the survey results to be to the correct population value. For example, a survey indicates that 72% of respondents favor Brand A over Brand B with a 3% margin of error. In this case, the actual population percentage that prefers Brand A likely falls within the range of 72% ± 3%, or 69 – 75%. [Read more…] about Margin of Error: Formula and Interpreting
What is a Confidence Interval?
A confidence interval (CI) is a range of values that is likely to contain the value of an unknown population parameter. These intervals represent a plausible domain for the parameter given the characteristics of your sample data. Confidence intervals are derived from sample statistics and are calculated using a specified confidence level. [Read more…] about Confidence Intervals: Interpreting, Finding & Formulas
What is a Test Statistic?
A test statistic assesses how consistent your sample data are with the null hypothesis in a hypothesis test. Test statistic calculations take your sample data and boil them down to a single number that quantifies how much your sample diverges from the null hypothesis. As a test statistic value becomes more extreme, it indicates larger differences between your sample data and the null hypothesis. [Read more…] about Test Statistic: Definition, Types & Formulas
What is an Odds Ratio?
An odds ratio (OR) calculates the relationship between a variable and the likelihood of an event occurring. A common interpretation for odds ratios is identifying risk factors by assessing the relationship between exposure to a risk factor and a medical outcome. For example, is there an association between exposure to a chemical and a disease? [Read more…] about Odds Ratio: Formula, Calculating & Interpreting
What is a Case Control Study?
A case control study is a retrospective, observational study that compares two existing groups. Researchers form these groups based on the existence of a condition in the case group and the lack of that condition in the control group. They evaluate the differences in the histories between these two groups looking for factors that might cause a disease. [Read more…] about Case Control Study: Definition, Benefits & Examples
What is the 5 Number Summary?
The 5 number summary is an exploratory data analysis tool that provides insight into the distribution of values for one variable. Collectively, this set of statistics describes where data values occur, their central tendency, variability, and the general shape of their distribution. [Read more…] about 5 Number Summary: Definition, Finding & Using
Variance is a measure of variability in statistics. It assesses the average squared difference between data values and the mean. Unlike some other statistical measures of variability, it incorporates all data points in its calculations by contrasting each value to the mean. [Read more…] about Variance: Definition, Formulas & Calculations
Mean squared error (MSE) measures the amount of error in statistical models. It assesses the average squared difference between the observed and predicted values. When a model has no error, the MSE equals zero. As model error increases, its value increases. The mean squared error is also known as the mean squared deviation (MSD). [Read more…] about Mean Squared Error (MSE)
What is a Paired T Test?
Use a paired t-test when each subject has a pair of measurements, such as a before and after score. A paired t-test determines whether the mean change for these pairs is significantly different from zero. This test is an inferential statistics procedure because it uses samples to draw conclusions about populations.
Paired t tests are also known as a paired sample t-test or a dependent samples t test. These names reflect the fact that the two samples are paired or dependent because they contain the same subjects. Conversely, an independent samples t test contains different subjects in the two samples. [Read more…] about Paired T Test: Definition & When to Use It
What is an Independent Samples T Test?
Use an independent samples t test when you want to compare the means of precisely two groups—no more and no less! Typically, you perform this test to determine whether two population means are different. This procedure is an inferential statistical hypothesis test, meaning it uses samples to draw conclusions about populations. The independent samples t test is also known as the two sample t test. [Read more…] about Independent Samples T Test: Definition, Using & Interpreting
What is a Stem and Leaf Plot?
Stem and leaf plots display the shape and spread of a continuous data distribution. These graphs are similar to histograms, but instead of using bars, they show digits. It’s a particularly valuable tool during exploratory data analysis. They can help you identify the central tendency, variability, skewness of your distribution, and outliers. Stem and leaf plots are also known as stemplots. [Read more…] about Stem and Leaf Plot: Making, Reading & Examples
What is a Pareto Chart?
A Pareto chart is a specialized bar chart that displays categories in descending order and a line chart representing the cumulative amount. The chart effectively communicates the categories that contribute the most to the total. Frequently, quality analysts use Pareto charts to identify the most common types of defects or other problems.
Learn how to use and read Pareto charts and understand the Pareto principle and the 80/20 rule that are behind it. I’ll also show you how to create them using Excel. [Read more…] about Pareto Chart: Making, Reading & Examples
The range of a data set is the difference between the maximum and the minimum values. It measures variability using the same units as the data. Larger values represent greater variability.
The range is the easiest measure of dispersion to calculate and interpret in statistics, but it has some limitations. In this post, I’ll show you how to find the range mathematically and graphically, interpret it, explain its limitations, and clarify when to use it. [Read more…] about Range of a Data Set