Random assignment uses chance to assign subjects to the control and treatment groups in an experiment. This process helps ensure that the groups are equivalent at the beginning of the study, which makes it safer to assume the treatments caused any differences between groups that the experimenters observe at the end of the study. [Read more…] about Random Assignment in Experiments
The scientific method is a proven procedure for expanding knowledge through experimentation and analysis. It is a process that uses careful planning, rigorous methodology, and thorough assessment. Statistical analysis plays an essential role in this process.
In an experiment that includes statistical analysis, the analysis is at the end of a long series of events. To obtain valid results, it’s crucial that you carefully plan and conduct a scientific study for all steps up to and including the analysis. In this blog post, I map out five steps for scientific studies that include statistical analyses. [Read more…] about 5 Steps for Conducting Scientific Studies with Statistical Analyses
Percentiles indicate the percentage of scores that fall below a particular value. They tell you where a score stands relative to other scores. For example, a person with an IQ of 120 is at the 91st percentile, which indicates that their IQ is higher than 91 percent of other scores.
Percentiles are a great tool to use when you need to know the relative standing of a value. Where does a value fall within a distribution of values? While the concept behind percentiles is straight forward, there are different mathematical methods for calculating them. In this post, learn about percentiles, special percentiles and their surprisingly flexible uses, and the various procedures for calculating them. [Read more…] about Percentiles: Interpretations and Calculations
I’m thrilled to announce the release of my first ebook! Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.
If you like the clear writing style I use on this website, you’ll love this book! The end of the post displays the entire table of contents! You can also download a Free Sample that includes the complete Table of Contents and the first two chapters. Go to My Store to download the ebook sample. [Read more…] about New eBook Release! Regression Analysis: An Intuitive Guide
To determine whether the difference between two means is statistically significant, analysts often compare the confidence intervals for those groups. If those intervals overlap, they conclude that the difference between groups is not statistically significant. If there is no overlap, the difference is significant.
While this visual method of assessing the overlap is easy to perform, regrettably it comes at the cost of reducing your ability to detect differences. Fortunately, there is a simple solution to this problem that allows you to perform a simple visual assessment and yet not diminish the power of your analysis.
In this post, I’ll start by showing you the problem in action and explain why it happens. Then, we’ll proceed to an easy alternative method that avoids this problem. [Read more…] about Using Confidence Intervals to Compare Means
Can high p-values be helpful? What do high p-values mean?
Typically, when you perform a hypothesis test, you want to obtain low p-values that are statistically significant. Low p-values are sexy. They represent exciting findings and can help you get articles published.
However, you might be surprised to learn that higher p-values, the ones that are not statistically significant, are also valuable. In this post, I’ll show you the potential value of a p-value that is greater than 0.05, or whatever significance level you’re using. [Read more…] about Can High P-values Be Meaningful?
Histograms are graphs that display the distribution of your continuous data. They are fantastic exploratory tools because they reveal properties about your sample data in ways that summary statistics cannot. For instance, while the mean and standard deviation can numerically summarize your data, histograms bring your sample data to life.
In this blog post, I’ll show you how histograms reveal the shape of the distribution, its central tendency, and the spread of values in your sample data. You’ll also learn how to identify outliers, how histograms relate to probability distribution functions, and why you might need to use hypothesis tests with them.
[Read more…] about Using Histograms to Understand Your Data
Graphing your data before performing statistical analysis is a crucial step. Graphs bring your data to life in a way that statistical measures do not because they display the relationships and patterns. In this blog post, you’ll learn about using boxplots and individual value plots to compare distributions of continuous measurements between groups. You’ll also learn why you need to pair these plots with hypothesis tests when you want to make inferences about a population. [Read more…] about Boxplots vs. Individual Value Plots: Graphing Continuous Data by Groups
Post hoc tests are an integral part of ANOVA. When you use ANOVA to test the equality of at least three group means, statistically significant results indicate that not all of the group means are equal. However, ANOVA results do not identify which particular differences between pairs of means are significant. Use post hoc tests to explore differences between multiple group means while controlling the experiment-wise error rate.
In this post, I’ll show you what post hoc analyses are, the critical benefits they provide, and help you choose the correct one for your study. Additionally, I’ll show why failure to control the experiment-wise error rate will cause you to have severe doubts about your results. [Read more…] about Using Post Hoc Tests with ANOVA
One-tailed hypothesis tests offer the promise of more statistical power compared to an equivalent two-tailed design. While there is some debate about when you can use a one-tailed test, the general consensus among statisticians is that you should use two-tailed tests unless you have concrete reasons for using a one-tailed test.
In this post, I discuss when you should and should not use one-tailed tests. I’ll cover the different schools of thought and offer my own opinion. [Read more…] about When Can I Use One-Tailed Hypothesis Tests?
Choosing whether to perform a one-tailed or a two-tailed hypothesis test is one of the methodology decisions you might need to make for your statistical analysis. This choice can have critical implications for the types of effects it can detect, the statistical power of the test, and potential errors.
In this post, you’ll learn about the differences between one-tailed and two-tailed hypothesis tests and their advantages and disadvantages. I include examples of both types of statistical tests. In my next post, I cover the decision between one and two-tailed tests in more detail.
[Read more…] about One-Tailed and Two-Tailed Hypothesis Tests Explained
The central limit theorem in statistics states that, given a sufficiently large sample size, the sampling distribution of the mean for a variable will approximate a normal distribution regardless of that variable’s distribution in the population.
Unpacking the meaning from that complex definition can be difficult. That’s the topic for this post! I’ll walk you through the various aspects of the central limit theorem (CLT) definition, and show you why it is vital in statistics. [Read more…] about Central Limit Theorem Explained
Bootstrapping is a statistical procedure that resamples a single dataset to create many simulated samples. This process allows you to calculate standard errors, construct confidence intervals, and perform hypothesis testing for numerous types of sample statistics. Bootstrap methods are alternative approaches to traditional hypothesis testing and are notable for being easier to understand and valid for more conditions.
In this blog post, I explain bootstrapping basics, compare bootstrapping to conventional statistical methods, and explain when it can be the better method. Additionally, I’ll work through an example using real data to create bootstrapped confidence intervals. [Read more…] about Introduction to Bootstrapping in Statistics with an Example
Omitted variable bias occurs when a regression model leaves out relevant independent variables, which are known as confounding variables. This condition forces the model to attribute the effects of omitted variables to variables that are in the model, which biases the coefficient estimates. [Read more…] about Confounding Variables Can Bias Your Results
Because histograms display the shape and spread of distributions, you might think they’re the best type of graph for determining whether your data are normally distributed. However, I’ll show you how histograms can trick you! Normal probability plots are a better choice for this task and they are easy to use.
[Read more…] about Assessing Normality: Histograms vs. Normal Probability Plots
Here’s some shocking information for you—sample statistics are always wrong! When you use samples to estimate the properties of populations, you never obtain the correct values exactly. Don’t worry. I’ll help you navigate this issue using a simple statistical tool! [Read more…] about Sample Statistics Are Always Wrong (to Some Extent)!
Luck, statistics, and probabilities go together hand-in-hand. Clint Eastwood, playing Dirty Harry, famously asked a bad guy who was about to reach for his rifle whether he felt lucky. I’m quite sure that the crook carefully pondered the nature of luck, probabilities, and expected outcomes before deciding not to grab his rifle!
A while ago, I did something shocking . . . something that I hadn’t done for several decades. Just like the thief in the Dirty Harry movie, I started thinking about luck. Yes, you guessed it: I bought a lottery ticket for the record-breaking Mega Millions Jackpot. This purchase is shocking for someone like me who knows statistics and is fully aware of how unlikely it is to win. Did I feel lucky? Or was I just a punk? [Read more…] about Luck and Statistics: Do You Feel Lucky, Punk?
Inferential statistics lets you draw conclusions about populations by using small samples. Consequently, inferential statistics provide enormous benefits because typically you can’t measure an entire population.
However, to gain these benefits, you must understand the relationship between populations, subpopulations, population parameters, samples, and sample statistics.
In this blog post, I discuss these concepts, and how to obtain representative samples using random sampling.
Hypothesis tests use sample data to make inferences about the properties of a population. You gain tremendous benefits by working with random samples because it is usually impossible to measure the entire population.
However, there are tradeoffs when you use samples. The samples we use are typically a minuscule percentage of the entire population. Consequently, they occasionally misrepresent the population severely enough to cause hypothesis tests to make errors.
In this blog post, you will learn about the two types of errors in hypothesis testing, their causes, and how to manage them. [Read more…] about Types of Errors in Hypothesis Testing
You’ve just performed a hypothesis test and your results are statistically significant. Hurray! These results are important, right? Not so fast. Statistical significance does not necessarily mean that the results are practically significant in a real-world sense of importance.
In this blog post, I’ll talk about the differences between practical significance and statistical significance, and how to determine if your results are meaningful in the real world.
[Read more…] about Practical vs. Statistical Significance