What is the Mann Whitney U Test?
The Mann Whitney U test is a nonparametric hypothesis test that compares two independent groups. Statisticians also refer to it as the Wilcoxon rank sum test. [Read more…] about Mann Whitney U Test Explained
Making statistics intuitive
The Mann Whitney U test is a nonparametric hypothesis test that compares two independent groups. Statisticians also refer to it as the Wilcoxon rank sum test. [Read more…] about Mann Whitney U Test Explained
A box plot, sometimes called a box and whisker plot, provides a snapshot of your continuous variable’s distribution. They particularly excel at comparing the distributions of groups within your dataset. A box plot displays a ton of information in a simplified format. Analysts frequently use them during exploratory data analysis because they display your dataset’s central tendency, skewness, and spread, as well as highlighting outliers. [Read more…] about Box Plot Explained with Examples
The trimmed mean is a statistical measure that calculates a dataset’s average after removing a certain percentage of extreme values from both ends of the distribution. By excluding outliers, this statistic can provide a more accurate representation of a dataset’s typical or central values. Usually, you’ll trim a percentage of values, such as 10% or 20%. [Read more…] about Trimmed Mean: Definition, Calculating & Benefits
The range rule of thumb allows you to estimate the standard deviation of a dataset quickly. This process is not as accurate as the actual calculation for the standard deviation, but it’s so simple you can do it in your head. [Read more…] about Range Rule of Thumb: Overview and Formula
A unimodal distribution in statistics refers to a frequency distribution that has only one peak. Unimodality means that a single value in the distribution occurs more frequently than any other value. The peak represents the most common value, also known as the mode. [Read more…] about Unimodal Distribution Definition & Examples
A random variable is a variable where chance determines its value. They can take on either discrete or continuous values, and understanding the properties of each type is essential in many statistical applications. Random variables are a key concept in statistics and probability theory. [Read more…] about Random Variable: Discrete & Continuous
A probability mass function (PMF) is a mathematical function that calculates the probability a discrete random variable will be a specific value. PMFs also describe the probability distribution for the full range of values for a discrete variable. A discrete random variable can take on a finite or countably infinite number of possible values, such as the number of heads in a series of coin flips or the number of customers who visit a store on a given day. [Read more…] about Probability Mass Function: Definition, Uses & Example
A cumulative distribution function (CDF) describes the probabilities of a random variable having values less than or equal to x. It is a cumulative function because it sums the total likelihood up to that point. Its output always ranges between 0 and 1. [Read more…] about Cumulative Distribution Function (CDF): Uses, Graphs & vs PDF
Monte Carlo simulation uses random sampling to produce simulated outcomes of a process or system. This method uses random sampling to generate simulated input data and enters them into a mathematical model that describes the system. The simulation produces a distribution of outcomes that analysts can use to derive probabilities. [Read more…] about Monte Carlo Simulation: Make Better Decisions
The hypergeometric distribution is a discrete probability distribution that calculates the likelihood an event happens k times in n trials when you are sampling from a small population without replacement. [Read more…] about Hypergeometric Distribution: Uses, Calculator & Formula
The negative binomial distribution describes the number of trials required to generate an event a particular number of times. When you provide an event probability and the number of successes (r), this distribution calculates the likelihood of observing the Rth success on the Nth attempt. Statisticians also refer to this discrete probability distribution as the Pascal distribution. [Read more…] about Negative Binomial Distribution: Uses, Calculator & Formula
Benford’s law describes the relative frequency distribution for leading digits of numbers in datasets. Leading digits with smaller values occur more frequently than larger values. This law states that approximately 30% of numbers start with a 1 while less than 5% start with a 9. According to this law, leading 1s appear 6.5 times as often as leading 9s! Benford’s law is also known as the First Digit Law. [Read more…] about Benford’s Law Explained with Examples
A probability density function describes a probability distribution for a random, continuous variable. Use a probability density function to find the chances that the value of a random variable will occur within a range of values that you specify. More specifically, a PDF is a function where its integral for an interval provides the probability of a value occurring in that interval. For example, what are the chances that the next IQ score you measure will fall between 120 and 140? In statistics, PDF stands for probability density function. [Read more…] about Probability Density Function: Definition & Uses
The t distribution is a continuous probability distribution that is symmetric and bell-shaped like the normal distribution but with a shorter peak and thicker tails. It was designed to factor in the greater uncertainty associated with small sample sizes.
The t distribution describes the variability of the distances between sample means and the population mean when the population standard deviation is unknown and the data approximately follow the normal distribution. This distribution has only one parameter, the degrees of freedom, based on (but not equal to) the sample size. [Read more…] about T Distribution: Definition & Uses
The difference between a standard deviation and a standard error can seem murky. Let’s clear that up in this post!
Standard deviation (SD) and standard error (SE) both measure variability. High values of either statistic indicate more dispersion. However, that’s where the similarities end. The standard deviation is not the same as the standard error. [Read more…] about Difference Between Standard Deviation and Standard Error
The beta distribution is a continuous probability distribution that models random variables with values falling inside a finite interval. Use it to model subject areas with both an upper and lower bound for possible values. Analysts commonly use it to model the time to complete a task, the distribution of order statistics, and the prior distribution for binomial proportions in Bayesian analysis. [Read more…] about Beta Distribution: Uses, Parameters & Examples
The geometric distribution is a discrete probability distribution that calculates the probability of the first success occurring during a specific trial. In other words, during a series of attempts, what is the probability of success first occurring during each attempt? Use this distribution when you need to understand how many attempts are necessary to produce the first successful outcome. [Read more…] about Geometric Distribution: Uses, Calculator & Formula
A conditional distribution is a distribution of values for one variable that exists when you specify the values of other variables. This type of distribution allows you to assess the dispersal of your variable of interest under specific conditions, hence the name. [Read more…] about Conditional Distribution: Definition & Finding
A marginal distribution is a distribution of values for one variable that ignores a more extensive set of related variables in a dataset.
That definition sounds a bit convoluted, but the concept is simple. The idea is that when you have a larger set of related variables that you collected for a study, you might want to focus on one of them to answer a specific question. [Read more…] about Marginal Distribution: Definition & Finding
A contingency table displays frequencies for combinations of two categorical variables. Analysts also refer to contingency tables as crosstabulation and two-way tables. [Read more…] about Contingency Table: Definition, Examples & Interpreting