What is the Mann Whitney U Test?
The Mann Whitney U test is a nonparametric hypothesis test that compares two independent groups. Statisticians also refer to it as the Wilcoxon rank sum test. [Read more…] about Mann Whitney U Test Explained
Making statistics intuitive
The Mann Whitney U test is a nonparametric hypothesis test that compares two independent groups. Statisticians also refer to it as the Wilcoxon rank sum test. [Read more…] about Mann Whitney U Test Explained
Covariance in statistics measures the extent to which two variables vary linearly. It reveals whether two variables move in the same or opposite directions. [Read more…] about Covariance: Definition, Formula & Example
The root mean square error (RMSE) measures the average difference between a statistical model’s predicted values and the actual values. Mathematically, it is the standard deviation of the residuals. Residuals represent the distance between the regression line and the data points. [Read more…] about Root Mean Square Error (RMSE)
A least squares regression line represents the relationship between variables in a scatterplot. The procedure fits the line to the data points in a way that minimizes the sum of the squared vertical distances between the line and the points. It is also known as a line of best fit or a trend line. [Read more…] about Least Squares Regression: Definition, Formulas & Example
ANCOVA, or the analysis of covariance, is a powerful statistical method that analyzes the differences between three or more group means while controlling for the effects of at least one continuous covariate. [Read more…] about ANCOVA: Uses, Assumptions & Example
A cumulative distribution function (CDF) describes the probabilities of a random variable having values less than or equal to x. It is a cumulative function because it sums the total likelihood up to that point. Its output always ranges between 0 and 1. [Read more…] about Cumulative Distribution Function (CDF): Uses, Graphs & vs PDF
The slope intercept form of linear equations is an algebraic representation of straight lines: y = mx + b. [Read more…] about Slope Intercept Form of Linear Equations: A Guide
Monte Carlo simulation uses random sampling to produce simulated outcomes of a process or system. This method uses random sampling to generate simulated input data and enters them into a mathematical model that describes the system. The simulation produces a distribution of outcomes that analysts can use to derive probabilities. [Read more…] about Monte Carlo Simulation: Make Better Decisions
Principal Component Analysis (PCA) takes a large data set with many variables per observation and reduces them to a smaller set of summary indices. These indices retain most of the information in the original set of variables. Analysts refer to these new values as principal components. [Read more…] about Principal Component Analysis Guide & Example
Use a Z test when you need to compare group means. Use the 1-sample analysis to determine whether a population mean is different from a hypothesized value. Or use the 2-sample version to determine whether two population means differ. [Read more…] about Z Test: Uses, Formula & Examples
A linear regression equation describes the relationship between the independent variables (IVs) and the dependent variable (DV). It can also predict new values of the DV for the IV values you specify. [Read more…] about Linear Regression Equation Explained
Relative risk is the ratio of the probability of an adverse outcome in an exposure group divided by its likelihood in an unexposed group. This statistic indicates whether exposure corresponds to increases, decreases, or no change in the probability of the adverse outcome. Use relative risk to measure the strength of the association between exposure and the outcome. Analysts also refer to this statistic as the risk ratio. [Read more…] about Relative Risk: Definition, Formula & Interpretation
Factor analysis uses the correlation structure amongst observed variables to model a smaller number of unobserved, latent variables known as factors. Researchers use this statistical method when subject-area knowledge suggests that latent factors cause observable variables to covary. Use factor analysis to identify the hidden variables. [Read more…] about Factor Analysis Guide with an Example
The K means clustering algorithm divides a set of n observations into k clusters. Use K means clustering when you don’t have existing group labels and want to assign similar data points to the number of groups you specify (K). [Read more…] about What is K Means Clustering? With an Example
Cronbach’s alpha coefficient measures the internal consistency, or reliability, of a set of survey items. Use this statistic to help determine whether a collection of items consistently measures the same characteristic. Cronbach’s alpha quantifies the level of agreement on a standardized 0 to 1 scale. Higher values indicate higher agreement between items. [Read more…] about Cronbach’s Alpha: Definition, Calculations & Example
The chi-square goodness of fit test evaluates whether proportions of categorical or discrete outcomes in a sample follow a population distribution with hypothesized proportions. In other words, when you draw a random sample, do the observed proportions follow the values that theory suggests. [Read more…] about Chi-Square Goodness of Fit Test: Uses & Examples
Inter-rater reliability measures the agreement between subjective ratings by multiple raters, inspectors, judges, or appraisers. It answers the question, is the rating system consistent? High inter-rater reliability indicates that multiple raters’ ratings for the same item are consistent. Conversely, low reliability means they are inconsistent. [Read more…] about Inter-Rater Reliability: Definition, Examples & Assessing
The margin of error (MOE) for a survey tells you how near you can expect the survey results to be to the correct population value. For example, a survey indicates that 72% of respondents favor Brand A over Brand B with a 3% margin of error. In this case, the actual population percentage that prefers Brand A likely falls within the range of 72% ± 3%, or 69 – 75%. [Read more…] about Margin of Error: Formula and Interpreting
A confidence interval (CI) is a range of values that is likely to contain the value of an unknown population parameter. These intervals represent a plausible domain for the parameter given the characteristics of your sample data. Confidence intervals are derived from sample statistics and are calculated using a specified confidence level. [Read more…] about Confidence Intervals: Interpreting, Finding & Formulas
A test statistic assesses how consistent your sample data are with the null hypothesis in a hypothesis test. Test statistic calculations take your sample data and boil them down to a single number that quantifies how much your sample diverges from the null hypothesis. As a test statistic value becomes more extreme, it indicates larger differences between your sample data and the null hypothesis. [Read more…] about Test Statistic: Definition, Types & Formulas