Reliability and validity are criteria by which researchers assess measurement quality. Measuring a person or item involves assigning scores to represent an attribute. This process creates the data that we analyze. However, to provide meaningful research results, that data must be good. And not all data are good! [Read more…] about Reliability vs Validity

# conceptual

## Nominal, Ordinal, Interval, and Ratio Scales

The nominal, ordinal, interval, and ratio scales are levels of measurement in statistics. These scales are broad classifications describing the type of information recorded within the values of your variables. Variables take on different values in your data set. For example, you can measure height, gender, and class ranking. Each of these variables uses a distinct level of measurement. [Read more…] about Nominal, Ordinal, Interval, and Ratio Scales

## Odds Ratio

An odds ratio (OR) quantifies the relationship between a variable and the likelihood of an event occurring. A common use for odds ratios is identifying risk factors by assessing the relationship between exposure to a risk factor and a medical outcome. For example, is there an association between exposure to a chemical and a disease? [Read more…] about Odds Ratio

## Case-Control Study

A case-control study is a retrospective, observational study that compares two existing groups. Researchers form these groups based on the existence of a condition in the case group and the lack of that condition in the control group. They evaluate the differences in the histories between these two groups looking for factors that might cause a disease. [Read more…] about Case-Control Study

## Simple Random Sampling

Simple random sampling (SRS) is a probability sampling method where researchers randomly choose participants from a population. All population members have an equal probability of being selected. This method tends to produce representative, unbiased samples. [Read more…] about Simple Random Sampling

## Convenience Sampling

Convenience sampling is a non-probability sampling method where researchers use subjects who are easy to contact and obtain their participation. Researchers find participants in the most accessible places, and they impose no inclusion requirements. Convenience sampling is also known as opportunity or availability sampling. [Read more…] about Convenience Sampling

## Systematic Sampling

Systematic sampling is a probability sampling method for obtaining a representative sample from a population. To use this method, researchers start at a random point and then select subjects at regular intervals of every n^{th} member of the population. Like other probability sampling methods, the researchers must identify their population of interest before sampling from it. [Read more…] about Systematic Sampling

## Variance

Variance is a measure of variability in statistics. It assesses the average squared difference between data values and the mean. Unlike some other statistical measures of variability, it incorporates all data points in its calculations by contrasting each value to the mean. [Read more…] about Variance

## Mean Squared Error (MSE)

Mean squared error (MSE) measures the amount of error in statistical models. It assesses the average squared difference between the observed and predicted values. When a model has no error, the MSE equals zero. As model error increases, its value increases. The mean squared error is also known as the mean squared deviation (MSD). [Read more…] about Mean Squared Error (MSE)

## Validity

Validity in research, statistics, psychology, and testing evaluates how well test scores reflect what they’re supposed to measure. Does the instrument measure what it claims to measure? Do the measurements reflect the underlying reality? Or, do they quantify something else? [Read more…] about Validity

## Internal and External Validity

Internal and external validity relate to the findings of studies and experiments. [Read more…] about Internal and External Validity

## Uniform Distribution

The uniform distribution is a symmetric probability distribution where all outcomes have an equal likelihood of occurring. All values in the distribution have a constant probability. This distribution is also known as the rectangular distribution because of its shape in probability distribution plots, as I’ll show you below. [Read more…] about Uniform Distribution

## Frequency Table

Frequency is the number of times a specific data value occurs in your dataset. A frequency table lists a set of values and how often each one appears. They help you understand which data values are common and which are rare. These tables organize your data and are an effective way to present the results to others. Frequency tables are also known as frequency distributions because they allow you to understand the distribution of values in your dataset. [Read more…] about Frequency Table

## Mean Absolute Deviation

The mean absolute deviation (MAD) is a measure of variability that indicates the average distance between observations and their mean. MAD uses the original units of the data, which simplifies interpretation. Larger values signify that the data points spread out further from the average. Conversely, lower values correspond to data points bunching closer to it. The mean absolute deviation is also known as the mean deviation and average absolute deviation. [Read more…] about Mean Absolute Deviation

## Conditional Probability

A conditional probability is the likelihood of an event occurring given that another event has already happened. Conditional probabilities allow you to evaluate how prior information affects probabilities. When you incorporate existing facts into the calculations, it can change the probability of an outcome. [Read more…] about Conditional Probability

## Cluster Sampling

Cluster sampling is a method of obtaining a representative sample from a population that researchers have divided into groups. An individual cluster is a subgroup that mirrors the diversity of the whole population while the set of clusters are similar to each other. Typically, researchers use this approach when studying large, geographically dispersed populations because it is a cost-controlling measure. [Read more…] about Cluster Sampling

## Stratified Sampling

Stratified sampling is a method of obtaining a representative sample from a population that researchers have divided into relatively similar subpopulations (strata). Researchers use stratified sampling to ensure specific subgroups are present in their sample. It also helps them obtain precise estimates of each group’s characteristics. Many surveys use this method to understand differences between subpopulations better. Stratified sampling is also known as stratified random sampling. [Read more…] about Stratified Sampling

## Skewed Distribution

A skewed distribution occurs when one tail is longer than the other. Skewness defines the asymmetry of a distribution. Unlike the familiar normal distribution with its bell-shaped curve, these distributions are asymmetric. The two halves of the distribution are not mirror images because the data are not distributed equally on both sides of the distribution’s peak. [Read more…] about Skewed Distribution

## Heterogeneity

Heterogeneity is defined as a dissimilarity between elements that comprise a whole. When heterogeneity is present, there is diversity in the characteristic under study. The parts of the whole are different, not the same. It is an essential concept in science and statistics. Heterogeneous is the opposite of homogeneous. [Read more…] about Heterogeneity

## Control Variables

Control variables are properties that researchers hold constant for all observations in an experiment. While these variables are not the primary focus of the research, keeping their values consistent helps the study establish the true relationships between the independent and dependent variables. Control variables are different from control groups. [Read more…] about Control Variables