Cohen’s kappa is a statistical measure of inter-rater reliability that assesses the agreement between two raters or judges while accounting for agreement occurring by chance. It ranges from -1 to 1, where 1 indicates perfect agreement, 0 indicates agreement no better than chance, and negative values suggest worse-than-chance agreement. Values above 0.6 are often considered moderate to good agreement, though interpretation can vary by field.
Cohen’s kappa works with binary, categorical, and ordinal data, such as yes/no decisions or rating scales. It assumes that the raters are independent, the categories are mutually exclusive, and all observations are rated by both raters.
Cohen’s kappa is designed specifically for two raters; for situations with more than two raters, consider using Fleiss’ kappa.
For example, if two doctors independently diagnose the same group of patients, Cohen’s kappa can evaluate how consistently they classify patients, adjusting for chance agreement. If the kappa value is 0.75, this result indicates substantial agreement between the doctors, giving researchers confidence in the reliability of their assessments.
« Back to Glossary Index