Introduction
Measures of dispersion, also referred to as measures of variability, are statistical tools used to describe the extent to which data points in a dataset deviate from a central value. They provide insights into the distribution and consistency of data, helping to evaluate the spread of observations. Common measures of dispersion include the range, variance, standard deviation, and interquartile range.
Dispersion can be categorized into two broad types: absolute measures and relative measures.
I. Absolute Measures of Dispersion
Absolute measures of dispersion quantify the extent of variability in the same units as the original dataset. They indicate how far data points spread around a central value.
1. Range
The range is the simplest measure of dispersion, calculated as the difference between the highest and lowest values in a dataset.
Example:
For the dataset {2, 5, 8, 11},
Range = 11 − 2 = 9
2. Variance
Variance represents the average of the squared deviations from the mean. It quantifies how data points spread around the average value, but its unit is the square of the original unit.
3. Standard Deviation
Standard deviation is the square root of the variance, making it more interpretable by expressing dispersion in the same units as the data. It is widely used due to its practical applicability.
4. Quartile Deviation (Interquartile Range / 2)
Also known as the semi-interquartile range, quartile deviation measures the spread of the middle 50% of a dataset. It is computed as half the difference between the third quartile (Q3) and the first quartile (Q1).
Formula:
Quartile Deviation (QD) = (Q3 − Q1) / 2
Example:
If Q3 = 30 and Q1 = 10, then
QD = (30 − 10) / 2 = 10
Interpretation: A higher quartile deviation signifies greater dispersion in the central half of the dataset.
5. Mean Deviation
Mean deviation is the average of the absolute deviations of data points from the mean. It provides an intuitive measure of average spread without squaring the deviations.
Illustrative Example (Conceptual only):
Consider exam scores: 60, 65, 70, 75, 80
- Range: 80 − 60 = 20
- Variance, Standard Deviation, Quartile Deviation, and Mean Deviation would require calculations, but each serves to measure variability using different approaches.
II. Relative Measures of Dispersion
Relative measures of dispersion are used for comparing the variability of datasets with different units or means. These measures are expressed as ratios or percentages, enabling standardized comparison.
1. Coefficient of Range
It is the ratio of the difference between the maximum and minimum values to their sum.
Formula:
Coefficient of Range = (L − S) / (L + S)
Where L = Largest value, S = Smallest value
2. Coefficient of Quartile Deviation
This metric standardizes the quartile deviation to facilitate comparison across datasets.
Formula:
Coefficient of QD = (Q3 − Q1) / (Q3 + Q1)
Example:
If Q3 = 30 and Q1 = 10, then
Coefficient of QD = (30 − 10) / (30 + 10) = 0.5
3. Coefficient of Variation (CV)
The coefficient of variation is the ratio of the standard deviation to the mean, expressed as a percentage. It is useful when comparing the relative variability of datasets with different means or units.
Formula:
CV = (Standard Deviation / Mean) × 100
Example:
Dataset A: Mean = 50, SD = 10 → CV = (10/50) × 100 = 20%
Dataset B: Mean = 100, SD = 15 → CV = (15/100) × 100 = 15%
Although Dataset B has a higher standard deviation, Dataset A has greater relative variability.
4. Coefficient of Mean Deviation
This is the ratio of the mean deviation to the mean, offering another form of relative comparison.
When to Use Relative Measures
Relative measures are particularly useful when:
- Comparing datasets with different units (e.g., height in centimeters vs. weight in kilograms).
- Comparing datasets with significantly different averages (e.g., salaries across different companies or industries).
Conclusion
Measures of dispersion are essential in understanding the spread and reliability of data. Absolute measures provide concrete values in original units, while relative measures enable meaningful comparisons across diverse datasets. Together, these tools enhance statistical analysis by highlighting variability, enabling data-driven insights, and supporting sound decision-making.