A quick one today. I thought we should just remind everyone of the difference between Mean and Median to avoid any confusion as the issue arose in one of our recent training classes for SPSS.
Both Mean and Median are different types of averages. The mean value is based on all of the values within the data and so will include outliers. In small datasets significant outliers can have a significant impact. It is calculated as the sum of all of the variable in the dataset divided by the number of items in the dataset.
So for example imagine our dataset is 1,11,12,13,14. The sum of these is 51, divided by 5 which gives a mean value of 10.2. As this is less than 80% of our dataset (chosen to illustrate the point!) in this case the outlier has made the mean a less useful statistic.
The Median however is the middle number in an ordered dataset. So imagine our dataset was actually 11,13,1,14,13, we would first order the data into 1,11,12,13,14 and then choose the middle number in the dataset, so 12 is our median. As you can see where you have an outlier in the data the median is a far more useful and representative figure as it removes the influence of the outlier.
The median is easy to calculate for datasets with odd numbers of pieces of data in them. For datasets with even numbers of pieces of data in them we calculate the mean of the middle 2 pieces of data. So assuming that our dataset had an additional piece of data in it, 15, our middle 2 pieces of data would be 12 and 13. Taking the mean of them would give a median of 12.5.
Both Mean and Median are different types of averages. The mean value is based on all of the values within the data and so will include outliers. In small datasets significant outliers can have a significant impact. It is calculated as the sum of all of the variable in the dataset divided by the number of items in the dataset.
So for example imagine our dataset is 1,11,12,13,14. The sum of these is 51, divided by 5 which gives a mean value of 10.2. As this is less than 80% of our dataset (chosen to illustrate the point!) in this case the outlier has made the mean a less useful statistic.
The Median however is the middle number in an ordered dataset. So imagine our dataset was actually 11,13,1,14,13, we would first order the data into 1,11,12,13,14 and then choose the middle number in the dataset, so 12 is our median. As you can see where you have an outlier in the data the median is a far more useful and representative figure as it removes the influence of the outlier.
The median is easy to calculate for datasets with odd numbers of pieces of data in them. For datasets with even numbers of pieces of data in them we calculate the mean of the middle 2 pieces of data. So assuming that our dataset had an additional piece of data in it, 15, our middle 2 pieces of data would be 12 and 13. Taking the mean of them would give a median of 12.5.