In my previous introduction to SPSS post about the normal distribution I said that I would come back and explain a little more about Gaussian distributions or normal distributions, which are often known colloquially as 'The Bell Curve'.

Normal distributions are enormously useful, and used very frequently. It is a continuous probability distribution and used a lot in scientific research. The key idea that makes the normal distribution so useful is something called the Central Limit Theorem. This states that the mean of a sample of variables drawn at random from the same distribution will be distributed normally. This is regardless of the distribution of the underlying variable. The sample mean will be the same as that of the underlying population which the sample variance will be equal to the population variance divided by the sample size. This approximation improves as the sample size gets larger.

Normal distributions are enormously useful, and used very frequently. It is a continuous probability distribution and used a lot in scientific research. The key idea that makes the normal distribution so useful is something called the Central Limit Theorem. This states that the mean of a sample of variables drawn at random from the same distribution will be distributed normally. This is regardless of the distribution of the underlying variable. The sample mean will be the same as that of the underlying population which the sample variance will be equal to the population variance divided by the sample size. This approximation improves as the sample size gets larger.

So the next question is - why is this so useful? The most useful thing about this is that is allows us to test hypotheses about data without knowing the underlying distribution of that data.

The second reason it's so useful is that normal distributions are everywhere, and this is in the main because so many variables in nature are not impacted by just one variable but are themselves the sum of many independent variables. Anything in nature (for example individual heights) is the sum of multiple different, and sometimes opposing factors like genes, diet etc.

Which normal distributions are very useful the key 'gotcha' to look out for is how likely it is to get an outlier. Under a normal distribution results far from the mean (many multiples of the standard deviation) are exceedingly unlikely and so if you expect distant outliers from the mean with any regularity at all it is probably the wrong distribution to use.

Hopefully that adds a little more detail to your understanding of Normal or Gaussian distributions and how