Standard deviations and confidence intervals in spreadsheets




Introduction

In connection with average and median values, you also have an interest in knowing the level of uncertainty for your calculations. For this you use standard deviations, which is an evaluation of the variations in the measurements, and the confidence interval, which is an evaluation of the uncertainty of the calculated average/median value.





Standard deviations

Standard deviations is a measurement of the natural variation in a set of observations. Here you should be aware, that there is several ways of calculating standard deviations, depending on a number of conditions, and you should be aware, that not all methods are available for all file formats. Also you should be aware, that the same functions may have different names, depending on the file formats.

For Microsoft Excel you have the following functions:
In the earlier versions of Excel, when using the file format .xls, you only had the function STDEV(), which is equivalent to STDEV.P().

To illustrate the difference between the four types of standard deviation, the image below shows an observation set consisting of 5 legal values + 1 illegal value (text):

The four types of standard deviations in Excel spreadsheets.


For OpenOffice Calc you have the following functions:
To illustrate the difference between the four types of standard deviation, the image below shows an observation set consisting of 5 legal values + 1 illegal value (text):

The four types of standard deviations in Calc spreadsheets.





Confidence interval

In the same way as you want to know the natural variation in your observations, you may also want some measurement on the expected accuracy of your average/median value. This is called a confidence interval. An interval from the average/median value, where you can find a specified percentage of your observations. In reality, it is a qualification of your standard deviation, where you look at the size of the standard deviation in relation to the number of observations on which it is calculated.

Like the standard deviations, there is some differences in the available functions, depending on the file formats.

For Microsoft Excel you have the following functions:
The functions uses three variables:
  1. Alpha (also written α): The significance level. The confidence level is equal to 100*(1-α), so a 95% confidence interval is equal to α=0,05 and a 99% confidence interval is equal to α=0,01 etc.
  2. Standard deviation: Usually you write the name of the cell, where this is calculated, but writing a fixed value is perfectly fine.
  3. Size: The size of the observation set. Here you have to be very aware whether you have illegal values in the observation set, and if you do, are you using a standard deviation that includes illegal values or not!
In the earlier versions of Excel, when using the file format .xls, you only had the function CONFIDENCE(), which is equivalent to CONFIDENCE.NORM().

For at illustrere forskellen på de to typer konfidensinterval, ses her to beregninger af 95% konfidensintervaller, baseret på en STDEV.P()-værdi:

The two types of confidence intervals in Excel spreadsheets.


For OpenOffice Calc you only have the function CONFIDENCE(), which is the confidence interval, under the assumption that you have a normal distribution of the observations.
The function uses three variables:
  1. Alpha (also written α): The significance level. The confidence level is equal to 100*(1-α), so a 95% confidence interval is equal to α=0,05 and a 99% confidence interval is equal to α=0,01 etc.
  2. Standard deviation: Usually you write the name of the cell, where this is calculated, but writing a fixed value is perfectly fine.
  3. Size: The size of the observation set. Here you have to be very aware whether you have illegal values in the observation set, and if you do, are you using a standard deviation that includes illegal values or not!
For at illustrere beregningen af et konfidensinterval, ses her beregningen af et 95% konfidensinterval, baseret på en STDEVP()-værdi:

Confidence intervals in Calc spreadsheets.