Doing Standard Deviation with UltimaCalc

Given a set of data, UltimaCalc can calculate its mean, the median value, the standard deviation of the data, the estimated standard deviation of the population of which the data is a sample, and a whole lot more.

What is Standard Deviation?

The problem: given a set of data, we can estimate the mean value, or the average, of the items in the set simply enough by adding up all the items and dividing by the number of them. But how can we estimate the variation between the data items and their mean?

The standard deviation is a measure of the scatter between a set of measurements. It helps to answer questions like, "How much variability is there in my results?"

One solution is to find the largest and smallest values. This gives the range in which the values lie, but this measure is very sensitive to the occurrence of extreme cases.

A better approach is to calculate the sum of the deviations from the mean, and divide the sum by the number of items to give a figure for the average deviation. This is simple enough to do by hand, but this method is inconvenient from a mathematical point of view, due to needing to take the absolute value of each deviation.

Standard Deviation Definition

The preferred method avoids the mathematical difficulties by squaring each deviation. Then the average of the squared deviations is calculated, to give a result caused the variance and standard deviation is the square root of this variance.

Thus we come to the standard deviation formula, the basic method for calculating standard deviation for a set of items: calculate the square root of the average value of the squares of the distances of each item from the mean for the whole set.

The standard deviation equation is expressed mathematically as:

std deviation = sqrt(sum((value - mean) 2 ) / N) or |

where N is the number of items, and the mean is calculated first as:

mean = sum(value) / N or |

How can we make use of standard deviation? I said at the start that it is a measure of variability. It can be shown that 68% (say, two thirds) of all items in a reasonably large sample will be within one standard deviation of the mean, 95.4% will be within two std deviations, and more than 99.7% will be within three std deviations of the mean.

Population Standard Deviation

The discussion so far has focussed on how to calculate standard deviation for a sample of items. Can we estimate the standard deviation of the values found in the population from which our sample was taken? Well yes: it can be shown that the population standard deviation is found by doing a similar calculation to that for the sample standard deviation, but instead of dividing by the number of items in the sample, we divide by one less than this number.

Standard Errors

Now that we have an estimate for the variation of the sample items from their mean value, can we estimate the standard error of this mean value? In other words, given our sample, how confident can we be in our estimate of the mean for the population as a whole? It can be shown that the standard error of the mean is equal to the standard deviation of the sample divided by the square root of the number of items in that sample.

For example, if we measure the heights of 100 young adults and find the mean height to be 178 centimetres with a standard deviation of 7 cm, then the standard error of the mean is 7/sqrt(100)=0.7cm. We can be very confident that the average height of young adults generally is within 3 x 0.7 = 2.1cm of 178cm.

How accurate is our calculated standard deviation? It can be shown that the degree of uncertainty in this value, or the standard error of the standard deviation, is given by dividing the standard deviation by the square root of double the number of items. So for our 100 young adults, the standard error of the standard deviation is 7/sqrt(2x100) or near enough 0.5cm. We can therefore be very confident that the true standard deviation lies between 5.5cm and 8.5cm.

The Solution: UltimaCalc and Standard Deviation

UltimaCalc will calculate the mean, median, sample standard deviation and population standard deviation, sample variance, coefficient of variation, interquartile range, skewness, excess kurtosis, and mean deviation. It will also calculate the standard errors of the mean, median, standard deviation, variance, an of the coefficient of variation.

The tool also displays the number of items and their total. This feature is handy for the times you simply want to add up a long list of numbers, and be able to easily correct any errors.

You can choose which of these items are to be shown, and resize the window so that the other items are not shown, as in the image below. You can also select a transformation to be applied to the data before analysing it.

You can quickly enter data by using the numeric keypad. Type a value into the box in the upper left and hit the 'Enter' key. The value will be added to the list and the entry box will be cleared, ready to accept more data. Every time you enter an item of data, UltimaCalc will try to perform the calculations. The results can be copied to the Windows clipboard.

If you right click on an item in the list, a menu will pop up to allow you to edit or delete the item. You can write notes about the data, and save the data and notes to a file.

Because a calculation is attempted every time you enter an item of data, automatic logging of the calculation is not done. Instead, click on the **Log** button if you want the calculation to be included in the log file.

You can write notes to remind yourself what the data is about, and save the data with these notes to a plain text file for future reference. .