Algorithms For Calculating Variance - Naïve Algorithm

Naïve Algorithm

A formula for calculating the variance of an entire population of size N is:

A formula for calculating an unbiased estimate of the population variance from a finite sample of n observations is:

Therefore a naive algorithm to calculate the estimated variance is given by the following:

def naive_variance(data): n = 0 Sum = 0 Sum_sqr = 0 for x in data: n = n + 1 Sum = Sum + x Sum_sqr = Sum_sqr + x*x variance = (Sum_sqr - ((Sum*Sum)/n))/(n - 1) return variance

This algorithm can easily be adapted to compute the variance of a finite population: simply divide by N instead of n − 1 on the last line.

Because sum_sqr and sum * mean can be very similar numbers, the precision of the result can be much less than the inherent precision of the floating-point arithmetic used to perform the computation. This is particularly bad if the standard deviation is small relative to the mean. However, the algorithm can be improved by adopting the method of the assumed mean.

Read more about this topic:  Algorithms For Calculating Variance