Basic statistics

Posted February 18, 2013 at 09:00 AM | categories: statistics | tags:

Updated February 27, 2013 at 02:35 PM

Given several measurements of a single quantity, determine the average value of the measurements, the standard deviation of the measurements and the 95% confidence interval for the average.

This is a recipe for computing the confidence interval. The strategy is:

compute the average
Compute the standard deviation of your data
Define the confidence interval, e.g. 95% = 0.95
compute the student-t multiplier. This is a function of the confidence

interval you specify, and the number of data points you have minus 1. You subtract 1 because one degree of freedom is lost from calculating the average. The confidence interval is defined as ybar +- T_multiplier*std/sqrt(n).

import numpy as np
from scipy.stats.distributions import  t

y = [8.1, 8.0, 8.1]

ybar = np.mean(y)
s = np.std(y)

ci = 0.95
alpha = 1.0 - ci

n = len(y)
T_multiplier = t.ppf(1-alpha/2.0, n-1)

ci95 = T_multiplier * s / np.sqrt(n-1)

print [ybar - ci95, ybar + ci95]

[7.9232449090029595, 8.210088424330376]

We are 95% certain the next measurement will fall in the interval above.

org-mode source

The Kitchin Research Group

Chemical Engineering at Carnegie Mellon University

Basic statistics