Looking at the data we are often tempted to arrive at statements about that data whose validity may be in doubt. To supplement the visual analysis we look at some basic statistic methods of testing.
In testing we usually define a null hypothesis and calculate some test statistic. We then look up a critical value in a table for a given significance level and either accept or reject the null hypothesis.
Let's assume that the company announced a sales figure of 22 per person over all departments. It turns out that our department's sales were a little below that value:
sales <-c( 20, 17, 24, 19, 24, 24, 21, 29, 13, 9) mean(sales) [1] 20
One question to ask is whether the difference is 'significant' i.e. not just a result of random fluctuation which would always be present in data of this kind; in other words, how likely are we to get a sample mean of 20 when the population mean is 22?
By calculating the mean we are summing values, and we already know that in this case the distribution of the result is approximately normal. The division by N does not matter in this respect.
Therefore, in order to answer our question we can use a t-test, formulating the null hypothesis that the population mean is μ = 22. We also chose a significance level, such as α = 0.05.
The test statistic is
t = (m - μ) / SE with standard error SE = s / √ n
where s is the sample standard deviation and n is the sample size.
Note that for the sample standard deviation n-1 is used: sd(x) = √ (∑ (x-μ)2 / (n-1) )
sqrt(sum(sapply(c( 20, 17, 24, 19, 24, 24, 21, 29, 13, 9), function(x) (x-20)^2))/(10-1)) [1] 5.868939
For our data,
sd(sales) [1] 5.868939 (20 - 22) / (5.868939 / sqrt(10)) [1] -1.077632
In the old days before cheap desktop computers we would have to consult a table and look up the critical value for the given degrees of freedom (n-1) and the significance level, and based on that value and the test statistic we would either accept or reject the null hypothesis.
1T 0.1 0.05 0.025 0.01 0.005 2T 0.2 0.1 0.05 0.02 0.01 ---------------------------------------------- 1 3.078 6.314 12.706 31.821 63.657 2 1.886 2.92 4.303 6.965 9.925 3 1.638 2.353 3.182 4.541 5.841 4 1.533 2.132 2.776 3.747 4.604 5 1.476 2.015 2.571 3.365 4.032 6 1.44 1.943 2.447 3.143 3.707 7 1.415 1.895 2.365 2.998 3.499 8 1.397 1.86 2.306 2.896 3.355 9 1.383 1.833 2.262 2.821 3.25 10 1.372 1.812 2.228 2.764 3.169 11 1.363 1.796 2.201 2.718 3.106 12 1.356 1.782 2.179 2.681 3.055 13 1.35 1.771 2.16 2.65 3.012 14 1.345 1.761 2.145 2.624 2.977 15 1.341 1.753 2.131 2.602 2.947
There are n-1 = 9 degrees of freedom since if we still want to arrive at a given mean we can change 9 values, but not the 10th one, since this last value must reflect (i.e. negate) our changes.
We are doing a two-tailed test i.e. we are testing whether the mean is different from the given value, accounting for both directions. A one-tailed test would be appropriate if e.g. we only test for m < μ.
Today we can easily compute the p-value directly up to several decimal places:
t.test(sales, mu = 22) One Sample t-test data: sales t = -1.0776, df = 9, p-value = 0.3092 alternative hypothesis: true mean is not equal to 22 95 percent confidence interval: 15.80161 24.19839 sample estimates: mean of x 20
→ The p-value is well above α and therefore we accept the null hypothesis (which states that the true mean is 22).
Another interpretation of the t.test result is: the probability of arriving at a test statistic of -1.0776 (or higher absolute value) is 0.3092 i.e. quite likely.
For the confidence interval we compute the standard error
SE <- sd(sales) / sqrt(length(sales)) SE [1] 1.855921
The quantile for α = 0.05 and n-1 = 9 degrees of freedom is not 1.96 but
qt(1-0.05/2, 9) [1] 2.262157
which is of course the critical value from the table and results in the confidence intervals given by the t.test:
20 - 2.262157 * 1.855921 [1] 15.80162 20 + 2.262157 * 1.855921 [1] 24.19839
The value of mu = 22 lies well within the confidence interval.
For a value just outside the confidence interval, such as mu = 25, the result would have been different:
t.test(sales, mu = 25) One Sample t-test data: sales t = -2.6941, df = 9, p-value = 0.02463 alternative hypothesis: true mean is not equal to 25 95 percent confidence interval: 15.80161 24.19839 sample estimates: mean of x 20
In this case with α = 0.05 we would reject the null hypothesis.
Note that for μ = 0 the t-test statistic becomes simply
t = m / SE
The t.test function in R can also be used to perform a two-sample t-test i.e. compare two means:
t.test(c(2,3,7,11,13,17,19),c(7,11,13,17,19,23,29,31)) Welch Two Sample t-test data: c(2, 3, 7, 11, 13, 17, 19) and c(7, 11, 13, 17, 19, 23, 29, 31) t = -2.1649, df = 12.847, p-value = 0.04983 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -16.921242403 -0.007329025 sample estimates: mean of x mean of y 10.28571 18.75000
Here the null hypothesis is that the two population means are identical; with α=0.05 this would be rejected in this case since the p-value is (slightly) below α.
Other Sources