Adam's stuff...: November 2010

What I'll do is walk through how to do a simple t-test in Excel - how and where to put in the different functions.

Let's start with an examnple data set. You have two strains of algae (Strain A and Strain B) and have measured the lipid content for 3 replicates of each.

That data might be be laid out in Excel as shown below, with the data series in columns. To the right, I've typed in labels for "mean" and "SD" (standard deviation) that we'll now calculate for each data series. I've clicked on cell J2, where I want to put the mean value for Strain A, and have gone on the tab menu to "Insert" and then "Function":

Select "average" from the menu. If it's not on the "recently used" list simply type "average" into the search box and Excel will bring the average option up for you:

Click "ok" and this will then bring up a menu to select the cells from which to calculate the average:

Highlight the cells you wish to calculate the average for (in this case C2 to C4) and hit "ok".

Cell J2 now displays the mean of 54, with the formula bar giving the function that cell J2 is performing "=AVERAGE(C2:C4)":

Clicking on the formula brings up a colour box highlighting the cells that you are analysing. This is a useful double that you have the right cells under analysis. You can change the cells under analysis simply by dragging the corners of the highlight box.

To calculate the standard deviation of Series A, do the same as for the mean but with selecting the "stdev" option from the pop up menu:

A shortcut to repeating the mean and SD for Series B would be to highlight and copy cells J2 and J3, then past them in K2 and K3. But because you have pasted one column along, Excel assumes that you want to analyse the column one along from that for Series A. Cells K2 and K3 actually display the error message "#DIV/0!" because there is no data in cells D2:D4.

Click on the formula bar for K2 and the cells it is analysing the mean for are highlighted (D2:D4).

The data for Series B is actually in cells G2:G4 - all you need to do to correct that is to drag the highlight box over to G2:G4.

Repeat that for cell K3 to ensure that it is analysing cells G2:G4 for the Series B standard deviation.

Now for the t-test, which is a bit more complicated. Again, go to "insert", "function", and select (or search for and then select) "ttest":

For this function, you need to highlight the two sets of data you wish to compare. For the first series (Series A), activate the "Array 1" tab and highlight cells C2:C4. For the second series (Series B), activate the "Array 2" tab and highlight cells G2:G4.

Now click on the "Tails" tab. This isn't really something to worry about for undergraduate level. By default, use "2" for two-tailed. This is the test for any difference between the mean of your two data sets. One-tailed is a specific case if you only want to look for a change in a certain direction and is not something that is ever really used, so don't worry about it.

This data set is an example of an unpaired test. Comparing two groups of different subjects is unpaired. Paired is when you are comparing a measure of the same subjects taken at different points (e.g. lipid content of Strain A measured on Day X compared to lipid content of Strain A measured on Day Y). Variance is again not something to worry about at this stage, assume equal variance and so select "2" for "Type".

This returns a P value of 0.12, meaning the difference has a 12 % likelihood of being due to chance alone. That means we do not take the difference to be significant - statistical significance of a difference is taken as needing a less than 5 % likelihood of the difference being due to chance alone (P<0.05). In answer to "signficant?", I'll just type in for my own information the answer "no".

Doing a simple t-test in Microsoft Excel