Doing a simple t-test in Microsoft Excel

What I'll do is walk through how to do a simple t-test in Excel - how and where to put in the different functions.

Let's start with an examnple data set. You have two strains of algae (Strain A and Strain B) and have measured the lipid content for 3 replicates of each.

That data might be be laid out in Excel as shown below, with the data series in columns. To the right, I've typed in labels for "mean" and "SD" (standard deviation) that we'll now calculate for each data series. I've clicked on cell J2, where I want to put the mean value for Strain A, and have gone on the tab menu to "Insert" and then "Function":


Select "average" from the menu. If it's not on the "recently used" list simply type "average" into the search box and Excel will bring the average option up for you:


Click "ok" and this will then bring up a menu to select the cells from which to calculate the average:

 

Highlight the cells you wish to calculate the average for (in this case C2 to C4) and hit "ok".


Cell J2 now displays the mean of 54, with the formula bar giving the function that cell J2 is performing "=AVERAGE(C2:C4)":

Clicking on the formula brings up a colour box highlighting the cells that you are analysing. This is a useful double that you have the right cells under analysis. You can change the cells under analysis simply by dragging the corners of the highlight box.

To calculate the standard deviation of Series A, do the same as for the mean but with selecting the "stdev" option from the pop up menu:



A shortcut to repeating the mean and SD for Series B would be to highlight and copy cells J2 and J3, then past them in K2 and K3. But because you have pasted one column along, Excel assumes that you want to analyse the column one along from that for Series A. Cells K2 and K3 actually display the error message "#DIV/0!" because there is no data in cells D2:D4.



Click on the formula bar for K2 and the cells it is analysing the mean for are highlighted (D2:D4).


The data for Series B is actually in cells G2:G4 - all you need to do to correct that is to drag the highlight box over to G2:G4.


Repeat that for cell K3 to ensure that it is analysing cells G2:G4 for the Series B standard deviation.


Now for the t-test, which is a bit more complicated. Again, go to "insert", "function", and select (or search for and then select) "ttest":


For this function, you need to highlight the two sets of data you wish to compare. For the first series (Series A), activate the "Array 1" tab and highlight cells C2:C4. For the second series (Series B), activate the "Array 2" tab and highlight cells G2:G4.


Now click on the "Tails" tab. This isn't really something to worry about for undergraduate level. By default, use "2" for two-tailed. This is the test for any difference between the mean of your two data sets. One-tailed is a specific case if you only want to look for a change in a certain direction and is not something that is ever really used, so don't worry about it.

This data set is an example of an unpaired test. Comparing two groups of different subjects is unpaired. Paired is when you are comparing a measure of the same subjects taken at different points (e.g. lipid content of Strain A measured on Day X compared to lipid content of Strain A measured on Day Y). Variance is again not something to worry about at this stage, assume equal variance and so select "2" for "Type".



This returns a P value of 0.12, meaning the difference has a 12 % likelihood of being due to chance alone. That means we do not take the difference to be significant - statistical significance of a difference is taken as needing a less than 5 % likelihood of the difference being due to chance alone (P<0.05). In answer to "signficant?", I'll just type in for my own information the answer "no".

Journal articles; primary literature

Research gets published as papers in journals, along with review essays and various other articles. It's important that you know how to cite journal articles, so that other people can easily find their way to the paper that you are refering to.

As an example, let's take one of Dr Keith White's research papers on aluminium toxicity in crayfish. If you want to refer to this article in the text of an essay, you might write something like:
Researchers concerned with aluminium toxicity have investigated the metal's bioavailability to crayfish (Alexopoulos et al., 2003)
The "et al." bit means "and others". In your reference list you would then give full details as:
Alexopoulos, A., McCrohan, C.R., Powell, J.J., Jugdaohsingh, R. & White, K.N. (2003). Bioavailability and toxicity of freshly neutralized aluminium to the freshwater crayfish Pacifastacas leniusculus. Archives of Environmental Contamination & Toxicology, 45:509-514
The journal name (in italics; Archives of Environmental Contamination & Toxicology) tells you in which journal to look for the article. In the journal listing online you can then simply click to the year 2003, open the listings for volume 45, and then scroll down to the article by Alexopoulos et al. on pages 509-514 inclusive.

To find journal articles on a subject you're interested in, try the following databases:
The kind of search terms you use can also be important. Try to use "scientific" names and words. For example, if you're looking for the enzyme "FNR", you might want to try "ferredoxin FNR" or "ferredoxin FNR leaf". "FNR" on Google comes up with a tractor company in Cambridge.