This is just a quick post to describe how to calculate p-values for two-variable correlations in Excel. Annoyingly, there is no direct way of doing this. Excel will give you the correlation, but not its associated p-value. It can be done, however, in a slightly roundabout way.
First, calculate the correlation between your groups:
=correl(variable1, variable2)
This gives you the sample test statistic r, which can be converted to t with the following formula:
where r is the correlation obtained above and n is your number of observations. Say you have 30 samples for two groups, and a r of 0.5. The calculation to obtain t is then (in excel terms):
=(0.5*sqrt(30-2))/(sqrt(1-0.5^2))
=3.05505
Then to assess the significance value associated with this t, simply use the tdist function (Student T distribution output):
=t.dist.2t(t, degrees of freedom)
This gives us results for a two-tailed distribution. Alternatively, the old tdist function can still be used, which requires the user to specify the number of tails (=tdist(t, degrees of freedom, #tails)).
Our calculation thus looks like this:
=t.dist.2t(3.05505, 30-2)
=0.0049
Which is the p-value for the correlation. Done!
Pingback: P-values for correlations in Excel — physiology and coffee | SutoCom Solutions
Thanks for this!
LikeLike
I realise I am being too simple minded here, but does this 0.0049 value for t.dist.2t mean that the two variables are significantly correlated – or not?
I ask because I thought if they are, then the significance should be above, say, 0.95?
LikeLike
Hi Ian. p=0.0049 is a significant correlation, yes. Typically, p<0.05 (or lower) is needed to claim significance. The threshold of 0.05 (called the alpha) is an arbitrary one, but remains the convention. Sometimes a 'stricter' alpha of 0.01 is used.
The alpha level is related to the confidence level, which may be what you are referring to. An alpha of 0.05 corresponds to a confidence level of 0.95 (95%) as follows: 1.0-0.95=0.05. So as our p-value is 0.0049, it meets the conventional alpha threshold (0.05), and we can say that the correlation is significant at a 95% confidence level. We could, of course, choose a different alpha threshold. For example, if we chose a 0.01 threshold instead, this would correspond to a 99% confidence level (1.0-0.99=0.01).
I hope this makes sense.
LikeLike
Hi Mariherigstad,
Thanks for that; and apologies for my not getting back sooner – I have been away.
Yes, that is helpful.
But can you confirm what I am doing is OK to test both lower and upper limits? –
I do this by three steps, using a boundary limit of 5% – let us call that B:
Having initially calculated the correlation using Correl r of Array1:Array2 of n events,
I then calculate the t-test function: r x SQRT(n-2)/SQRT(1 – r^2) = t of this array
Next, I calculate T.DIST(t,n-2,2) calling this result L
Finally, I put this value as my lower and upper limits, so:
IF(OR(L 1-B), SLOPE(Array1:Array2),””)
Have I got this right please?
LikeLike
Thanks for this – very helpful. Having a slight problem when I have a negative r value. It’s making my t value negative and then its unable to calculate my p value. Do I need to add in another step for a negative t value or am I doing something wrong? I’ve checked the equations and they’re working fine with positive r values. Thanks
LikeLike
Hi Sara, you are absolutely right, it doesn’t compute for negative values. So if you have a negative, just use the absolute number (in effect, change your negative to a positive).
The command for doing it is “abs”: for e.g. -5, you could write abs(-5) and that would tell excel to use the value 5 instead of -5. Or you can just do it manually if that is easier. Hope this helps.
LikeLike
Great, thanks very much for your help!
LikeLike
Hi,
Please correct me if I’m wrong, but can’t the significance of a correlation be found using the regression analysis tool in the Data Analysis tool-pack?
LikeLike
Hi. Hope you could help me.
Can I use the pearson correlation as well as p-value for the correlation of measurement method between our company and our suppliers? Thanks.
LikeLike
Pingback: Gesamtsterbefallzahl korreliert jetzt noch signifikanter positiv mit Impfquote beim Vergleich aller Deutscher Bundesländer - Agenda-leben