P-values for correlations in Excel

This is just a quick post to describe how to calculate p-values for two-variable correlations in Excel. Annoyingly, there is no direct way of doing this. Excel will give you the correlation, but not its associated p-value. It can be done, however, in a slightly roundabout way.

First, calculate the correlation between your groups:

=correl(variable1, variable2)

This gives you the sample test statistic r, which can be converted to t with the following formula:

where r is the correlation obtained above and n is your number of observations. Say you have 30 samples for two groups, and a r of 0.5. The calculation to obtain t is then (in excel terms):

=(0.5*sqrt(30-2))/(sqrt(1-0.5^2))
=3.05505

Then to assess the significance value associated with this t, simply use the tdist function (Student T distribution output):

=t.dist.2t(t, degrees of freedom)

This gives us results for a two-tailed distribution. Alternatively, the old tdist function can still be used, which requires the user to specify the number of tails (=tdist(t, degrees of freedom, #tails)).

Our calculation thus looks like this:

=t.dist.2t(3.05505, 30-2)
=0.0049

Which is the p-value for the correlation. Done!

10 thoughts on “P-values for correlations in Excel”

1. I realise I am being too simple minded here, but does this 0.0049 value for t.dist.2t mean that the two variables are significantly correlated – or not?
I ask because I thought if they are, then the significance should be above, say, 0.95?

Like

• Hi Ian. p=0.0049 is a significant correlation, yes. Typically, p<0.05 (or lower) is needed to claim significance. The threshold of 0.05 (called the alpha) is an arbitrary one, but remains the convention. Sometimes a 'stricter' alpha of 0.01 is used.

The alpha level is related to the confidence level, which may be what you are referring to. An alpha of 0.05 corresponds to a confidence level of 0.95 (95%) as follows: 1.0-0.95=0.05. So as our p-value is 0.0049, it meets the conventional alpha threshold (0.05), and we can say that the correlation is significant at a 95% confidence level. We could, of course, choose a different alpha threshold. For example, if we chose a 0.01 threshold instead, this would correspond to a 99% confidence level (1.0-0.99=0.01).

I hope this makes sense.

Like

Thanks for that; and apologies for my not getting back sooner – I have been away.
But can you confirm what I am doing is OK to test both lower and upper limits? –
I do this by three steps, using a boundary limit of 5% – let us call that B:
Having initially calculated the correlation using Correl r of Array1:Array2 of n events,
I then calculate the t-test function: r x SQRT(n-2)/SQRT(1 – r^2) = t of this array
Next, I calculate T.DIST(t,n-2,2) calling this result L
Finally, I put this value as my lower and upper limits, so:
IF(OR(L 1-B), SLOPE(Array1:Array2),””)
Have I got this right please?

Like

2. Thanks for this – very helpful. Having a slight problem when I have a negative r value. It’s making my t value negative and then its unable to calculate my p value. Do I need to add in another step for a negative t value or am I doing something wrong? I’ve checked the equations and they’re working fine with positive r values. Thanks

Like

• Hi Sara, you are absolutely right, it doesn’t compute for negative values. So if you have a negative, just use the absolute number (in effect, change your negative to a positive).

The command for doing it is “abs”: for e.g. -5, you could write abs(-5) and that would tell excel to use the value 5 instead of -5. Or you can just do it manually if that is easier. Hope this helps.

Like

3. Hi,
Please correct me if I’m wrong, but can’t the significance of a correlation be found using the regression analysis tool in the Data Analysis tool-pack?

Like

4. Hi. Hope you could help me.
Can I use the pearson correlation as well as p-value for the correlation of measurement method between our company and our suppliers? Thanks.

Like