P-values for correlations in Excel

This is just a quick post to describe how to calculate p-values for two-variable correlations in Excel. Annoyingly, there is no direct way of doing this. Excel will give you the correlation, but not its associated p-value. It can be done, however, in a slightly roundabout way.

First, calculate the correlation between your groups:

=correl(variable1, variable2)

This gives you the sample test statistic r, which can be converted to t with the following formula:

temp2

where r is the correlation obtained above and n is your number of observations. Say you have 30 samples for two groups, and a r of 0.5. The calculation to obtain t is then (in excel terms):

=(0.5*sqrt(30-2))/(sqrt(1-0.5^2))
=3.05505

Then to assess the significance value associated with this t, simply use the tdist function (Student T distribution output):

=t.dist.2t(t, degrees of freedom)

This gives us results for a two-tailed distribution. Alternatively, the old tdist function can still be used, which requires the user to specify the number of tails (=tdist(t, degrees of freedom, #tails)).

Our calculation thus looks like this:

=t.dist.2t(3.05505, 30-2)
=0.0049

Which is the p-value for the correlation. Done!

excel-2010-logo

7 thoughts on “P-values for correlations in Excel

  1. Pingback: P-values for correlations in Excel — physiology and coffee | SutoCom Solutions

  2. I realise I am being too simple minded here, but does this 0.0049 value for t.dist.2t mean that the two variables are significantly correlated – or not?
    I ask because I thought if they are, then the significance should be above, say, 0.95?

    Like

    • Hi Ian. p=0.0049 is a significant correlation, yes. Typically, p<0.05 (or lower) is needed to claim significance. The threshold of 0.05 (called the alpha) is an arbitrary one, but remains the convention. Sometimes a 'stricter' alpha of 0.01 is used.

      The alpha level is related to the confidence level, which may be what you are referring to. An alpha of 0.05 corresponds to a confidence level of 0.95 (95%) as follows: 1.0-0.95=0.05. So as our p-value is 0.0049, it meets the conventional alpha threshold (0.05), and we can say that the correlation is significant at a 95% confidence level. We could, of course, choose a different alpha threshold. For example, if we chose a 0.01 threshold instead, this would correspond to a 99% confidence level (1.0-0.99=0.01).

      I hope this makes sense.

      Like

  3. Thanks for this – very helpful. Having a slight problem when I have a negative r value. It’s making my t value negative and then its unable to calculate my p value. Do I need to add in another step for a negative t value or am I doing something wrong? I’ve checked the equations and they’re working fine with positive r values. Thanks

    Like

    • Hi Sara, you are absolutely right, it doesn’t compute for negative values. So if you have a negative, just use the absolute number (in effect, change your negative to a positive).

      The command for doing it is “abs”: for e.g. -5, you could write abs(-5) and that would tell excel to use the value 5 instead of -5. Or you can just do it manually if that is easier. Hope this helps.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s