Call Us: US - +1 845 478 5244 | UK - +44 20 7193 7850 | AUS - +61 2 8005 4826

Pearson Correlation Coefficient

Pearson Correlation Coefficient

Correlation is a statistical technique, which investigates the relationship between two quantitative and continues variables. For instance the relationship between blood pressure and age. The Pearson correlation coefficient, or Pearson product-moment correlation coefficient denoted by r measures how strong is the relationship between the two variables in question. Generally, the Pearson correlation coefficient establishes the line of best fit between the two variables considered. Therefore, the value of r is an indication of how far the data points of the variables are from the line of best fit. Therefore, it measures how well the variables fit in the new model or the line of best fit.

Values assumed by Pearson correlation coefficient

The Pearson correlation coefficient, r, can assume various values ranging from +1 to -1.  If the value of r is greater than 0, it means that there is a positive relationship between the variables. That is, as one variable increases, the other also increases and vice versa. If the value of r is less than 0, it implies a negative relationship between the variables considered. That means as one variable increases, the other variable decreases and vice versa. These values are well illustrated in the diagrams below.

Determining the strength of Association using Pearson correlation coefficient

The closer the Pearson correlation coefficient, r, to either +1 or -1, the stronger the relationship between two variables. if the r is +1 or -1, then it means that all the values lies within the line of best fit. This means that there is no value, which deviates from the line of best fit. However, if the values lies between +1 and -1, for instance r=0.4 or r= -0.6 then it means there is variation between the two variables. The close the values are to 0, then the higher the variation of the data values from the line of best fit. The various variation could be depicted by the figures shown below, which shows the scatter plot of the data values of the two variables, and a fitted line of best fit. The line of best fit gives a clear direction of the relationship, where positive sloping shows a positive relationship, while a negative sloping line depicts a negative relationship. The upper row shows a positive correlation while the lower row shows a negative correlation.

The following table could be very useful in the interpretation of the Pearson correlation coefficient

However, these values should be considered as a guideline. The strength of association between the variables considered is dependent on the variables being measured.