Correlation Analysis

Correlation analysis is a method to measure the strength of the linear relationship between 2 or more variables.

This is used to identify the key input variables of a process i.e. the input variables that have the highest impact on the process output.

The correlation analysis can be done using two methods 1) using a scatter plot or 2) using a correlation matrix.

Scatter Plot

Scatter plots show the correlation between the two variables graphically. However, this is a subjective analysis and cannot give the magnitude of the strength of the relationship. But the scatter plots can reveal any non-linear relationship that might exist between the 2 variables. The correlation matrix cannot reveal any non-linear relationship. So it is recommended that both methods be used when assessing the correlation between 2 variables.


Figure 1: Scatter Plot

Correlation Matrix


Correlation matrix is a quantitative measure of the strength of the linear relationship between 2 variables. This is done by using the ‘correlation coefficient’ or ‘Pearson coefficient’, ‘r’.

Value of ‘r’ ranges from -1 to +1. A value of 1 (positive or negative) indicates perfect correlation while 0 indicates no correlation between the variables.

Pearson Correlation (r-value)
  • 0 then no relationship
  • +1 or -1 perfect relationship
  • between 0 and 0.3 little relationship
  • between 0.3 and 0.5 low relationship
  • between 0.5 and 0.7 moderate relationship
  • between 0.7 and 0.9 strong relationship
  • between 0.9 and 1.0 very strong relationship

P-value

While doing the correlation analysis, we also need to know the significance of the correlation. This is done by using the p-value.

Null Hypothesis: There is no correlation between the variables
Alternate Hypothesis: There is correlation

Lets say the confidence interval (CI) is set at 95% i.e. 0.95

If p-value is < 0.05 (1 - CI) = Reject the null hypothesis. i.e. there is correlation between the two variables.

The low p-value indicates that the correlation between the two variables is significant and did not happen by accident.

Correlation Analysis in Minitab

Minitab>> Stat >> Basic Statistics >> Correlation

Choose the columns containing the variables you need to correlate. Minitab calculates the correlation for every possible pairs in the list of columns selected.

By default, Minitab also displays the p-value for each correlation.

5 comments:

  1. This is great. Thanks for taking time to write this piece. It will be useful for me. Kayode from Nigeria

    ReplyDelete
  2. I read lot of articles and really like this article. This information is definitely useful for everyone in daily life. Fantastic job.

    Best institutes for B Tech in Chandigarh
    B tech for working professional
    B.Tech In Industrial Engineering

    ReplyDelete
  3. great explanation - Rajan (India)

    ReplyDelete
  4. great explanation - Rajan (India)

    ReplyDelete
  5. This is very nice blog, and it is helpful for students. Thanks for sharing this nice blog.if anyone looking for six sigma certification trining join us

    ReplyDelete