• Tidak ada hasil yang ditemukan

PDF Chapter 10: Correlation and Regression Chapter 13: Nonparametric Statistics

N/A
N/A
Protected

Academic year: 2025

Membagikan "PDF Chapter 10: Correlation and Regression Chapter 13: Nonparametric Statistics"

Copied!
27
0
0

Teks penuh

(1)

Chapter 10: Correlation and Regression Chapter 13: Nonparametric Statistics

Objectives:

❑ Learn how to draw a scatter plot for a set of ordered pairs.

❑ Learn how to compute the correlation coefficient.

❑ Learn how to compute the equation of the regression line.

❑ Learn how to compute the Spearman rank correlation coefficient.

(2)

Overview of Chapters 10 and 13

Sec. # Title Page(s)

10 - 1 Scatter Plots and Correlation 369 – 385 13 - 6 The Spearman Rank Correlation

Coefficient 459 – 461

10 - 2 Regression 386 – 393

(3)

Remember?

Independent

variable influences Dependent

variable

(4)

At a Glance!

 Are two or more variables linearly related?

(Scatter plot and/or correlation coefficient)

 If so, what is the strength of the relationship?

(Scatter plot and/or correlation coefficient)

 What type of relationship exists?

(Scatter plot, correlation coefficient and/or regression)

 What kind of predictions can be made from the relationship?

(Regression)

(5)



A scatter plot is a graph of the ordered pairs of numbers (x, y) consisting of the independent variable x and the dependent variable y.

10 – 1: Scatter Plots and Correlation

(6)

10 – 1: Scatter Plots and Correlation (cont.)



It is a visual way to describe the nature of the relationship between the x and y. It may shows:

ο‚€

a positive linear relationship,

ο‚€

a negative linear relationship,

ο‚€

a curvilinear relationship,

ο‚€

or no relationship.



Example 10 – 1, page 372, Example 10 – 2,

page 372 – 373, Example 10 – 3, page 373.

(7)

Examples of scatter plots patterns

(8)

Correlation

 Pearson’s linear correlation coefficient, which will be denoted by π‘Ÿ, measures the strength and the direction of a linear relationship between two quantitative variables.

(9)

Calculating 𝒓

 The linear correlation coefficient is given by

𝒓 = π’βˆ‘π’™π’š βˆ’ (βˆ‘π’™)(βˆ‘π’š)

π’βˆ‘π’™πŸ βˆ’ βˆ‘π’™ 𝟐 π’βˆ‘π’šπŸ βˆ’ βˆ‘π’š 𝟐

 The above coefficient is also known as Pearson product moment correlation coefficient (PPMC).

(10)

Properties of 𝒓

 The range of the correlation coefficient is from +1 to -1.

 If the value of π‘Ÿ is close to +1, then there is a strong positive linear relationship between the variables.

 If the value of π‘Ÿ is close to -1, then there is a strong negative linear relationship between the variables.

 If the value of π‘Ÿ is close to 0, then there is either a

weak or no linear relationship between the variables.

(11)

Properties of 𝒓

(12)

Example 10 – 4: Car Rental Companies

# of Cars (x) Revenue (y)

63 7

29 3.9

20.8 2.1

19.1 2.8

13.4 1.4

8.5 1.5

 From the left table, we obtain:

βˆ‘π’™ = πŸπŸ“πŸ‘. πŸ–,

βˆ‘π’š = πŸπŸ–. πŸ•,

βˆ‘π’™π’š = πŸ”πŸ–πŸ. πŸ•πŸ•,

βˆ‘π’™πŸ = πŸ“πŸ–πŸ“πŸ—. πŸπŸ”,

βˆ‘π’šπŸ = πŸ–πŸŽ. πŸ”πŸ•.

(13)

Example 10 – 4 (cont.)

𝒓 = πŸ”(πŸ”πŸ–πŸ. πŸ•πŸ•) βˆ’ (πŸπŸ“πŸ‘. πŸ–)(πŸπŸ–. πŸ•)

πŸ”(πŸ“πŸ–πŸ“πŸ—. πŸπŸ”) βˆ’ πŸπŸ“πŸ‘. πŸ– 𝟐 πŸ”(πŸ–πŸŽ. πŸ”πŸ•) βˆ’ πŸπŸ–. πŸ• 𝟐

= 𝟎. πŸ—πŸ–πŸ

 Hence, there is a strong positive linear correlation relation between the number of rented cars and revenues.

 Example 10 – 5, page 377 (Negative correlation),

Example 10 – 6, page 378 (Weak positive correlation).

(14)

13 – 6: The Spearman Rank Correlation Coefficient

 If 𝑛 is the sample size, and 𝑑 is difference in ranks, then the Spearman rank correlation coefficient is calculated as

𝒓𝒔 = 𝟏 βˆ’ πŸ”βˆ‘π’…πŸ 𝒏(π’πŸ βˆ’ 𝟏)

(15)

Example 13 – 7: Bank Branches and Deposits (page 459)

# of branches (X) Deposits (Y) Rank (X)

Rank (Y)

209 23 4 4

353 31 2 1

19 7 8 6

201 12 5 5

344 26 3 2

132 5 6 7

401 24 1 3

126 5 7 8

# of branches (X) Deposits (Y) Rank (X)

209 23 4

353 31 2

19 7 8

201 12 5

344 26 3

132 5 6

401 24 1

126 5 7

# of branches (X) Deposits (Y)

209 23

353 31

19 7

201 12

344 26

132 5

401 24

126 4

# of branches (X)

209 353 19 201 344 132 401 126

(16)

Example 13 – 7 (cont.)

Rank (X) Rank (Y) 𝒅 π’…πŸ

4 4 0 0

2 1 1 1

8 6 2 4

5 5 0 0

3 2 1 1

6 7 -1 1

1 3 -2 4

7 8 -1 1

βˆ‘ 𝟎 𝟏𝟐 = βˆ‘π’…πŸ

(17)

Example 13 – 7 (cont.)

𝒓𝒔 = 𝟏 βˆ’ πŸ”βˆ‘π’…πŸ

𝒏 π’πŸ βˆ’ 𝟏 = 𝟏 βˆ’ πŸ” β‹… 𝟏𝟐

πŸ– πŸ”πŸ’ βˆ’ 𝟏 = 𝟏 βˆ’ πŸ•πŸ πŸ“πŸŽπŸ’

= 𝟎. πŸ–πŸ“πŸ•

 The above value indicates that we have a strong positive correlation.

 We can calculate Spearmen’s correlation if the data are ordinal-level qualitative.

(18)

10 – 2: Regression

 If the value of the correlation coefficient is significant, the next step is to determine the

equation of the regression line, which is the data’s line of best fit.

 Best fit means that the sum of the squares of the vertical distances from each point to the line is at a minimum.

(19)

Line of best fit

(20)

Line of best fit (cont.)

(21)

Determination of the Regression Line Equation

 The equation regression line is:

π’šβ€² = 𝒂 + 𝒃 β‹… 𝒙

 Here, π‘Ž is the intercept or the regression constant, 𝑏 is the slope or the regression coefficient, π‘₯ is the observed independent variable, and they are used to calculate 𝑦′which is the predicted dependent

variable.

(22)

Determination of the Regression Line Equation (cont.)

𝒂 = βˆ‘π’š βˆ‘π’™πŸ βˆ’ (βˆ‘π’™)(βˆ‘π’š) 𝒏 βˆ‘π’™πŸ βˆ’ βˆ‘π’™ 𝟐

𝒃 = 𝒏 βˆ‘π’™π’š βˆ’ (βˆ‘π’™)(βˆ‘π’š) 𝒏 βˆ‘π’™πŸ βˆ’ βˆ‘π’™ 𝟐

(23)

Example 10 – 9 (page 388)

 Number of rented cars is the independent variable π‘₯, while the revenue is the dependent variable 𝑦. The regression line is found to be:

π’šβ€² = 𝟎. πŸ‘πŸ—πŸ” + 𝟎. πŸπŸŽπŸ” β‹… 𝒙

 This means that as the number of rented cars

increases by 1 as the revenue increases by 0.106 on average.

(24)

Example 10 – 10 (page 389)

 Number of absences is the independent variable π‘₯, while the final grade is the dependent variable 𝑦. The regression line is found to be:

π’šβ€² = 𝟏𝟎𝟐. πŸ’πŸ—πŸ‘ βˆ’ πŸ‘. πŸ”πŸπŸ β‹… 𝒙

 This means that as the number of absences

increases by 1 as the final grade decreases by 3.622 on average.

(25)

Example 10 – 11 (page 391)

 Predict the income of a car rental agency (y) that has 200,000 automobiles (x).

 Note that in the Example 10 – 1, the unit of number of rented automobiles is in ten thousands.

Therefore, 200,000 automobiles is in fact 20 ten thousand, i.e. x = 20. Hence,

𝑦′ = 0.396 + 0.106 𝟐𝟎 = 2.516

(26)

Important Rule!

 Q. Is there any relationship between the Person’s

correlation coefficient and the regression coefficient 𝒃?

 A. The sign of the correlation coefficient and the sign of the slope of the regression line will always be the same.

(27)

Application Summary

Measure Excel only Excel + MegaStat

Scatter plot βœ“

Person’s linear correlation

coefficient βœ“

Spearman’s correlation

coefficient βœ“

Regressions equation βœ“

Referensi

Dokumen terkait

In line with the problem formulated above, the objective of this study was to know and describe about whether or not there is a significant correlation between

This simple equation demonstrates that the fringe shift in radians quantified by an interferometer when measuring a folding, binding, or hybridization event in free-solution no labels

10 OR Examine the strategies adopted by the Planning Commission for the development of Indian economy... Note : The fol low ing Ques tion is for the Blind Can di dates only in lieu

36 Report of Fraudulent Acts and its Receipt β‘  When a fraudulent act is recognized or reported, the case shall be verified by the institution to which the corresponding researcher

Based on the above equation, it can be seen that the value of component a or constant is - 3,344, the value of component b1 or the regression coefficient for transformational leadership

Digital Elevation Models and Indirect Contouring  In a regular grid DEM, spot elevations are determined for a uniformly spaced array of ground cells or groundels  The elevations 𝑍

3.11 Flow around a Cylinder Continue The lift and drag will be found by using Bernoulli’s equation Kutta–Joukowski theorem: The lift per unit span on a lifting airfoil or cylinder is

The form of the regression equation is a polynomial of order 3 or Spline Qubic [28] with the following equation 4 Y = b0 + b1*X + b2*X2 + b3*X3 4 Where, X = Rice Crop time/age HST;