• Tidak ada hasil yang ditemukan

Chapter 9 - SIMPLE LINEAR REGRESSION AND CORRELATION

N/A
N/A
Protected

Academic year: 2025

Membagikan "Chapter 9 - SIMPLE LINEAR REGRESSION AND CORRELATION"

Copied!
41
0
0

Teks penuh

(1)

ยฉ FAROUQ MOHAMMAD A. ALAM 1

Chapter 9

SIMPLE LINEAR REGRESSION AND

CORRELATION

(2)

Learning Outcomes

โ€บ After studying this chapter, the student will:

1. be able to obtain a simple linear regression model and use it to make predictions.

2. be able to calculate the coefficient of determination and to interpret tests of regression coefficients.

3. be able to calculate correlations among variables.

4. understand how regression and correlation differ and when the use of each is appropriate.

(3)

9.1 INTRODUCTION

9.2 THE REGRESSION MODEL

ยฉ FAROUQ MOHAMMAD A. ALAM 3

(4)

Regression vs. Correlation

โ€บ Regression is an inferential method which is employed usually is to predict or estimate the value of one variable corresponding to a given value of another variable.

โ€บ Correlation is an inferential method which is concerned with measuring the strength of the relationship between variables.

โ€บ Fundamentals of regression analysis are based on the theory of conditional probability.

(5)

Types of Variables in Regression Analysis

โ€บ An independent variable or predictor variable (๐‘ฟ) is the variable being influenced by the investigator.

โ€บ An dependent variable or response variable (๐’€) is the variable being influenced by the independent variable.

ยฉ FAROUQ MOHAMMAD A. ALAM 5

(6)

Independent variable vs. dependent variable

Independent

variable influences Dependent

variable

(7)

9.3 THE SAMPLE REGRESSION EQUATION 9.4. EVALUATING THE REGRESSION

EQUATION

9.5. USING THE REGRESSION EQUATION

ยฉ FAROUQ MOHAMMAD A. ALAM 7

(8)

The Scatter Diagram

โ€บ A scatter diagram consists of points that are plotted by assigning values of the independent variable ๐‘ฟ to the horizontal axis and values of the dependent variable ๐’€ to the vertical axis.

(9)

The Least-Squares Line

โ€บ The method of least squares is used to obtain the least-squares regression line which has the following form:

๐’š = เทกเท ๐œท๐ŸŽ + เทก๐œท๐Ÿ๐’™

โ€บ Here, ๐’š is the predicted value of the dependent variable, ๐’™ is the corresponding value of the independent variable on which the prediction is based, ๐œทเทก๐ŸŽis the point where the line crosses the vertical axis (i.e., the intercept), and ๐œทเทก๐Ÿshows the amount by which y changes for each unit change in ๐’™ (i.e., the slope).

ยฉ FAROUQ MOHAMMAD A. ALAM 9

(10)

The Least-Squares Line (cont.)

๐’š = เทก เท ๐œท

๐ŸŽ

+ เทก ๐œท

๐Ÿ

๐’™

โ€บ The above equation means that as the independent value increase by 1 unit, as the dependent variable increase by ๐’ƒ on the average.

๐’š = เทก เท ๐œท

๐ŸŽ

โˆ’ เทก ๐œท

๐Ÿ

๐’™

โ€บ The above equation means that as the independent

value increase by 1 unit, as the dependent variable

decrease by ๐’ƒ on the average.

(11)

The Least-Squares Line (cont.)

๐œท เทก

๐Ÿ

= ฯƒ

๐’Š=๐Ÿ๐’

(๐’™

๐’Š

โˆ’ เดฅ ๐’™)(๐’š

๐’Š

โˆ’ เดฅ ๐’š) ฯƒ

๐’Š=๐Ÿ๐’

๐’™

๐’Š

โˆ’ เดฅ ๐’™

๐Ÿ

๐œท เทก

๐ŸŽ

= เดฅ ๐’š โˆ’ เทก ๐œท

๐Ÿ

๐’™ เดฅ

ยฉ FAROUQ MOHAMMAD A. ALAM 11

(12)

Types of Relations from Scatter Diagram

(13)

Types of Relations from Scatter Diagram (cont.)

โ€บ A positive ๐œทเทก๐Ÿ indicates that values of Y tend to increase as

values of X increase, and we say that there is a direct (positive) linear relationship between X and Y.

โ€บ A negative ๐œทเทก๐Ÿ indicates that values of Y tend to decrease as values of X increase, and we say that there is an inverse

(negative) linear relationship between X and Y.

โ€บ When there is no linear relationship between X and Y, ๐›ฝแˆ˜1 = 0.

ยฉ FAROUQ MOHAMMAD A. ALAM 13

(14)

The Least-Squares Line (cont.)

โ€บ The least-squares regression line is called the โ€œbest fitโ€ line for describing the relationship between our two variables since the sum of the squared

vertical deviations of the observed data points (๐’š

๐’Š

) from the least-squares regression line is

smaller than the sum of the squared vertical

deviations of the data points from any other line.

(15)

The Least-Squares Line (cont.)

(16)

The Least-Squares Line (cont.)

(17)

Using the Fitted Equation

โ€บ The fitted equation can be used to obtain a prediction for the value Y given a value of X.

ยฉ FAROUQ MOHAMMAD A. ALAM 17

(18)

MegaStat Application

(19)

Example 9.3.1

โ€บ Table 9.3.1 shows the measurements taken on each subject were deep abdominal adipose tissue (AT) obtained by CT and waist circumference (in cm). Construct a scatter plot and perform

regression analysis if you know that deep abdominal AT is the dependent variable, while the waist measurement is the

independent variable.

ยฉ FAROUQ MOHAMMAD A. ALAM 19

(20)
(21)

Example 9.4.2

โ€บ We wish to know if we can conclude that the slope of the

population regression line describing the relationship between X and Y is zero. Also, predict Y and estimate the mean of Y for a waist circumference of 1m.

ยฉ FAROUQ MOHAMMAD A. ALAM 21

(22)

Example 9.4.2

โ€บ We wish to know if we can conclude that the slope of the

population regression line describing the relationship between X and Y is zero. Also, predict Y and estimate the mean of Y for a waist circumference of 1m.

โ€บ Important Note: Recall that X is measured in cm. Before

prediction or estimating the mean, check the measurement units.

Here, change 1m to cm by multiplying 1 by 100, then predict Y or estimated its mean based on X = 100.

(23)

Regression Analysis (Scatter Diagram) (MegaStat Application)

1. In Data Ribbon, click on MegaStat icon, then select Correlation / Regression.

2. Select Scatterplot.

ยฉ FAROUQ MOHAMMAD A. ALAM 23

(24)

3. Uncheck if you do not want to include the regression line in the 3. Input the range of X.

4. Input the range of Y.

5. Uncheck if you do not want to include the regression line in the

(25)

โ€บ The scatter diagram indicates a direct linear relationship between X and Y.

โ€บ The equation line is:

เท

๐’š = โˆ’๐Ÿ๐Ÿ๐Ÿ“. ๐Ÿ—๐Ÿ–๐Ÿ + ๐Ÿ‘. ๐Ÿ’๐Ÿ“๐Ÿ— ๐’™

โ€บ The regression equation means that as the waist

measurement increases by 1 unit as the deep abdominal AT increases by 3.489 units on the average.

ยฉ FAROUQ MOHAMMAD A. ALAM 25

(26)

Regression Analysis (MegaStat Application)

1. In Data Ribbon, click on MegaStat icon, then select Correlation / Regression.

2. Select Regression.

(27)

ยฉ FAROUQ MOHAMMAD A. ALAM 27

3. Input the range of X. 4. Input the range of Y.

5. Change the option in the drop box to โ€œType in predictor

valuesโ€, add the values of X.

then press OK.

5. Change the option in the drop box to โ€œType in predictor

valuesโ€, add the values of X, and then press OK.

(28)

โ€บ The coefficient of determination is equal to 0.670.

(29)

ยฉ FAROUQ MOHAMMAD A. ALAM 29

โ€บ The intercept and the slope of the regression equation.

(30)

โ€บ The predicted value of Y given that X = 100 is เท๐’š โ‰ˆ ๐Ÿ๐Ÿ‘๐ŸŽ.

(31)

9.6. THE CORRELATION MODEL

9.7 THE CORRELATION COEFFICIENT

ยฉ FAROUQ MOHAMMAD A. ALAM 31

(32)

Regression Analysis vs. Correlation Analysis

โ€บ Regression analysis describes the relationship between the

dependent (Y) and independent (X) variables for the purposes of prediction.

โ€บ Correlation analysis is used to determine the strength and direction of the relationship between the Y and X.

โ€บ The population correlation coefficient ๐† measures the direction and strength of the linear relationship between X and Y. It is known as Pearson's correlation coefficient.

(33)

Properties of ANY Correlation Coefficient

โ€บ The range of the correlation coefficient is from +1 to -1.

โ€บ If the value of the correlation coefficient is close to +1, then there is a strong direct linear relationship between the variables.

โ€บ If the value of the correlation coefficient is equal to +1, then there is a perfect direct linear relationship between the variables.

(34)

Properties of ANY Correlation Coefficient (cont.)

โ€บ If the value of the correlation coefficient is close to -1, then there is a strong inverse linear relationship

between the variables.

โ€บ If the value of the correlation coefficient is equal to -1, then there is a perfect inverse linear relationship between the variables.

โ€บ If the value of the correlation coefficient is close to 0

(35)

Properties of ANY Correlation Coefficient (cont.)

โ€บ If the value of the correlation coefficient is close to 0 (from the negative side), then there is an inverse weak linear relationship between the variables.

โ€บ If the value of the correlation coefficient is equal to 0, then there is no linear relationship between the

variables.

โ€บ The sign of the correlation coefficient is the same as the sigh of the slope of the regression equation.

(36)

The Sample Correlation Coefficient

โ€บ The sample correlation coefficient ๐’“ describes the linear relationship between the sample observations of the two variables in the same way as the population correlation coefficient ๐†.

โ€บ The sample correlation coefficient is calculate using the following formula:

๐’“ = ๐’ฯƒ๐’™๐’š โˆ’ (ฯƒ๐’™)(ฯƒ๐’š)

๐’ฯƒ๐’™๐Ÿ โˆ’ ฯƒ๐’™ ๐Ÿ ๐’ฯƒ๐’š๐Ÿ โˆ’ ฯƒ๐’š ๐Ÿ

(37)

MegaStat Application

ยฉ FAROUQ MOHAMMAD A. ALAM 37

(38)

Example 9.7.1 and Example 9.7.2

โ€บ Table 9.7.1 shows a subjectโ€™s height (cm) and the peak spinal

latency (Cv) of the SEP (a type of electrical activity of the brain).

Investigate the relationships between a subjectโ€™s height and the Cv of the SEP. Use the sample correlation coefficient to check if it is of sufficient magnitude to indicate that, in the population, height and Cv SEP levels are correlated.

(39)

Correlation Analysis (MegaStat Application)

1. In Data Ribbon, click on MegaStat icon, then select Correlation / Regression.

2. Select Correlation Matrix.

ยฉ FAROUQ MOHAMMAD A. ALAM 39

(40)

6. Choose โ€œgreater thanโ€.

3. Input the range of information, then

(41)

ยฉ FAROUQ MOHAMMAD A. ALAM 41

โ€บ MegaStat indicates that the

correlation coefficient is equal to 0.848 (strong positive linear

relationship).

Referensi

Dokumen terkait

Table 1 Least-squares means and standard error of the means of total yield kg/cow, peak yield kg/cow and day at peak for milk, fat and protein yield modelled using the random regression

Then based on the theoretical regression line assesses the level of volatihty of future socio economic phenomena to determine the theoretical regression line requires summarizing past

We theoretically compared the proposed DBR estimator with some existing estimators, for examples, the ordinary least squares OLS, the ordinary ridge regression ORR, the Liu, the

Ordinary least squares OLS regression and quantile regression method were used to investigate the relationship between the editorial board representation and the quantity and impact of

The backbone of least squares is the classical multiple regression analysis using the linear model to relate several independent variables to a response or dependent variable..

Journal of Quality Measurement and Analysis JQMA 183 2022, 61-70 e-ISSN: 2600-8602 http://www.ukm.my/jqma DETERMINE THE PARAMETERS FOR PHOTOELECTRIC EFFECT DATA USING CORRELATION

Notably, Helland, 1988 demonstrated that the regression coefficient for partial least squares can be expressed as follows: ๐›ฝฬ‚ = ๐‘Š๐‘ƒโ€ฒ๐‘Šโˆ’1๐‘ž Furthermore, the score can be expressed through

This document outlines the group assignment for the Engineering Statistics course BEE 35202 at Universiti Tun Hussein Onn Malaysia, focusing on Simple Linear Regression and involving manual calculations, software demonstration, and