LAPORAN PRAKTIKUM ANALISIS REGRESI TERAPAN

(1)

LAPORAN

PRAKTIKUM ANALISIS REGRESI TERAPAN

OLEH:

Gempur Safar

06/193137/PA/10877

Asisten:

Jim Oklahoma (10419)

Indri Rivani Purwanti (10990)

Dosen

Drs. Zulaela, M.Si.

LABORATORIUM KOMPUTASI MATEMATIKA DAN STATISTIKA

JURUSAN MATEMATIKA

FAKULTAS MATEMATIKA DAN ILMU PENGETAHUAN ALAM

UNIVERSITAS GADJAH MADA

YOGYAKARTA

2008

(2)

Soal

a. Dengan data berikut, jawablah pertanyaan‐pertanyaan berikut:

Location Birth Rate Infant Mortality

American Samoa 24.88 10.36 CNMI 20.6 5.7 Fiji 23.33 14.08 French Polynesia 18.6 9.12 FSM 27.09 33.48 Guam 25.07 6.71 Kiribati 31.98 54.0 Marshall Islands 45.07 39.82 Nauru 27.22 10.71 Palau 19.64 16.67

Papua New Guinea 32.15 58.21

Samoa 15.59 31.75

Solomon Islands 34.05 24.47

Tuvalu 21.56 22.65

Vanuatu 25.4 61.05

a. Calculate the slope of the least squares line for the data,

b. Calculate the y‐intercept of the least squares line.

c. Is the correlation positive, negative, or neutral ?

d. Use the equation of the best fit line to calculate the expected infant mortality rate for a

pacific basin location with a birth rate of 30.

e. Use the inverse of the best fit equation of the best fit line to calculate the expected birth

rate for a pacific basin location with an infant mortality rate of 40.

f. Calculate the linear correlation coefficient r for the data.

g. Is the correlation none, low, moderate, high, or perfect ?

h. Calculate the coefficient of determination.

i. What percent of the variation in the birth rate explains the variation in the infant

mortality data ?

j. Is there a relationship beetwen birth rate and infant mortality ?

k. Would a policy that decreases the birth rate also likely decrease the infant mortality

data ?

l. Given the data and correaltion above, what types of social policy might you recommend

(3)

m. The FSM has a population of 110,000. Given the noted birth rate of 27.09 babies (live

births) per 1000 people in the FSM in 2001, how many babies can be expected to be

born in the FSM this year ?

n. Infant mortality refers to death of a baby in its first year of life. Given the FSM’s infant

mortality rate of 33.48 infants per 1000 live births, how many infants can be expected to

die before they reach first birthday in the FSM this year ?

b. Seorang peneliti ingin membandingkan tiga metode (A,B, dan C) pengobatan penderita

depresi berat. Dia juga ingin mengetahui hubungan antara umur dengan keefektifan metode

pengobatan tersebut. Untuk itu dipilih 36 pasien secara random untuk tiap‐tiap kelompok

pengobatan. Data hasil penelitian diringkaskan dalam tabel berikut, dengan variabel

dependen Y adalah keefektifan obat, Variabel independen X1 umur pasien, dan X2 metode pengobatan. Y X1 X2 Y X1 X2 56.0 21.0 A 65.0 43.0 A 41.0 23.0 B 55.0 45.0 B 40.0 30.0 B 57.0 48.0 B 28.0 19.0 C 59.0 47.0 C 55.0 28.0 A 64.0 48.0 A 25.0 23.0 C 61.0 53.0 A 46.0 33.0 B 62.0 58.0 B 71.0 67.0 C 36.0 29.0 C 48.0 42.0 B 69.0 53.0 A 63.0 33.0 A 47.0 29.0 B 52.0 33.0 A 73.0 58.0 A 62.0 56.0 C 64.0 66.0 B 50.0 45.0 C 60.0 67.0 B 45.0 43.0 B 62.0 63.0 A 58.0 38.0 A 71.0 59.0 C 46.0 37.0 C 62.0 51.0 C 58.0 43.0 B 70.0 67.0 A 34.0 27.0 C 71.0 63.0 C

(4)

Jawab

1. Linearity Test

\

The

scater

plot

above

shows

that

there

is

a

linear

relation

between

independent

variable and dependent variable.

Output: Correlations Infant_mortality Birth_rate Infant_mortality 1.000 .455 Pearson Correlation Birth_rate .455 1.000 Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate 1 .455 .207 .146 17.66795 Coefficients Unstandardized Coefficients Standardized Coefficients

Model _B _{Std. Error} _Beta t Sig. (Constant) -4.123 17.300 -.238 .815 1

Birth_Rate 1.174 .638 .455 1.840 .089

a. Slope of the least squares line for the data is, 1,174 (the birth_rate coeficients)

b. The y‐intercept of the least squares line is ‐4,123 (the constant coefficients)

(5)

d. Use the equation of the best fit line to calculate the expected infant mortality rate for a

pacific basin location with a birth rate of 30.

Given X = 30, so:

Y(infant mortality) = ‐4,123 + 1,174*30 = ‐4,123 + 35,22 = 31,097

e. Use the inverse of the best fit equation of the best fit line to calculate the expected birth

rate for a pacific basin location with an infant mortality rate of 40.

Given Y = 40

Y1 = ‐4,123 + 1,174X1 40 = ‐4,123 + 1,174X

44,123 = 1,174X

X = 37,583

f. The linear correlation coefficient for the data is 0,455

g. The correlation low because the correlation coefficient is between 0 – 0,5

h. The coefficient of determination is 0,207

i. The variation in the birth rate can explains 20,7 % of the variation in the infant mortality

data.

j. Yes, there is a relationship between birth rate and infant mortality though it isn’t too

strong

k. Yes, a policy that decreases the birth rate also likely decrease the infant mortality data

because the correlations between the two variables is positive.

l. Types of social policy that I recommend to reduce infant mortality in the FSM is the

healthy of the pregnant mother such as smooking or not.

m. Given the population of FSM = 110,000,‐

Birth rate = 27.09

The number of baby expected to be born in the FSM this year = (27,09/1000)*110.000

= 2979,9 Æ 2980 baby

n. Given Infant mortality of FSM = 33,48

The numbers of infant expected to die before they reach first birthday is

= (33,48/1000)* 2980

= 99,77 Æ 100 infants

(6)

2. Uji Linearitas

Dari scater plot terlihat bahwa hubungan antara variabel Y dan X1 linier dan positif yang berarti bahwa nilai Y akan bertambah seiring bertambahnya nilai X1. Untuk pembuatan scater plot, variabel X2 tidak diikutsertakan sebab variabel ini merupakan variabel kualitatif.

Karena variabel independen X2 merupakan variabel kualitatif dengan tiga pilihan yaitu metode A, B, dan C, maka akan dibuat variabel dummy D1 dan D2 dengan reference category metode C dengan ketentuan:

• Untuk D1, nilai 1 untuk metode A, nilai 0 untuk lainnya (B dan C) • Untuk D2, nilai 1 untuk metode B, nilai 0 untuk lainnya (A dan C)

Selanjutnya dilakukan analsis regresi linier ganda dengan mengikutsertakan variabel D1 dan D2, dengan output:

Model Summary

Model R R Square Adjusted R Square

1 .885 .784 .764

(7)

Dari tabel di atas,

) Nilai R = 0,885 yang berarti bahwa ada hubungan yang cukup erat antara variabel

independent dengan variabel dependent.

) Nilai R Square = 0,784 menunjukkan bahwa model regresi dengan tiga variabel independen

(X1, D1 dan D2) tersebut dapat menjelaskan 78,4% variasi dalam variabel dependen (keefektifan obat)

Overall Test :

ANOVA

Model SquaresSum of Df Mean Square F Sig.

Regression 4229.425 3 1409.808 38.705 .000 Residual 1165.575 32 36.424 1 Total 5395.000 35 Hipotesis: H0 : H1 : Tingkat signifikansi :

Statistik Uji : p‐value = 0,000

Daerah Penolakan: H0 ditolak jika p‐value < Kesimpulan:

Oleh karena p‐value=0,00 < , maka H0 ditolak yang berarti bahwa model layak digunakan.

Partial Test :

Coefficients

Unstandardized Coefficients Standardized Coefficients

Model _B _{Std. Error} _Beta t Sig. (Constant) 22.291 3.505 6.359 .000 X1 .664 .070 .783 9.522 .000 dummy1 10.253 2.465 .395 4.159 .000 1 dummy2 .445 2.464 .017 .181 .858

(8)

) Hipotesis: H0 : , β1 = 0, β2 = 0, β3 = 0

H1 : , β1 ≠ 0, β2 ≠ 0, β3 ≠ 0

Tingkat signifikansi :

Statistik Uji : p‐value untuk masing‐masing koefisien regresi. Daerah Penolakan: H0 ditolak jika p‐value <

Kesimpulan:

9 Oleh karena p‐value=0,00 < , maka H0 ditolak yang berarti bahwa koefisien regresi dari konstan layak digunakan/dimasukan ke dalam model persamaan regresi. 9 Oleh karena p‐value=0,00 < , maka H0 ditolak yang berarti bahwa koefisien

regresi dari variabel Independen X1 layak digunakan/dimasukan ke dalam model persamaan regresi

9 Oleh karena p‐value=0,00 < , maka H0 ditolak yang berarti bahwa koefisien regresi dari variabel D1 layak digunakan/dimasukan ke dalam model persamaan regresi

9 Oleh karena p‐value=0,858 > , maka H0 tidak ditolak yang berarti bahwa koefisien regresi dari variabel D2 tidak layak digunakan/dimasukan ke dalam model persamaan regresi

Namun, meskipun koefisien regresi untuk variabel D2 tidak signifikan dan D1 signifikan, koefisien regresi D2 akan tetap dimasukan ke dalam persamaan regresi karena kedua variabel tersebut berasal dari satu variabel yaitu X2.

Sehingga persamaan model regresi estimasinya yaitu:

Y = 22,291 + 0,664 X1 + 10,253D1 + 0,445D2

Dan untuk mengetahui efek dari ketiga metode tersebut pada persamaan ini, maka kita akan

meninjau persamaan model regresi estimasi untuk masing‐masing metode: ) Untuk metode A, modelnya adalah :

Y = 22,291 + 0,664 X1 + 10,253D1 + 0,445D2 Dengan nilai pada D1 = 1 dan pada D2 = 0 Maka modelnya: Y = 22,291 + 0,664 X1 + 10,253 + 0

(9)

) Untuk metode B, modelnya adalah :

Y = 22,291 + 0,664 X1 + 10,253D1 + 0,445D2 Dengan nilai pada D1 = 0 dan pada D2 = 1 Maka modelnya: Y = 22,291 + 0,664 X1 + 0 + 0,445

Y = 22,736 + 0,664 X1

) Untuk metode C, modelnya adalah :

Y = 22,291 + 0,664 X1 + 10,253D1 + 0,445D2 Dengan nilai pada D1 = 0 dan pada D2 = 0 Maka modelnya: Y = 22,291 + 0,664 X1 + 0 + 0

Y = 22,291 + 0,664 X1

Dari hasil tersebut terlihat bahwa ketiga metode tersebut mempunyai kemiringan/slope yang

sama yaitu 0,664. Akan tetapi pada intersep, ketiga metode tersebut mempunyai koefisien

intersep yang berbeda, dimana terlihat bahwa metode pengobatan A mempunyai nilai intersep

paling besar yaitu 32,546 kemudian metode pengobatan B sebesar 22,736 dan terakhir Metode

pengobatan C sebesar 22,291.

Oleh karena reference category adalah C, maka kita akan membandingkan metode A dengan C,

dan B dengan C. Namun untuk menentukan metode mana yang memberikan tingkat keefektifan

obat yang lebih besar juga akan kita lakukan perbandingna antara variabel A dan B.

Untuk metode A dan C, terlihat perbedaan sebesar 10,253 satuan yang menandakan metode A

mempunyai tingkat keefektifan obat yang lebih besar dari C sebesar 10,253,

Sedangkan untuk metode B dengan C, terlihat perbedaan sebesar 0,445 satuan yang menandakan

metode pengobatan B mempunyai tigkat keefektifan obat yang lebih besar dari C sebesar 0,445,

Dan untuk metode A dan B, terlihat perbedaan sebesar (32,546 – 22,736)= 9,81 satuan yang

menandakan metode pengobatan A mempunyai tingkat keefektifan obat yang lebih besar dari B

sebesar 9,81.

Atau dapat ditarik kesimpulan bahwa untuk nilai variabel X1 atau umur pasien yang sama (tetap) metode pengobatan A mempunyai tingkat keefektifan obat paling tinggi dibandingkan kedua