• Tidak ada hasil yang ditemukan

Statistik Bisnis 1. Week 11 Sampling and Sampling Method

N/A
N/A
Protected

Academic year: 2021

Membagikan "Statistik Bisnis 1. Week 11 Sampling and Sampling Method"

Copied!
56
0
0

Teks penuh

(1)

Statistik Bisnis 1

Week 11

(2)

Learning Objectives

In this chapter, you learn:

• To distinguish between different sampling methods

• The concept of the sampling distribution • To compute probabilities related to the

sample mean and the sample proportion

(3)
(4)

Why Sample?

• Selecting a sample is less time-consuming than selecting every item in the population (census).

• Selecting a sample is less costly than selecting every item in the population.

• An analysis of a sample is less cumbersome

and more practical than an analysis of the

(5)

A Sampling Process Begins With A

Sampling Frame

• The sampling frame is a listing of items that make up the population

• Frames are data sources such as population lists, directories, or maps

• Inaccurate or biased results can result if a frame excludes certain portions of the population

• Using different frames to generate data can lead to dissimilar conclusions

(6)

Types of Samples

Samples

Non-Probability Samples Judgment Convenience Probability Samples Simple

(7)

Types of Samples:

Nonprobability Sample

• In a nonprobability sample, items included are chosen without regard to their probability of occurrence.

– In convenience sampling, items are selected based only on the fact that they are easy, inexpensive, or convenient to sample.

– In a judgment sample, you get the opinions of pre-selected experts in the subject matter.

(8)

Types of Samples:

Probability Sample

• In a probability sample, items in the sample are chosen on the basis of known

probabilities.

Probability Samples

Simple

(9)

Probability Sample:

Simple Random Sample

• Every individual or item from the frame has an

equal chance of being selected

• Selection may be with replacement (selected individual is returned to frame for possible reselection) or without replacement (selected individual isn’t returned to the frame).

• Samples obtained from table of random numbers or computer random number generators.

(10)

Selecting a Simple Random Sample Using A Random Number Table

Sampling Frame For Population With 850

Items

Item Name Item #

Bev R. 001 Ulan X. 002 . . . . . . . . Joann P. 849 Paul F. 850

Portion Of A Random Number Table

49280 88924 35779 00283 81163 07275 11100 02340 12860 74697 96644 89439 09893 23997 20048 49420 88872 08401

The First 5 Items in a simple random sample

Item # 492 Item # 808

Item # 892 -- does not exist so ignore Item # 435

Item # 779 Item # 002

(11)

Probability Sample:

Systematic Sample

• Decide on sample size: n

• Divide frame of N individuals into groups of k individuals: k=N/n

• Randomly select one individual from the 1st

group

• Select every kth individual thereafter

N = 40

n = 4

k = 10

First Group

(12)

Probability Sample:

Stratified Sample

• Divide population into two or more subgroups (called strata) according to some common characteristic

• A simple random sample is selected from each

subgroup, with sample sizes proportional to strata sizes • Samples from subgroups are combined into one

• This is a common technique when sampling population of voters, stratifying across racial or socio-economic

lines.

Population Divided into 4 strata

(13)

Probability Sample

Cluster Sample

• Population is divided into several “clusters,” each representative of the population

• A simple random sample of clusters is selected

• All items in the selected clusters can be used, or items can be chosen from a cluster using another probability

sampling technique

• A common application of cluster sampling involves election exit polls, where certain election districts are selected and sampled.

Population divided into

16 clusters. Randomly selected clusters for sample

(14)

Probability Sample:

Comparing Sampling Methods

• Simple random sample and Systematic sample

– Simple to use

– May not be a good representation of the population’s underlying characteristics

• Stratified sample

– Ensures representation of individuals across the entire population

• Cluster sample

– More cost effective

– Less efficient (need larger sample to acquire the same level of precision)

(15)

Evaluating Survey Worthiness

• What is the purpose of the survey?

• Is the survey based on a probability sample? • Coverage error – appropriate frame?

• Nonresponse error – follow up

• Measurement error – good questions elicit good responses

(16)

Types of Survey Errors

• Coverage error or selection bias

– Exists if some groups are excluded from the frame and have no chance of being selected

• Non response error or bias

– People who do not respond may be different from those who do respond

• Sampling error

– Variation from sample to sample will always exist • Measurement error

– Due to weaknesses in question design, respondent error, and interviewer’s effects on the respondent (“Hawthorne effect”)

(17)

Types of Survey Errors

• Coverage error

• Non response error

• Sampling error • Measurement error Excluded from frame Follow up on nonresponses Random differences from sample to sample Bad or leading question (continued)

(18)
(19)

Sampling Distributions

• A sampling distribution is a distribution of all of the possible values of a sample statistic for a given size sample selected from a population.

• For example, suppose you sample 50 students from your college regarding their mean GPA. If you obtained many different samples of 50, you will compute a different mean for each sample. We are interested in the distribution of all potential mean GPA we might calculate for any given sample of 50 students.

(20)

Developing a

Sampling Distribution

• Assume there is a population …

• Population size N=4 • Random variable, X, is age of individuals • Values of X: 18, 20, 22, 24 (years) A B C D

(21)

.3 .2 .1 0 18 20 22 24 A B C D Uniform Distribution P(x) x (continued)

Summary Measures for the Population Distribution:

Developing a

Sampling Distribution

21 4 24 22 20 18 N X μ i      

2.236 N μ) (X σ 2 i   

(22)

16 possible samples (sampling with replacement) 1st 2nd Observation Obs 18 20 22 24 18 18 19 20 21 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24 (continued) 16 Sample Means 1st Obs 2nd Observation 18 20 22 24 18 18,18 18,20 18,22 18,24 20 20,18 20,20 20,22 20,24 22 22,18 22,20 22,22 22,24 24 24,18 24,20 24,22 24,24

Developing a Sampling Distribution

(23)

1st 2nd Observation Obs 18 20 22 24 18 18 19 20 21 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24

Developing a Sampling Distribution

Sampling Distribution of All Sample Means

18 19 20 21 22 23 24 0 .1 .2 .3 P(X) X Sample Means Distribution 16 Sample Means _ (continued)

(no longer uniform)

(24)

Developing a Sampling Distribution

Summary Measures of this Sampling Distribution:

(continued) 21 16 24 19 19 18 N X μ i X       

 1.58 16 21) -(24 21) -(19 21) -(18 N ) μ X ( σ 2 2 2 2 X i X       

(25)

Comparing the Population Distribution

to the Sample Means Distribution

18 19 20 21 22 23 24 0 .1 .2 .3 P(X) X 18 20 22 24 A B C D 0 .1 .2 .3 Population N = 4 P(X) X _ 1.58 σ 21 μXX  2.236 σ 21 μ  

Sample Means Distribution n = 2

(26)

Sample Mean Sampling Distribution:

Standard Error of the Mean

• Different samples of the same size from the same population will yield different sample means

• A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean:

(This assumes that sampling is with replacement or sampling is without replacement from an infinite population)

• Note that the standard error of the mean decreases as the sample size increases

n σ

(27)

Sample Mean Sampling Distribution:

If the Population is Normal

• If a population is normally distributed with mean μ and standard deviation σ, the sampling

distribution of is also normally distributed with and X

μ

μ

X

n

σ

σ

X

(28)

Z-value for Sampling Distribution

of the Mean

• Z-value for the sampling distribution of :

where: = sample mean

= population mean

= population standard deviation n = sample size X μ σ n σ μ) X ( σ ) μ X ( Z X X     X

(29)

Normal Population Distribution

Normal Sampling Distribution

(has the same mean)

Sampling Distribution Properties

(i.e. is unbiased )

x

x

x

μ

μ

x

μ x μ

(30)

Sampling Distribution Properties

– As n increases,

– decreases Larger sample size

Smaller sample size

x

(continued) x σ μ

(31)

Example

Oxford Cereals mengisi ribuan kotak sereal dalam satu shift (8 jam). Sebagai manajer operasional, anda bertanggung jawab untuk memonitor jumlah sereal yang diisi pada tiap kotak. Agar konsisten dengan label pada kotak, kotak-kotak tersebut harus rata-rata berisi 368

gram sereal. Karena kecepatan proses, berat isi sereal

bervariasi dari kotak ke kotak, menyebabkan ada kotak yang isinya lebih sedikit dan ada kotak yang isinya lebih banyak. Jika proses tersebut tidak bekerja dengan benar,

berat rata-rata dari kotak-kotak tersebut dapat terlalu bervariasi dari berat label 368 gram tersebut.

(32)

Example

Karena menimbang semua kotak akan terlalu menghabiskan waktu, biayanya besar dan tidak efisien, anda harus mengambil sampel. Untuk tiap sampel yang anda pilih, anda berencana untuk menimbang masing-masing kotak dan menghitung rata-rata sampel. Anda perlu menentukan peluang munculnya rata-rata sampel tersebut dari populasi yang rata-ratanya 368 grams. Berdasarkan analisis, anda harus memutuskan apakah anda perlu mempertahankan, menyesuaikan atau menutup proses pengisian sereal tersebut.

(33)

Example

a. Jika anda memilih 25 kotak secara acak tanpa dikembalikan dari ribuan kotak yang diisi pada sebuah shift, sampel ini jumlahnya jauh lebih sedikit dari 5% populasi. Diketahui bahwa simpangan baku proses pengisian sereal adalah 15 gram, hitunglah kesalahan baku rata-rata (standard error of the

mean)?

3

25

15

n

σ

σ

X

(34)

Example

b. Bagaimana kesalahan baku rata-rata (standard error of the mean) dipengaruhi oleh peningkatan ukuran sampel dari 25 hingga 100 kotak?

5

.

1

100

15

n

σ

σ

X

(35)

Example

c. Jika anda memilih 100 kotak, berapakah peluang rata-rata sampel dibawah 365 gram?

368 365 2 5 . 1 3 100 15 368 365 n σ μ X Z         0228 . 0 ) 2 ( ) 365 (x   P z    P

(36)

Example

d. Temukan selang yang berdistribusi simetris disekitar rata-rata populasi yang mencakup 95% rata-rata sampel, jika sampel yang diambil adalah 25 kotak.

Dengan demikian: 368 L X XU 95% 025 . 0 ) (XX LP 975 . 0 ) (XXUP

(37)

Example

025 . 0 ) (XX LP P(XXU )  0.975 96 . 1   L X Z1.96 U X Z 25 15 68 3 96 . 1 -  X L  3 . 96 . 1 368   L X 12 . 362  L X 25 15 68 3 96 . 1  X L  3 . 96 . 1 368   L X 88 . 373  L X

(38)
(39)

Exercise 1

Biro Sensus U.S. mengumumkan bahwa median dari harga jual rumah baru pada tahun 2009 adalah $215.600, dan rata-rata harga jualnya adalah $270.100 (www.census.gov/newhomesales, 30 Maret 2010). Asumsikan simpangan baku dari harga jual adalah $90.000.

a. Jika anda memilih sampel, n = 2, bagaimanakah bentuk distribusi sampling .

b. Jika anda memilih sampel, n = 100, bagaimanakah bentuk distribusi sampling .

c. Jika anda memilih sampel sampel, n = 100, berapakah peluang rata-rata sampel akan kurang dari $300.000?

d. Jika anda memilih sampel sampel, n = 100, berapakah peluang rata-rata sampel akan berada antara$275.000 dan $290.000?

X

(40)

Exercise 2

Waktu yang dihabiskan untuk menggunakan surel (e-mail) per sesi berdistribusi normal, dengan  = 8 menit dan  = 2 menit. Jika anda memilih sampel acak 25 sesi,

a. Berapakah peluang rata-rata sampel berada diantara 7.8 dan 8.2 menit?

b. Berapakah peluang rata-rata sampel berada diantara 7.5 dan 8 menit?

c. Jika anda memilih sampel acak 100 sesi, berapakah peluang rata-rata sampel berada diantara 7.8 dan 8.2 menit?

(41)

Exercise 3

Jumlah waktu yang dihabiskan oleh seorang teller bank untuk melayani tiap pelanggan memiliki rata-rata,  = 3.10 menit dan simpangan baku,  = 0.40 menit. Jika anda memilih sampel acak 16 pelanggan,

a. Berapakah peluang rata-rata waktu yang dihabiskan per pelanggan paling tidak 3 menit?

b. Terdapat 85% peluang bahwa rata-rata sampel akan kurang dari berapa menit?

c. Apakah asumsi yang harus ada untuk dapat menyelesaikan poin (a) dan (b)?

d. Jika anda memilih sampel acak 64 pelanggan, terdapat 85% peluang bahwa rata-rata sampel kurang dari berapa menit?

(42)
(43)

n↑

Central Limit Theorem

As the sample size gets large enough… the sampling distribution becomes almost normal regardless of shape of population

x

(44)

Sample Mean Sampling Distribution:

If the Population is not Normal

(45)
(46)

How Large is Large Enough?

• For most distributions, n > 30 will give a sampling distribution that is nearly normal • For fairly symmetric distributions, n > 15 will

usually give a sampling distribution is almost normal

• For normal population distributions, the

sampling distribution of the mean is always normally distributed

(47)
(48)

Population Proportions

π = the proportion of the population having some characteristic

Sample proportion ( p ) provides an estimate of π:

• 0 ≤ p ≤ 1

• p is approximately distributed as a normal distribution when n is large

• (assuming sampling with replacement from a finite population or without replacement from an infinite population)

size sample interest of stic characteri the having sample the in items of number n X p  

(49)

Sampling Distribution of p

• Approximated by a

normal distribution if:

where

and

(where π = population proportion)

Sampling Distribution P(ps) .3 .2 .1 0 0 . 2 .4 .6 8 1 p

π

p

μ

n ) (1 σpππ

5

)

n(1

5

n

and

π

π

(50)

Z-Value for Proportions

n ) (1 p σ p Z p         

(51)

Example

• Seorang manajer bank lokal menetapkan bahwa 40% dari pelanggannya memiliki lebih dari satu akun rekening.

• Jika anda memilih sampel acak 200 pelanggan, karena n = 200(0.40) = 80 ≥ 5 dan n(1 – ) =

200(0.60) = 120 ≥ 5, maka ukuran sampel cukup

besar untuk bisa diasumsikan mendekati distribusi normal

• Hitunglah peluang proporsi sampel pelanggan yang memiliki akun rekening lebih dari satu kurang dari 0.30.

(52)

Example

89 . 2 200 24 . 0 10 . 0 200 ) 60 . 0 )( 40 . 0 ( 40 . 0 30 . 0 ) 1 (          n p Z    P(Z<-2.89) = 0.0019

Jika proporsi populasi 0.40, hanya 0.19% dari sampel (n=200) akan memiliki proporsi sampel kurang dari 0.3

(53)
(54)

Exercise 4

Sebuah badan survey independen melakukan hitung cepat hasil pemilu. Misalkan terdapat dua kandidat pemilu, jika salah satu kandidat mendapat paling tidak 55% suara dari sampel, kandidat tersebut akan diprediksi sebagai pemenang pemilu. Jika anda memilih sampel acak 100 pemilih, berapakah peluang seorang kandidat akan diprediksi menjadi pemenang jika

a. Persentase populasi pemilihnya sebesar 50.1%? b. Persentase populasi pemilihnya sebesar 60%?

c. Persentase populasi pemilihnya sebesar 49% (dan dia sebenarnya kalah pemilu)?

d. Jika ukuran sampelnya dinaikan menjadi 400,

(55)

Exercise 5

Pada survei terbaru pada pekerja wanita penuh waktu usia 22 hingga 35 tahun, 46% mengatakan bahwa lebih baik gaji mereka dikurangi demi mendapatkan lebih banyak waktu luang. (Data didapatkan dari “I’d Rather Give Up,” USA Today, 4 Maret 2010, hal. 1B.) Misalkan anda memilih sampel 100 pekerja wanita penuh waktu berusia 22 hingga 35 tahun.

a. Berapakah peluang bahwa didalam sampel, kurang dari 50% sampel lebih memilih gaji mereka dikurangi demi waktu luang yang lebih banyak?

b. Berapakah peluang bahwa didalam sampel, terdapat di antara 40% dan 50% sampel lebih memilih gaji mereka dikurangi demi waktu luang yang lebih banyak?

c. Berapakah peluang bahwa didalam sampel, lebih dari 40% sampel lebih memilih gaji mereka dikurangi demi waktu luang yang lebih banyak? d. Jika jumlah sampel menjadi 400 orang, bagaimanakah perubahan

(56)

Referensi

Dokumen terkait

Pada hasil akhir didapatkan bahwa variabel yang berhubungan secara bermakna adalah tekanan intraoku- lar (data kontinu dengan PR = 1,01; 95% CI = 1,01- 1,02), jenis glaukoma,

Permasalahan yang dihadapi pada penelitian ini adalah belum banyaknya penelitian yang berkaitan dengan respon fisiologis domba lokal yang diberi pakan Indigofera sp

Berdasarkan latar belakang masalah yang telah dikemukakan di atas maka penulis merumuskan masalah dalam penelitian ini adalah sebagai berikut: menganalisis bagaimana

Akibat tertutupnya permukaan tanah oleh beton - baik berupa bangunan, parkir atau jalan di kawasan kota, radiasi matahari yang jatuh pada kawasan itu sebagian besar diserap

Siswa menanyakan penjelasan guru dan materi yang belum di pahami tentang kosakata yang berkaitan dengan Strong State Prosperous Society, kemudian guru menjawab

Hasil penelitian menunjukkan bahwa perlakuan variasi lama fermentasi berdasarkan analisis kadar air, rendemen, persentase tak terfermentasi, susut bobot dan pH biji

per satuan kemasan Vera C Sebagian besar Batang, dahan atau Kehitaman Kurang pedas Maks..

Puji syukur penulis ucapkan kehadirat Allah Subhanahu Wa Ta’ala yang telah memberikan berkat rahmat dan hidayahnya yang besar sehingga penulis dapat menyelesaikan