1
Small Area Estimation
Research Activity
in Bogor Agricultural University
1)Khairil A. Notodiputro Anang Kurnia
Department of Statistics Bogor Agricultural University Jl. Meranti, Wing 22 Level 4 Kampus IPB Darmaga, Bogor
1) Paper has presented in Seminar on Use, Analysis and Application of Small Area Statistics, March 7, 2007. BPS-JICA
Definition :
A sub-population is
small
if the domain specific sample size
is not large enough to support direct estimates of adequate
precision.
Small area, small domain,
local area
Small geographical area
Domain : age-sex-race,
poverty status
3
Direct estimates:
Use area-specific sample data only.
Indirect estimates:
Borrow strength from sample
observations of related areas
through auxiliary data (recent
census and current
administrative records) to
increase
effective
sample size.
What and why ”small area”
Can we minimize or even eliminate the use of indirect estimates ?
The attention of SAE has increased along with increasing of
government or private sector demand to provide accurate
information quickly, not only for national (large domain) but also
for small domain such as sub-district.
In Indonesia, it’s important to develop SAE because nowadays
there is moving away from centralization to decentralization in
making decision of public policy that a local government can
manage their districts, allocate their funds and make regional
planning well. Certainly, the decision maker in local government
will require some statistics for their districts.
Statistics Indonesia (BPS) regularly conducts surveys such as
“SUSENAS, etc” but it’s based on national designed.
5
There are essentially two-types of SAE models:
Basic area level model
that relate small area direct
estimator to area-specific auxiliary data
Basic unit level model
that the information is available at
the sampling unit level and modeling is done based on
individual data
Consider the following Fay-Herriot (1979) model for area level
y
i= x
i’
β
+
υ
i+ e
iwhere
υ
iand e
iare independent with
υ
i~ N(0, A),
e
i~ N(0, D
i) for i = 1, 2, ..., k.
We assume that
β
and A unknown but D
iare known.
Small Area Model
Model description :
1. xi = (xi1, xi2, ..., xip)Æ auxiliary data
2. θi= xi’β + υi Æ the parameter that is a function of auxiliary
data and random effect υi
3. yi = θi + ei Æ direct estimate with sampling error 4. yi = xi’β + υi + ei Æ the special case of GLMM
Small Area Model
The best predictor (BP) of θi = xi’β + υi if β and A = σ2
υ known is
given by:
where Bi = σ2
ei /(σ2υ + σ2ei).
The best predictor is equivalent with empirical bayes approach for
7
1. Estimates of small area characteristics based on fixed effect models are reffered to as synthetic estimator (Levy and French, 1977), composite estimator (Schaibel et al, 1977), and prediction estimator (Holt et al, 1979, Sarndal 1984, Marker 1999)
2. Mixed models have been used to improve estimation of small area characteristics of small area based on survey sampling or cencus data by Fay and Herriot (1979), Ghosh and Rao (1994), Rao (1999) and Pfeffermann (1999)
3. In addition to EBLUP, empirical Bayes (EB) and hierarchical Bayes (HB) estimation and inference methods have been also applied to small area estimation.
4. Ghosh and Rao (1994) review the application of these estimation. Maiti (1998) has used non-informative priors for hyperparameters in HB methods and You and Rao (2000) have used HB methods to estimate small area means under random effect models.
A Brief Review of SAE Techniques
6. A general approach for SAE based on GLM is describe in Ghosh et al (1998), Malec et al (1999). Farrel et al (1997) extended the mixed logistic model and Moura and Migon (2001) further extend with introducing a component to account for spatially correlated structure in the biner respon data.
7. A measure of uncertainty of EBLUP or EB has been developed in recent years. Rao (2003) described the result of simulation study of Jiang, Lahiri and Wan (2002). They reported the simulation results on the relative performance of estimator of MSE under the simple model.
8. Some author who concern in a measure of uncertainty are Butar and Lahiri (2001, 2003) on Bootstraping methods; Wang and Fuller (2003), Rivest and Vandal (2003) on aspect of unknown sampling variance; Rao (2003), Jiang, Lahiri and Wan (2003) on jackknife
9
1. Empirical Best Linear Unbiased Predictor (EBLUP)
Î Estimation of variance component
2. Empirical Bayes (EB)
Î Mean of posterior distribution, the parameter was estimated from empirical data
3. Hierachical Bayes (HB)
Î Mean of posterior distribution, prior distribution
Inference of Area Level Model
One of recent problem on SAE is uncertainty and MSE estimator
Ghosh and Rao (1994), Prasad and Rao (1990), Butar and Lahiri (2001, 2003), Jiang, Lahiri and Wan (2002), Chen and Lahiri (2001, 2005), Hall and Maiti (2005) give a contribution for this problem.
The approximation that proposed by some authors could be eliminated the problem of underestimate especially for case of A = Di = 1 and Xβ = 0. However Kurnia and Notodiputro (2006) showed that for heterogenity of Di (sampling error) and all of parameter model must be estimate, the underestimate of MSE large enough about 13% - 19%.
11
The Chronology
1. A discussion in ”Forum Masyarakat Statistik”
held in Solo on December, 3, 2004, identified
the need of quality research in small area
estimation models for Indonesia Case
2. In 2003 Smeru Research Institute developed
small area statistics map for several provinces
3. Kurnia and Notodiputro (2005) carried out a
study of generalized linear mixed model
approach and hierarchical Bayes for SAE
applied to BPS data.
The Development of SAE Research in IPB
Research on
Small Area Estimation
at IPB has been
carried out through support from DGHE
Developing Small Area Estimation Models for BPS Data
13
Development Stages of SAE Research at IPB
Tinjauan
M etodologi Pemilihan Peubah
Survey untuk M etode (Unggul)
Desain Software Uji Coba
Kajian untuk Pembandingan
M etode
Evaluasi M etode Terhadap Data
Simulasi yang M engikuti Sampling BPS
Up dating data BPS
Tahun I Tahun II Tahun III
Roadmap
Papers in conference proceeding or journals:
Kurnia, A. dan Notodiputro, K.A. 2006. The Jacknife Method in
Small Area Estimation. Forum Statistika dan Komputasi, Vol. 11 No.1, p:12-15.
Sadik, K. dan Notodiputro, K.A. 2006. Small Area Estimation
based on Random Walk Models. Forum Statistika dan Komputasi, Vol. 11 No.1.
Sadik, K. dan Notodiputro, K.A. 2006. Small Area Estimation
with Time and Area Effects Using Two Stage Estimation, ICoMS-1 : Bandung.
Kurnia, A. dan Notodiputro, K.A. 2006. EB-EBLUP MSE Estimator
on Small Area Estimation with Application to BPS Data, ICoMS-1 : Bandung.
Kismiantini, Kurnia, A. and Notodiputro, K.A. 2006. Risk of
Dengue Haemorrhagic Fever In Bekasi Municipality With Small
Area Approach, ICoMS-1 : Bandung.
15
The Development of SAE Research in IPB
Papers in conference proceedings or journals (continued):
Indahwati and Notodiputro, K.A. 2006. Effect of Inappropiate
Sampling Design on Reliability of Small Area Estimates, ICoMS-1 : Bandung.
Sadik, K. dan Notodiputro, K.A. P-Spline M-Quantile Approach in
Small Area Estimation. Jurnal Matematika Aplikasi dan Pembelajaran, Vol. 5 No.2 Jilid 1, p:142-147.
Kurnia, A. dan Notodiputro, K.A. Effects of Sampling Variance
Estimation in Small Area Estimation. Jurnal Matematika Aplikasi dan Pembelajaran, Vol. 5 No.2 Jilid 2.
Handayani, D. dan Kurnia, A. 2006. Empirical Bayes Approach to
Estimate Finite Population Mean in Small Area Estimation.
Jurnal Matematika Aplikasi dan Pembelajaran, Vol. 5 No.2 Jilid 2
The Development of SAE Research in IPB
Research plan in the 2
ndyear :
The research will be focus on development of
method of small area estimation to increase
the accuration