Impact Evaluation Workshop
IRSA Annual Conference, Solo, 2018
OLS and Matching
Dr. Sarah Dong
Fellow, Indonesia Project
Australian National University
The intuition
β’ Assumption for both OLS and Matching:
β Everything that determines program allocation is observed
β’ Then after controlling for observed variables, the assignment into treatment and control is random
β’ OLS compares the averages of all treated with average of all untreated
β’ Matching calculates the average of the
difference between each treated and its chosen comparison in untreated
2 Sarah Dong (ANU) IRSA Workshop July 21-22, 2018
Beneficiary Clone
Intuition of Matching: The Perfect Clone
3 Sarah Dong (ANU) IRSA Workshop July 21-22, 2018
Treatment Comparison
ο Matching identifies a control group that is as similar as possible to the treatment group!
Intuition of Matching: The Perfect Clone
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 4
Indiv 1:
π1
Given that π = 1
Indiv 2:
π2
Given that π = 1
Indiv NT:
πππ
Given that π = 1
β¦
Match for Indiv 1:
ππ1
Given that π = 0
Match for Indiv 2
ππ2
Given that π =0
Match for Indiv NT:
ππππ
Given that π = 0
β¦
Principle of matching
Sarah Dong IRSA Workshop July 21-22, 2018 5
Matching treated to controls
β’ If we know the selection criteria (π), then we know who is eligible
β’ If π consists of discrete variables we can match treated and non- treated with similar eligibility (according to π):
1. Assign (or match) observations into cells with similar π, 2. drop the cells where we have only treated or controls, 3. and simply do a non-parametric difference in each cell 4. to retrieve ATE for respective cells
β’ However, matching suffers from the curse of dimensionality!
β If selection criteria π includes 1 or two 2 variables: no problem β But as the dimensions of selection increase β matching
becomes impossible very quickly
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 6
β’ An alternative approach: do not match on observed
characteristics, but on the probability of being treated.
β’ Propensity score matching (PSM) is a method that summarizes π to a single indicator: the probability of selection as a function of π
π π = Pr π = 1 π = πΉ( π½π)
β’ This is the propensity score: the conditional probability of receiving treatment given π
β’ This greatly reduces dimensionality: now only match on 1 variable!
Propensity score matching (1)
7 IDEC8026 Lecture 3 - Spring 2015
Insured
Not insured
Treatment group
Control group Initial
sample
Similar Propensity
Score
OOP budget share
Example: effect of health insurance on health spending
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 8
Indiv 1:
Pr π = 1 π1 = π1 Given that π = 1
Indiv 2:
Pr π = 1 π2 = π2 Given that π = 1
Indiv N:
Pr π = 1 ππ = ππ Given that π = 1
β¦
Match for Indiv 1:
Pr π = 1 ππ1 β π1 Given that π = 0
Match for Indiv 2:
Pr π = 1 ππ2 β π2 Given that π = 0
Match for Indiv N:
Pr π = 1 πππ β ππ Given that π = 0
β¦
π·πππππππππ ππ ππ’π‘ππππ π·πππππππππ ππ ππ’π‘ππππ π·πππππππππ ππ ππ’π‘ππππ π΄ππ
average
Propensity score matching: overview
9 IDEC8026 Lecture 3 - Spring 2015
β’ Estimate π(π) by means of logit or probit estimation and then predicting the probability of selection
π π = Pr π = 1 π = πΉ( π½π)
β’ Choice of π:
β Covariates must include determinants of participation β Exogenous data
β’ Measured prior to program exposure
β’ Unaffected by the program
β Include trends prior to program exposure if possible
β Knowledge about the program, setting and selection process
β’ Qualitative field work
Estimating the propensity score
10 IDEC8026 Lecture 3 - Spring 2015
β’ Common support:
β The distribution of π(π) may differ for treated and controls
β Especially for controls it can be hard to find high values of π(π) β Matching is only possible if there is a similar range of π(π) for
both treated and control units!
β Restrict the match to the range of common support!
β’ Look where distributions of the propensity score overlap
β’ Plot π(π) for the treated and non-treated
β’ Drop non-treated who fall outside of the region of common support
Range of common support
11 IDEC8026 Lecture 3 - Spring 2015
Range of common support
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 12
Source: Khandker et al. (2010)
Matching methods
β’ Once we have estimated π(π) there are several methods for propensity score matching
1. Nearest neighbour matching 2. Calliper matching
3. Kernel matching
β’ Weighting by the propensity score
β’ Basically, these propensity score matching methods aim to recreate an experimental design ex-post, by re-
weighting the data based on π(π)
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 13
Summarize: how to do PSM
1. Find data with observed π
π, π
πand π
πfor treated and non-treated
2. Understand the selection process
3. Estimate propensity score π π = Pr π = 1 π) 4. Check the range of common support
5. Choose matching method
6. Match units within range of common support 7. Check balancing properties π β₯ π | π(π)
8. Estimate treatment effects for matched sample
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 14
Comparing PSM and OLS
β’ Identifying assumption
β Both methods assume that π is exogenous to π, conditional on π
β’ Functional form
β PSM does not impose functional form (semi-parametric estimation) β OLS typically imposes a linear specification and assumes constant
treatment effects
β’ Sample
β PSM only considers observations within range of common support β OLS typically uses full sample
β’ Use of control variables
β PSM only considers determinants of the selection process (π) β OLS should control for determinants of the outcome (π)
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 15
Exercise: Askeskin
using Matching.dta
Sparrow, Suryahadi and Widyanti (2013) βSocial
health insurance for the poor: Targeting and impact of Indonesiaβs Askeskin programmeβ Social
Science & Medicine 96, pp. 264-271
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 16
Outcome variables
β’ Total outpatient utilization (outpat)
β’ Total outpatient utilization by public provider (outpu)
β’ Total outpatient utilization by private provider (outpr)
β’ Out-of-pocket health spending per capita (pcexph)
β’ Household budget share (oop)
β’ Incidence of catastrophic health spending (chs10, chs15)
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 17
Which control variables?
β’ Consider selection process and targeting variables:
- Household size and composition
- Education of the head of households
- Participation in other insurance schemes
- Housing conditions (size, quality and ownership of house) - Access to piped water
- Regional targeting is captured by districts coverage - Province effects
β’ Health: illness of household last month?
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 18
Useful Stata commands
β’ NaΓ―ve comparison
β summarize, table, ttest
β’ OLS, logit, probit regressions
β’ Propensity score matching
β Will use user written ado files: psmatch2, pstest, psgraph
β May search in the web using Stata interface (type net search psmatch2)
β Follow the instructions on screen to install
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 19
Exercise
1. Estimate the PS: Select the covariates and a specification 2. Choose a matching procedure
3. Are covariates balanced for matched treated & controls?
4. Assess the range of common support
5. Estimate the impact of Askeskin on the outcome variables
6. Does the choice of matching procedures affect impact estimates?
Sarah Dong (ANU) IRSA Workshop July 21-22, 2018 20