Marianne Beninato, DPT, PhD MGH Institute of Health Professions
George Fulk, PT, PhD Clarkson University
Define important psychometric properties of outcome measures
Apply the ICF framework to be able to categorize outcome measures according to the ICF
Compare and contrast the Minimal Detectable Change (MDC) and Minimal Clinically
Important Difference (MCID)
• Apply concepts of MDC and MCID to the
interpretation of change scores on outcome measures
• Understand the limitations of the interpretation of outcome measures in various settings and patient subgroups
• Discuss the ways that the proper
interpretation of change scores can inform patient management and clinical decision making
The authors have nothing to disclose
Introduction (MB)
Overview of Measurement (MB)
◦International Classification of Functioning, Disability and Health (ICF)
◦Review of Measurement Properties
Error, reliability, Standard Error of Measurement
◦Measures of Change and Their Interpretation
Minimal Detectable change
Minimal Clinically Important Difference Distribution-based methods Diagnostic Test (Anchor) Method Case Presentations (GF)
Limitations, cautionary notes (MB) Future Research (MB)
Questions (MB and GF)
Pt is a 66 year old male with first time right MCA stroke. Prior to his stroke he was working full time as an architect. He is married with 3 grown children, 2 of whom live near by.
What will you decide to assess?
Which clinical assessment tools will you use? How will you interpret the scores?
How will you know if your patient is getting better?
◦“Significantly” better?
◦Better than what?
◦Based on what reference point?
Between groups vs Within Patient change
Essential to evidence-based practice Guides clinical decision making
Frameworks for assessing health and disease ◦The ICF (WHO, 2001)
Health Condition
Body Function Structure
Activity Participation
Personal Factors Environmental
Factors
Contextual Factors
Replaces ICIDH and Nagi Model
A meaningful and practical system that can be used by various consumers for health policy, quality assurance and outcome evaluation in different cultures Aims
◦To provide scientific basis for understanding and studying health and health-related states, outcomes and determinants
◦To establish a common language for describing health and health-related states in order to improve communication between different users.
Health Condition
Body Function Structure
Activity Participation
Personal Factors Contextual
Factors
Environmental Factors
Body Function Structure
Activity Participation
Physiologic functions or anatomical parts
of the body Negative aspect:
Impairment
Execution of task or action Negative aspect:
Limitation
Involvement in life situations Negative aspect:
Personal Factors
External influences on functioning Physical, social and
attitudinal environment in which people live
Internal influences on functioning Particular background
of individual’s life and living, and comprise
features of the individual that are not part of health condition Environmental
Factors
•Relationships among components are not unidirectional or linear or proportional
Health Condition
Body Function Structure
Activity Participation
Personal
‣ DTR’s, Ashworth Scale
‣ NIH Stroke Scale
‣ Fugl-Meyer Assessment of Motor Function
‣ Fugl-Meyer Sensory Assessment
‣ Chedoke-McMaster Stroke Assessment
‣ Dynomometry
‣ Motricity Index
‣ Nottingham Assessment of Somatosensation
‣ Orpington Prognostic Scale
‣Rate of Perceived Exertion
‣Rivermead Assessment of Somatosensory Performance
‣Rivermead Motor Assessment
‣Semmes Weinstein Monofilaments
‣Stroke Rehabilitation Assessment of Movement – Limb Movement Subscales
‣Tardieu Spasticity Scale
‣VO2 Max
5 times Sit to Stand
6 Minute Walk Test
9 Hole Peg Test
10 Meter Walk Test
Action Research Arm Test
Activity-specific Balance Confidence Scale**
Arm Motor Ability Test
Berg Balance Scale
Balance Evaluation Systems Test (BEST)
Block & Box Test
Brunnel Balance Test
Canadian Occupational Performance Measure
‣Chedoke Hand Arm Inventory
‣Dynamic Gait Index ‣Falls Efficacy Scale**
‣Functional Ambulatory Categories ‣Functional Gait Assessment ‣Functional Independence Measure
‣Functional Reach ‣Hi Mat
‣Jebsen Taylor Arm Function Test ‣Motor Activity Log
‣Mobility Scale for Acute Stroke ‣Postural Assessment Scale for Stroke
Patients
‣Stroke Rehabilitation Assessment of Movement – Mobility Subscale ‣Timed Up and Go ‣Tinetti POMA ‣Trunk Control Test ‣Trunk Impairment Scale ‣Wolf Motor Function Test
BOLD – not included in StrokEdge
Assessment of Life Habits
EuroQOL
Goal Attainment Scale Modified Fatigue
Impact Scale Modified Rankin Scale Reintegration to Normal
Living
‣Satisfaction with Life Scale
‣Stroke Adapted Sickness Impact Scale 30
‣SF-36
‣Stroke Impact Scale** ‣Stroke-Specific Quality
of Life ‣Frenchay Index ‣Adalaide Activities
Profile
BOLD – not included in StrokEdge
Some measures are Hybrid
Include items from more than one ICF component
Example: Stroke Impact Scale
◦BSF: “How would you rate the strength of your leg affected by your stroke?”
◦Activity: “How difficult was it to bathe yourself?”
ABC scale (Powell and Myers, J Gerontol A Biol Med Sci 1995; 50:28-34)
Falls Efficacy Scale for Stroke (Hellstrom and Lindmark, Clin Rehabil 1999;13:509-17)
Root questions are not about how well or how often the activities are performed but how the person feels about doing them
“How confident are you that you
could…without losing your balance
If possible, measure in various domains of ICF If possible, include measures of personal
factors
This is not always possible or appropriate Health
Condition
Body Function Structure
Activity Participation
Personal Factors Contextual
Factors
Environmental Factors
Decide what you will be using OM for
◦Measuring change
◦Prediction
Are reference psychometrics available?
◦Reliability
◦Validity
Avoid floor or ceiling effect (≥20 % floor or ceiling effect)
Match with
◦health condition (diagnosis) ◦practice setting
◦patient subgroup (i.e. stroke severity etc) ◦stage of recovery
Responsiveness ◦Aspect of validity ◦Small but relevant change ◦Meaningful
Group Comparisons
◦t-tests, ANOVA ◦Limits of interpretability
Why do we take measurements? ◦Descriptive ◦Differentiation ◦Detect change
Sources of Error Examples
Patient Variability
Normal variability in patient performance related to factors such as fatigue
Disease state is more or less stable
Patient’s cognitive state
Rater Variability
Familiarity, expertise with the instrument Practiced, standardized
technique
Measurement Instrument
Scoring not clearly defined
Instrument not stable
Random
◦Scores taken at different time in a truly unchanging person will be bell shaped i.e. normal distribution
Systematic
◦Scores will be skewed to greater than or less than the mean
Differentiating among patients
◦Will people with more impairment consistently have lower scores and vice versa?
◦Interclass correlation coefficient (ICC) ◦Unitless measure
◦Scored 0 to 1 ◦Higher score is better
http://en.wikipedia.org/wiki/File:Intraclass_correlation_coefficient_graph.png
Consistency of measured values from a truly unchanged patient
◦Standard Error of Measurement (SEM) ◦SEM = s √1 – rXX
s is pooled SD of 2 sets of stable scores rXX is reliability coefficient (ICC)
SEM quantifies random error taking into account stability at baseline and test re-test reliability
In same units as outcome measure
Mean BBS ICC
SD T1
SD T2
Pooled SD
√(SDT12 + SDT22 )/2 √1- ICC SEM Flansbjer
PM R
2012;4:165-170 52.0 .88 4.3 3.8 4.05 .3464 1.40
Hiengkaew Arch Phys Med Rehabil 2012;93:1201-1208
46.2 .95 7.64 7.87 7.76 .2236 1.73 SEM = √(SD T12 + SD T22 )/2 x √1 – ICC
SEM assumptions:
◦A truly stable group of individuals
The means between T1 and T2 should not be substantially different
◦Normal distribution of the difference in scores between T1 and T2
Interpreting reliability studies
◦The sample studied should resemble your patient ◦Your actual error could vary depending on your
reliability
Initial Exam Follow-up
Berg 45 Berg 49
Improvement?? 1st Berg 44
2nd Berg 45
3rd Berg 46
1st Berg 48
2nd Berg 49
3rd Berg 50
ERROR ERROR
Initial Exam Follow-up
Berg 45 Berg 49
Improvement?? 1st Berg 42
2nd Berg 45
3rd Berg 48
1st Berg 47
2nd Berg 49
3rd Berg 52
ERROR ERROR
Thanks to P. Levangie and D. Gross for image idea
MDC
Smallest amount of change that can be considered above measurement error Or the smallest amount of change that is
REAL change
Quantifies the variability of responses in truly unchanged patients
Assumes
◦Normal distribution of difference scores reflecting only random error
◦Patients’ true values do not change from over
measurement period
MDC = SDdiff x z Or
MDC = SEM x √2 x z ◦SEM x √2 = SDdiff ◦z Indicates level of
confidence ◦For MDC usually 90%
(z=1.65) or 95% (z=2.0) confidence
◦Nomenclature: MDC90 z scores
Interpreting MDC
◦Based on change in unchanged people
MDC90 = SEM x √2 x 1.65
MDC90 of BBS in people with Chronic Stroke: Flansbjer 2012
◦SEM = 1.40 ◦MDC90 = 3.27
• Hiengkaew 2012
◦SEM = 1.73 ◦MDC90 = 4.04
Beyond the threshold of measurement error is the threshold for important change
Commonly known as MCID
Definition: “the smallest difference in score
in the domain of interest which patient perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive cost, a change in
patient’s management” Jaeschke, et al, Control Clin Trials1989;10:407-15
OR
“The smallest difference in a score that is considered worthwhile or important” (Hayes and Woolley, Pharmacoeconomics 2000;18:419-23)
Distribution-based methods oEffect size
oES= M1-M2
SDbaseline
oStandardized response mean
oSRM = M1-M2
SDdiff
‣
Important? Who says so?‣
Anchor-based methods‣
External anchor used to define clinical importanceoAchievement of goal, discharge home etc
oDirect survey
oGlobal Rating of Change Scale (GROC) often used
7: A very great deal better 6: A great deal better 5: A good deal better 4: Moderately better 3: Somewhat better 2: A little better
1: About the same, hardly any better at all 0: No change
−1: About the same, hardly any worse at all −2: A little worse
−3: Somewhat worse −4: Moderately worse −5: A good deal worse
−6: A great deal worse
−7: A very great deal worse
15 Point Global
Rating of
Change Scale
(GROC)
(Jaeschke,1989)M
C
ID
N
O
M
C
People categorized as having achieved MCID or No MCID
Identify change score on outcome measure that best categorizes people as achieving MCID or No MCID
Apply diagnostic Test Methods ◦Sensitivity
◦Specificity
◦Positive and Negative Predictive Values ◦Likelihood ratios
MCID of FIM n = 113
Used GROC +3 as
MCID indicator Cutoff score from
ROC curve = 22 AUC = .85
Derived from Beninato et al., Arch Phys Med Rehabil. 2006;87:32-9
Sensitivity (SN) a/a+c Specificity (SP) d/b+d Likelihood ratios +LR = SN/1-SP -LR = 1-SN/SP
Positive Predictive Value a/a+b
Negative Predictive Value d/c+d
MCID based on the GROC
≥3 <3
Change Score
≥ score
a TP
b
FP a+b
< score
c FN
d
TN c+d
a+c b+d a+b+c+d
Sensitivity (SN) 77/100 = .77 Specificity (SP) 10/13 = .77 +LR = SN/1-SP = .77/.33 = 2.33 -LR = 1-SN/SP = .33/.77 = .43
Positive Predictive Value 77/80 = .96
Negative Predictive Value 10/33 = .30
MCID based on the GROC
≥3 <3
FIM Change
Score
≥22 77 3 80
<22 23 10 33
100 13 113
If my patient achieves a change score greater than the MCID, then that change reflects important change
Need to be aware of the anchor that was used Important? Who says so?
Does my patient share characteristics with the study sample
The SEM is a estimate of error that takes into account variability in a stable group of patients The MDC and MCID are useful and informative
threshold values for interpreting patient change scores
MDC is an indication of achieving real change (beyond measurement error)
MCID tells us that important change has taken place
66 y/o male: JR Acute care hospital
In patient rehabilitation hospital Out patient rehabilitation center Chronic
Stroke Rehabilitation Assessment of Movement (STREAM)
◦Our patient’s score
Total: 70 UE subscale: 68 LE subscale: 70 Mobility: 69
30 Items across 3 domains UE and LE: (Body Structure/Function)
◦0: unable to perform ◦1:
A: part of movement marked deviation B: part of movement comparable to unaffected C: full movement withmarked deviation
◦2: able to complete movement comparable to unaffected side Mobility (Activity)
◦0: unable to perform ◦1:
A: requires partial assistance with deviation B: requires partial assistance grossly normal C: Independently but abnormal movement pattern
◦2: independent, grossly normal pattern with assistive device ◦3: independent, grossly normal pattern without assistive device
How do I interpret the score, what are norms for this time frame/time in the continuum of care?
How much change necessary to be reasonably confident that my patient really changed?
Important change?
Predictive ability?
Mean (SD)
STREAM Total 75 (26.7)
LE subscale 73 (33.3)
UE subscale 75 (28.9)
Mobility subscale 74 (25.9)
Gait speed (m/s) 0.55 (0.38)
Our patient: Total: 70 UE subscale: 68 LE subscale: 70 Mobility: 69
Hsueh et al. ◦MDC
◦UE subscale: 14 ◦LE subscale: 12.6
Hsueh et al. Neurorehabil Neural Repair. 2008; 22:737-744.
Our patient: Total: 70
UE subscale: 68 ◦Need to increase to
82
LE subscale: 70 ◦Need to increase to
83
Mobility: 69
MCID?
63 individuals a mean of 8 (SD=3) days post stroke Initial scores of
subjects <63 JR: initial total
STREAM: 70 ~20% probability he
will be discharged home.
Ahmed et al. PHYS THER. 2003; 83:617-630.
8 days post stroke
Average LOS: 18
days
JR’s: Outcome Measures ◦Berg Balance Scale
30/56 ◦Fugl Meyer
UE: 35/66 LE: 18/34
Dobrez D, et al. Am J Phys Med Rehabil. 2010;89:198-204
How do I interpret the score, what are norms for this time frame/time in the continuum of care?
How much change necessary to be reasonably confident that my patient really changed?
Important change?
Predictive ability?
Days post Stroke
BBS Score N
Mao et al. Stroke.2002; 33:1022-1027
14 days post stroke
22.3 (22.2) 123
O’Dell et al. P M&R. 2013;5:392-399
9.2 (6.8) days post stroke
19.6 (16.6) Range: 0-54
55
Stevensen et al
◦30.3 (23.3) days post stroke ◦All subjects: 43.0
◦Assist: 35.5 MDC90: 5.8
◦Independent: 5.3, Standby: 5.0, Assist: 6.8 MDC95: 6.9
◦Independent: 6.3, Standby: 6.0, Assist: 8.1
Stevensen et al. Aust J Physiother. 2001;47:29-38.
Our Patient: 30/56 Need to improve >=39 to be 95% confident a real change occurred.
MCID? Rehabil. 2003;84:731-735. Our patient: admission
score: 30 ~65% D/C home Family support: ~95% D/C home
Duncan et al ◦105 subjects ◦Initial total motor:
57.1 (33.4) Within 24 hours of
stroke Stratified
◦Severe: 0-35 ◦Mod severe: 36-55 ◦Moderate: 56-79 ◦Mild: >80
Sanford et al
Sanford et al. Phys Ther. 1993;73:447-454. Wagner et al. Phys Ther. 2008;88:652-663. See et al. Neurorehabil Neural Rep. 2013.
Our Patient: UE: 35/66 LE: 18/34
FM MCID: JR: UE: 35/66, LE: 18/34
Time post stroke MCID Anchor Accuracy
Shelton et al 2001
17 days 10 point =1.5 D/C FIM self care 10 point=1.9 point D/C FIM mobility
FIM Self Care FIM Mobility
Page et al 2012 UE motor
60 months MCID: 4.25-7.5 Therapists’ perception of different UE movements/fu nction
AUC: 0.61-0.70 Sens:
3 months post stroke ◦Subacute stage of
recovery
Gait Speed ◦0.56 m/s
Stroke Impact Scale (SIS)
◦Communication: 45
◦Social Participation: 52
◦SIS 16: 62
How do I interpret the score, what are norms for this time frame/time in the continuum of care?
How much change necessary to be reasonably confident that my patient really changed?
Important change?
Normative data with healthy individuals (M/F): post stroke
Mean: 0.39 (0.22) m/s
Bohannon 1997
Tilson et al. Phys Ther. 2010 90:196–208. Our patient: 0.56 m/s
How much change in gait speed needs
to occur to be confident that it is real?
Time post stroke Mean GS MDC
Stephenson et al 1999
112 days 0.80 m/s 95% CI of change: -0.10 to 0.12 m/s
Flansbjer et al 2005
16 months 0.89 m/s 1st
session 0.94 m/s 2nd
session
Smallest Real Difference: -0.15 to 0.25 m/s Assistance: 0.07 m/s
Used AD: 0.18 m/s
Stephens et al. Clin Rehabil. 1999;13:171-181 Flansbjer et al. J Rehabil Med. 2005;37:75-82. Fulk et al. J Neurol Phys Ther. 2008;32:8-13.
Our patient: 0.56 m/s
Important Change in Gait Speed?
Time post stroke
Initial Gait Speed
MCID Anchor Accuracy
Fulk et al 2011
56 to 139 days post stroke
0.56 (0.22) m/s
0.17 m/s 0.19 m/s
Patient GROC Therapist post stroke
0.18
Our patient: 0.56 m/s Fulk et al. J Neurol Phys Ther. 2011;35:82-89. Tilson et al. Phys Ther. 2010:90.
5 point likert scale ◦1 could not do it at
all
◦2 very difficult ◦3 somewhat difficult ◦4 a little difficult ◦5 not difficult at all SIS-16
Stroke Impact Scale-8 domains ◦Strength ◦Hand Function ◦Mobility ◦ADLs ◦Emotion
◦Memory
◦Communication
Duncan et al 90-120 days post stokre
Huang et al 18 months post stroke
Our Patient
Total 65
Duncan et al. Stroke. 2002;33:2593-2599.
Huang et al. Neurorehabil Neural Repair.2010;24:559-566
Chronic, 17.7 months post stroke ◦Strength = 24.0
Lin et al. Neurorehabil Neural Rep. 2010;24:486-492.
Time post stroke
MCID Anchor Accuracy
Fulk et al SIS-16
2 months 9.4 14.1
Patient GROC Therapist GROC
Patient: AUC: 0.72
Strength: 9.2 ADL: 5.9 Mobility: 4.5 Hand: 17.8
Mean score of subjects that reported 10-15% on overall change
N/A
Fulk et al. Top Stroke Rehabil. 2010;17:477-483. Lin et al. Neurorehabil Neural Rep. 2010. 24:486-492
MDC Acute/ Subacute
MDC Chronic
MCID Acute/ Subacute my patient?
◦Cautiously interpret the values available
SEM and MDC depend on reliability
◦Only scores are reliable, not outcome instruments ◦Reliability is not transferable
MDC derived from research studies with strict methodology
◦Establish for your own practice group
Riddle and Stratford. Is This Change Real, 2013, F.A Davis Revicki D, et al.. J Clin Epidemiol. 2008; 61:102-109 Wells G, et al. J Rheumatol. 2001; 28:406-412
MCID depends on Anchor used
◦Motor FIM using GROC ratings MCID = 17 points (Beninato et al 2006)
◦Motor FIM using change in mRS = 11 points
(Wallace et al 2002)
Anchor should be closely related to construct being measured
◦Gait Speed by GROC survey .175 m/sec SN .81, SP .81(Fulk et al 2012) ◦Gait speed by change in mRS
.16 m/sec SN .74 SP .57 (Tilson et al)
Beninato et al . Arch Phys Med Rehabil. 2006;87:32-9 Wallace et al. J Clin Epidemiol. 2002;55:922-928
Baseline scores
◦Lower baseline requires more change to achieve MCID
◦Example (Beninato et al 2006)
Admission FIM scores10-40 required 27 point change Admission FIM scores 41-60 required 23 point change
Whether considering improvement versus
decline
Beninato et al . Arch Phys Med Rehabil. 2006;87:32-9 Wang et al. Phys Ther. 2011; 91:675-688
Beninato M, Portney LG, JNPT, 2011;35:75-81
Use of Diagnostic Test Methods to determine MDC
◦Additional information on accuracy of estimates ◦Little research available on this
Riddle and Stratford. Is This Change Real, 2013, F.A Davis Gold
Standard of Change
Yes No
Change Score
≥ score
a TP
b
FP a+b
< score
c FN
d
TN c+d
a+c b+d a+b+c+d
MCID estimates needed for more outcome measures
◦Only 6 of 24 recommended by EDGE task force have established MCID scores
MCID estimates for OMs at different ◦stages of recovery
◦severity levels
◦settings
◦different anchors
MCID using different anchors ◦Any one value of MCID is an estimate
◦Need to consider different perspectives
Stroke Edge Resources
◦ http://www.neuropt.org/professional- resources/neurology-section-outcome-measures-recommendations/stroke www.rehabmeasures.org Internet Stroke Center