Estimating the Scale Factor - Rationale and Procedure in Examining the Stability of Item Bank

List of Appendices

Chapter 3 Methods

3.2 Rationale and Procedure in Examining the Stability of Item Bank Parameters Bank Parameters

3.2.4 Estimating the Scale Factor

An estimate of a scaling factor is obtained from the ratio of the standard deviation of the estimates for common items or common persons across frames of reference. Equation

3.24 is applied to obtain the ratio estimate from common items while Equation 3.25 is used to obtain the ratio estimate from common persons.

ˆ δ

ρ δ V V _s

s =

(3.24)

^* ˆ

ˆ β

ρ β

V V _s

s =

(3.25) 3.2.5 The Relationship between the Unit and Discrimination

The scale factor,ρ, also refers to a discrimination parameter (Humphry, 2010). This is because the value of the scale determines the rate of change of the probability of correct responses. It is clear from Equation 3.21 that a different value of ρ results in the same probability of correct responses. However, a different value of ρ does not result in the same distance between person n and item location i, that isβ_n− . The greater the value δ_i ofρ, the smaller the distance β_n − required to generate the same probability. In other δ_i words, the greater the value ofρ, the greater the rate of change of the probability and subsequently, the more discriminating the item.

Humphry (2010) points out that incorporating a discrimination parameter in a frame of reference in the Rasch model makes it possible to account for the effect of person factors on discrimination. Discrimination in frames of reference arising from empirical factors, such as item characteristics, person characteristics and testing circumstances, are acknowledged and taken into account. The magnitude of a discrimination parameter is determined by those empirical factors (Humphry, 2010).

However, as mentioned previously, Humphry and Andrich (2008) show that incorporating a scaling factor or discrimination parameter does not destroy the sufficiency of the Rasch model. This is because the discrimination parameter in the

Rasch model is within a specified frame of reference, implying that the discrimination parameter is associated with a set of items rather than a single item. Accordingly, a discrimination parameter for each item is not estimated. This is different from 2PL where discrimination is estimated for each item so that sufficiency is not preserved (Humphry and Andrich, 2008).

3.2.6 Procedures in Comparing Item Parameter Estimates in this Study:

Adjustment of the Origin and the Unit

As stated earlier, in comparing measurements of different frames of reference, the origin and the unit in each frame of reference need to be taken into account. In terms of the origin, in this study, as a result of a constraint imposed in every analysis, the sum of item estimates of the postgraduate/undergraduate analysis is 0.00 while the sum of the item bank values is not necessarily 0. This indicates that the item locations being compared are not of the same origin. To make them of the same origin, the mean of the item locations is made the same. To achieve this, it is possible to adjust the means for the postgraduate/undergraduate estimates or the item bank values. For convenience, the mean of the item bank locations is transformed following the mean of the postgraduate/undergraduate estimates of 0.00. This is performed by subtracting the mean value from each item value of the item bank.

In terms of the unit, the ratio of the standard deviation of the item bank values and that of postgraduate/undergraduate estimates is calculated. This ratio is used as the scaling factor to establish the common unit of the item bank and of the postgraduate/

undergraduate locations. Consistent with the adjustment of the origin using the postgraduate/undergraduate data as the standard, the common unit is also based on the postgraduate/undergraduate data.

The transformation of each item location of the item bank into the postgraduate/

undergraduate unit is according to Equation 3.26.

ρ δ_bi^* =δ^pi

(3.26) where δ_bi^* is the location item i of the item bank in the postgraduate/undergraduate unit;

δpi is the location of item i from the postgraduate/undergraduate analysis; and ρis the ratio of the standard deviation of the item bank values and the postgraduate/

undergraduate estimate.

After taking into account the origin and the unit, the item values of the item bank and of the postgraduate/undergraduate analysis are compared by performing a t-test for each item. A t value is obtained by applying Equation 3.27 as follows.

2 2

pi bi

t bi

σ σ

δ δ

= −

(3.27)

where δ_bi^* is the location item i of the item bank in the postgraduate/undergraduate unit;

δpi is the location of item i from the postgraduate/undergraduate analysis; σ_bi is the standard error estimate of item i of the item bank; and σ_piis the standard error estimate of item i from the postgraduate/undergraduate analysis.

Although in principle a unit needs to be taken into account in comparing item or person estimates from different frames of reference, the size of the scaling ratio are defined in 3.24 and 3.25 apparently determines the effect of adjusting the unit. When the ratio is greater than 1.0, indicating large difference of units, adjusting the unit seems to be necessary. In contrast, when the ratio is close to 1.0, it is expected that the small difference in units will not result in significant differences between estimates.

In the context of examining the stability of the item parameters in this study, the effect of taking the unit into account also depends on the correlation between item locations from the item bank and from this study’s analysis. For example, if the correlation is high and the unit ratio is greater than 1, adjusting the units is not likely to change the number of observed differences in item estimates between the item bank and the sample data. This is because with a high correlation, and a ratio close to 1.0, the estimates are very similar for all items, or there is a small number of items whose estimates are not stable.

In another situation where the correlation is low and the unit ratio is high, the effect of adjusting the unit may be marginal because a low correlation indicates that the difference between locations is relatively large, manifesting in a large number of items identified as unstable. Therefore, the effect of adjusting the units becomes smaller when the correlation is lower.

In contrast, when the unit ratio is close to 1.0 indicating that the unit difference is not great, regardless of the correlation between the item locations in different frames of reference, it is expected that adjusting the unit will not have any effect. The number of unstable items identified is determined solely by the size of the item location correlation. When the correlation is higher, the number of unstable items identified is anticipated to be less than when the correlation is lower.

It is clear that adjusting the units is expected to have an effect when the difference between the units is large. The size of the unit ratio needs to be considered in adjusting the unit. Humphry (2010) considers that an approximate ratio of 1.1 or greater (about a 10 %) difference, is worth investigating. However, it is not clear how large the difference would have to be to generate an effect on the person estimates. Testing the significance of the difference between the units in this case is considered useful. For this

purpose an F-test is used to test the significance of the difference between the variances of the two distributions of persons (Guilford & Fruchter, 1978). Because a ratio of standard deviations represents a unit ratio, a ratio of variance can also indicate a unit ratio.

Based on an earlier exposition, in this study, adjusting the units are performed either when there is a 10 % difference in ratio, particularly the unit ratio ≥ 1.1 or ≤ 0.9, or when there is a significant difference between the variances of the groups of persons.

In addition, to test the hypothesis that the effect of adjusting units on the number of unstable items is a function of the correlation between item location and unit ratio, a comparison between item locations with adjusting the unit and without adjusting the unit is conducted. This is especially performed on the set of tests where the item correlation and the unit ratio are high.

3.2.7 Procedures to Assess the Effect of Unstable of Item Parameters on Person Measurement

Because the instability of item parameters may have an effect on person measurement, the procedure to assess this possible effect is now presented. The effect is examined by (i) comparing the means of person locations derived from both sets of item parameters;

and, (ii) correlating the person locations derived from both sets of item parameters.

It is clear that the person estimates derived from the set of postgraduate/undergraduate item parameters are immediately available from the analyses performed. However, person estimates using item bank parameters are not immediately available because there is no response data associated with the item bank parameters. To obtain person estimates with item bank parameters, an anchored analysis is carried out. This is an analysis using person responses from the postgraduate/undergraduate data with the item bank values.

The person estimates derived from the anchored analysis can then be compared and correlated with those derived from the postgraduate/undergraduate analysis. In this comparison, the origin and the units in both sets of data are taken into account.

Dalam dokumen Evaluation of the Indonesian Scholastic Aptitude Test According to the Rasch Model and Its Paradigm (Halaman 118-124)