The Characteristics and Properties of the Threshold and Squared-Error Criterion-Referenced Agreement Indices
Description: Educators who use criterion-referenced measurement to ascertain the current level of performance of an examinee in order that the examinee may be classified as either a master or a nonmaster need to know the accuracy and consistency of their decisions regarding assignment of mastery states. This study examined the sampling distribution characteristics of two reliability indices that use the squared-error agreement function: Livingston's k^2(X,Tx) and Brennan and Kane's M(C). The sampling distribution characteristics of five indices that use the threshold agreement function were also examined: Subkoviak's Pc. Huynh's p and k. and Swaminathan's p and k. These seven methods of calculating reliability were also compared under varying conditions of sample size, test length, and criterion or cutoff score. Computer-generated data provided randomly parallel test forms for N = 2000 cases. From this, 1000 samples were drawn, with replacement, and each of the seven reliability indices was calculated. Descriptive statistics were collected for each sample set and examined for distribution characteristics. In addition, the mean value for each index was compared to the population parameter value of consistent mastery/nonmastery classifications. The results indicated that the sampling distribution characteristics of all seven reliability indices approach normal characteristics with increased sample size. The results also indicated that Huynh's p was the most accurate estimate of the population parameter, with the smallest degree of negative bias. Swaminathan's p was the next best estimate of the population parameter, but it has the disadvantage of requiring two test administrations, while Huynh's p index only requires one administration.
Date: May 1988
Creator: Dutschke, Cynthia F. (Cynthia Fleming)
Partner: UNT Libraries