24 Matching Results

Search Results

A Comparison of Traditional Norming and Rasch Quick Norming Methods

Description: The simplicity and ease of use of the Rasch procedure is a decided advantage. The test user needs only two numbers: the frequency of persons who answered each item correctly and the Rasch-calibrated item difficulty, usually a part of an existing item bank. Norms can be computed quickly for any specific group of interest. In addition, once the selected items from the calibrated bank are normed, any test, built from the item bank, is automatically norm-referenced. Thus, it was concluded that the Rasch quick norm procedure is a meaningful alternative to traditional classical true score norming for test users who desire normative data.
Date: August 1993
Creator: Bush, Joan Spooner
Partner: UNT Libraries

The Effect of Training in Test Item Writing on Test Performance of Junior High Students

Description: Students in an inner city junior high school in North Central Texas participated in a study whose purpose was to examine the effect of training in test item construction on their later test performance. The experimental group underwent twelve weeks of instruction using the Test Item Construction Method (TICM). In these sessions students learned to develop test items similar to those on which they were tested annually by the state via the Texas Assessment of Academic Skills (TAAS). The TICM aligned with state mandated test specifications.
Date: May 1997
Creator: Tunks, Jeanne L.
Partner: UNT Libraries

Early Childhood Educators' Beliefs and Practices about Assessment

Description: Standardized tests are being administered to young children in greater numbers in recent years than ever before. Many more important educational decisions about children are being based on the results of these tests. This practice continues to escalate despite early childhood professional organizations' calls for a ban of standardized testing for children eight years of age and younger. Many early childhood educators have become dissatisfied with multiple-choice testing as a measure of student learning and are increasingly using various forms of alternative assessment to replace the more traditional testing formats. Teachers seem to be caught in the middle of the controversy between standardized testing and alternative assessment. This research examined what early childhood educators in one north Texas school district believe about assessment of young children and what assessment methods they report using in their classrooms, as well as factors which influence those beliefs and practices. The sample for this study was 84 teachers who taught prekindergarten through third grade. An eight-page questionnaire provided quantitative data and interviews and the researcher's journal provided qualitative data.
Date: May 1994
Creator: Diffily, Deborah
Partner: UNT Libraries

Testing in American Schools: Asking the Right Questions

Description: In this report, OTA places testing in its historical and policy context, examines the reasons for testing and the ways it is done, and identifies particular ways Federal policy affects the picture, The report also explores new approaches to testing that derive from modem technology and cognitive research.
Date: February 1992
Creator: United States. Congress. Office of Technology Assessment.
Partner: UNT Libraries Government Documents Department

Using regression analysis to investigate relationships of ASVAB selector composites to end-of-course grades for students in aircraft maintenance training programs in the Air Force

Description: Aircraft maintenance training programs in the Air Force have evolved from an almost exclusively mechanical orientation to one that is largely electronic. The Armed Services Vocational Aptitude Battery (ASVAB) with its four selector composites (Mechanical, Administration, General, and Electronic) has been in use for over 20 years. The mechanical (M) composite score is used to identify those who will be trained in aircraft maintenance.
Date: December 1995
Creator: Byrd, John L. (John Luclon)
Partner: UNT Libraries

The Effect on Group IQ Test Performance of Modification of Verbal Repertoires Related to Motivation, Anxiety, and Test-Wiseness

Description: To investigate the efficacy of a cognitive approach applied to problems of motivation, anxiety, and test-wiseness in a group test situation, programmed texts were used to Condition a repertoire of verbal responses relevant to each of these problems. Five sixth grade classes composed of 118 Students total were administered Otis-Lennon Mental Ability Tests in a pretest-posttest design. For the five groups, ANCOVA demonstrated a significant effect on raw scores, but not on IQ. Significant IQ and raw score gains were found for the combination group over the control group. Due to treatment lower IQ level students of the combination group made greater raw score gains than upper IQ level students.
Date: December 1975
Creator: Petty, Nancy E.
Partner: UNT Libraries

The Effect of Criterion-Referenced Tests on the Acquisition of Mathematical Skills and the Mastery of Objectives in Fifth-Grade Students

Description: This study is a description and analysis of the effect of criterion referenced test data on the acquisition of math skills and the mastery of selected objectives in fifth-grade students.The first chapter includes the introduction, statement of the problem, purposes of the study, statement of the hypotheses, background and significance., definition of terms, limitations, basic assumptions, and procedures for collecting data. The second chapter is a review of the literature pertaining to criterion-referenced testing and also includes a review of studies utilizing criterion-referenced test material. The third chapter describes the population being studied, the instruments used to measure achievement, and procedures for treatment of the data. The fourth chapter presents an analysis of the data collected for the study and a discussion of the findings. The fifth and final chapter presents a summary of the study, findings, conclusions, and recommendations pertaining to future research in the utilization of criterion- referenced testing. The subjects in this study were sixty, fifth-grade students attending Lakeland Elementary in the Lewisville Public School System who comprised the experimental group and sixty, fifth-grade students attending Central Elementary in the same district, who comprised the control group. The Comprehensive Test of Basic Skills (Form G Level 2), and the Prescriptive Mathematics Inventory (Aqua Level), were administered to both groups, with the pretest occurring in September, 1973 and the posttest being administered in April, 1974. Analysis of covariance and chi square goodness of fit were the techniques used to analyze the data statistically. Significant change was found to take place in the experimental group in mastering a greater proportion of the objectives selected for this study. The socio-economic level and educational background of the parents of the subjects in this study proved to be a significant factor in mastering the objectives selected for this study. The hypotheses utilizing ...
Date: August 1974
Creator: Downing, Clayton W.
Partner: UNT Libraries

A Comparison of Two Criterion-Referenced Item-Selection Techniques Utilizing Simulated Data with Item Pools that Vary in Degrees of Item Difficulty

Description: The problem of this study was to examine the equivalency of two different types of criterion-referenced item-selection techniques on simulated data as item pools varied in degrees of item difficulty. A pretest-posttest design was employed in which pass-fail scores were randomly generated for item pools of twenty-five items. From the item pools, the two techniques determined which items were to be used to make up twelve-item criterion-referenced tests. The twenty-five items also were rank ordered according to the discrimination power of the two techniques.
Date: May 1974
Creator: Davis, Robbie G.
Partner: UNT Libraries

Ability Estimation Under Different Item Parameterization and Scoring Models

Description: A Monte Carlo simulation study investigated the effect of scoring format, item parameterization, threshold configuration, and prior ability distribution on the accuracy of ability estimation given various IRT models. Item response data on 30 items from 1,000 examinees was simulated using known item parameters and ability estimates. The item response data sets were submitted to seven dichotomous or polytomous IRT models with different item parameterization to estimate examinee ability. The accuracy of the ability estimation for a given IRT model was assessed by the recovery rate and the root mean square errors. The results indicated that polytomous models produced more accurate ability estimates than the dichotomous models, under all combinations of research conditions, as indicated by higher recovery rates and lower root mean square errors. For the item parameterization models, the one-parameter model out-performed the two-parameter and three-parameter models under all research conditions. Among the polytomous models, the partial credit model had more accurate ability estimation than the other three polytomous models. The nominal categories model performed better than the general partial credit model and the multiple-choice model with the multiple-choice model the least accurate. The results further indicated that certain prior ability distributions had an effect on the accuracy of ability estimation; however, no clear order of accuracy among the four prior distribution groups was identified due to an interaction between prior ability distribution and threshold configuration. The recovery rate was lower when the test items had categories with unequal threshold distances, were close at one end of the ability/difficulty continuum, and were administered to a sample of examinees whose population ability distribution was skewed to the same end of the ability continuum.
Date: May 2002
Creator: Si, Ching-Fung B.
Partner: UNT Libraries

A Comparison of IRT and Rasch Procedures in a Mixed-Item Format Test

Description: This study investigated the effects of test length (10, 20 and 30 items), scoring schema (proportion of dichotomous ad polytomous scoring) and item analysis model (IRT and Rasch) on the ability estimates, test information levels and optimization criteria of mixed item format tests. Polytomous item responses to 30 items for 1000 examinees were simulated using the generalized partial-credit model and SAS software. Portions of the data were re-coded dichotomously over 11 structured proportions to create 33 sets of test responses including mixed item format tests. MULTILOG software was used to calculate the examinee ability estimates, standard errors, item and test information, reliability and fit indices. A comparison of IRT and Rasch item analysis procedures was made using SPSS software across ability estimates and standard errors of ability estimates using a 3 x 11 x 2 fixed factorial ANOVA. Effect sizes and power were reported for each procedure. Scheffe post hoc procedures were conducted on significant factos. Test information was analyzed and compared across the range of ability levels for all 66-design combinations. The results indicated that both test length and the proportion of items scored polytomously had a significant impact on the amount of test information produced by mixed item format tests. Generally, tests with 100% of the items scored polytomously produced the highest overall information. This seemed to be especially true for examinees with lower ability estimates. Optimality comparisons were made between IRT and Rasch procedures based on standard error rates for the ability estimates, marginal reliabilities and fit indices (-2LL). The only significant differences reported involved the standard error rates for both the IRT and Rasch procedures. This result must be viewed in light of the fact that the effect size reported was negligible. Optimality was found to be highest when longer tests and higher proportions of polytomous ...
Date: August 2003
Creator: Kinsey, Tari L.
Partner: UNT Libraries

An Empirical Investigation of Matrix Sampling Involving Multiple Item Samples in a Two-Factor Analysis of Variance Design

Description: The primary purposes of this study were: (1) to study empirically differences that might occur among item-samples and (2) to compare empirically the effect of test item samples on matrix sampling estimates of the mean and variance of a population of test scores, and (3) to study empirically an analysis of variance design through multiple matrix sampling.
Date: December 1971
Creator: Newell, James Archie
Partner: UNT Libraries

A Study of some Relationships between Level of Self-Concept, Academic Achievement and Classroom Adjustment

Description: The purpose of this study is two-fold: (1) to evaluate an instrument for measuring the self-concept of middle grade children; and (2) to determine the relationship of a middle-grade child's self-concept to his peer status, his classification by the teacher as a problem in behavior or classroom management, and to his academic achievement.
Date: August 1953
Creator: Reeder, Thelma Adams
Partner: UNT Libraries

An Investigation of Factors Affecting Test Equating in Latent Trait Theory

Description: The study investigated five factors which can affect the equating of scores from two tests onto a common score scale. The five factors studied were: (a) distribution type (i.e., normal versus uniform); (b) standard deviation of itemdifficulties (i.e., .68, .95, .99); (c) test length or number of test items (i.e., 50,100, 200); (d) number of common items (i.e., 10,20,30); and (e) sample size (i.e., 100, 300, 500). The significant two-way interaction effects were for common item length and test length, standard deviation of item difficulties and distribution type, and standard deviation of item difficulties and sample size.
Date: August 1998
Creator: Suanthong, Surintorn
Partner: UNT Libraries

The Effects of Computer Performance Assessment on Student Scores in a Computer Applications Course

Description: The goal of this study was to determine if performance-based tests should be routinely administered to students in computer application courses. The purpose was to determine the most appropriate mode of testing for individuals taking a computer applications course. The study is divided into areas of assessment, personality traits, and computer attitudes.
Date: July 1994
Creator: Casey, Sue Hartness
Partner: UNT Libraries

The impact of selected school factors on the test performance of African-American economically disadvantaged elementary students.

Description: In order for America to retain its superior position in a global economy it is imperative that all students receive educational opportunities that will prepare them for the future. Currently, African-American economically disadvantaged students in the United States perform lower on standardized tests than their grade and age-level peers. Educators must find ways to improve the performance of students in this group in order to maximize future opportunities. Through a mixed-methodology approach, the current study finds three school factors that may positively impact the performance of African-American economically disadvantaged students: high expectations, student-teacher relationships and teacher effectiveness. Quantitative and qualitative analysis provides perspectives from principals primarily from a large urban school district on the impact of these factors on student performance.
Date: May 2006
Creator: Griffin, Wynette O.
Partner: UNT Libraries

An examination of computer anxiety related to achievement on paper-and-pencil and computer-based aircraft maintenance knowledge testing of United States Air Force technical training students.

Description: The purpose of this study was to determine whether varying levels of computer anxiety have an effect on computer-based testing of United States Air Force technical training students. The first chapter presents an overview of computer-based testing, defines key terms, and identifies questions addressed in the research. The rationale for conducting this study was that little research had been done in this area. The second chapter contains a review of the pertinent literature related to computer-based testing, computer anxiety, test reliability, validity, and gender differences in computer use. Due to the lack understanding concerning any effects of computer anxiety on computer-based testing, this has been a worthwhile topic to explore, and it makes a significant contribution to the training field. The third chapter describes the qualitative research methodology used to conduct the study. The primary methodology was an analysis of variance comparison for groups of individuals who displayed high or low computer anxiety to their respective mean computer-based or paper-based aircraft maintenance knowledge testing scores. The research population consisted of United States Air Force aircraft maintenance craftsmen students attending training at Sheppard Air Force Base, Texas. The fourth chapter details the findings of the study. The findings indicate that there was no significant difference between the groups of students rated with high computer anxiety and low computer anxiety while testing with computers. Additionally, no significant differences were detected while testing alternative hypotheses covering differences between groups of students rated with high computer anxiety and low computer anxiety testing by traditional paper-and pencil methods. Finally, a reference section identifying the literature used in the preparation of this dissertation is also included.
Access: This item is restricted to UNT Community Members. Login required if off-campus.
Date: May 2002
Creator: McVay, Richard B.
Partner: UNT Libraries

CT3 as an Index of Knowledge Domain Structure: Distributions for Order Analysis and Information Hierarchies

Description: The problem with which this study is concerned is articulating all possible CT3 and KR21 reliability measures for every case of a 5x5 binary matrix (32,996,500 possible matrices). The study has three purposes. The first purpose is to calculate CT3 for every matrix and compare the results to the proposed optimum range of .3 to .5. The second purpose is to compare the results from the calculation of KR21 and CT3 reliability measures. The third purpose is to calculate CT3 and KR21 on every strand of a class test whose item set has been reduced using the difficulty strata identified by Order Analysis. The study was conducted by writing a computer program to articulate all possible 5 x 5 matrices. The program also calculated CT3 and KR21 reliability measures for each matrix. The nonparametric technique of Order Analysis was applied to two sections of test items to stratify the items into difficulty levels. The difficulty levels were used to reduce the item set from 22 to 9 items. All possible strands or chains of these items were identified so that both reliability measures (CT3 and KR21) could be calculated. One major finding of this study indicates that .3 to .5 is a desirable range for CT3 (cumulative p=.86 to p=.98) if cumulative frequencies are measured. A second major finding is that the KR21 reliability measure produced an invalid result more than half the time. The last major finding is that CT3, rescaled to range between 0 and 1, supports De Vellis' guidelines for reliability measures. The major conclusion is that CT3 is a better measure of reliability since it considers both inter- and intra-item variances.
Date: December 2002
Creator: Swartz Horn, Rebecca
Partner: UNT Libraries

A Comparison of a Computer-Administered Test and a Paper and Pencil Test Using Normally Achieving and Mathematically Disabled Young Children

Description: This study investigated whether a computer-administered mathematics test can provide equivalent results for normal and mathematically disabled students while retaining similar psychometric characteristics of an equivalent paper and pencil version of the test. The overall purpose of the study was twofold. First, the viability of using computer administered assessment with elementary school children was examined. Second, by investigating items on the computer administered mathematics test for potential bias between normally achieving and mathematically disabled populations, it was possible to determine whether certain mathematical concepts consistently distinguish between the two ability groups.
Date: May 1997
Creator: Swain, Colleen R. (Colleen Ruth)
Partner: UNT Libraries

Would You Do Your Homework for a Chance to Improve Your Quiz Score?

Description: Students who complete homework generally do better on measures of academic performance such as quizzes, exams, and overall course grades. We examined the effects of contingent access to second quiz attempts on the percentage of undergraduate students completing homework to mastery. The study was conducted in an Introduction to Behavior Analysis course that, historically, had only 70% of students on average completing homework. An adapted multiple baseline design across sections was used for four sections of the course. Students could access a second quiz attempt contingent by meeting the following criteria: the student received a 16 out of 20 on the first quiz attempt or by meeting the mastery criterion of the homework (45 out of 50). We also examined the relation between homework accuracy and scores on first quiz attempts. Two sections did not show a difference in homework completion with and without the second quiz attempt contingency. One section showed more sensitivity toward the contingency once it was withdrawn, and one section never had the removal of the contingency and had the highest percentages of students completing their homework. When analyzing the relation of homework accuracy to the corresponding first quiz attempts, homework accuracy appeared to be related to higher scores on first quiz attempts across all sections. Quiz scores were typically a letter grade higher for students who completed homework compared to students who did not complete homework to mastery. Although there are limitations to the current study, the results suggest the second quiz contingency may impact homework completion.
Date: August 2014
Creator: Zimmerman, Karl J.
Partner: UNT Libraries

An Analysis of the Relationship of the Scores Made by Students on Aptitudes "G" and "V" and Parts "H" and "I" of the General Aptitude Test Battery and the Academic Grades Made in Industrial Arts

Description: This study analyzes the converted scores made on Aptitudes "G" (intelligence) and "V" (verbal" and the raw scores made on Part "H" (three-dimensional space) and Part "I" of the General Aptitude Test Battery by students enrolled in beginning industrial arts courses, advanced industrial arts courses, and beginning English at North Texas State College, Denton, Texas, and the academic grades made by theses same students in order to determine what relationship exists between both the converted and raw scores made on the foregoing parts of the GATB and academic grades.
Date: August 1952
Creator: Gray, Noel Oren
Partner: UNT Libraries