UNT Theses and Dissertations - Browse

ABOUT BROWSE FEED

Comparisons of Improvement-Over-Chance Effect Sizes for Two Groups Under Variance Heterogeneity and Prior Probabilities

Description: The distributional properties of improvement-over-chance, I, effect sizes derived from linear and quadratic predictive discriminant analysis (PDA) and from logistic regression analysis (LRA) for the two-group univariate classification were examined. Data were generated under varying levels of four data conditions: population separation, variance pattern, sample size, and prior probabilities. None of the indices provided acceptable estimates of effect for all the conditions examined. There were only a small number of conditions under which both accuracy and precision were acceptable. The results indicate that the decision of which method to choose is primarily determined by variance pattern and prior probabilities. Under variance homogeneity, any of the methods may be recommended. However, LRA is recommended when priors are equal or extreme and linear PDA is recommended when priors are moderate. Under variance heterogeneity, selecting a recommended method is more complex. In many cases, more than one method could be used appropriately.
Date: May 2003
Creator: Alexander, Erika D.
Partner: UNT Libraries

The Effectiveness of a Mediating Structure for Writing Analysis Level Test Items From Text Based Instruction

Description: This study is concerned with the effect of placing text into a mediated structure form upon the generation of test items for analysis level domain referenced test construction. The item writing methodology used is the linguistic (operationally defined) item writing technology developed by Bormuth, Finn, Roid, Haladyna and others. This item writing methodology is compared to 1) the intuitive method based on Bloom's definition of analysis level test questions and 2) the intuitive with keywords identified method of item writing. A mediated structure was developed by coordinating or subordinating sentences in an essay by following five simple grammatical rules. Three test writers each composed a ten-item test using each of the three methodologies based on a common essay. Tests were administered to 102 Composition 1 community college students. Students were asked to read the essay and complete one test form. Test forms by writer and method were randomly delivered. Analysis of variance showed no significant differences among either methods or writers. Item analysis showed no method of item writing resulting in items of consistent difficulty among test item writers. While the results of this study show no significant difference from the intuitive, traditional methods of item writing, analysis level test item generation using a mediating structure may yet prove useful to the classroom teacher with access to a computer. All three test writers agree that test items were easier to write using the generative rules and mediated structure. Also, some relief was felt by the writers in that the method theoretically assured that an analysis level item was written.
Date: August 1989
Creator: Brasel, Michael D. (Michael David)
Partner: UNT Libraries

Effect of Rater Training and Scale Type on Leniency and Halo Error in Student Ratings of Faculty

Description: The purpose of this study was to determine if leniency and halo error in student ratings could be reduced by training the student raters and by using a Behaviorally Anchored Rating Scale (BARS) rather than a Likert scale. Two hypotheses were proposed. First, the ratings collected from the trained raters would contain less halo and leniency error than those collected from the untrained raters. Second, within the group of trained raters the BARS would contain less halo and leniency error than the Likert instrument.
Date: May 1987
Creator: Cook, Stuart S. (Stuart Sheldon)
Partner: UNT Libraries

A Comparison of Two Criterion-Referenced Item-Selection Techniques Utilizing Simulated Data with Item Pools that Vary in Degrees of Item Difficulty

Description: The problem of this study was to examine the equivalency of two different types of criterion-referenced item-selection techniques on simulated data as item pools varied in degrees of item difficulty. A pretest-posttest design was employed in which pass-fail scores were randomly generated for item pools of twenty-five items. From the item pools, the two techniques determined which items were to be used to make up twelve-item criterion-referenced tests. The twenty-five items also were rank ordered according to the discrimination power of the two techniques.
Date: May 1974
Creator: Davis, Robbie G.
Partner: UNT Libraries

A State-Wide Survey on the Utilization of Instructional Technology by Public School Districts in Texas

Description: Effective utilization of instructional technology can provide a valuable method for the delivery of a school program, and enable a teacher to individualize according to student needs. Implementation of such a program is costly and requires careful planning and adequate staff development for school personnel. This study examined the degree of commitment by Texas school districts to the use of the latest technologies in their efforts to revolutionize education. Quantitative data were collected by using a survey that included five informational areas: (1) school district background, (2) funding for budget, (3) staff, (4) technology hardware, and (5) staff development. The study included 137 school districts representing the 5 University Interscholastic League (UIL) classifications (A through AAAAA). The survey was mailed to the school superintendents requesting that the persons most familiar with instructional technology be responsible for completing the questionnaires. Analysis of data examined the relationship between UIL classification and the amount of money expended on instructional technology. Correlation coefficients were determined between teachers receiving training in the use of technology and total personnel assigned to technology positions. Coefficients were calculated between a district providing a plan fortechnology and employment of a coordinator for instructional technology. Significance was established at the .05 level. A significant relationship was determined between the total district budget and the amount of money allocated to instructional technology. There was a significant relationship between the number of teachers receiving training in technology and the number of personnel assigned to technology positions. A significant negative relationship was determined between the district having a long-range plan for technology and the employment of a full-time coordinator for one of the subgroups. An attempt was made to provide information concerning the effort by local school districts to provide technology for instructional purposes. Progress has been made, although additional funds will be ...
Date: May 1990
Creator: Hiett, Elmer D. (Elmer Donald)
Partner: UNT Libraries

A Comparison of IRT and Rasch Procedures in a Mixed-Item Format Test

Description: This study investigated the effects of test length (10, 20 and 30 items), scoring schema (proportion of dichotomous ad polytomous scoring) and item analysis model (IRT and Rasch) on the ability estimates, test information levels and optimization criteria of mixed item format tests. Polytomous item responses to 30 items for 1000 examinees were simulated using the generalized partial-credit model and SAS software. Portions of the data were re-coded dichotomously over 11 structured proportions to create 33 sets of test responses including mixed item format tests. MULTILOG software was used to calculate the examinee ability estimates, standard errors, item and test information, reliability and fit indices. A comparison of IRT and Rasch item analysis procedures was made using SPSS software across ability estimates and standard errors of ability estimates using a 3 x 11 x 2 fixed factorial ANOVA. Effect sizes and power were reported for each procedure. Scheffe post hoc procedures were conducted on significant factos. Test information was analyzed and compared across the range of ability levels for all 66-design combinations. The results indicated that both test length and the proportion of items scored polytomously had a significant impact on the amount of test information produced by mixed item format tests. Generally, tests with 100% of the items scored polytomously produced the highest overall information. This seemed to be especially true for examinees with lower ability estimates. Optimality comparisons were made between IRT and Rasch procedures based on standard error rates for the ability estimates, marginal reliabilities and fit indices (-2LL). The only significant differences reported involved the standard error rates for both the IRT and Rasch procedures. This result must be viewed in light of the fact that the effect size reported was negligible. Optimality was found to be highest when longer tests and higher proportions of polytomous ...
Date: August 2003
Creator: Kinsey, Tari L.
Partner: UNT Libraries

Bias and Precision of the Squared Canonical Correlation Coefficient under Nonnormal Data Conditions

Description: This dissertation: (a) investigated the degree to which the squared canonical correlation coefficient is biased in multivariate nonnormal distributions and (b) identified formulae that adjust the squared canonical correlation coefficient (Rc2) such that it most closely approximates the true population effect under normal and nonnormal data conditions. Five conditions were manipulated in a fully-crossed design to determine the degree of bias associated with Rc2: distribution shape, variable sets, sample size to variable ratios, and within- and between-set correlations. Very few of the condition combinations produced acceptable amounts of bias in Rc2, but those that did were all found with first function results. The sample size to variable ratio (n:v)was determined to have the greatest impact on the bias associated with the Rc2 for the first, second, and third functions. The variable set condition also affected the accuracy of Rc2, but for the second and third functions only. The kurtosis levels of the marginal distributions (b2), and the between- and within-set correlations demonstrated little or no impact on the bias associated with Rc2. Therefore, it is recommended that researchers use n:v ratios of at least 10:1 in canonical analyses, although greater n:v ratios have the potential to produce even less bias. Furthermore,because it was determined that b2 did not impact the accuracy of Rc2, one can be somewhat confident that, with marginal distributions possessing homogenous kurtosis levels ranging anywhere from -1 to 8, Rc2 will likely be as accurate as that resulting from a normal distribution. Because the majority of Rc2 estimates were extremely biased, it is recommended that all Rc2 effects, regardless of which function from which they result, be adjusted using an appropriate adjustment formula. If no rationale exists for the use of another formula, the Rozeboom-2 would likely be a safe choice given that it produced the greatest ...
Date: August 2006
Creator: Leach, Lesley Ann Freeny
Partner: UNT Libraries

An Investigation of the Effect of Violating the Assumption of Homogeneity of Regression Slopes in the Analysis of Covariance Model upon the F-Statistic

Description: The study seeks to determine the effect upon the F-statistic of violating the assumption of homogeneity of regression slopes in the one-way, fixed-effects analysis of covariance model. The study employs a Monte Carlo simulation technique to vary the degree of heterogeneity of regression slopes with varied sample sizes within experiments to determine the effect of such conditions. One hundred and eighty-three simulations were used.
Date: August 1972
Creator: McClaran, Virgil Rutledge
Partner: UNT Libraries

A Comparison of Some Continuity Corrections for the Chi-Squared Test in 3 x 3, 3 x 4, and 3 x 5 Tables

Description: This study was designed to determine whether chis-quared based tests for independence give reliable estimates (as compared to the exact values provided by Fisher's exact probabilities test) of the probability of a relationship between the variables in 3 X 3, 3 X 4 , and 3 X 5 contingency tables when the sample size is 10, 20, or 30. In addition to the classical (uncorrected) chi-squared test, four methods for continuity correction were compared to Fisher's exact probabilities test. The four methods were Yates' correction, two corrections attributed to Cochran, and Mantel's correction. The study was modeled after a similar comparison conducted on 2 X 2 contingency tables and published by Michael Haber.
Date: May 1987
Creator: Mullen, Jerry D. (Jerry Davis)
Partner: UNT Libraries

A comparison of the Effects of Different Sizes of Ceiling Rules on the Estimates of Reliability of a Mathematics Achievement Test

Description: This study compared the estimates of reliability made using one, two, three, four, five, and unlimited consecutive failures as ceiling rules in scoring a mathematics achievement test which is part of the Iowa Tests of Basic Skill (ITBS), Form 8. There were 700 students randomly selected from a population (N=2640) of students enrolled in the eight grades in a large urban school district in the southwestern United States. These 700 students were randomly divided into seven subgroups so that each subgroup had 100 students. The responses of all those students to three subtests of the mathematics achievement battery, which included mathematical concepts (44 items), problem solving (32 items), and computation (45 items), were analyzed to obtain the item difficulties and a total score for each student. The items in each subtest then were rearranged based on the item difficulties from the highest to the lowest value. In each subgroup, the method using one, two, three, four, five, and unlimited consecutive failures as the ceiling rules were applied to score the individual responses. The total score for each individual was the sum of the correct responses prior to the point described by the ceiling rule. The correct responses after the ceiling rule were not part of the total score. The estimate of reliability in each method was computed by alpha coefficient of the SPSS-X. The results of this study indicated that the estimate of reliability using two, three, four, and five consecutive failures as the ceiling rules were an improvement over the methods using one and unlimited consecutive failures.
Date: May 1987
Creator: Somboon Suriyawongse
Partner: UNT Libraries

A Monte Carlo Analysis of Experimentwise and Comparisonwise Type I Error Rate of Six Specified Multiple Comparison Procedures When Applied to Small k's and Equal and Unequal Sample Sizes

Description: The problem of this study was to determine the differences in experimentwise and comparisonwise Type I error rate among six multiple comparison procedures when applied to twenty-eight combinations of normally distributed data. These were the Least Significant Difference, the Fisher-protected Least Significant Difference, the Student Newman-Keuls Test, the Duncan Multiple Range Test, the Tukey Honestly Significant Difference, and the Scheffe Significant Difference. The Spjøtvoll-Stoline and Tukey—Kramer HSD modifications were used for unequal n conditions. A Monte Carlo simulation was used for twenty-eight combinations of k and n. The scores were normally distributed (µ=100; σ=10). Specified multiple comparison procedures were applied under two conditions: (a) all experiments and (b) experiments in which the F-ratio was significant (0.05). Error counts were maintained over 1000 repetitions. The FLSD held experimentwise Type I error rate to nominal alpha for the complete null hypothesis. The FLSD was more sensitive to sample mean differences than the HSD while protecting against experimentwise error. The unprotected LSD was the only procedure to yield comparisonwise Type I error rate at nominal alpha. The SNK and MRT error rates fell between the FLSD and HSD rates. The SSD error rate was the most conservative. Use of the harmonic mean of the two unequal sample n's (HSD-TK) yielded uniformly better results than use of the minimum n (HSD-SS). Bernhardson's formulas controlled the experimentwise Type I error rate of the LSD and MRT to nominal alpha, but pushed the HSD below the 0.95 confidence interval. Use of the unprotected HSD produced fewer significant departures from nominal alpha. The formulas had no effect on the SSD.
Date: December 1985
Creator: Yount, William R.
Partner: UNT Libraries