UNT Theses and Dissertations - 48 Matching Results

Search Results

An Investigation of the Effect of Violating the Assumption of Homogeneity of Regression Slopes in the Analysis of Covariance Model upon the F-Statistic

Description: The study seeks to determine the effect upon the F-statistic of violating the assumption of homogeneity of regression slopes in the one-way, fixed-effects analysis of covariance model. The study employs a Monte Carlo simulation technique to vary the degree of heterogeneity of regression slopes with varied sample sizes within experiments to determine the effect of such conditions. One hundred and eighty-three simulations were used.
Date: August 1972
Creator: McClaran, Virgil Rutledge
Partner: UNT Libraries

A Comparison of Two Criterion-Referenced Item-Selection Techniques Utilizing Simulated Data with Item Pools that Vary in Degrees of Item Difficulty

Description: The problem of this study was to examine the equivalency of two different types of criterion-referenced item-selection techniques on simulated data as item pools varied in degrees of item difficulty. A pretest-posttest design was employed in which pass-fail scores were randomly generated for item pools of twenty-five items. From the item pools, the two techniques determined which items were to be used to make up twelve-item criterion-referenced tests. The twenty-five items also were rank ordered according to the discrimination power of the two techniques.
Date: May 1974
Creator: Davis, Robbie G.
Partner: UNT Libraries

The Effects of the Ratio of Utilized Predictors to Original Predictors on the Shrinkage of Multiple Correlation Coefficients

Description: This study dealt with shrinkage in multiple correlation coefficients computed for sample data when these coefficients are compared to the multiple correlation coefficients for populations and the effect of the ratio of utilized predictors to original predictors on the shrinkage in R square. The study sought to provide the rationale for selection of the shrinkage formula when the correlations between the predictors and the criterion are known and determine which of the three shrinkage formulas (Browne, Darlington, or Wherry) will yield the R square from sample data that is closest to the R square for the population data.
Date: August 1983
Creator: Petcharat, Prataung Parn
Partner: UNT Libraries

Short-to-Medium Term Enrollment Projection Based on Cycle Regression Analysis

Description: Short-to-medium projections were made of student semester credit hour enrollments for North Texas State University and the Texas Public and Senior Colleges and Universities (as defined by the Coordinating Board, Texas College and University System). Undergraduate, Graduate, Doctorate, Total, Education, Liberal Arts, and Business enrollments were projected. Fall + Spring, Fall, Summer I + Summer II, Summer I were time periods for which projections were made. A new regression analysis called "cycle regression" which employs nonlinear regression techniques to extract multifrequential phenomena from time-series data was employed for the analysis of the enrollment data. The heuristic steps employed in cycle regression analysis are similar to those used in fitting polynomial models. A trend line and one or more sin waves (cycles) are simultaneously estimated using a partial F test. The process of adding cycle(s) to the model continues until no more significant terms can be estimated.
Date: August 1983
Creator: Chizari, Mohammad
Partner: UNT Libraries

Willingness of Educators to Participate in a Descriptive Research Study as a Function of a Monetary Incentive

Description: The problem considered involved assessing willingness of educators to participate in a study offering monetary incentives. Determination of willingness was implemented by sending educators a packet requesting return of a postcard to indicate willingness to participate. The purpose was twofold: to determine the effect of a monetary incentive upon willingness of educators to participate in a research study, and to analyze implications for mail questionnaire studies. A sample of 600 educators was chosen from directories of eleven public schools in north Texas. It included equal numbers of male and female teachers and male and female administrators. Subjects were assigned to one of twelve groups. No two from a school were assigned to different levels of the inducement variable.
Date: May 1984
Creator: Pittman, Doyle
Partner: UNT Libraries

A Monte Carlo Study of the Robustness and Power of Analysis of Covariance Using Rank Transformation to Violation of Normality with Restricted Score Ranges for Selected Group Sizes

Description: The study seeks to determine the robustness and power of parametric analysis of covariance and analysis of covariance using rank transformation to violation of the assumption of normality. The study employs a Monte Carlo simulation procedure with varying conditions of population distribution, group size, equality of group size, scale length, regression slope, and Y-intercept. The procedure was performed on raw data and ranked data with untied ranks and tied ranks.
Date: December 1984
Creator: Wongla, Ruangdet
Partner: UNT Libraries

A Comparison of Three Methods of Detecting Test Item Bias

Description: This study compared three methods of detecting test item bias, the chi-square approach, the transformed item difficulties approach, and the Linn-Harnish three-parameter item response approach which is the only Item Response Theory (IRT) method that can be utilized with minority samples relatively small in size. The items on two tests which measured writing and reading skills were examined for evidence of sex and ethnic bias. Eight sets of samples, four from each test, were randomly selected from the population (N=7287) of sixth, seventh, and eighth grade students enrolled in a large, urban school district in the southwestern United States. Each set of samples, male/female, White/Hispanic, White/Black, and White/White, contained 800 examinees in the majority group and 200 in the minority group. In an attempt to control differences in ability that may have existed between the various population groups, examinees with scores greater or less than two standard deviations from their group's mean were eliminated. Ethnic samples contained equal numbers of each sex. The White/White sets of samples were utilized to provide baseline bias estimates because the tests could not logically be biased against these groups. Bias indices were then calculated for each set of samples with each of the three methods. Findings of this study indicate that the percent agreement between the Linn-Harnish IRT method and the chisquare and transformed difficulties methods is similar to that found in previous studies comparing the latter approaches with other IRT methods requiring large minority samples. Therefore, it appears that the Linn-Harnish IRT approach can be used in lieu of other more restrictive IRT methods. Ethnic bias appears to exist in the two tests as measured by the large mean bias indices for the White/Hispanic and White/Black samples. Little sex bias was found as evidenced by the low mean bias indices of the male/ ...
Date: May 1985
Creator: Monaco, Linda Gokey
Partner: UNT Libraries

The Robustness of O'Brien's r Transformation to Non-Normality

Description: A Monte Carlo simulation technique was employed in this study to determine if the r transformation, a test of homogeneity of variance, affords adequate protection against Type I error over a range of equal sample sizes and number of groups when samples are obtained from normal and non-normal distributions. Additionally, this study sought to determine if the r transformation is more robust than Bartlett's chi-square to deviations from normality. Four populations were generated representing normal, uniform, symmetric leptokurtic, and skewed leptokurtic distributions. For each sample size (6, 12, 24, 48), number of groups (3, 4, 5, 7), and population distribution condition, the r transformation and Bartlett's chi-square were calculated. This procedure was replicated 1,000 times; the actual significance level was determined and compared to the nominal significance level of .05. On the basis of the analysis of the generated data, the following conclusions are drawn. First, the r transformation is generally robust to violations of normality when the size of the samples tested is twelve or larger. Second, in the instances where a significant difference occurred between the actual and nominal significance levels, the r transformation produced (a) conservative Type I error rates if the kurtosis of the parent population were 1.414 or less and (b) an inflated Type I error rate when the index of kurtosis was three. Third, the r transformation should not be used if sample size is smaller than twelve. Fourth, the r transformation is more robust in all instances to non-normality, but the Bartlett test is superior in controlling Type I error when samples are from a population with a normal distribution. In light of these conclusions, the r transformation may be used as a general utility test of homogeneity of variances when either the distribution of the parent population is unknown or is known ...
Date: August 1985
Creator: Gordon, Carol J. (Carol Jean)
Partner: UNT Libraries

A Monte Carlo Analysis of Experimentwise and Comparisonwise Type I Error Rate of Six Specified Multiple Comparison Procedures When Applied to Small k's and Equal and Unequal Sample Sizes

Description: The problem of this study was to determine the differences in experimentwise and comparisonwise Type I error rate among six multiple comparison procedures when applied to twenty-eight combinations of normally distributed data. These were the Least Significant Difference, the Fisher-protected Least Significant Difference, the Student Newman-Keuls Test, the Duncan Multiple Range Test, the Tukey Honestly Significant Difference, and the Scheffe Significant Difference. The Spjøtvoll-Stoline and Tukey—Kramer HSD modifications were used for unequal n conditions. A Monte Carlo simulation was used for twenty-eight combinations of k and n. The scores were normally distributed (µ=100; σ=10). Specified multiple comparison procedures were applied under two conditions: (a) all experiments and (b) experiments in which the F-ratio was significant (0.05). Error counts were maintained over 1000 repetitions. The FLSD held experimentwise Type I error rate to nominal alpha for the complete null hypothesis. The FLSD was more sensitive to sample mean differences than the HSD while protecting against experimentwise error. The unprotected LSD was the only procedure to yield comparisonwise Type I error rate at nominal alpha. The SNK and MRT error rates fell between the FLSD and HSD rates. The SSD error rate was the most conservative. Use of the harmonic mean of the two unequal sample n's (HSD-TK) yielded uniformly better results than use of the minimum n (HSD-SS). Bernhardson's formulas controlled the experimentwise Type I error rate of the LSD and MRT to nominal alpha, but pushed the HSD below the 0.95 confidence interval. Use of the unprotected HSD produced fewer significant departures from nominal alpha. The formulas had no effect on the SSD.
Date: December 1985
Creator: Yount, William R.
Partner: UNT Libraries

A Comparison of Some Continuity Corrections for the Chi-Squared Test in 3 x 3, 3 x 4, and 3 x 5 Tables

Description: This study was designed to determine whether chis-quared based tests for independence give reliable estimates (as compared to the exact values provided by Fisher's exact probabilities test) of the probability of a relationship between the variables in 3 X 3, 3 X 4 , and 3 X 5 contingency tables when the sample size is 10, 20, or 30. In addition to the classical (uncorrected) chi-squared test, four methods for continuity correction were compared to Fisher's exact probabilities test. The four methods were Yates' correction, two corrections attributed to Cochran, and Mantel's correction. The study was modeled after a similar comparison conducted on 2 X 2 contingency tables and published by Michael Haber.
Date: May 1987
Creator: Mullen, Jerry D. (Jerry Davis)
Partner: UNT Libraries

A comparison of the Effects of Different Sizes of Ceiling Rules on the Estimates of Reliability of a Mathematics Achievement Test

Description: This study compared the estimates of reliability made using one, two, three, four, five, and unlimited consecutive failures as ceiling rules in scoring a mathematics achievement test which is part of the Iowa Tests of Basic Skill (ITBS), Form 8. There were 700 students randomly selected from a population (N=2640) of students enrolled in the eight grades in a large urban school district in the southwestern United States. These 700 students were randomly divided into seven subgroups so that each subgroup had 100 students. The responses of all those students to three subtests of the mathematics achievement battery, which included mathematical concepts (44 items), problem solving (32 items), and computation (45 items), were analyzed to obtain the item difficulties and a total score for each student. The items in each subtest then were rearranged based on the item difficulties from the highest to the lowest value. In each subgroup, the method using one, two, three, four, five, and unlimited consecutive failures as the ceiling rules were applied to score the individual responses. The total score for each individual was the sum of the correct responses prior to the point described by the ceiling rule. The correct responses after the ceiling rule were not part of the total score. The estimate of reliability in each method was computed by alpha coefficient of the SPSS-X. The results of this study indicated that the estimate of reliability using two, three, four, and five consecutive failures as the ceiling rules were an improvement over the methods using one and unlimited consecutive failures.
Date: May 1987
Creator: Somboon Suriyawongse
Partner: UNT Libraries

Effect of Rater Training and Scale Type on Leniency and Halo Error in Student Ratings of Faculty

Description: The purpose of this study was to determine if leniency and halo error in student ratings could be reduced by training the student raters and by using a Behaviorally Anchored Rating Scale (BARS) rather than a Likert scale. Two hypotheses were proposed. First, the ratings collected from the trained raters would contain less halo and leniency error than those collected from the untrained raters. Second, within the group of trained raters the BARS would contain less halo and leniency error than the Likert instrument.
Date: May 1987
Creator: Cook, Stuart S. (Stuart Sheldon)
Partner: UNT Libraries

Comparison of Methods for Computation and Cumulation of Effect Sizes in Meta-Analysis

Description: This study examined the statistical consequences of employing various methods of computing and cumulating effect sizes in meta-analysis. Six methods of computing effect size, and three techniques for combining study outcomes, were compared. Effect size metrics were calculated with one-group and pooled standardizing denominators, corrected for bias and for unreliability of measurement, and weighted by sample size and by sample variance. Cumulating techniques employed as units of analysis the effect size, the study, and an average study effect. In order to determine whether outcomes might vary with the size of the meta-analysis, mean effect sizes were also compared for two smaller subsets of studies. An existing meta-analysis of 60 studies examining the effectiveness of computer-based instruction was used as a data base for this investigation. Recomputation of the original study data under the six different effect size formulas showed no significant difference among the metrics. Maintaining the independence of the data by using only one effect size per study, whether a single or averaged effect, produced a higher mean effect size than averaging all effect sizes together, although the difference did not reach statistical significance. The sampling distribution of effect size means approached that of the population of 60 studies for subsets consisting of 40 studies, but not for subsets of 20 studies. Results of this study indicated that the researcher may choose any of the methods for effect size calculation or cumulation without fear of biasing the outcome of the metaanalysis. If weighted effect sizes are to be used, care must be taken to avoid giving undue influence to studies which may have large sample sizes, but not necessarily be the most meaningful, theoretically representative, or elegantly designed. It is important for the researcher to locate all relevant studies on the topic under investigation, since selective or even random ...
Date: December 1987
Creator: Ronco, Sharron L. (Sharron Lee)
Partner: UNT Libraries

The Characteristics and Properties of the Threshold and Squared-Error Criterion-Referenced Agreement Indices

Description: Educators who use criterion-referenced measurement to ascertain the current level of performance of an examinee in order that the examinee may be classified as either a master or a nonmaster need to know the accuracy and consistency of their decisions regarding assignment of mastery states. This study examined the sampling distribution characteristics of two reliability indices that use the squared-error agreement function: Livingston's k^2(X,Tx) and Brennan and Kane's M(C). The sampling distribution characteristics of five indices that use the threshold agreement function were also examined: Subkoviak's Pc. Huynh's p and k. and Swaminathan's p and k. These seven methods of calculating reliability were also compared under varying conditions of sample size, test length, and criterion or cutoff score. Computer-generated data provided randomly parallel test forms for N = 2000 cases. From this, 1000 samples were drawn, with replacement, and each of the seven reliability indices was calculated. Descriptive statistics were collected for each sample set and examined for distribution characteristics. In addition, the mean value for each index was compared to the population parameter value of consistent mastery/nonmastery classifications. The results indicated that the sampling distribution characteristics of all seven reliability indices approach normal characteristics with increased sample size. The results also indicated that Huynh's p was the most accurate estimate of the population parameter, with the smallest degree of negative bias. Swaminathan's p was the next best estimate of the population parameter, but it has the disadvantage of requiring two test administrations, while Huynh's p index only requires one administration.
Date: May 1988
Creator: Dutschke, Cynthia F. (Cynthia Fleming)
Partner: UNT Libraries

A Comparison of Three Item Selection Methods in Criterion-Referenced Tests

Description: This study compared three methods of selecting the best discriminating test items and the resultant test reliability of mastery/nonmastery classifications. These three methods were (a) the agreement approach, (b) the phi coefficient approach, and (c) the random selection approach. Test responses from 1,836 students on a 50-item physical science test were used, from which 90 distinct data sets were generated for analysis. These 90 data sets contained 10 replications of the combination of three different sample sizes (75, 150, and 300) and three different numbers of test items (15, 25, and 35). The results of this study indicated that the agreement approach was an appropriate method to be used for selecting criterion-referenced test items at the classroom level, while the phi coefficient approach was an appropriate method to be used at the district and/or state levels. The random selection method did not have similar characteristics in selecting test items and produced the lowest reliabilities, when compared with the agreement and the phi coefficient approaches.
Date: August 1988
Creator: Lin, Hui-Fen
Partner: UNT Libraries

The Analysis of the Accumulation of Type II Error in Multiple Comparisons for Specified Levels of Power to Violation of Normality with the Dunn-Bonferroni Procedure: a Monte Carlo Study

Description: The study seeks to determine the degree of accumulation of Type II error rates, while violating the assumptions of normality, for different specified levels of power among sample means. The study employs a Monte Carlo simulation procedure with three different specified levels of power, methodologies, and population distributions. On the basis of the comparisons of actual and observed error rates, the following conclusions appear to be appropriate. 1. Under the strict criteria for evaluation of the hypotheses, Type II experimentwise error does accumulate at a rate that the probability of accepting at least one null hypothesis in a family of tests, when in theory all of the alternate hypotheses are true, is high, precluding valid tests at the beginning of the study. 2. The Dunn-Bonferroni procedure of setting the critical value based on the beta value per contrast did not significantly reduce the probability of committing a Type II error in a family of tests. 3. The use of an adequate sample size and orthogonal contrasts, or limiting the number of pairwise comparisons to the number of means, is the best method to control for the accumulation of Type II errors. 4. The accumulation of Type II error is irrespective of distributions.
Date: August 1989
Creator: Powers-Prather, Bonnie Ann
Partner: UNT Libraries

The Effectiveness of a Mediating Structure for Writing Analysis Level Test Items From Text Based Instruction

Description: This study is concerned with the effect of placing text into a mediated structure form upon the generation of test items for analysis level domain referenced test construction. The item writing methodology used is the linguistic (operationally defined) item writing technology developed by Bormuth, Finn, Roid, Haladyna and others. This item writing methodology is compared to 1) the intuitive method based on Bloom's definition of analysis level test questions and 2) the intuitive with keywords identified method of item writing. A mediated structure was developed by coordinating or subordinating sentences in an essay by following five simple grammatical rules. Three test writers each composed a ten-item test using each of the three methodologies based on a common essay. Tests were administered to 102 Composition 1 community college students. Students were asked to read the essay and complete one test form. Test forms by writer and method were randomly delivered. Analysis of variance showed no significant differences among either methods or writers. Item analysis showed no method of item writing resulting in items of consistent difficulty among test item writers. While the results of this study show no significant difference from the intuitive, traditional methods of item writing, analysis level test item generation using a mediating structure may yet prove useful to the classroom teacher with access to a computer. All three test writers agree that test items were easier to write using the generative rules and mediated structure. Also, some relief was felt by the writers in that the method theoretically assured that an analysis level item was written.
Date: August 1989
Creator: Brasel, Michael D. (Michael David)
Partner: UNT Libraries

A State-Wide Survey on the Utilization of Instructional Technology by Public School Districts in Texas

Description: Effective utilization of instructional technology can provide a valuable method for the delivery of a school program, and enable a teacher to individualize according to student needs. Implementation of such a program is costly and requires careful planning and adequate staff development for school personnel. This study examined the degree of commitment by Texas school districts to the use of the latest technologies in their efforts to revolutionize education. Quantitative data were collected by using a survey that included five informational areas: (1) school district background, (2) funding for budget, (3) staff, (4) technology hardware, and (5) staff development. The study included 137 school districts representing the 5 University Interscholastic League (UIL) classifications (A through AAAAA). The survey was mailed to the school superintendents requesting that the persons most familiar with instructional technology be responsible for completing the questionnaires. Analysis of data examined the relationship between UIL classification and the amount of money expended on instructional technology. Correlation coefficients were determined between teachers receiving training in the use of technology and total personnel assigned to technology positions. Coefficients were calculated between a district providing a plan fortechnology and employment of a coordinator for instructional technology. Significance was established at the .05 level. A significant relationship was determined between the total district budget and the amount of money allocated to instructional technology. There was a significant relationship between the number of teachers receiving training in technology and the number of personnel assigned to technology positions. A significant negative relationship was determined between the district having a long-range plan for technology and the employment of a full-time coordinator for one of the subgroups. An attempt was made to provide information concerning the effort by local school districts to provide technology for instructional purposes. Progress has been made, although additional funds will be ...
Date: May 1990
Creator: Hiett, Elmer D. (Elmer Donald)
Partner: UNT Libraries

Outliers and Regression Models

Description: The mitigation of outliers serves to increase the strength of a relationship between variables. This study defined outliers in three different ways and used five regression procedures to describe the effects of outliers on 50 data sets. This study also examined the relationship among the shape of the distribution, skewness, and outliers.
Date: May 1992
Creator: Mitchell, Napoleon
Partner: UNT Libraries

A Comparison of Traditional Norming and Rasch Quick Norming Methods

Description: The simplicity and ease of use of the Rasch procedure is a decided advantage. The test user needs only two numbers: the frequency of persons who answered each item correctly and the Rasch-calibrated item difficulty, usually a part of an existing item bank. Norms can be computed quickly for any specific group of interest. In addition, once the selected items from the calibrated bank are normed, any test, built from the item bank, is automatically norm-referenced. Thus, it was concluded that the Rasch quick norm procedure is a meaningful alternative to traditional classical true score norming for test users who desire normative data.
Date: August 1993
Creator: Bush, Joan Spooner
Partner: UNT Libraries

The Generalization of the Logistic Discriminant Function Analysis and Mantel Score Test Procedures to Detection of Differential Testlet Functioning

Description: Two procedures for detection of differential item functioning (DIF) for polytomous items were generalized to detection of differential testlet functioning (DTLF). The methods compared were the logistic discriminant function analysis procedure for uniform and non-uniform DTLF (LDFA-U and LDFA-N), and the Mantel score test procedure. Further analysis included comparison of results of DTLF analysis using the Mantel procedure with DIF analysis of individual testlet items using the Mantel-Haenszel (MH) procedure. Over 600 chi-squares were analyzed and compared for rejection of null hypotheses. Samples of 500, 1,000, and 2,000 were drawn by gender subgroups from the NELS:88 data set, which contains demographic and test data from over 25,000 eighth graders. Three types of testlets (totalling 29) from the NELS:88 test were analyzed for DTLF. The first type, the common passage testlet, followed the conventional testlet definition: items grouped together by a common reading passage, figure, or graph. The other two types were based upon common content and common process. as outlined in the NELS test specification.
Date: August 1994
Creator: Kinard, Mary E.
Partner: UNT Libraries

A Comparison of Three Criteria Employed in the Selection of Regression Models Using Simulated and Real Data

Description: Researchers who make predictions from educational data are interested in choosing the best regression model possible. Many criteria have been devised for choosing a full or restricted model, and also for selecting the best subset from an all-possible-subsets regression. The relative practical usefulness of three of the criteria used in selecting a regression model was compared in this study: (a) Mallows' C_p, (b) Amemiya's prediction criterion, and (c) Hagerty and Srinivasan's method involving predictive power. Target correlation matrices with 10,000 cases were simulated so that the matrices had varying degrees of effect sizes. The amount of power for each matrix was calculated after one or two predictors was dropped from the full regression model, for sample sizes ranging from n = 25 to n = 150. Also, the null case, when one predictor was uncorrelated with the other predictors, was considered. In addition, comparisons for regression models selected using C_p and prediction criterion were performed using data from the National Educational Longitudinal Study of 1988.
Date: December 1994
Creator: Graham, D. Scott
Partner: UNT Libraries

A Comparison of Two Differential Item Functioning Detection Methods: Logistic Regression and an Analysis of Variance Approach Using Rasch Estimation

Description: Differential item functioning (DIF) detection rates were examined for the logistic regression and analysis of variance (ANOVA) DIF detection methods. The methods were applied to simulated data sets of varying test length (20, 40, and 60 items) and sample size (200, 400, and 600 examinees) for both equal and unequal underlying ability between groups as well as for both fixed and varying item discrimination parameters. Each test contained 5% uniform DIF items, 5% non-uniform DIF items, and 5% combination DIF (simultaneous uniform and non-uniform DIF) items. The factors were completely crossed, and each experiment was replicated 100 times. For both methods and all DIF types, a test length of 20 was sufficient for satisfactory DIF detection. The detection rate increased significantly with sample size for each method. With the ANOVA DIF method and uniform DIF, there was a difference in detection rates between discrimination parameter types, which favored varying discrimination and decreased with increased sample size. The detection rate of non-uniform DIF using the ANOVA DIF method was higher with fixed discrimination parameters than with varying discrimination parameters when relative underlying ability was unequal. In the combination DIF case, there was a three-way interaction among the experimental factors discrimination type, relative ability, and sample size for both detection methods. The error rate for the ANOVA DIF detection method decreased as test length increased and increased as sample size increased. For both methods, the error rate was slightly higher with varying discrimination parameters than with fixed. For logistic regression, the error rate increased with sample size when relative underlying ability was unequal between groups. The logistic regression method detected uniform and non-uniform DIF at a higher rate than the ANOVA DIF method. Because the type of DIF present in real data is rarely known, the logistic regression method is recommended for ...
Date: August 1995
Creator: Whitmore, Marjorie Lee Threet
Partner: UNT Libraries

An Empirical Comparison of Random Number Generators: Period, Structure, Correlation, Density, and Efficiency

Description: Random number generators (RNGs) are widely used in conducting Monte Carlo simulation studies, which are important in the field of statistics for comparing power, mean differences, or distribution shapes between statistical approaches. Statistical results, however, may differ when different random number generators are used. Often older methods have been blindly used with no understanding of their limitations. Many random functions supplied with computers today have been found to be comparatively unsatisfactory. In this study, five multiplicative linear congruential generators (MLCGs) were chosen which are provided in the following statistical packages: RANDU (IBM), RNUN (IMSL), RANUNI (SAS), UNIFORM(SPSS), and RANDOM (BMDP). Using a personal computer (PC), an empirical investigation was performed using five criteria: period length before repeating random numbers, distribution shape, correlation between adjacent numbers, density of distributions and normal approach of random number generator (RNG) in a normal function. All RNG FORTRAN programs were rewritten into Pascal which is more efficient language for the PC. Sets of random numbers were generated using different starting values. A good RNG should have the following properties: a long enough period; a well-structured pattern in distribution; independence between random number sequences; random and uniform distribution; and a good normal approach in the normal distribution. Findings in this study suggested that the above five criteria need to be examined when conducting a simulation study with large enough sample sizes and various starting values because the RNG selected can affect the statistical results. Furthermore, a study for purposes of indicating reproducibility and validity should indicate the source of the RNG, the type of RNG used, evaluation results of the RNG, and any pertinent information related to the computer used in the study. Recommendations for future research are suggested in the area of other RNGs and methods not used in this study, such as additive, combined, ...
Date: August 1995
Creator: Bang, Jung Woong
Partner: UNT Libraries