Under the auspices of its new editorial team, Co-editors Dianne Henderson, Ph.D., of ACT, and Holly Garner, of Kaplan Test Prep, The Journal of Applied Testing Technology (JATT), ATP's peer-reviewed, online scholarly journal published its first 2020 issue in January.

Three articles are included in the issue:  Taming the Firehose: Unsupervised Machine Learning for Syntactic Partitioning of Large Volumes of Automatically Generated Items to Assist Automated Test Assembly  by Brian S. Cole, Elia Lima-Walton, Kim Brunnert, Winona Burt Vesey and Kaushik Raha;  Educational Test Approaches: The Suitability of Computer-Based Test Types for Assessment and Evaluation in Formative and Summative Contexts by Maaike M. van Groen and Theo J. H. M. Eggen ;  and, Proficiency Classification and Violated Local Independence: An Examination of Pass/Fail Decision Accuracy under Competing Rasch Models by Kari J. Hodge and Grant B. Morgan.

Abstracts are below

Taming the Firehouse:


Automatic item generation can rapidly generate large volumes of exam items, but this creates challenges for assembly of exams which aim to include syntactically diverse items. First, we demonstrate a diminishing marginal syntactic return for automatic item generation using a saturation detection approach. This analysis can help users of automatic item generation to generate more diverse item banks. We then develop a pipeline that uses an unsupervised machine learning method for partitioning of a large, automatically generated item bank into syntactically distinct clusters. We explore applications to test assembly and conclude that machine learning methods can provide utility in harnessing the large datasets achievable by automatic item generation.

Educational Test Approaches:

When developing a digital test, one of the first decisions that need to be made is which type of Computer-Based Test (CBT) to develop. Six different CBT types are considered here: linear tests, automatically generated tests, computerized adaptive tests, adaptive learning environments, educational simulations, and educational games. The selection of a CBT type needs to be guided by the intended purposes of the test. The test approach determines which purposes can be achieved by using a particular test. Four different test approaches are discussed here: formative assessment, formative evaluation, summative assessment, and summative evaluation. The suitability of each CBT type to measure performance for the different test approaches is evaluated based on four test characteristics: test purpose, test length, level of interest for measurement (student, class, school, system), and test report. This article aims to provide some guidance in the selection of the most appropriate type of CBT.

Proficiency Classification and Violated Local Independence:


The purpose of this study was to examine the use of a misspecified calibration model and its impact on proficiency classification. Monte Carlo simulation methods were employed to compare competing models when the true structure of the data is known (i.e., testlet conditions). The conditions used in the design (e.g., number of items, testlet to item ratio, testlet variance, proportion of items that are testlet-based and sample size) reflect those found in the applied educational literature. Decision Consistency (DC) was high between the models, ranging from 91.5% to 100%. Testlet variance had the greatest effect on DC. An empirical example using PISA data with nine testlets is also provided for the consistency of pass/ fail decisions between the competing models.