Shin, D., & Lee, J. H. (2023). Can ChatGPT make reading comprehension testing items on par with human experts? Language Learning & Technology, 27(3), 27–40. https://hdl.handle.net/10125/73530
ABSTRACT
Given the recent increased interest in ChatGPT in the L2 teaching and learning community, the present study sought to examine ChatGPT’s potential as a resource for generating L2 assessment materials on par with those created by human experts. To this end, we extracted five reading passages and testing items in the format of multiple-choice questions from the English section of the College Scholastic Ability Test (CSAT) in South Korea. Additionally, we used ChatGPT to generate another set of readings and testing items in the same format. Next, we developed a survey made up of Likert-scale questions and open-ended response questions that asked about participants’ perceptions of the diverse aspects of the target readings and testing elements. The study’s participants were comprised of 50 pre- and in-service teachers, and they were not informed of the target materials’ source or the study’s purpose. The survey’s results revealed that the CSAT and ChatGPT-developed readings were perceived as similar in terms of naturalness of the target passages’ flow and expressions. However, the former was judged as having included more attractive multiple-choice options, as well as having a higher completion level regarding testing items. Based on such outcomes, we then present implications for L2 teaching and future research.
Keywords: Artificial Intelligence, Automated Item Generation, ChatGPT, Content Generation