Syntactic And Lexical Comparison Between Ai-generated Reading Passages And Japanese Universities' National Test

Authors

Keywords:

Generative AI, Natural language processin, Text coverage, Syntactic complexity, University entrance examination

Abstract

This research aimed to develop a methodology for creating mock English reading tests using AI-generated passages to help decrease the workload of EFL teachers who struggle in creating the high-quality mock English reading test for universities’ entrance examinations. To achieve this goal, this paper examined the lexical and syntactical differences between English Subject of the Center Test at Japanese universities’ entrance exams (ESCT) and AI-generated passages. Three different generative AI were used: OpenAI ChatGPT version 4, Google Gemini version 1.5 Flash, and DeepSeek-V3. To make the vocabulary coverage between AI-generated passages and ESCTs meaningful, topics based on ESCTs were used to create 11 prompts for AI-generated passages. This paper examined text coverage using CEFR-based wordlist and syntactic complexity using the Python library spaCy. Findings revealed that the proportion of A2 level tokens does not differ greatly between AI-generated passages (ChatGPT: 19.1%; Gemini: 17.4%; DeepSeek: 17.3%) and ESCT (15.6%); ESCT had more complex but shorter sentences comparing to AI-generated passages, and personal pronouns accounted for 1.3% of ESCT tokens, while they accounted for less than 1% of generative AIs tokens (ChatGPT, 0.39% ; Gemini, 0.71% ; DeepSeek 0.45%). Few wh-pronouns and the existential there were found in AI-generated passages. This study concluded that when the EFL teachers convert AI-generative text to the mock ESCT, they should (a) add a wider variety of A1 level lemmas, (b) rewrite more complex and shorter sentences, (c) increase personal pronouns and determiners, and (d) reduce adjective modifiers, conjuncts, and coordination.

References

Chujo, K., & Hasegawa, S. (2004). Goi no kabā ritsu to rīdabiritī kara mita daigakueigonyūshi mondai no nan’ido [The difficulty of English university entrance examination problems from the perspective of vocabulary coverage and readability]. Bulletin of College of Industrial Technology, Nihon University, B, 37, 45-55.

Erikawa, H. (2011). Juken Eigo to Nihonjin: Nyūshi Mondai to Sankōsho kara Miru Eigo Gakushūshi [Entrance exam English and the Japanese: English learning seen

-throughentrance exam questions and reference books]. Kenkyusha.

Hartley, R. (2024). Efficacy analysis of online Artificial Intelligence fact-checking tools. The International Review of Information Ethics, 33(1). https://doi.org/10.29173/irie502

Ho, C. C. (2023). ChatGPT as a tool for developing paraphrasing skills among ESL learners. Journal of Creative Practices in Language Learning and Teaching (CPLT), 11(2), 85-105. https://doi.org/10.24191/cplt.v11i2.21723

Honnibal, M., & Montani, I. (2017). spaCy 2: Natural Language Understanding with Bloom Embeddings, Convolutional Neural Networks and Incremental Parsing.

Hsueh-Chao, M. H., & Nation, P. (2000). Unknown vocabulary density and reading comprehension. Reading in a Foreign Language, 13(1), 403-430.

Jiang, R. (2022). How does artificial intelligence empower EFL teaching and learning now a days? A review on artificial intelligence in the EFL context. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.1049401

Kuramoto, N. (2017). Daigaku nyūshi seido kaikaku no ronri ni kansuru ichi kōsatsu–Daigaku nyūshi sentā shiken wa naze haishi no kiki ni itatta no ka [An inquiry about the logic of the reform of university entrance examinations: why was the Center Test close to being abandonned?]. Daigaku Nyūshi Kenkyū Jānaru [Journal of Entrance Examination Research], 27, 29-35. https://doi.org/10.57513/dncjournal.27.0_29

MEXT. (2019, January 25). Kōritsu Gakkō no Kyōshi no Kinmu Jikan no Jōgen ni Kansuru Gaidorain [Guideline on the work time of teachers in public schools]. Ministry of Education, Culture, Sports, Science and Technology of Japan. https://www.mext.go.jp/component/a_menu/education/detail/__icsFiles/afieldfile/2019/01/25/1413004_1.pdf

MEXT. (2020, October 28). Wagakuni no Nyūshi Seido no Gaiyō [Outline of our country’s university entrance exam system]. Ministry of Education, Culture, Sports, Science and Technology of Japan. https://www.mext.go.jp/content/20201028-mxt_daigakuc02-000010703_12.pdf

MEXT. (2023, April 28). Kyōin Kinmu Jittai Chōsa (Reiwa 4 nendo) Shūkei – KinmuJikanno Jikeiretsu Henka – [Results of the survey on teachers’ workload (2022) – changes in workload with time]. Ministry of Education, Culture, Sports, Science and Technology of Japan. https://www.mext.go.jp/content/20230428-mxt_zaimu01-000029160_1.pdf

Qazi, S., Kadri, M. B., Naveed, M., Khawaja, B. A., Khan, S. Z., Alam, M. M., &Su'ud, M. M. (2024). AI-driven learning management systems: Modern developments, challenges and future trends during the age of ChatGPT. Computers, Materials & Continua, 80(2), 3289-3314. https://doi.org/10.32604/cmc.2024.048893

Saeidnia, H. R., Hosseini, E., Lund, B., Tehrani, M. A., Zaker, S., &Molaei, S. (2025). Artificial intelligence in the battle against disinformation and misinformation: a systematic review of challenges and approaches. Knowledge and Information Systems, 1-20. https://doi.org/10.1007/s10115-024-02337-7

Sakurai, T. (2022). Gakushū shidō yōryō kaitei ni tomonau eigo kyōiku no henkaku –kōpasuwo katsuyō shita korekara no jidai no kyōzai kenkyū [Transformation of English education with the revision of the Course of Study: research on teaching materials for the future using corpora]. Studies in Arts & Letters: Literature, History, Geography, 56, 1-15.

Salingre, M., & Kurokawa, S. (2023). Igakubu eigo nyūshi no goi oyobi tōgoteki bunseki –kakariuke kaiseki to bunsan hyōgen wo chūshin ni [Lexical and syntactic analysis of medical English university entrance exams: focus on dependency tree analysis and word embeddings]. KLA Journal, 7, 16-32.

Shirahata, T. (2019). Applying research findings from theoretical linguistics to teaching of English as a foreign language: a case of teaching personal pronouns. The Chubu English Language Education Society, 48, 243-248.

Szmrecsányi, B. (2004). On operationalizing syntactic complexity. In G. Purnelle, C. Fairon& A. Dister (eds.). Le Poids des Mots - Proceedings of the 7th International Conference on Textual Data Statistical Analysis, 1032-1039.

Tani, K. (2008). Daigaku nyūshi sentā shiken goi to kōkō eigo kyōkasho no goi hikakubunseki: kabā ristu no kanten kara [Comparison between the vocabulary of the Center Test and highschool English textbooks: from the perspective of vocabulary coverage]. Practical English studies, 14, 47-55.

Tono, Y. (2017). The CEFR-J and its Impact on English Language Teaching in Japan. JACET International Convention Selected Papers, 4, 31-52. Tono, Y. (2019). Coming Full Circle – From CEFR to CEFR-J and back. CEFR Journal –Research and Practice, 1, 5-17. Tono, Y. (2020). CEFR-J Wordlist ver 1.6. https://www.cefr j.org/download.html

Uchida, S., & Negishi, M. (2021). Eigo dokkai kyōzai no CEFR reberu no suitei: CVLAnodatōsei hyōka [Estimation of the CEFR level of English reading materials: evaluation of the validity of CVLA]. Journal of Corpus-based Lexicology Studies, 3, 1-14.

von Glasersfeld, E. (1970). The problem of syntactic complexity in reading and readability. Journal of Reading Behavior, 3(2), 1-14.

Wang, T., Lund, B. D., Marengo, A., Pagano, A., Mannuru, N. R., Teel, Z. A., &Pange, J. (2023). Exploring the potential impact of artificial intelligence (AI) on international students in higher education: Generative AI, chatbots, analytics, and international student success. Applied Sciences, 13(11), 6716. https://doi.org/10.3390/app13116716

Watanabe, Y. (2013). The national center test for university admissions. LanguageTesting, 30(4), 565-573. https://doi.org/10.1177/02655322134830

Young, J. C., & Shishido, M. (2023). Investigating OpenAI’s ChatGPT potentials in generating Chatbot's dialogue for English as a foreign language learning. International journal of advanced computer science and applications, 14(6). https://doi.org/10.14569/IJACSA.2023.0140607

Downloads

Published

2025-05-25

How to Cite

Kurokawa, S. ., & Salingre, M. . (2025). Syntactic And Lexical Comparison Between Ai-generated Reading Passages And Japanese Universities’ National Test. The Journal of Studies in Language, Culture, and Society, 8(1), 295–309. Retrieved from https://univ-bejaia.dz/revue/jslcs/article/view/590