Using Ai to Analyze Language Learners' Discourse: A Corpus-Based Study of Learner Language Development
DOI:
https://doi.org/10.63056/ACAD.004.04.1132Keywords:
Patterns , language development , spoken discourse , natural language processing , machine learning tools , lexical variability , grammatical precision , syntactic intricacy , coherence.Abstract
This research examined patterns of language development of Pakistani university students using AI-driven corpus tools. The researchers gathered 150 students’ written and spoken discourse samples from Punjab and Sindh provinces for six months and built a small learner corpus of around 500,000 words. The researchers used natural language processing and machine learning tools to evaluate the samples for lexical variability, grammatical precision, syntactic intricacy, and coherence. The researchers used a mixed-method design for the study, incorporating quantitative frequency analysis and qualitative thematic analysis. The analysis demonstrated advanced and less advanced learners’ levels of proficiency and lexical sophistication and syntactic complexity to a higher degree. The researchers found patterns of common errors, which included articles, propositions, and subject-verb agreement. The AI managed to recognize the gaps of interlanguage and track the development level. Qualitative analysis produced five themes: L1 transfer, rule overgeneralization, lexical discourse fossilization, organization problems, and code-switching. The study demonstrated the corpus-based approach’s ability to detect the language learner’s level of development. This research helped understand the second language acquisition processes of South Asian learners and illustrated how AI technology can benefit learning and teaching research conducted to improve language education.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Arfa Maham, Muniba Saleem, Muhammad Ismail Rahu, Sohail Ahmad (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.







