Saturday, July 19, 2025

Unpacking Reading Comprehension: Neurocomputational Insights from L1 and L2 Readers Using Large Language Models

Share

Understanding the Reading Experience: An Exploration of L1 and L2 Readers

Participants

In a fascinating study aiming to explore reading comprehension across different linguistic backgrounds, the participant pool consisted of 107 individuals—52 native English speakers (L1 readers) and 56 Chinese-speaking learners of English (L2 readers). The L1 group included 24 males with a mean age of 22.85 years (±4.66), while the L2 group had 26 males with a mean age of 25.14 years (±4.74). A common criterion for the L2 participants was passing the Chinese English Test 6 (CET6), ensuring a level of English proficiency.

Both groups were characterized by normal or corrected vision and a right-handedness trait, mitigating variability from handedness in cognitive processes. An essential ethical consideration was adhered to, as the research received the approval of the Pennsylvania State University Institutional Review Board (IRB; Study ID: STUDY00002823). Participants provided written informed consent before engaging in the study. Sources for the L2 readers were diverse, recruited from Pennsylvania State University and prominent universities in China like Beijing Normal University and Peking University. Following rigorous data checks, the analysis proceeded with 51 L1 and 55 L2 readers, with exclusions limited to cases of missing eye movement data.

Stimuli

The study leveraged five short expository texts rooted in STEM (Science, Technology, Engineering, and Mathematics) topics to challenge participants. These texts covered diverse subjects, including Mars exploration, supertankers, mathematics, the Global Positioning System (GPS), and electric circuits. Each connected text was meticulously designed, maintaining balanced characteristics such as text length and sentence complexity to ensure uniformity.

Further control measures were employed concerning psychological variables affecting reading comprehension, including age of acquisition, familiarity with key vocabulary, word frequency, and lexical properties as assessed by Coh-Metrix. The texts were adopted from prior studies to align baseline knowledge with current research aspirations.

Behavioral Measurements

The researchers aimed to evaluate how the comprehension ability of participants affected their reading experience using standardized tools. The Gray Silent Reading Test (GSRT) was utilized to gauge each participant’s general reading ability. This test included 13 narratives, each accompanied by five multiple-choice questions that allowed for a comprehensive assessment of reading comprehension.

Every participant started the GSRT at the eighth narrative, adjusting difficulty incrementally based on their performance. Scores ranged from 0 to 65, with raw scores converted into standardized quotients and percentile ranks. Notably, the GSRT has been normed on a substantial sample of 1,400 individuals, boasting reliability coefficients of 0.97 or higher.

English receptive vocabulary size was assessed with the Peabody Picture Vocabulary Test (PPVT IV), featuring 228 items across 19 sets. Participants matched spoken words to visual images, with total scores spanning from 0 to 228.

To delve deeper, the attentional network test (ANT) was employed to examine cognitive abilities across three attention networks: alerting, orienting, and executive control. This test used visual and spatial cues in a series of trials where participants noted the direction of central arrows. The results provided a comprehensive understanding of how attention mechanisms varied between participants.

Language Dominance Assessment

Language experience plays a crucial role in comprehension and processing. The Language History Questionnaire (LHQ 2.0) was utilized to measure language dominance, giving insight into the participants’ daily language use. Specifically, linguistic dominance relevant to reading activities was computed, further contextualizing how bilingual experiences affect reading behavior. Participants were carefully selected to ensure accurate data, excluding any with incomplete or erroneous information.

MRI Task Procedure and Image Acquisition

The study integrated a sophisticated fixation-related fMRI paradigm to observe brain activity while participants engaged with texts. Participants were positioned in a 3-T Siemens MRI scanner, reading sentences presented one at a time on a screen, while their eye movements were continuously monitored. Each text was followed by comprehension questions to evaluate reading performance.

Functional brain images were captured using T1-weighted, T2*-weighted, and diffusion tensor imaging, ensuring clarity in how cognitive processes were reflected biologically. These careful protocols enabled researchers to explore the intricate dynamics of reading comprehension across linguistic backgrounds.

Eye-Tracking Data Acquisition

To understand the micro-level interactions during reading, the Eye-Link 1000 Plus employed a long-range mount, recording eye movements with remarkable precision. Parameters ensured a robust data collection process, with a sampling rate of 1000 Hz, meticulous calibration routines, and consideration of visual angles during text exposure.

Reading Performance Assessments

Reading comprehension was evaluated through reading accuracy (ACC)—the proportion of correctly answered questions post-text. With 50 questions spanning across all texts, insights were drawn about the efficacy of reading strategies between L1 and L2 participants. The metrics included total reading time, fixation counts, and mean fixation durations, facilitating comparisons through two-sample t-tests to discern significant differences.

fMRI Data Preprocessing

Data derived from the MRI scans underwent comprehensive preprocessing using software such as SPM12 and FSL 5.0.11. Steps included slice-time correction, motion and distortion correction, normalization, and smoothing, allowing for a more refined analysis of brain activity associated with reading comprehension. Head motion was carefully monitored, ensuring accuracy in data interpretation.

ROI-based Analyses

In examining structural responses, the study employed region of interest (ROI) analyses focusing on critical language regions, including the inferior frontal gyrus, middle frontal gyrus, and posterior temporal gyrus. This targeted approach allowed for detailed inspections of how specific brain areas reacted during reading tasks.

Embeddings Extracted from LLMs

Utilizing cutting-edge technology, researchers extracted contextual embeddings from the GPT-2 model. This transformer-based architecture not only enabled an exploration of linguistic structures but also facilitated comparisons with previous research, revealing the dynamism of language processing in both L1 and L2 readers.

Model-Brain Alignment

The alignment between brain activity and language model metrics was a pivotal part of the analysis. Employing a ridge regression model, the researchers computed correlations between brain responses and model predictions across different reading runs. This innovative approach allowed for a nuanced understanding of how effectively neural patterns corresponded with language processing.

Association Between Reading Performance and Model-Brain Alignment

A critical element of the study involved exploring correlations between reading accuracy and the model-brain alignment for both L1 and L2 readers. By adjusting for potential confounding factors and employing regression techniques, the study sought to elucidate how individual differences resonated with reading performance.

Individual Differences Impact on Model-Brain Alignment

Lastly, to understand individual variability in language processing, the researchers constructed regression models incorporating linguistic and attentional abilities. This exploration aimed to unravel the complexities of how bilingual experiences and cognitive abilities overlap to influence reading comprehension.

Through these multi-dimensional analyses, the study advanced our understanding of the cognitive mechanisms underpinning reading across different language backgrounds, paving the way for future research in the field of bilingual education and cognitive neuroscience.

Read more

Related updates