Modeling Sentence Comprehension Deficits in Aphasia: A Computational Evaluation of the Direct-access Model of Retrieval

Several researchers have argued that sentence comprehension is mediated via a content-addressable retrieval mechanism that allows fast and direct access to memory items. Initially failed retrievals can result in backtracking, which leads to correct retrieval. We present an augmented version of the direct-access model that allows backtracking to fail. Based on self-paced listening data from individuals with aphasia, we compare the augmented model to the base model without backtracking failures. The augmented model shows quantitatively similar performance to the base model, but only the augmented model can account for slow incorrect responses. We argue that the modified direct-access model is theoretically better suited to fit data from impaired populations.


Introduction
Comprehending a sentence involves building linguistic dependencies between words. In the sentence processing literature, several researchers have argued that linguistic dependency resolution is carried out via a cue-based retrieval mechanism (Van Dyke and McElree, 2006;Lewis and Vasishth, 2005). Cue-based retrieval theory assumes that word representations are retrieved from working memory via their syntactic and semantic features. Consider the following sentences: (1) a. The boy who tickled the girl greeted the teacher. b. The boy who the girl tickled greeted the teacher.
In (1a), the noun boy would be encoded in memory with features such as [+animate, +subj]. When the reader reaches the verb tickled, a retrieval is triggered with retrieval cues that match the features of boy. At this point in time, boy is the only element that matches the retrieval cues of the verb. By contrast, in (1b), another noun intervenes between tickled and boy that partially matches the cues set at the retrieval: girl [+animate, -subj]. The partial feature overlap causes similarity-based interference between the two items, making the dependency more difficult to resolve in (1b) compared to (1a).
Interference effects have been attested in multiple studies, see for example Jäger et al. (2020); Gordon et al. (2006); Jäger et al. (2017); Van Dyke (2007). One model of cue-based retrieval that predicts these interference effects is the directaccess model developed by McElree and colleagues (McElree, 2000;McElree et al., 2003;Martin and McElree, 2008). The direct-access model (DA) assumes that retrieval cues allow parallel access to candidate items in memory, as opposed to a serial search mechanism. Due to the parallelism assumption, the speed of retrieval is predicted to be constant across items (aside from individual differences and stochastic noise in the retrieval process).
Factors such as increased distance between the target and the retrieval point and the presence of distractor items can lower the probability of retrieving the correct dependent (also known as availability). Low availability of the target dependent can lead to failures in parsing or to misretrievals of competitor items. When such errors occur, a backtracking process can be initiated, which by assumption leads to the correct retrieval of the target (McElree, 1993). The backtracking process requires additional time that is independent of the retrieval time. According to the direct-access model, (1a) should have shorter processing times than (1b) on average, because in (1b) some trials require costly backtracking due to lower availability of the target item boy.
The direct-access model can be adapted to explain impaired sentence comprehension in individuals with aphasia (IWA; Lissón et al., 2021). However, there is one crucial aspect of the directaccess model that is at odds with the aphasia literature, specifically with the finding that IWA have longer processing times for incorrect than for correct responses (e.g., Hanne et al., 2015;Pregla et al., 2021). The direct-access model assumes that some percentage of correct interpretations are only obtained after costly backtracking, and thus predicts that the average processing time for incorrect responses should be faster than for correct responses. To address this issue, we implement a modified version of the direct-access model that is specifically relevant for sentence processing in IWA. In this model, backtracking can lead to correct retrieval of the target, as in the base model, but can also result in misretrieval and parsing failure.

Sentence Comprehension in Aphasia
Aphasia is an acquired neurological disorder that causes language production and comprehension impairments. In the aphasia literature, there are several theories that aim to explain the source of these impairments in language comprehension. One possibility is that IWA carry out syntactic operations at a slower-than-normal pace, which could cause failures in parsing. This is the slow syntax theory (Burkhardt et al., 2008). By contrast, Ferrill et al. (2012) claim that the underlying cause of slowed sentence processing in IWA is delayed lexical access, which cannot keep up with structure building. Another theory, resource reduction, assumes that IWA experience a reduction in the resources used for parsing (Caplan, 2012), such as working memory. Finally, Caplan et al. (2013) claim that IWA suffer from intermittent deficiencies in their parsing system that lead to parsing failures. Previous computational modeling work has shown that these theories may be complementary (Patil et al., 2016;Lissón et al., 2021), and that IWA may experience a combination of all of these deficits (Mätzig et al., 2018).
Assuming that a direct-access mechanism of retrieval subserves sentence comprehension, this mechanism could interact with one or more of the proposed processing deficits in IWA. One way to assess whether these deficits are plausible under a direct-access model is the computational modeling of experimental data. Lissón et al. (2021) tested the direct-access model against self-paced listening data from individuals with aphasia, finding the model to be in line with multiple theories of processing deficits in aphasia. Despite this encouraging result, the model could not fit slow incorrect responses, due to its assumptions about backtrack-ing and its consequences.
In what follows, we present our implementation of the original direct-access model and the modified version with backtracking failures. We fit the two models to data from individuals with aphasia and compare their quantitative performances. In order to assess the role of the different proposed deficits of IWA in sentence comprehension, we also map the models' parameters onto theories of processing deficits in aphasia.

Data
The data that we model come from a self-paced listening task in German (Pregla et al., 2021). 50 control participants and 21 IWA completed the experiment. Sentences were presented auditorily, word by word. Participants paced the presentation themselves, choosing to hear the next word by pressing a computer key. The time between key presses (here called listening time) was recorded. At the end of the sentence, two images (target and foil) were presented, and participants had to select which image matched the meaning of the sentence they had just heard. Accuracies for the picture-selection task were also recorded. To assess test-retest reliability, each subject completed the task twice, with a break of two months in between. Our modeling is based on the pooled data of both sessions.

Items
We investigate interference effects in a linguistic construction that is understudied in IWA: Control constructions. In control constructions, the subject of an infinitival clause is not overly specified, but understood to be coreferential with one of the overt noun phrases in the matrix clause of the same sentence (e.g, Brian promises Martha to take out the trash → Brian takes out the trash). In linguistic theory, it is assumed that a a phonologically empty element (PRO) occupies the subject position of take out (Chomsky, 1981). PRO is co-indexed with a noun phrase in the matrix clause that acts as its antecedent. The verb in the matrix clause specifies, according to its semantic and syntactic properties, which noun phrase in the matrix clause triggers the interpretation of PRO in the subclause.
In sentence (2a) below, the verb verspricht (promises) is lexically specified as a subjectcontrol verb, and the subject noun phrase of the main clause, Peter, is chosen as the antecedent of PRO. By contrast, in (2b), the object-control verb erlaubt (allows) specifies that the object noun phrase of the main clause, Lisa, is the antecedent of PRO.
(2) a. Subject control Peter i verspricht nun Lisa j , PRO i das kleine Lamm zu streicheln und zu kraulen.
'Peter now promises Lisa to pet and to ruffle the little lamb' b. Object control Peter i erlaubt nun Lisa j , PRO j das kleine Lamm zu streicheln und zu kraulen.
'Peter now allows Lisa to pet and to ruffle the little lamb' Cue-based retrieval theory assumes that control clauses require completion of the PRO dependency through memory access to the correct noun phrase. The direct-access model would predict (2b) to be easier to process than (2a), because the target (Lisa) is linearly closer to the retrieval site at PRO, and thus more available. Therefore, at PRO, the probability of retrieval of the target should be higher in (2b) relative to (2a). In line with this prediction, unimpaired subjects show a processing advantage for object control over subject control (Kwon and Sturt, 2016). Similarly, IWA exhibit more difficulties understanding subject control conditions in acting-out tasks (Caplan and Hildebrandt, 1988;Caplan et al., 1996). However, the object control advantage in IWA has not been previously tested using online methods.
Our experimental items were 20 sentences (10 per condition) similar to (2a) and (2b). The corresponding pictures for the picture-selection task are shown in Figure (1). The top picture is the target picture for (2a), whereas the bottom picture is the target for (2b). We assume that trials where the foil picture has been selected (i.e., the picture that shows the distractor noun as the agent of the action) correspond to a misretrieval.

Dependent Variables
The dependent variables used for modeling were the listening times (henceforth, LT) at the retrieval site (PRO) and the accuracy of the picture-selection task. Given that PRO is phonologically empty, we assumed that the retrieval process takes place at some point between the second and the third noun phrase (Lisa and das kleine Lamm in (2a)). We therefore summed the listening times of these regions within each trial. In order to evaluate the slowed lexical access hypothesis (Ferrill et al., 2012), we also used data from an auditory lexical decision task that participants performed in addition to the experiment. This task was based on LEMO 2.0 (Stadie et al., 2013). Participants had to decide whether an auditorily presented item was a word or a neologism, and the response times were recorded. For each participant, we computed the mean response times for correct responses. These were then centered and scaled within groups and used as continuous predictors in the models. We will refer to the scaled lexical decision task reaction times as the LDT predictor.

Direct-Access Model
Our implementation of the direct-access model follows Nicenboim and Vasishth (2018). The model assumes that listening times for correct responses come from a mixture distribution, given that there are trials with backtracking, where an additional processing cost δ is added, and trials without backtracking, where no such cost is added. By contrast, incorrect responses never involve backtracking, and the average listening time should be the same as for correct responses without backtracking. A graphical representation of the model is displayed in Figure (2). The three possible cases are as follows: (a) Retrieval of the target succeeds at first attempt, with probability θ: LT ∼ lognormal(µ, σ) (c) Retrieval fails, no backtracking, and a misretrieval occurs, with probability The model includes both fixed and random effects in order to account for sentence complexity, group differences, and individual variability. The hierarchical structure is shown in Equation (1). All parameters have an adjustment by group (IWA versus control), because we expect IWA to have different parameter estimates from control participants. Since DA assumes that retrieval times are not affected by sentence complexity, the average listening times (µ) do not have an adjustment for condition. By contrast, the probability of retrieval of the target, θ, includes a condition adjustment. This parameter can be thought of as indexing memory availability. The probability of backtracking P b , the cost of backtracking δ, and σ do not depend on sentence complexity, but may vary between IWA and controls. The hierarchical structure is embedded within the parameters when possible (we report the maximal hierarchical structure that could be fit). In Equation (1), the terms u and w are the by-participant and by-item adjustments to the fixed effects, respectively. These are assumed to come from two multivariate normal distributions. All parameters had regularizing priors, listed in Appendix B. (1) The model was implemented in the probabilistic programming language Stan (Stan Development Team, 2020), and fit via the rstan package (Carpenter et al., 2017) in R (R Core Team, 2020). The model was fit with 3 chains and 8,000 iterations, half of which were used as warm-up.

Predictions
Based on the theories of processing deficits in aphasia discussed in Section (1.1), and on the findings in Lissón et al. (2021), we make the following predictions: 1. IWA's µ and δ values should be higher than controls'. This would be in line with slow syntax, assuming that both the initial retrieval and the backtracked retrieval are accompanied by appropriate structure-building processes.
2. The probability of initial retrieval of the target θ should be lower for IWA relative to controls, across conditions.
3. Object control conditions should have a larger θ, relative to subject control. In addition, IWA should have a bigger interference effect, i.e., the difference in θ between the two conditions should be larger in IWA than in controls. This pattern would be expected under the resource reduction theory, which states that IWA should have greater difficulties in more complex sentences.
4. Slower lexical decision (LDT) should be associated with a decrease in θ across groups. Strong support for delayed lexical access would come from an interaction between LDT and group, such that an increase in LDT predicts a greater decrease in θ for IWA than for controls: Slow lexical access could cause parsing problems for controls, but if delayed lexical access is the main cause of difficulty in IWAs, parsing failures should occur more often in this group for individuals whose lexical access is particularly slow.
5. The probability of backtracking should be lower for IWA, which would be in line with resource reduction.
6. Finally, the dispersion parameter σ of the listening-time distribution should be larger for IWA, which would indicate that IWA have more noise in their parsing system. This would be in line with intermittent deficiencies, since more noise could be due to more breakdowns in parsing.
These predictions build on the previous work by Lissón et al. (2021), but other options for the mapping between parameters and theories of comprehension deficits in aphasia are possible, see Mätzig et al. (2018); Patil et al. (2016).

Results
We begin by assessing the posterior distribution of the probability of retrieval of the target, θ, shown in Figure (3  Controls are estimated to retrieve the target at the first retrieval attempt in both conditions in more than 90% of trials. The mean of the subject-control condition is slightly lower than the mean for the object-control condition. By contrast, IWA display a greater effect of interference: In object-control sentences, where the antecedent is close to PRO, IWA are estimated to correctly retrieve the target at the first attempt 85% of the time, compared to 60% for subject-control. An increase in LDT leads to a decrease in θ of −6% CrI: [−11%, −2%], but there was no interaction with group × LDT (−2% CrI: [−6%, 2%]). The credible intervals for the remaining parameters are shown in Table ( As expected under the slow syntax theory, IWA's mean listening times (µ) and the time needed for backtracking (δ) are higher than controls'. Similarly, σ is also higher for IWA, as predicted by intermittent deficiencies. Finally, the probability of backtracking is much lower for IWA than for controls. Assuming that backtracking uses general parsing resources, this estimate is in line with resource reduction.

Posterior Predictive Checks
One way to assess the behavior of the model is to check the posterior distribution of data generated by the model against the empirical data. If the mean of the empirical data falls within the range of predicted values of the model, the model could have generated the empirical data. By contrast, if the empirical data are outside of the range of the generated values, this indicates a suboptimal fit. Figure (4) shows the posterior predictive distributions of the direct-access model across groups and conditions. Overall, correct responses are modeled reasonably well, except in the object-control condition for IWA. The model also underestimates the listening times for incorrect responses, except for IWA in the subject-control condition. In all other design cells, incorrect responses are slower than correct responses, contrary to the model's assumption that slow backtracking responses are always correct. In terms of implementation, the main difference between the models is a newly-introduced parameter θ b , which is the probability of correct retrieval after backtracking. Figure (5) displays a graphical representation of this new model: After backtracking, the target is retrieved with probability θ b , and a misretrieval occurs with probability 1 − θ b . The hierarchical structure is the same as in the DA original model, except for θ b , whose adjustments are shown in Equation (2). Figure 5: Graphical representation of the modified direct-access model.
The model was run with 10,000 iterations, half of which were used as warm-up.

Predictions
All predictions are carried over from the base DA model. In addition, the probability of retrieval of the target after backtracking θ b should be lower for IWA than for controls. This would indicate that IWA are more likely to experience parsing failure or misretrieval even after backtracking.

Results
We begin by assessing the probability of first correct retrieval, θ. The posterior distribution across groups and conditions is shown in Figure (6). The estimates are quite similar to the ones in the original DA model: Controls have a very high probability of initial correct retrieval across conditions, and IWA display a greater interference effect.  As in the base model, IWA have a low probability of backtracking in this model (7% CrI: [4%, 12%]) relative to controls (80%, CrI: [72%, 86%]). The probability of correct retrieval after backtracking, θ b , determines the amount of slow incorrect responses. The posterior distribution of θ b is shown in Figure (7). After backtracking, controls are estimated to retrieve the target 90% of the time, compared to around 70% for IWA.
The rest of estimates are also similar to the ones in the original DA model: IWA's µ is higher than controls' (2751( ms, CrI: [2477

Posterior Predictive Checks
The posterior predictive checks for the modified direct-access model are shown in Figure ( Like the base model, the MDA mostly correctly estimates listening times for correct responses across the board. The fits for incorrect responses seem to have improved, except for object-control in IWA, where the predicted listening times are still faster than the observed listening times.

Model Comparison
In order to quantitatively compare the performance of the models, we computed Bayes factors. We chose Bayes factors over other alternatives (e.g. cross-validation), because the two models seem to predict similar distributions, and Bayes factors are especially suited for nested models, or models that make very similar predictions. The hypothesis being tested is whether there is a non-zero parameter θ b that indexes the probability of successful backtracking, assumed by the MDA model, or whether backtracking is always successful, as assumed by the base DA model.
In order to perform the comparison, the models were run for 40,000 iterations, of which 3,000 were used for warm-up. Bayes factors were computed using the bridgesampling package (Gronau et al., 2020) in R. The Bayes factor of DA over MDA was estimated to be 2. This result is inconclusive, and indicates that the models provide similar quantitative fit to the data.

Discussion and Conclusion
In the present paper, we implemented and tested two versions of the direct-access model of cuebased retrieval and evaluated their predictive performance on data from individuals with aphasia and control participants. Specifically, we modeled interference in an under-studied linguistic construction, namely control structures. Both the base model and the modified model are in line with a combination of processing deficits in IWA: slow syntax, resource reduction, and intermittent deficiencies. Neither of the two models showed support for delayed lexical access as a source of retrieval difficulty specifically for IWA. Although a delay in LDT was connected to a decrease in the probability of correct retrieval, the effect of LDT was similar for IWA and control participants. In general, our results are consistent with other studies showing that a combination of processing deficits may be the source of impairments in sentence comprehension in IWA (Caplan et al., 2015;Mätzig et al., 2018;Lissón et al., 2021).
Unlike the base direct-access model, our modified DA model (MDA) assumes that backtracking can fail, resulting in slow, incorrect retrievals. However, this added assumption does not result in a decisive advantage in fit for the MDA model, as shown by the posterior predictive checks and the Bayes factor analysis. This result is unexpected, and leads us to think that the MDA model may be overparametrized. In MDA, all of the main parameters include a group adjustment. As a consequence, for instance, the mean listening times, µ, are estimated to be higher for IWA than for controls. The cost of backtracking, which is only added to µ if backtracking is performed, accounts for slower re-sponses. However, because IWA's µ is estimated to be higher than controls' µ, the model may not need to rely on backtracking in order to account for slow responses in IWA. This could be the reason why the probability of backtracking for IWA is very low (7%) relative to controls (80%). In addition, IWA's θ b has to be estimated from the 7% of trials that include backtracking. Given the size of the IWA group (21 participants), and the small amount of trials that include backtracking, perhaps the model cannot correctly estimate the θ b parameter. This could be investigated in several ways. One possibility would be to remove the group adjustments from µ, P b , δ, and θ b one at the time, and see which of these models shows a better quantitative fit for the data (see Lissón et al., 2021). Another possibility would be to evaluate how these parameters interact with and without group adjustments (e.g., do P b and/or δ for IWA increase if there is no group adjustment in µ?). We will address these questions in future work.
The present paper contributes to the aphasia literature by proposing a modification of the directaccess model that can account for incorrect slow responses. Despite our inconclusive results, we believe that the modified direct-access model offers a more appropriate set of assumptions for individuals with aphasia than the direct-access model. The modified-direct access model can account for slow incorrect responses, which are frequently found in studies on sentence processing in IWA (e.g., Hanne et al., 2015;Lissón et al., 2021;Pregla et al., 2021). It remains to be seen, by testing the new modified direct-access model against more data from individuals with aphasia, whether there is a difference in predictive performance between the two models.