Meta-Semantic Representation for Early Detection of Alzheimer’s Disease

This paper presents a new task-oriented meaning representation called meta-semantics, that is designed to detect patients with early symptoms of Alzheimer’s disease by analyzing their language beyond a syntactic or semantic level. Meta-semantic representation consists of three parts, entities, predicate argument structures, and discourse attributes, that derive rich knowledge graphs. For this study, 50 controls and 50 patients with mild cognitive impairment (MCI) are selected, and meta-semantic representation is annotated on their speeches transcribed in text. Inter-annotator agreement scores of 88%, 82%, and 89% are achieved for the three types of annotation, respectively. Five analyses are made using this annotation, depicting clear distinctions between the control and MCI groups. Finally, a neural model is trained on features extracted from those analyses to classify MCI patients from normal controls, showing a high accuracy of 82% that is very promising.


Introduction
Our understanding of Alzheimers disease (AD) has evolved over the last few decades.Most notably is the discovery that AD has long latent preclinical and mild cognitive impairment (MCI) stages (Karr et al., 2018;Steenland et al., 2018).These stages are the focus of many prevention and therapeutic interventions.A key limitation in identifying these pre-dementia stages for clinical trial recruitment is the need for expensive or invasive testing like positron emission tomography or obtaining cerebrospinal fluid (CSF) analyses.Traditional cognitive testing is time-consuming and can be biased by literacy and test-taking skills (Fyffe et al., 2011).Recent advances in natural language processing (NLP) offer the unique opportunity to explore previously undetectable changes in the cognitive process of semantics that can be automated in clinical artificial intelligence (Beam and Kohane, 2016).
Limited prior studies have suggested the feasibility of detecting AD by analyzing language variations.One approach includes linguistically motivated analysis extracting lexical, grammatical, and syntactic features to detect language deficits in AD patients (Fraser et al., 2016;Orimaye et al., 2017).The other approach involves deep learning models to extract features from languages used by AD patients (Orimaye et al., 2016;Karlekar et al., 2018).The limitations of these studies are that most were developed based on dementia cases, so their ability to detect pre-dementia is still unknown.The impact of these methods is the highest in the cases where traditional cognitive measures are unable to clarify the patients cognitive status.Hence, we focus on these early MCI stages in this study.
We suggest a new meaning representation called meta-semantics that derives a knowledge graph reflecting semantic, pragmatic, and discourse aspects of language spoken by MCI patients.The objective of this representation is not to design yet another structure to capture more information but to sense aspects beyond the syntax and semantic level that are essential for the early detection of MCI patients.We hypothesize that patients in the pre-dementia stage do not necessarily make so much of grammatical mistakes compared to normal people but often have difficulties in elaborating or articulating their thoughts in language.To verify our hypothesis, we collect speeches from 50 normal controls and 50 MCI patients that standardized cognition tests fail to distinguish (Section 2), annotate meta-semantic representation on the transcripts of those speeches (Section 3), make several analyses to comprehend linguistic differences between the control and the MCI groups (Section 4), then develop a neural network model to detect MCI patients from normal controls (Section 5).To the best of our knowledge, this is the first time that a dedicated meaning representation is proposed for the detection of MCI.

Data Preparation
We analyzed data from 100 subjects collected as part of the B-SHARP, Brain, Stress, Hypertension, and Aging Research Program.1 50 cognitively normal controls and 50 patients with mild cognitive impairment (MCI) were selected based on neuropsychological and clinical assessments performed by a trained physician and a neuropsychologist.The two groups were matched on overall cognitive scores to examine how well our new meta-semantic indices would perform in the setting where standardized tests such as the Montreal Cognitive Assessment (Nasreddine et al., 2005) and the Boston Naming Test (Kaplan et al., 1983) failed to distinguish them.Table 1 shows demographics and clinical features of the control and the MCI groups.No significant group differences were found in age, race, sex, or education between these two groups.
The MCI group performed significantly worse on the Clinical Dementia Rating (Morris, 1994), but did not differ as much on the Function Assessment Questionnaire (Pfeffer et al., 1982) assessing instrumental activities of daily living.

Speech Task Protocol
We conducted a speech task protocol that evaluated subjects' language abilities on 1) natural speech, 2) fluency, and 3) picture description, and collected audio recordings for all three tasks from each subject.For this study, the audio recordings from the third task, picture description, were used to demonstrate the effectiveness of the meta-semantics analysis on detecting MCI.All subjects were shown the picture in Figure 1, The Circus Procession, copyrighted by McLoughlin Brothers as part of the Juvenile Collection, and given the same instruction to describe the picture for one minute.Visual abilities of the subjects were confirmed before recording.
Figure 1: The image of "The Circus Procession" used for the picture description task.

Transcription
Audio recordings for the picture description task (Section 2.1) from the 100 subjects in Table 1 were automatically transcribed by the online tool, Temi,2 then manually corrected.Table 3 shows transcripts from a normal control and an MCI patient whose MoCA scores are matched to 29 (out of 30 points).
For the annotation of meta-semantic representation in Section 3, all transcripts were tokenized by the open-source NLP toolkit called ELIT. 3 Table 2 shows general statistics of these transcripts from the output automatically generated by the part-ofspeech tagger and the dependency parser in ELIT.Control MCI This is a what looks like a circus poster.The title is the Circus Procession.There's an off color green background.On the lefthand side is elephant in a costume peddling a tricycle, operating a tricycle.On the right side is another elephant with holding a fan.He's dressed in an outfit with a hat and a cane.There are two people in the background and they could be either men or women.And then there are three, I'll take that back.And then the foreground is a clown in a white suit with red trim.It was copyrighted in 1988 by the McLoughlin Brothers, New York or NY.Um, there's a black border.Um, the, there are shadows represented by some brown color at the bottom.
It's a circus poster.Going left to right is an elephant standing on its side legs, and a, um, vest, a tie and a red Tuxedo coat, and um yellow cap with a black band holding what appears to be a fan in its trunk.The elephant has glasses and a cane.Um, the top, says the Circus Procession.To the left of the elephant is a clown in a white and red costume with red and black paint on his face, red hair or shoes.And there appear to be three like soldiers, um gray suits, yellow trim, um, um, red hair.To the left of them, there's another elephant, riding a bicycle.This elephant has pants to red bicycle.He's got a regular coat of his and a red bow tie.No significant group differences were found in textlevel counts (tokens and sentences), grammatical categories (nouns and verbs), or syntactic structures (conjuncts, clausal modifiers or complements), except for the relative clauses and non-finite modifiers whose p-value is less than 0.05.The MCI group used notably a fewer number of verbs although the difference to the control group was not significant.

Meta-Semantic Representation
We organized a team of two undergraduate students in Linguistics to annotate meta-semantic representation on the transcripts from Section 2.2 such that every transcript was annotated by two people and adjudicated by an expert.The web-based annotation tool called BRAT was used for this annotation (Stenetorp et al., 2012), where the entire content of each transcript was displayed at a time.Figure 2 shows a screenshot of our annotation interface using BRAT on the control example in Table 3.  3.
Meta-semantic representation involves three types of annotation, entities (Section 3.1), predicate argument structures (Section 3.2), discourse attributes (Section 3.3), as well as few other miscellaneous components (Section 3.4).The following sections give a brief overview of our annotation guidelines.

Entities
To analyze which and how objects in the picture are described by individual subjects, every object mentioned in the transcript is identified as either a predefined entity or an unknown entity.All nominals including pronouns, proper nouns, common nouns, and noun phrases are considered potential mentions.Table 4 shows the list of 50 predefined entities that are frequently mentioned in the transcripts.
Table 4: Predefined entities, where the main entities indicate the 5 conspicuous objects in Figure 1 and the sub entities indicate objects that belong to the main entities.
In the example below, five mentions are found and can be linked to four entities as follows: An elephant 1 is holding a fan 2 .To the leftside of him 3 , another elephant 4 is riding a tricycle 5 .
The entity Men is a group of three people including Man L, Man M, and Man R (man on the left, middle, and right) as its sub entities.Such a group entity is defined because subjects regularly describe them together as one unit.Picture often refers to the types of the picture that subjects view it as (e.g., poster in Figure 2).Special kinds of entities, Title and Copyright, are also defined that are annotated on the literals (e.g., the Circus Procession in Figure 2, McLoughlin Brothers, 1888, N.Y.) to see if subjects indeed recognize them correctly.Any object that is either ambiguous or not predefined is annotated as an unknown entity.
It is worth mentioning that unlike mention annotation for coreference resolution in OntoNotes (Pradhan et al., 2012) where whole noun phrases are annotated as mentions, function words such as articles or determiners and modifiers such as adverbs or adjectives are not considered part of mentions in our annotation, which is similar to abstract meaning representation (Banarescu et al., 2013).Such abstraction is more suitable for spoken data where the usage of these function words and modifiers is not so consistent.

Predicate Argument Structures
To analyze semantics of the entities as well as their relations to one another, predicate argument structures are annotated.Note that meta-semantic representation is entity-centric such that expressions that do not describe the picture are discarded from the annotation (e.g., When I was young, circus came to my town all the time).Such expressions do not help analyzing subjects' capabilities in describing the picture although they can be used for other kinds of analyses which we will explore in the future.
Following the latest guidelines of PropBank (Bonial et al., 2017), both verbal predicates, excluding auxiliary and modal verbs, and nominal predicates, including eventive nouns and nouns from light-verb constructions, are considered in our representation.
Once predicates are identified, arguments are annotated with the following thematic roles (in the examples, predicates are in italic, arguments are in brackets, and thematic roles are in subscripts): • agent: Prototypical agents e.g., An [elephant] agent is holding a fan.
• theme: Prototypical patients or themes e.g., An elephant is holding a [fan] theme .
• dative: Recipients or beneficiaries e.g., The soldier is bringing a flag to the [circus] dative .
• dir: Directional modifiers e.g., Feathers are coming out of the [hat] dir .
• loc: Locative modifiers e.g., The clown is dancing in between the [elephants] loc .
• prp: Purpose or clausal modifiers e.g., The clown is dancing to [tease] prp the elephants.
• tmp: Temporal modifiers e.g., This seemed to be a poster made in the early [1900s] tmp .
If an argument is a preposition phrase, the thematic role is annotated on the preposition object such that in the example above, only the head noun [hat] is annotated as dir instead of the entire preposition phrase "out of the hat". 4As shown in the prp example, a predicate can be an argument of another predicate.Note that modifiers do not need to be arguments of only predicates but entities as well (e.g., the elephant on the [tricycle] loc , a poster from way back in [1990s] tmp ).The choice of these thematic roles are observational to the transcripts.No instance of dative is found in our dataset but the role is still kept in the guidelines for future annotation.

Discourse Attributes
To analyze discourse aspects of the transcripts, six labels and one relation are annotated as follows (in the examples, attributes are indicated in brackets): ambiguous Objects contextually ambiguous to identify are annotated with this label.For example, both [elephant] and [something] are annotated as ambiguous because it is unclear which elephant and object they refer to.Also, [blue] likely refers to the vest of Elephant R but not specified in this context; thus, it is also annotated as ambiguous.There are elephants, two [elephants] more , here.This is the Circus Profession, [Procession] more .That one is holding an umbrella, or a [fan] more .
[elephants] is an apposition that adds more information to elephants.[Procession] is a prototypical repair case that fixes the prior mention of Profession.
[fan] may not be considered a repair in some analysis, but it is in ours because it attempts to fix the earlier mention of umbrella in a speech setting.

Miscellaneous
Two additional modifiers, Nmod and Xmod are annotated.Nmod are modifiers of nominals that modify entities with the attr relation: The elephant with a blue [jacket] with .The elephant has a blue [jacket] with .

Meta-Semantic Analysis
Given the annotation in Section 3, several analyses are made to observe how effective meta-semantic is to distinguish the control (C) and MCI (M ) groups.

Entity Coverage Analysis
We anticipate that most subjects in C and M would recognize the main entities whereas a fewer number of sub entities would be commonly recognized by M than C.For each entity e i , that is the i'th entity in Table 4, two counts c c i and c m i are measured such that they are the numbers of subjects in C and M whose transcripts include at least one mention of e i .For instance, the entity e 7 = Title is mentioned by c c 7 = 37 subjects in C and c m 7 = 40 subjects in M in our annotation.
Figure 4 shows how many entities are commonly mentioned by each percentage range of the subjects in C and M .For example, six entities are commonly mentioned by 55∼75% of the subjects in C whereas only three entities are commonly mentioned by the same range of the subjects in M .These percentage ranges are analyzed as follows: Mid range (35∼75%) Subjects in M start not recognizing certain entities recognized by subjects in C in this range.14 entities are commonly mentioned by C whereas 10 entities are mentioned by M .When the range is fine-grained to 45∼75%, the difference becomes even more significant such that 10 entities are commonly mentioned by C whereas only 5 entities are mentioned by M in that range.
Low range (15∼35%) Similar to the high range, no significant difference is found between the two groups.11 and 13 entities are commonly recognized by C and M , respectively in this range.
For the whole range of 15∼75%, the plot from C can be well fitted to a linear line with R 2 = 0.9524, whereas the one from M cannot, resulting significantly lower R 2 = 0.5924.The plot from M rather shows an inverted Gaussian distribution, implying that the majority of M tends not to mention about entities that are not immediately conspicuous which is not necessarily the case for subjects in C.

Entity Focus Analysis
This analysis shows which entities are more frequently mentioned (focused) by what subject group.
For each entity e i and its counts c c i and c m i in Section 4.1, the proportions p c i and p m i are measured such that p c i = c c i/|C| and p m i = c m i /|M|, where |C| = |M | = 50 (Table 1).Then, the relative difference d r i for the i'th entity is measured as follow: Thus, if d r i is greater than 0, e i is more commonly mentioned by C; otherwise, it is by M .Figure 4 shows the entities that are significantly more mentioned by C (blue) and M (red), where |d| r i ≥ 0.2.6 entities, CL Pants, M Boots, ER Glasses, EL Collar, ER Trunk, M Flag, EL Pants, ER Vest, and EL Jacket, are noticeably more mentioned by C, whereas only 2 entities, EL Tie and EL Hat, are by M , which are focused on those two small parts of the left elephant.Additionally, M mentions more about the Background, which is not a specific object but an abstract environment.

Entity Density Analysis
This analysis shows the proportion of the description used for each object in the transcript.Metasemantic representation forms a graph comprising many isolated subgraphs.In Figure 3, there are 5 subgraphs, where the largest subgraph has 7 vertices (the one with Elephant L) and the smallest subgraph has only 1 vertex (the one with Title).3, where x and y axises are ranked indices and sizes of the subgraphs, respectively.
Let G t be a graph derived from meta-semantic representation annotated on the t'th transcript.G t can be represented by a list of its subgraphs sorted in descending order with respect to their sizes such that G t = [g t 1 , . . ., g t k ] where |g i | ≥ |g j | for all 0 < i < j ≤ k.The size of a subgraph is determined by the number of vertices.For the graph in Figure 3 3.The control plot can be well-fitted to a linear line with R 2 = 0.9312, whereas the MCI plot is better fitted to an exponential curve with R 2 = 0.9206.
The control plots fit to lower degree functions more reliably than the MCI plots, although not statistically significant, implying that subjects in C distribute their time more evenly to describe different entities than subjects in M who tend to spend most of their time to describe a couple of entities but not so much for the rest of the entities.

Predicate Argument Analysis
Figure 7 shows the average percentages of predicates and their thematic arguments annotated on the transcripts.Subjects in C generally form sentences with more predicate argument structures although the differences are not statistically significant.Not enough instances of the modifiers (e.g., mnr, loc) are found to make a meaningful analysis for those roles.Although predicate argument structures may not appear useful, these structures make it possible to perform the entity density analysis in Section 4.3 and potentially other types of analyses, which we will conduct in the future.

Discourse Attribute Analysis
Figure 8 shows the average percentages of discourse attributes.Notice that M makes over twice more ambiguous mentions than C, implying that MCI patients do not elaborate as well.Moreover, M makes more fuzzy expressions and frequently uses more relations to repair, which makes their speeches less articulated.On the other hand, C makes more subjective opinion and certain expressions with emphasis, which makes their speeches sound more confident.These are essential features to distinguish M from C, makes this analysis more "meta-semantics".

Inter-Annotator Agreement
The annotation guidelines summarized in Section 3 are established through multiple rounds of double annotation and adjudication.During the final round, the entity annotation, the predicate argument annotation, and the discourse attribute annotation reach the F1 scores of 88%, 82%, and 89% respectively for the inter-annotator agreement, which yield highquality data ready for training statistical models.

Data Split
The 100 transcripts from Section 2 are split into 5 folds where each fold contains 10 transcripts from the control group and another 10 transcripts from the MCI group (so the total of 20 transcripts).To evaluate our model that takes a transcript annotated with meta-semantic representation as input and predicts whether or not it is from the MCI group, 5fold cross validation is used, which is suitable for experimenting with such a small dataset.

Features
For each transcript, three types of features are extracted from the meta-semantic analysis in Section 4 for the classifications of Control vs. MCI: • Entity Types: A vector e ∈ R 1×|E| is created where |E| = 50 is the total number of predefined entities in Table 4, and each dimension i of e represents the occurrence of the corresponding entity such that e i = 1 if the i'th entity appears in the transcript; otherwise, e i = 0.
• Entity Densities: A vector d ∈ R 1×|P | is created where P = {1, 2, 3} (|P | = 3) consisting of degrees used for the entity density analysis in Section 4.3 (in this case, the polynomial functions with degrees 1, 2, and 3 are used) such that d i is the sum of the squared error measured by comparing the size list L of this transcript to the fitted polynomial function of the degree i.
• Labels: A vector b ∈ R 1×|N | is created where N contains counts of predicates, thematic roles, and discourse attributes in Sections 3.2 and 3.3 (|N | = 16) such that b i is the count of the corresponding component in the transcript.

Classification
The feature vector x = e ⊕ d ⊕ b is created by concatenating e, d, and b, and gets fed into a classifier.Figure 9 illustrates the feed-forward neural network used for the classification between the control and the MCI groups.Let the size of the feature vector x be s = |E| + |P | + |L|.Then, the input vector x ∈ R 1×s is multiplied by the weight matrix W 0 ∈ R s×d 0 and generates the first hidden vector h 1 = x•W 0 .The hidden vector h 1 ∈ R 1×d 0 is multiplied by another weight matrix W 1 ∈ R d 0 ×d 1 and generates the second hidden vector h 2 = h 1 • W 1 .Finally, h 2 ∈ R 1×d 1 is multiplied by the last weight matrix W 2 ∈ R d 1 ×d 2 where d 2 is the number of classes to be predicted, and generates the output vector y = h 2 • W 2 ∈ R 1×d 2 .In our case, the sizes of the hidden vectors are d 0 = 200 and d 1 = 100, and the size of the output vector is d 2 = 2.Note that we have experimented with simpler networks comprising only one or no hidden layer, but the one with two hidden layers shows the best results.The two dimensions y m and y c in the output vector are optimized for the likelihoods of the subject being control or MCI, respectively.The average of 82% accuracy is achieved by the 5-fold crossvalidation (Section 5.2) with this model.Considering these are subjects that the standardized tests such as MoCA or Boston Naming Test could not distinguish (Table 1), this result is very promising.
6 Related Work Reilly et al. (2010) found that neurodegenerative disorders could deteriorate nerve cells controlling cognitive, speech and language processes.Verma and Howard (2012) reported that language impairment in AD could affect verbal fluency and naming, that requires integrity of semantic concepts, before breaking down in other facets of the brain.Tillas (2015) showed that linguistic clues captured from verbal utterances could indicate symptoms of AD.Toledo et al. ( 2018) investigated the significance of lexical and syntactic features from verbal narratives of AD patients by performing several statistical tests based on 121 elderly participants consisting of 60 patients with AD and 61 control subjects.In this work, immediate word repetitions, word revisions, and coordination structures could be used to distinguish patients with AD from the control group.Mueller et al. (2018) recently found that AD patients often depicted less informative discourse, greater impairment in global coherence, greater modularization, and inferior narrative structure compared to the normal control group.

Figure 2 :
Figure 2: A screenshot of our annotation interface using the web-based tool BRAT on the first five sentences of the control example in Table3.

Figure 3 :
Figure 3: Visualization of meta-semantic representation on the first 5 sentences of the control example in Table3.

A
[polka dot] attr dress.Very [big] attr [red and yellow] attr pants.Xmod are any other types of modifiers, mostly adverbials and prepositions.If adverbials, they are annotated with the adv relation in Sec 3.2.If prepositions, they are annotated with the case relation: There is a [seemingly] adv dancing clown.Feathers are coming [out of] case the hat.Finally, possessions of entities are annotated with the with relation regardless of verbs such as have or get for the consistency across different structures.In both of the following sentences, [jacket] has the with relation to the elephant.

Figure 5 :
Figure 5: Entity focus analysis.Entities focused by C and M are colored in blue and red, respectively.

Figure 6 :
Figure6: Plots of size lists derived from meta-semantic representation annotated on the control and MCI examples in Table3, where x and y axises are ranked indices and sizes of the subgraphs, respectively.
, G = [g 1 , . . ., g 5 ] such that |G| = k = 5, |g 1 | = 7, and |g 5 | = 1.Given G t , the size list L t can be derived such that L t = [|g t 1 |, . . ., |g| t k ]. Figure 6 shows plots of the size lists from the graphs derived by meta-semantic representation annotated on the control and MCI examples in Table

Figure 9 :
Figure 9: Feed-forward neural network used for the classification of the control vs.MCI group.

Table 3 :
Transcripts from a normal control and an MCI patient whose MoCA scores are 29 points.

Table 5 :
The average sums of squared errors by fitting each size list to degrees 1-5 of polynomial functions.

Table 5
shows the average sums of squared errors SSE d by fitting each size list L t = [l t 1 , . . ., l t k ] to polynomial functions f d (x) of degrees d = [1, .., 5] where n = 50 for both C and M :