Quantitative Semantic Variation in the Contexts of Concrete and Abstract Words

Across disciplines, researchers are eager to gain insight into empirical features of abstract vs. concrete concepts. In this work, we provide a detailed characterisation of the distributional nature of abstract and concrete words across 16,620 English nouns, verbs and adjectives. Specifically, we investigate the following questions: (1) What is the distribution of concreteness in the contexts of concrete and abstract target words? (2) What are the differences between concrete and abstract words in terms of contextual semantic diversity? (3) How does the entropy of concrete and abstract word contexts differ? Overall, our studies show consistent differences in the distributional representation of concrete and abstract words, thus challenging existing theories of cognition and providing a more fine-grained description of their nature.


Introduction
The complete understanding of the cognitive mechanisms behind the processing of concrete and abstract meanings represents a key and still open question in cognitive science (Barsalou and Wiemer-Hastings, 2005). More specifically, the psycholinguistic literature reports extensive analyses of how concrete concepts are processed, however there is still little consensus about the nature of abstract concepts (Barsalou and Wiemer-Hastings, 2005;McRae and Jones, 2013;Hill et al., 2014;Vigliocco et al., 2014).
The Context Availability Theory represents one of the earliest theoretical approaches aiming to account for the differences between concrete and abstract concepts (Schwanenflugel and Shoben, 1983). This theory suggests that meaning arises from the ability to create an appropriate context for a concept, which has proven to be more challenging (i.e., enforcing higher reaction times and larger number of errors) for abstract than for concrete concepts. In a computational study, Hill et al. (2014) quantitatively analysed the distinction between concrete and abstract words in a large corpus. Overall, they showed that abstract words occur within a broad range of context words while concrete words occur within a smaller set of context words. Similarly, Hoffman et al. (2013) and Hoffman and Woollams (2015) analysed the concrete vs. abstract dichotomy in terms of their semantic diversity, demonstrating that concrete words occur within highly similar contexts while abstract words occur in a broad range of less associated contexts (i.e., exhibiting high semantic diversity). These computational findings are fully in line with the Context Availability Theory: the processing time of concrete words is generally shorter than the processing time of abstract words, as abstract words are attached to a broad range of loosely associated words.
More recently, embodied theories of cognition have suggested that word meanings are grounded in the sensory-motor system (Barsalou and Wiemer-Hastings, 2005;Glenberg and Kaschak, 2002;Hill et al., 2014;Pecher et al., 2011). According to this account, concrete concepts have a direct referent in the real world, while abstract concepts have to activate a series of concrete concepts that provide the necessary situational context required to successfully process their meanings (Barsalou, 1999).
These interdisciplinary outcomes are not fully supported by recent computational studies showing different contextual patterns for concrete and abstract words in text compared to the literature (Bhaskar et al., 2017;. It is becoming clear, however, that the inclusion of information regarding the concreteness of words plays a key role in the automatic identification of non-literal language usage (Turney et al., 2011;Schulte im Walde, 2016, 2017).
The aim of the current study is thus to provide a contextual description of the distributional representation of these two classes of words, to gain insight into empirical features of abstract vs. concrete concepts. This would represent an essential contribution to the resolution of the debate about meaning representation within the human mind, and thereby also help to enhance computationally derived models that are concerned with meaning derivation from text.

Hypotheses
Based on the existing psycholinguistic and computational evidence reported in the previous section, we formulate three hypotheses regarding the distributional nature of concrete and abstract words that we will test in the following studies.
(1) The contexts of both concrete and abstract words are mainly composed of concrete words.
This first hypothesis directly tests the general claim of grounding theories: both concrete and abstract words require the activation of a layer of situational (concrete) information in order to be successfully processed (Barsalou and Wiemer-Hastings, 2005). According to the Distributional Hypothesis (Harris, 1954;Firth, 1968), similar linguistic contexts tend to imply similar meanings of words. Thus, we suggest to perform a distributional semantic analysis in order to quantitatively investigate the contexts that concrete and abstract words frequently co-occur within.
(2) Abstract words occur in a broad range of distinct contexts whereas concrete words appear in a limited set of contexts.
Based on the computational study by Hill et al. (2014), we expect to find concrete words appearing in a more restricted set of contexts in comparison to abstract words, which should occur in a broad range of contexts. This second hypothesis is explored by providing two fine-grained analyses of the extension and variety in contexts of concrete and abstract words.
(3) Abstract words are more difficult to predict than concrete words, due to their higher contextual variability.
Building upon the previous hypothesis and on the studies by Hoffman et al. (2013) and Hoffman and Woollams (2015), we aim to show that concrete words are easier to predict than abstract words. Specifically, we expect higher entropy values for abstract than for concrete contexts, indicating that on average, we need more information to uniquely encode an abstract word than a concrete word (Shannon, 2001). The reason resides within the high context variability of abstract words: there is a large number of highly probable words satisfying these contexts. In contrast, we expect concrete words to occur in a limited set of different contexts because there is only a restricted amount of words that have a high probability to fit a specific context. Thus, we estimate the entropy value of concrete contexts to be lower than the entropy value of abstract contexts.
In the three studies reported in this paper, we systematically test these three hypotheses regarding concrete vs. abstract words, by performing quantitative analyses of the distributional representations across the word classes of nouns, verbs and adjectives.

Materials and Method
For our studies, we selected nouns, verbs and adjectives from the Brysbaert et al. (2014) collection of concreteness ratings for 40,000 English words. In total we used 16,620 target words including 9,240 nouns, 3,976 verbs and 3,404 adjectives. 1 Each word in this collection has been scored by humans according to its concreteness on a scale from 1 (abstract) to 5 (concrete).
Our distributional semantic representations of the target words were built by extracting co-occurrences from the POS-tagged version (Schmid, 1994) of the sentence-shuffled English COW corpus ENCOW16AX (Schäfer and Bildhauer, 2012). We originally constructed three different spaces with window sizes of 2, 10, and 20 context words surrounding the target, and performed parallel analyses for all the three spaces. Since we did not find any relevant differences between the three spaces, we will report only the analyses based on the distributional space from a window size of 20 context words. Moreover, we restricted the dimensions in our matrix to 16,620 × 16,620 (target words × context words). By using the target words also as context words, we had knowledge about the concreteness score of each context word. In a follow-up study, we performed the same analyses extracting co-occurrences from the British National Corpus (Burnard, 2000). Even though both the size and the nature of these two corpora are extremely different, the results did not show any significant difference.
In order to get a clearer picture about empirical distributional differences for concrete vs. abstract targets, we focused some of our analyses only on the most concrete and abstract targets, expecting words with mid-range concreteness scores to be more difficult in their generation by humans and consequently noisier in their distributional representation. For this reason, we analysed the 1,000 most concrete (concreteness range: 4.82 -5.00) and the 1,000 most abstract (1.07 -2.17) nouns, the 500 most concrete (4.71 -5.00) and most abstract (1.12 -2.21) verbs, and the 200 most concrete (4.34 -5.00) and most abstract (1.19 -1.64) adjectives. On the other hand, context was not subset and consisted of the complete set of 16,620 nouns, verbs and adjectives.

Study 1: Analysis of Concrete vs. Abstract Co-Occurrences
In this study we test the validity of hypothesis (1): the contexts of both concrete and abstract words are mainly concrete. For this purpose, we analyse the distributions of the 16,620 context dimensions for their concreteness, by the parts-of-speech of target and context words.
Noun Targets Figure 1 reports the distribution of noun, verb and adjective contexts for the 1,000 most abstract target nouns ( Figure 1a) in comparison to the 1,000 most concrete target nouns ( Figure 1b). As clearly shown in Figure 1a, the majority of contexts of an abstract noun are also abstract: noun, verb and adjective context words all show the maximum peak at low concreteness scores. On the contrary, the distributions of the contexts of concrete nouns shown in Figure 1b vary according to POS. The nouns in the context of concrete noun targets are also very concrete as shown by the high red bar at concreteness 4.5-5.   On the other hand, verbs and adjectives show a similar pattern to Figure 1a: a greater distribution with low concreteness scores. Figure 2 shows a very comparable pattern to the one described for noun targets. Contexts of abstract verbs are, on average, also abstract, regardless of their POS. On the other hand, the verbs and adjectives in the contexts of concrete verb targets are mainly abstract, while the nouns are mainly concrete.

Verb Targets
Adjective Targets Again, Figure 3 shows the same pattern as the one reported for nouns and verbs.  Discussion Table 1 reports an overview of the outcomes of this first study. The "X" indicates the predominant contextual class (abstract vs. concrete words) for each target class by POS. All in all, our results partly disagree with our first hypothesis induced from observations in the literature, within the scope of which we expected the context of concrete and abstract words to be mostly composed of concrete words.  More specifically, our first hypothesis is confirmed, on the one hand, by the contextual distribution of concrete target nouns, due to the fact that they frequently appear with other concrete nouns. On the contrary, it is rejected by the contextual ratio of abstract nouns as they primarily co-occur with other abstract nouns. Thus, as we based our hypothesis on the theory of embodied cognition, the observed contextual pattern of abstract nouns challenges this theory.
Another evidence in favour of our hypothesis comes from the nouns in the context of concrete verbs and adjectives that are mainly concrete. In contrast, concrete and abstract nouns, verbs and adjectives elicit the same contextual pattern regarding context verbs and adjectives. They cooccur with abstract verbs and abstract adjectives to a large extent, which does not support the expectations based on the existing literature.

Study 2: Semantic Diversity of Context
In this study, we test our second hypothesis: abstract words occur in a broad range of distinct contexts whereas concrete words appear in a limited set of different contexts. In the following sections we report two studies where we analyse (i) the number of non-zero dimensions in the representation of concrete vs. abstract words, and (ii) the degree of semantic variability in their contexts.

Non-Zero Dimensions
The analysis of the number of non-zero dimensions in the vector representation of concrete and abstract words provides a first indicator of the contextual richness of our targets. Based on Hill et al. (2014), we expect concrete target words to have significantly less diverse context dimensions than abstract target words, as the former should co-occur within a restricted set of context words. Therefore, we expect the portion of non-zero context dimensions to be smaller for concrete than for abstract target words.
The following analyses compare the proportions of non-zero context dimensions between the 1,000 highly concrete (blue boxes) and highly abstract (red boxes) target nouns, 500 verbs, and 200 adjectives, based on raw frequency counts. For each POS, we compared the proportion of nonzero dimensions in the full vectors of 16,620 context words for concrete and abstract target words (left side), and the number of non-zero dimensions with the same part-of-speech of the target (respectively, 9,240 context nouns, 3,976 context verbs, 3,404 context adjectives). The star ( ) indicates the mean number of non-zero dimensions.
Noun Targets As shown in Figure 4, the comparison of non-zero context dimensions of concrete (M = 57.80, SD = 23.07) and abstract (M = 57.78, SD = 22.57) target nouns does not show any significant difference (t(33238) = -0.02, p = 0.98). This result indicates that concrete and abstract target nouns co-occur with a similar amount of context words. We can observe the exact same pattern when we restrict the contexts to nouns only: no significant difference between the number of non-zero context noun dimensions for concrete (M = 32.12, SD = 12.98) and abstract (M = 31.78, SD = 12.76) target nouns (t(18478) = -0.59, p = 0.56). Figure 5 reports the number of non-zero dimensions for concrete and abstract verbs. When considering the full set of contexts (left side), concrete words (M = 37.93, SD = 22.5) have significantly less active contexts than abstract words (M = 64.2, SD = 25.73; t(33238) = 17.18, p < 0.001). The exact same outcome is shown when focusing only on verbs as contexts (t(7950) = 16.3, p < 0.001).

Verb Targets
Adjective Targets The analysis of the adjectives in Figure 6 indicates that the number of non-zero dimensions for concrete and abstract adjectives follows the same pattern as the verbs. When considering the full set of contexts (left side), concrete adjectives (M = 40.4, SD = 24.7) have significantly less active contexts than abstract adjectives (M = 59.46, SD = 19.11, t(33238) = 8.63, p < 0.001). The exact same outcome is shown when focusing only on adjectives as contexts (t(6806) = 10.15, p < 0.001).

Semantic Diversity of Context
Based on hypothesis (2), we expect the contexts of concrete words to be more similar among themselves than the contexts of abstract words. We test this hypothesis by computing the semantic diver-sity of the contexts of concrete and abstract targets. Semantic diversity corresponds to the inverse of the average semantic similarity of each pair of context dimensions of a word (Hoffman et al., 2013). In order to control for pure frequency effects, we transformed the co-occurrence frequency counts into local mutual information (LMI) scores (Evert, 2005).
The study reports the average cosine similarity between context dimensions for concrete and abstract words; the analysis is conducted incrementally, including the top-k most associated context dimensions (from 5 to 16,620 associates) sorted by their LMI scores. Figure 7 reports the average semantic similarity between the context dimensions of the 1,000 most concrete (blue boxes) and the 1,000 most abstract (red boxes) target nouns. The analysis is performed step-wise from left to right, starting with the average similarity between the 5 most associated contexts and moving up to the average similarity between all 16,620 context dimensions. Overall, while increasing the number of dimensions, both the mean similarity and also the differences in mean between concrete and abstract words drop, while remaining significant. The difference between the mean cosine similarity of the most associated contexts of concrete (M = 0.32, SD = 0.14 at k = 5) and abstract (M = 0.20, SD = 0.13 at k = 5) target nouns is significant (p < 0.001 at k = 5).

Noun Targets
Verb Targets As shown in Figure 8, there are no significant differences (p = 0.38 at k = 5) in the similarity of the context dimensions of the 500 most concrete (M = 0.23, SD = 0.15 at k = 5) and most abstract (M = 0.23, SD = 0.16 at k = 5) verb targets.
Adjective Targets When analysing the similarity of the contexts of the 200 most concrete and abstract adjectives we see ( Figure 9) the same pattern as shown for nouns. The average similarity of the most associated contexts is significantly higher (p<0.001 at k = 5) for concrete (M = 0.26, SD = 0.14 at k = 5) than for abstract (M = 0.17, SD = 0.12 at k = 5) target adjectives.

Discussion
According to hypothesis (2), we expected abstract words to occur in a broader range of distinct contexts and concrete words to appear in a more lim-ited set of different contexts. Moreover, the contexts of concrete words should be more restricted and more similar to each other compared to the contexts of abstract words. The results discussed only partially support this hypothesis. The analysis of the number of non-zero context dimensions for concrete and abstract target verbs and adjectives show results in line with hypothesis (2). On the contrary, concrete and abstract target nouns share the same number of non-zero dimensions. The analysis of the similarity between contexts of concrete and abstract target nouns and adjectives supports our hypothesis; while we do not see any significant difference when analysing the verbs.

Study 3: Entropy of Concrete and Abstract Words
In this study we test our third hypothesis: abstract words are more difficult to predict than concrete words, due to their higher contextual variability. In study 2 we already started investigating this phenomenon using semantic diversity. In the current study we will use entropy as a measure of variability (Shannon, 2001): Based on the assumption that abstract words occur within a high number of distinct contexts, we expect the entropy of abstract words to be higher than the entropy of concrete words. Figure 10 reports the average entropy in the context of the top 1,000 most abstract (on the left side) and most concrete (on the right side) target nouns. Regarding the 1,000 most abstract target nouns, the entropy of the 1,000 most abstract context nouns (M = 7.42, SD = 0.58) is significantly higher (p < 0.001) than the entropy of the 1,000 most concrete context nouns (M = 6.44, SD = 0.77). A similar pattern emerges in the analysis of the entropy of the contexts of the 1,000 most concrete target nouns: the difference between concrete (M = 6.64, SD = 0.61) and abstract contexts (M = 7.21, SD = 0.54) is statistically significant (p < 0.001).

Discussion
The results of this study support the predictions of hypothesis (3): concrete contexts have significantly lower entropy than abstract contexts irrespective of the POS of their target words.

Conclusions
The aim of this work was to provide a very detailed description of the contextual representation of concrete and abstract English nouns, verbs and adjectives. Table 2 summarises the most important findings. 1) Concrete target nouns, verbs and adjectives mainly co-occur with concrete nouns and with abstract verbs and adjectives, while abstract target words always co-occur with abstract words. 2a) The contexts of abstract target verbs and adjectives are broader (less non-zero dimensions) than those of concrete targets verbs and adjectives. On the other hand, concrete and abstract target nouns have a similar number of nonzero dimensions. 2b) The most associated contexts of concrete nouns and adjectives are significantly more similar to each other than the contexts of abstract nouns and adjectives. However, no difference emerges between the contexts of verbs. 3) The concrete contexts of concrete and abstract targets (nouns, verbs, adjectives) have significantly lower entropy values than their abstract contexts. Overall, hypotheses (1) and (2) are not fully supported by our analyses; on the contrary, the predictions made in hypothesis (3) are confirmed.
The three studies described in this paper thus show consistent differences in the contexts of concrete and abstract words and yield patterns that challenge the grounding theory of cognition. In their analyses on noun and verb comprehension, Barsalou (1999) and Richardson et al. (2003) suggest that humans process abstract concepts by creating a perceptual representation. These representations are inherently concrete because they are stored as "experiential traces" generated through the exposure to real world situations using our five senses (Van Dam et al., 2010). In the instructions of their norming study, Brysbaert et al. (2014, p. 906) describe concrete words in a similar way: "some words refer to things or actions in reality, which you can experience directly through one of the five senses". On the contrary, our study is aligned more with recent theories claiming a representational pluralism that includes both perceptual and non-perceptual features (Dove, 2009).
While the reported cognitive theories describe general patterns emerging from the distinction between concrete and abstract words, the novelty of our study is to provide a fine-grained analysis of the distributional nature of these words and an attempt to explain their similarities and differences from a data-driven perspective. In our opinion, the detection of the precise properties of concrete and abstract words makes an extremely valuable contribution to the long-lasting debate about meaning representation in the human mind and to the use of this knowledge to significantly improve the performance of computational models.