Possessors Change Over Time: A Case Study with Artworks

This paper presents a corpus and experimental results to extract possession relations over time. We work with Wikipedia articles about artworks, and extract possession relations along with temporal information indicating when these relations are true. The annotation scheme yields many possessors over time for a given artwork, and experimental results show that an LSTM ensemble can automate the task.


Introduction
All languages have a way to express possessive relationships (Aikhenvald and Dixon, 2012). Possession is an asymmetric semantic relation between two entities, where one entity (the possessee) belongs to the other entity (the possessor) (Stassen, 2009). When it comes to defining possession, belongs includes a wide range of relationships, including kinship (e.g., [my]  Possession relations can be divided into alienable (also referred to as acquired, transferable, non-intimate, etc.) and inalienable (also referred to as inherent, inseparable, intimate, etc.). Possessees that can be separated from their possessors are alienable, and possessees that cannot normally be separated from their possessors are inalienable (Heine, 1997). For example, [John] x 's [condo] y is alienable, and [John] x 's [arm] y is inalienable (some previous works would call the latter a part-whole relation instead). Tham (2004) In 1530 the painting was inherited by Margaret's niece Mary of Hungary, who [. . . ]. It is clearly described in an inventory taken after her death in 1558, when it was inherited by Philip II of Spain. defines control possession as a relation in which the possessor has temporary control of the possessee, but does not necessarily alienably possess it (e.g., [John] x borrowed the [car] y for the weekend). Following the aforecited works, possession goes beyond ownership of property.
Virtually all possessees change possessors over time, especially if possession relationships are understood in a broad sense as outlined above. Consider the excerpt of the Wikipedia article about the Arnolfini Portrait in Figure 1. From this excerpt, we know that the painting had at least two possessors (Mary of Hungary and Philip II of Spain), and that they were the possessors from 1530 to 1558 and after 1558 respectively.
In this paper, we track possessors of selected possessees over time. Unlike most previous works (Section 3), we (a) start with a document relevant to the possessee of interest, (b) select plausible possessors and years without syntactic restrictions and including inter-sentential pairs, and then (c) determine whether the plausible possessors are actual possessors with respect to the years. The main contributions of this paper are: (a) 88 Wikipedia articles about artworks annotated with their possessors over time; 1 (b) a detailed corpus analysis (e.g., unique possessors, years and possessor-year pairs); and (c) experimental results showing that an LSTM ensemble outperforms SVM.

Previous Work
We briefly summarize work on possession relationships from a theoretical perspective, and then move to work in computational linguistics.

Possession relations
The very definition of possession is not set in stone. Aikhenvald (2013) distinguishes three core meanings for possessive noun phrases that occur across languages: ownership (of property), wholepart (often referred to as part-whole), and kinship. Following a cross-linguistic perspective, she discusses possessions and time (present and former possession relationships, e.g., my tooth vs. my former axe), temporary and permanent possession (e.g., borrow vs. acquire) and others. Heine (1997) Miller and Johnson-Laird (1976) differentiate between three kinds of possession: inherent, accidental, and physical; and provide the following example: He owns an umbrella (inherent), but she's borrowed it (accidental), though she doesn't have it with her (physical).
Possession relations have also been defined in terms of parameters. For example, Stassen (2009) considers two parameters (permanent contact and control) and Heine (1997) defines five parameters (human possessor, concrete possessee, spatial proximity, temporal permanence, and control).
While we do not closely follow any of these previous works, we borrow from them the broad definition of possession relations, and the motivation to work with possessions over time.

Computational Linguistics
Within computational linguistics, possession relations have been mostly studied as one of the many relations encoded in a given syntactic construction. For example, Tratz and Hovy (2013) extract semantic relations within English possessives. They propose a set of 18 relations, e.g. temporal (e.g., [today] x 's [rates] y ), extent (e.g., [6 hours] y ' [drive] x ). Their controller / owner / user relation (one relation with three aliases) is the closest relation to the possession relations we target in this paper. Extracting semantic relations between noun compounds (Nakov and Hearst, 2013;Tratz and Hovy, 2010) usually includes extracting possession relations, e.g., [family] x [estate] y . These previous works extract all semantic relations-including possessionsbetween arguments that follow a syntactic construction.
In our previous work (Chinnappa and Blanco, 2018), we identify possession relations between a deterministically chosen person (possessor) and a concrete object (possessee) within a sentence. If a possession relation exists, we also identify the possession type (alienable or control). Finally, we temporally anchor the possession relation with respect to the verb of which the possessor is the subject. In this paper, we take a complementary approach. We start with text relevant to the possessee of interest-specifically, its Wikipedia articleand then extract its possessors without any restrictions beyond considering as possessors only named entities. Furthermore, we specify in which years the possessions were true.
To the best of our knowledge, the work by Banea et al. (2016) is the only one on extracting possession relations without imposing syntactic constraints. Banea and Mihalcea (2018) build a corpus working with personal blogs, and present results on automatic extraction of possession using a naive bayes approach. They consider as possessors the author of a blog, and as possessees concrete nouns in blog posts. Regarding time, they annotate possessions at the time of the utterance (when the blog posts were published). Unlike them, we work with one possessee per Wikipedia article (i.e., the artwork the article is about), and then find possessors in the article. Additionally, we extract when a possessor-possessee relation is true with respect to the years in the article, and present results using SVM and end-to-end neural networks.

Annotating Possessions Over Time
In this section, we detail the methodology to create a corpus of possession relations over time. We first discuss the selection of source documents and possessees of interest. Then, we detail what is considered as a potential possessor, and how these possessors are paired with years. Finally, we describe the annotation process (is the potential possessor an actual possessor with respect to the years?) and analyze the resulting corpus.

Selecting Source Documents
Our goal is to target possessors of a given possessee over time. A natural choice is to work with documents about specific objects, as they are likely to describe the history and key events involving the objects. We decided to work with Wikipedia articles about important artworks. The methodology presented here, however, is not limited to artworks, and we believe it is applicable to any article about an object of interest. We selected 100 artworks using online content, including Google queries for famous artwork and famous paintings, and online lists. 2 Then, we downloaded the full content of the corresponding Wikipedia articles. Some of the selected artworks are The Third of May 1808, Philosopher in Meditation, and Saturn Devouring His Son. The final corpus has 88 articles because we discarded articles if we could not select at least three (potential possessor, year) pairs (see below).  Table 1 presents basic counts and percentages of the potential possessors and years after removing duplicates (Step 4), and the pairs generated (Step 5) for all documents. There are 3,230 potential possessors and 940 years, and 12,913 (potential possessor, year) pairs. The most common named entity of potential possessors is PERSON (48%), followed by ORG (30.5%) and GPE / LOC (21.5%). The percentage of (potential possessor, year) pairs depending on the named entity of the potential possessor almost follows an identical distribution (48.8%, 29.7% and 21.4%). Figure 2 shows the distributions of unique potential possessors, years and (potential possessor, year) pairs generated per article (or equivalently, per possessee). While the distributions are far from uniform, the boxplots show that most articles have a substantial number of potential possessors, years and pairs. The minimum number of potential possessors is 4, of years 2, and of pairs 7. But Cohen's κ Before 0.69 During 0.59 After 0.77 All 0.70 over 75% of articles have at least 19 unique potential possessors, 5 years and 46 pairs; and over 50% of articles have at least 28 unique potential possessors, 8 years and 86 pairs. In other words, our corpus takes into account many potential possessors and years for the vast majority of articles.

Validating Possessors and Years
After (potential possessor, year) pairs were generated, they were validated manually. To do so, we asked the following questions to annotators: The annotation interface showed the title of the article and the section to which the potential possessor and year belong to (section title + text). Annotators were instructed to first read the section and then answer all questions. Thus, annotators reveal possession information involving possessors and years that are potentially far away (different clauses, sentences, etc.). Recall that all potential possessors and years within a section are paired, thus we allow to cross sentence boundaries. Annotation Quality. Annotations were done inhouse by two graduate students. Both of them annotated 25% of the articles individually. Table 2 shows inter-annotator agreements (Cohen's kappa) for each question. Overall, inter-annotator agreement is 0.70 (values between 0.60 and 0.80 are considered substantial (Artstein and Poesio, 2008)). Agreements are higher for Before and After than During (0.69 and 0.77 vs. 0.59). The remaining articles were annotated once. Annotation Examples. Figure 3 shows the annotations for one paragraph of the Wikipedia article about Girl with a Pearl Earring (more specifically, from the section titled Ownership and display). The figure shows the annotations on top of a screenshot of the article for clarity purposes, but the annotation interface only showed one section at a time along with all the generated pairs (Section 3.3, equivalent to pre-drawing edges).
Five potential possessors and two years were selected, thus ten (potential possessor, year) pairs were generated. The annotations reveal the intuitive possession information contained within the paragraph. First, Victor de Stuers was an advisor to Arnoldus Andries des Tombe, so there is no evidence that he was a possessor at any point of time (missing label edges). Second, Vermeer is the artist who made Girl with a Pearl Earring, so there are possession relations before 1881 and 1902. Third, Arnoldus purchased the piece in The Hague in 1881, and in 1902 it was donated to Mauritshuis. So Arnoldus was a possessor in 1881 and after 1881 (until 1902), The Hague in 1881 (recall that non-humans can be possessors, spatial proximity is also considered possession, Section 2.1), and Mauritshuis during and after 1902. We discuss the limitation of the annotation approach in Section 3.5.

Annotation Analysis.
Counts of yes labels for the three questions (before, during and after) are rather low (17%, 9% and 19%, Figure 4). This is not surprising, as any PERSON, ORG, LOC and GPE named entity is considered as potential possessors. We note, however, that we annotated a possession relation (yes label) in 35% of (potential possessor, year) pairs generated (either before, during or after year). Figure 5 depicts the distribution of labels per article for (potential possessor, year) pairs generated from the same and different sentences. It is worth noting a couple of interesting patterns. First, the annotations contain many more possessions because we pair potential possessors and years that Possessor and year from the same sentence Possessor and year from different sentences Figure 5: Distribution of yes label per article. We provide distributions for each temporal anchor (at some point of time before, during or after year) and for all anchors, and distinguish between possessors and years belonging to the same sentence (left) or different sentences (right). Each boxplot shows the minimum, first quartile, median, third quartile, and maximum.
belong to different sentences (note the different scales in the y-axis). Second, for pairs generated from different sentences, yes label for during questions is much less likely than for the labels before and after.
Finally, Figure 6 shows the distribution of yes label for all generated pairs. At a minimum, each possessee has at least two possession relations. There are a few outliers articles in which annotators identified over 150 possessors in time. The From the possessor and year: the concatenation of tokens; binary flags for each token; the syntactic head (token, lemma and part-of-speech tag); and the named entity type.
From the sentences to which the possessor and year belong to: for (a) a window of 4 tokens to the left and right, (b) all the verbs to the left and right, (c) all the verbs that are ancestors or children in the depedency tree, and (d) all the left and right siblings in the dependency tree, the tokens, lemmas and part-of-speech tags.
Other and Wikipedia article: whether the possessor and year belong to the same sentence, whether the possessor appears before or after the year, the Wikipedia article tile (concatenation of tokens and binary flags for each token), and the section title (concatenation of tokens and binary flags for each token).

Limitations
While the proposed procedure successfully identifies possession relations over time, we acknowledge limitations in both the possessors and temporal information considered.
First, we only consider named entities as potential possessors, so it is possible we miss some possessors (e.g., pronouns, the artist, his son). Because of the source documents we work with (Wikipedia articles about artwork) and the fact that we pair all potential possessors and years within a section, this is not a big issue: most Wikipedia sections do not have mentions that cannot be resolved to a named entity within the same section. We note, however, that coreference resolution (Pradhan et al., 2011) would alleviate this problem.
Second, we only consider four digits within a DATE named entity as temporal information. This means that temporal information encoded in relative dates (e.g., four years later) or historical events (e.g. after World War II) is disregarded. Additionally, we cannot distinguish between several possessors within a year, finer-grained times would be required to do so. To address these issues, temporal parsers (Lee et al., 2014;Strötgen and Gertz, 2015) and anchoring events in time (Reimers et al., 2016) are required.

Experiments and Results
We experiment with traditional Support Vector Machines and neural networks. We divided the articles (and the corresponding (potential possessor, year) pairs) into train (80%) and test (20%), and report results obtained with the test split. Note that splitting pairs randomly would be unsound, as possession relations for the same possessee would be in the train and test splits. We build three classifiers with both SVMs and neural networks (one per question: before, during and after), and all of them predict two labels: yes or no.

Support Vector Machines
We trained the three classifiers using the SVM implementation in scikit-learn (Pedregosa et al., 2011), and tuned hyper-parameters C and γ using 10-fold cross-validation with the train split. We used features extracted from the possessor, the year, and the sentences they belong to. Additionally, we also included the Wikipedia article title and the section title from which the possessor and year were selected. The full feature set is described in Table 3 and we do not elaborate further. Our motivation to try SVMs is to establish a strong supervised baseline, and to compare with neural networks that take as input only plain text.

Neural Networks
We use the implementations provided by the Keras neural network API (Chollet et al., 2015) with TensorFlow backend (Abadi et al., 2015). Additionally, we use GloVe embeddings with 300 dimensions (Pennington et al., 2014) 4 to transform words into their distributed representations, the Adam optimizer (Kingma and Ba, 2014) and categorical cross entropy as a loss function. We train the network with batch size 16 for up to 200 epochs, but stop earlier if no improvement is observed in the validation set for 5 epochs. We reserve 20% of the train split for validation.
The neural network is composed of four Long Short-Term Memory networks (Hochreiter and Schmidhuber, 1997) with 200 units. The outputs of the LSTMs are concatenated along with the embeddings of the possessor and year, and the final output is calculated with a Softmax layer. Each LSTM has as its input a different chunk of text: • The first LSTM takes as input the sequence of tokens in the sentence containing the possessor (top left in Figure 7). Each token is represented by the corresponding word embedding, and an additional embedding (also with 300 dimensions) for the possessor and all other tokens (there are only two unique additional embeddings, white and light gray in Figure 7). Unlike the word embedding from GloVe, the additional embeddings are initialized randomly and are updated during the training process. Our rationale to add the additional embeddings is to provide the LSTM with information to learn which tokens surrounding the possessor are more important. 4 Available at https://nlp.stanford.edu/ projects/glove/, file glove/glove.6B.300.txt • The second LSTM takes as input the sentence containing the year (top right in Figure 7). The input representation is very similar to the one used in the first LSTM, the only difference is that the additional embeddings (white and dark grey) indicate the year and any other token. Again, our rationale for the additional embeddings is to provide the LSTM with information to learn which tokens surrounding the year are more important. • The third LSTM (bottom left in Figure 7) takes as input the Wikipedia article (i.e., the name of the possessee). The input words are represented with their GloVe embeddings. • The fourth LSTM (bottom right) takes as input the section title from which the possessor and year were selected. The input words are also represented with their GloVe embeddings and no additional information. Our rationale is that some sections are less likely to contain valid possessors (e.g., Cultural Impact (low likelihood) vs. Ownership and display (high likelihood).

Results
Results obtained with the test set are provided in Table 4. F-measures are always higher for no than yes, but recall that only yes label allows us to extract valid possession relations.
Baselines. The majority baseline always predicts no label for all temporal tags (before, during and after, see percentages in Figure 4), thus it fails to extract any possession information.
SVMs. SVMs obtain higher-than-chance results, but F-scores with yes label are relatively low (before: 0.33, during: 0.31 and after: 0.44). Neural Networks. The full neural network always outperforms SVMs, but the difference in F-score  We also experimented with modifications of the full neural network to provide insights into which components are more useful. Specifically, we report results not using the additional embeddings for the possessor and year, and disabling the LSTMs for the article title and section title. Note that while yes F-scores for during barely vary regardless of the modifications to the network, we found interesting patterns for before and after. All F-scores discussed below are for yes label, the only label that is useful to extract possession relations.
• First, the additional embeddings for the possessor and year are beneficial for after (0.47 vs 0.53, +12.8%) and during (0.29 vs 0.32, +10.3%), and neutral for before. This leads to the conclusion that the LSTM learns the contexts surrounding the possessor and year successfully only for after and during. Note that the additional embeddings provides information regarding the position of the possessor and year within their sentences. • Second, the LSTM that takes as its input the article title is beneficial for before (0.37 vs. 0.40, +8.1%) and after (0.46 vs 0.53, +15.2%), and barely for during (0.31 vs. 0.32, +3.1%). Thus we can conclude that the article title contains useful information to determine the existence of possession relations, and that pretrained word embeddings capture this information. • Third, the LSTM that takes as its input the section title is beneficial for after (0.48 vs 0.53, +10.4%), detrimental for before (0.45 vs. 0.40, -11.1%) and barely detrimental for during (0.32 vs. 0.33, -3.0%). These results lead to the conclusion that the section title only contains useful information to determine possession relations in future years with respect to the years mentioned in the section.

Conclusions
Possession is an asymmetric semantic relation between two entities, where one entity (the possessee) belongs to the other entity (the possessor). Following theoretical works, we understand belongs in a broad sense, including physical, temporal, and control possessions.
In this paper, we track possession relations over time. Specifically, we work with Wikipedia articles about artworks, and extract their possessors as well as temporal information with respect to the years explicitly mentioned (before, during or after). We have presented an approach to extract potential possessors and pair them with years, and an annotation scheme to validate them. Overall interannotator agreement (Cohen's kappa) is 0.70, and the resulting corpus has substantial information regarding possessors over time: in 75% of articles we validate at least 14 (possessor, year) pairs, and in 50%, at least 30 pairs.
Experimental results show that the task can be automated, although we obtain moderate results. We present an LSTM ensemble that outperforms a traditional SVM. Disabling certain components of the full network show that the article title and sec-tion title benefit different temporal tags, and that the additional embeddings for the possessor and year are beneficial for during.