Conversational Decision-Making Model for Predicting the King’s Decision in the Annals of the Joseon Dynasty

Styles of leaders when they make decisions in groups vary, and the different styles affect the performance of the group. To understand the key words and speakers associated with decisions, we initially formalize the problem as one of predicting leaders’ decisions from discussion with group members. As a dataset, we introduce conversational meeting records from a historical corpus, and develop a hierarchical RNN structure with attention and pre-trained speaker embedding in the form of a, Conversational Decision Making Model (CDMM). The CDMM outperforms other baselines to predict leaders’ final decisions from the data. We explain why CDMM works better than other methods by showing the key words and speakers discovered from the attentions as evidence.


Introduction
Decision making in groups refers to the process of making choices to resolve issues by discussing the issues with group members (Lunenburg, 2011). It has various styles based on the balance of the participation between the leader and members from autocratic, democratic, laissez-faire (let go) to delegation types of groups (Lewin et al., 1939;Vroom and Jago, 1988). Social psychologists note that decision making affects the group performance and the satisfaction of its members (Yang, 2010), and that leadership plays a role (Larson Jr et al., 1998). In this paper, we study the key factors that are closely related to the decision making process used by leaders.
First, we build conversational meeting records from The Annals of the Joseon Dynasty (henceforth referred to as the AJD), after which, we formalize our research problem as predicting leaders' decisions in conversational discussions from the data (Sec 2). The AJD consists of the records of kings who governed the Korean peninsula from Combining two local districts "It is a hard problem." "It is reasonable to combine the regions." "I propose another solution." The king follows Official C's suggestion. In the AJD, the kings discuss the issues with government officials and decide upon a course of action. Many discussion corpora are available such as Augmented Multi-party Interaction (AMI) (Carletta et al., 2005) which is meeting recordings as video, and are used to identify and summarize decisions in the conversation (Hsueh and Moore, 2007;Fernández et al., 2008;Bui et al., 2009). However, the AJD has more speakers than AMI, and it is a longitudinal corpus spanning over 400 years.
To predict the decisions in the corpus, we develop a model which we term the Conversational Decision-Making Model (CDMM) (Sec 3). CDMM is based on the hierarchical RNN structure with attention (Yang et al., 2016), but we add speaker information with pre-trained embedding. We also devise a way to make the speaker embedding using co-occurrence document network (Sec 3.3). In comparison with several other methods, CDMM shows the highest macro-averaged F1 score (Sec 4). We also show why CDMM works better with key words and speakers by examining the attention values (Sec 5). Meeting articles in the AJD consist of who said what on an issue in dialogue form, and the king's decision. Figure 1 shows an example of a meeting record article 1 . In the article, the king and government officials discuss the issue of combining two local regions. The king asks for a solution to the issue from the officials, and they state their opinions. At the end of the article, the king decided to follow official C's suggestion to solve this issue.
We build a corpus from the AJD using the following process. We crawl the AJD website to retrieve the documents and select articles that have three or more speakers per document. We identify the king's final decision in each article by examining the final sentence and the title as summarized by historians. We initially determine whether or not the final sentence of the subject is that by a king, as some issues are dealt with by others, such as the king's mother. We also extract the verbs in the final sentence and the title that indicates the decisions. From these, we categorize each king's decisions into six types: Order, Approve, Disapprove, Accept and Reject. Some articles include a discussion of an issue, but the king's final decision is not explicitly recorded or the king postpones the decision. We treat this type of decision as Discuss, i.e., the sixth category. Finally, we choose fifteen kings with more than 200 articles that have his final decisions. Table 1 shows the basic statistics of 1 http://sillok.history.go.kr/id/kda_ 10103027_005

Conversational Decision Making Model
This section describes our model, the Conversational Decision-Making Model (CDMM), for identifying leaders' decisions from meeting records. CDMM is based on the Hierarchical Attention Network (HAN) (Yang et al., 2016), but we change the sentence level to the utterance level and use speaker information (described in Section 3.2). To encode the speaker information, we build the speaker embedding from co-occurrence document network (described in Section 3.3).

Word Encoder
To encode the t-th word of i-th utterance x it , t ∈ {1, . . . , T }, we initially change the word x it to word vector w it using the word embedding matrix We use a bi-directional GRU (Bahdanau et al., 2014), and concatenate the hid- Then, we use the attention mechanism in HAN to find important words to classify the decision. Each word has an attention value α it , and we compute the utterance word vector,

Utterance Encoder with Speaker
In CDMM, the i-th utterance has word sequence representation vector u i and speaker vector s i . First, we change the speaker z i to vector s i using the speaker embedding matrix W s , s i = W s z i . To encode a length U of the utterances (u i , s i ), i ∈ {1, . . . , U }, we suggest encoders based on GRU (Bahdanau et al., 2014), which can learn u i and s i simultaneously, as follows: Here, h i is the i-th utterance hidden state, and z i and r i denote the update and reset gate, respectively. This is similar to earlier work (Li et al., 2016), but we add the speaker vector to the utterance level, not the word level. As in the word encoder, we use the bidirectional GRU with the utterance encoder and concatenate the hidden states h We use the same attention mechanism to find important utterances. Each utterance has an attention value of α i , and for the conversation vector we use , and a dropout scheme (Srivastava et al., 2014) to avoid over-fitting.

Pre-trained Speaker Embedding
Unlike word embedding which is pre-trained from news or Wikipedia articles (Mikolov et al., 2013;, pre-trained speaker embedding for the AJD does not exist. To overcome this limitation, we suggest the building of speaker embedding from the co-occurrence document network in the AJD. The AJD contains not only meeting records but also personnel management reports and explanations of the officials. We therefore build a co-occurrence network. The vertices are people, and two individuals are connected if they appear in the same article. The weight of the edge is the number of co-occurrences in the same article. With this network, we realize speaker embedding using the node2vec algorithm (Grover and Leskovec, 2016), which generates node vector representation.

Experiments
This section describes the experiments and results of CDMM as well as other methods for classifying the king's decisions in the AJD.

Experiment Setting
We split the data as 80/10/10 for training/validation/test. Because the meeting records contain fifteen kings, we split the data randomly for each king and merge each part into the entire training, validation and test set.
We compare CDMM with the following methods. The majority of classes predicts all test examples as the major class, Discuss. We apply Naive Bayes and the SVM with the linear kernel. To use these methods, we remove words whose document frequency is smaller than twenty. To see the power of the speaker information, we run these baselines on words and speaker features together. We also run fastText , which is a classifier with n-gram features and hierarchical softmax, and is similar to CBOW (Mikolov et al., 2013). We use pre-trained Korean word vectors 2 (Grave et al., 2018) to fastText and CDMM. We create the speaker embedding from the AJD. For a fair comparison, we exclude the valid and test articles to construct the co-occurrence network. We use node2vec implementation 3 for speaker embedding. We set the GRU hidden state size to 200, the dimension of the speaker embedding to 200 and the dropout probability to 0.5 for CDMM. Table 2 shows the results. CDMM performs better than all other methods for macro-average and weighted-averaged metrics. The majority of classes shows the lowest performance. Naive Bayes and SVM outperform the baseline. fast-Text with pre-trained word vectors outperforms its counterpart, in accordance with an earlier result (Lample et al., 2016). CDMM without a speaker performs equally to HAN, the only difference being that HAN encodes sentences and CDMM encodes utterances. It does not show good performance as it models only the hierarchical structure of the conversation. However, when we add speaker information, the performance increases even with random initialization of speaker embedding. The performances of Naive Bayes and SVM also increase when they are assigned speakers as features. These observations signal that speaker information is helpful for predicting the king's decisions. Finally, CDMM with pre-trained speaker

Discussion
Here, we investigate the attention values to determine the important words and speakers for predicting the king's decisions. We also obtain evidence showing why CDMM with pre-trained speaker embedding outperforms the others.

Key Words and Speakers
We investigate the important words using word attention values. To find the important words, we compute the mutual information (Christopher et al., 2008) of words that have the top 10% of attention values in the utterances among the classes.  Figure 3 shows the attention weight distributions of the two examples of the top words "Wish to do" and "Okay". The word "Wish to do" is usually used to make a request to the king. The peak of the attention weight distribution of "Wish to do" for the Approve class is around 0.7, whereas it is around 0.3 for Order and Discuss. We can interpret this to mean that CDMM assigns greater attention to that word to predict Approve compared to Order and Discuss. The word "Okay" is used to consent to the opinions of others. CDMM assigns a high attention value to the word to predict Order and Accept compared to Discuss.
However, the attention values differ according to the speaker. As shown in Figure 4, CDMM gives a high attention score to the word "Okay" for  Table 3: Name (translated in English) and position of the speakers who have high mutual information scores for the classes. Local gov is the local government official and Central gov is the central government official. Remonstrator is the official who remonstrates to the king. The position of the speaker is important to predict the king's decision.
Accept as compared to the other classes when the speaker is king. However, when officials use this word, CDMM assigns a high attention value to the word in the Order class. Despite the fact that the same word is used, the king's decision is changed based on the speaker. This is additional evidence showing why the speaker information is useful to predict the decision.

Position of the Speaker
We investigate the key speakers from utterance attention values. To determine the important person, we use the same technique of finding important words.
We find that high ranking person's positions are shared for each class. Table 3 shows the top ranked speakers and their positions for each class. The chief secretary who takes orders from the king has a high rank in the Order class. For Approve and Disapprove, local authorities are highly ranked. For Accept, central government officials have high MI values. Interestingly, officials who remonstrate to the king have high scores in the Disapprove and Reject class. We can thus say that the kings refuse admonitions commonly from officials.
From these results, we can gain insight into why pre-trained speaker embedding is helpful to predict the king's decisions. People in the same organization are in the same community of cooccurrence news article network (Özgür et al., 2008). Therefore, the AJD network contains the community information, and node2vec generates the node's closeness via embedding. CDMM can have this knowledge in the model therefore outperforms the other methods.

Conclusion
In this paper, we created conversational meeting data from the Annals of the Joseon Dynasty (AJD). We presented Conversational Decision-Making Model (CDMM) to predict leaders' decisions from the data. We also suggested the use of speaker embedding from co-occurrence document network with node2vec. With this data, we showed that CDMM outperforms other methods in terms of most metrics. We implemented CDMM using tensorflow (Abadi et al., 2016), and published the code and data in public 4 . We also analyzed the reasoning behind the success of CDMM and the key words and speakers by investigating the concept of attention.
Studies of small group dynamics can be helpful when attempting to understand group decision making behavior (Backstrom et al., 2006). Prior work which analyzed small group dynamics relied on a hidden Markov model (Magdon-Ismail et al., 2003), a dynamic Bayesian network (Mathur et al., 2012) or a layered probabilistic model (Cheng et al., 2014) for various datasets such as networks or recorded video. We suggest CDMM, which combine two types of data to predict leaders' decision. We can also apply this idea to other group dynamics analyses.