Helpful or Hierarchical? Predicting the Communicative Strategies of Chat Participants, and their Impact on Success

When interacting with each other, we motivate, advise, inform, show love or power towards our peers. However, the way we interact may also hold some indication on how successful we are, as people often try to help each other to achieve their goals. We study the chat interactions of thousands of aspiring entrepreneurs who discuss and develop business models. We manually annotate a set of about 5,500 chat interactions with four dimensions of interaction styles (motivation, cooperation, equality, advice). We find that these styles can be reliably predicted, and that the communication styles can be used to predict a number of indices of business success. Our findings indicate that successful communicators are also successful in other domains.


Introduction
People are social beings who communicate their feelings, emotions, thoughts, ideas, etc. through verbal and non-verbal interactions. Based on these interactions, we build relationships, and these relationships, in turn, help create and maintain a network of peers. Peers in a network cooperate with each other, help each other to learn, and exchange ideas. However, they also compete for the same resources (Vega-Redondo et al., 2019), not least attention. Peer networks are particularly important for innovation and entrepreneurship (Gonzalez-Uribe and Leatherbee, 2017), as they produce an active exchange of ideas.
People are usually assumed to be altruistic in networks like online social forums. They cooperate with and help one another with answers, advice, and ideas. The motivations behind helping a peer include, but are not limited to, getting pure pleasure from helping, self-advancement, building a reputation, developing relationships, or sheer entertainment (Tausczik and Pennebaker, 2012).
When people interact with each other, their interactions vary along various communicative styles, such as showing cooperativeness, equality, business orientation, etc. (Rashid and Blanco, 2018). Varying these communication styles provides tools to achieve communicative goals. For example, someone trying to build a reputation will tend to use a more cooperative style. Someone who tries to be helpful may use more words of advice in their interactions. The usage of relationshipestablishing styles is more prevalent in certain personalities (Cheng, 2011) and in specific settings. Business-oriented people communicate more independence, tolerance of ambiguity, risk-taking propensity, innovativeness, and leadership qualities (Wagener et al., 2010).
The impact of these styles is, therefore, an essential factor in text analysis. However, due to their complex, decentralized nature, these communication styles have been studied very little in NLP. Cooperativeness is more than just a few keywordsit includes a whole inventory of communicative tools. This property makes it harder to annotate and predict. Part of the reason is the lack of adequate corpora. We provide such a corpus and report encouraging results for the above styles.
Contributions We introduce a new task, predicting the communicative strategies of interlocutors in a real-life setting, and provide a new, multiplyannotated data set of 5k+ instances. We find that the various communicative dimensions can be efficiently predicted. Additional tests suggest that the communicative strategy of a person is somewhat predictive of their business success.

Data
Our ultimate goal is to predict the communicative styles and strategies of aspiring entrepreneurs in an online peer network. The initial corpus is part of a large-scale social science experiment that involved around 5,000 entrepreneurs from 49 African countries (Vega-Redondo et al., 2019). After completing an online business course, those entrepreneurs interacted in groups of sixty through an Internet platform for about two and a half months, resulting in approximately 140,000 chat interactions. Besides the chat interactions, the original dataset contains background information about the speakers (country of origin, educational background, age, gender, etc.). All the participants submitted business proposals, which were evaluated by a panel to assess their potential.
The original experimental setup was designed to assess how communication among peers affects innovation and entrepreneurship. Vega-Redondo et al.
(2019) therefore already applied NLP techniques for semantic analysis of the interactions. They also manually annotated other indicators, i.e., businessrelatedness, sentiment, and target audience (i.e., one or several people) on a subset of 10k sentences. They trained classifiers on this data to infer these labels for all remaining 130k instances in the corpus. This dataset provides a perfect starting point for our goals.
The first step to address our goal involves annotating speech styles on several interactions among the participants. We work on the same subset of previously annotated data, and add our own annotations to enrich the data further.

Equality indicating whether there is a display
of hierarchy between the speaker and the receiver, with label values equal and hierarchical.
For all styles, unknown is used whenever it is hard to determine any of the other values from context.

Annotation Process
Three graduate students with experience in NLP tasks annotated the corpus. They were trained with written annotation guidelines consisting of definitions and examples for all the communication styles. They also had an hour-long session carrying out sample annotations to ensure that they properly understood the problem. For the annotations, the annotators filled out their responses in interactive spreadsheets choosing the correct value for a particular style. Each of the annotators annotated their part of around 2,100 chat interactions. 502 of these were shared among all three annotators so that we can compute agreement measures. We obtain the most probable labels for the shared portion using MACE (Hovy et al., 2013).
We summarize the inter-annotator agreement coefficients in Table 1 (raw agreement: 83%; averaged pairwise Cohen's κ: 0.53; Krippendorff's α: 0.52). The average MACE competence score of these annotators is 0.53. Table 3 shows the Pearson's correlations between pairs of the styles of interactions and previous annotations. This indicates that Motivational styles are usually also Cooperative (0.61), give Advice (0.56) and are Equal (0.54). Interestingly, many Business-related interactions are not very Cooperative (-0.42).
The label counts are as follows. For cooperativeness, 42.3% are labeled cooperative, 50.3% are Cooperative (Rashid and Blanco, 2018;Wish et al., 1976)    We release our annotations as stand-alone annotations. 1 Table 2 shows a number of actual chat interactions from the dataset with different values for the styles annotated. In interaction (1), the praising is considered a cooperative response, whereas in (2) the speaker is chiding someone, indicating a competitiveness. The praise in (3) is motivational. Example (4) does not really communicate any motivation, so it is labeled as neutral for this style. Example 1 https://github.com/MilaNLProc/ conversationstyle (5) is just a greeting and does not indicate anybody displaying hierarchy over anyone else, so it is equal. Example (6) shows that the speaker instructs someone on how to behave (hierarchical). In (7), the speaker is advising someone to think about a matter whereas example (8) is just another neutral statement.

Experiments and Results
We want to predict four styles of interactions (cooperative, motivational, advice, equality), and three subsequent indicators of business success: (1) whether the person owns a business (HAS BUSI-NESS), (2) whether someone has ever owned a business (BUSINESS EVER) and (3) whether they submitted a business proposal to win funding to start a business (BUSINESS PROPOSAL).
We use (1) an SVM classifier with RBF kernel (effective in (Rashid and Blanco, 2018)) to predict both the communicative styles and the business success indicators, and (2) a Multitask Learning (MTL) Convolutional Neural Network to predict the business success indicators.
We divide our annotated dataset into 80-20 stratified train-test splits for predicting communicative styles. For predicting indicators of business success, we use 500 randomly selected instances as test and the rest as training data.

SVM setup
We use the SVM implementation in scikit-learn (Pedregosa et al., 2011) and tune the hyperparameters (C and γ) using 10-fold cross-validation within the train split. We train one classifier per style and per indicator of business success to predict the different labels. Feature Set. After basic preprocessing (removal of stop words), tokenization, and parsing (to get the root verb) using spaCy, we extract features from the chat interactions and sentiment lexica. The feature set relies only on language usage. We extract the first word in a chat interaction, the bag-of-words representations (binary flags and tf-idf scores) of the chat interaction and features from sentiment lexica. Specifically, we extract flags indicating whether the turn has a positive, negative or neutral word in the list by Hamilton et al. (2016), the sentiment score of the chat interaction (summation of sentiment scores per token over number of tokens), and a flag indicating whether the interaction contains a negative word from the list by Hu and Liu (2004). We also extract other features, which include (a) the root verb (b) binary flags indicating the presence of exclamation, question marks and negation cues from Morante and Daelemans (2012).

Multitask Learning (MTL) setup
We use a standard Convolutional Neural Network over word-embeddings, with one output per task. We preprocess the data (convert to lowercase, removed URLs and stop-words, converted numbers to 0's etc.) and learn a skip-gram embeddings model (Mikolov et al., 2013) trained for 50 epochs. We use an embedding size of 512, choosing a power of 2 for memory efficiency.
In the CNN, the input layer has the word indices of the text, converted via the embedding matrix into word embeddings. We convolve two parallel channels with max-pooling layers, and convolutional window sizes 4 and 8 over the input. The two window sizes account for both short and relatively long patterns in the texts. In both channels, the initial number of filters is 128 for the first convolution, and 256 in the second one. We join the convolutional channels' output and pass it through an attention mechanism (Bahdanau et al., 2014;Vaswani et al., 2017) to emphasize the weight of any meaningful pattern recognized by the convolutions. We use the implementation of Yang et al. (2016). The output consists of 7 independent, fullyconnected layers for the predictions, respectively in the form of discrete labels for classification of one of the business success indicators of a person (HAS

Related Work
There have been a few studies analyzing language usage when people communicate. For example, researchers have studied power (or hierarchical) relationships in online communities (Danescu-Niculescu-Mizil et al., 2012), emails (Prabhakaran and Rambow, 2014), and social networks (Bramsen et al., 2011). Some have studied how roles of Wikipedia editors affect their success (Maki et al., 2017). Danescu-Niculescu-Mizil et al. (2013) an-alyze politeness in online forums using structural and linguistic features derived from the communications between two individuals. Katerenchuk and Rosenberg (2016) develop an algorithm to predict user influence levels in online communities. Rashid and Blanco (2018) characterize interactions between people with dimensions and produce a dataset annotating dimensions on TV scripts. Vega-Redondo et al. (2019) annotate business relevance and sentiment on online chat interactions among aspiring entrepreneurs. In contrast, we annotate the communicative styles cooperativeness, motivational, advice and equality on chat interactions between young aspiring entrepreneurs, and develop machine learning systems to automatically predict these styles and indicators of business success for the participants.

Conclusions
We present a data set of 5k+ instances annotated with four communication styles which can effectively be predicted. These communicative styles also influence people's business success. Our results and data set open up interesting new avenues to study the effects of people's communicative strategies on their business success.