ArguminSci: A Tool for Analyzing Argumentation and Rhetorical Aspects in Scientific Writing

Argumentation is arguably one of the central features of scientific language. We present ArguminSci, an easy-to-use tool that analyzes argumentation and other rhetorical aspects of scientific writing, which we collectively dub scitorics. The main aspect we focus on is the fine-grained argumentative analysis of scientific text through identification of argument components. The functionality of ArguminSci is accessible via three interfaces: as a command line tool, via a RESTful application programming interface, and as a web application.


Introduction
Scientific publications are primary means for convincing scientific communities of the merit of one's scientific work and importance of research findings (Gilbert, 1976). To this end, authors typically present their work by embracing and exploiting established practices and specific tools related to the scientific discourse, such as citations (Gilbert, 1977), that facilitate building persuading argumentation lines. Consequently, scientific texts are abundant with different interrelated rhetorical and argumentative layers. In this work, we refer to this set of mutually-related rhetorical aspects of scientific writing as scitorics.
Numerous research groups have already proposed computational models for analyzing scientific language with respect to one or multiple of these aspects. For example,  presented experiments on the automatic assignment of argumentative zones, i.e., sentential discourse roles, to sentences in scientific articles. Similarly, there has been work on automatic classification of citations with respect to their polarity and purpose (Jha et al., 2017;Lauscher et al., 2017b). It has also been shown that through analyses of scitorics higher-level computational tasks can be supported, such as the attribution of scientific statements to authors (Teufel and Moens, 2000), identification of research trends (McKeown et al., 2016), or automatic summarization of scientific articles (Abu-Jbara and Radev, 2011;Lauscher et al., 2017a).
In this work, we present ArguminSci 1 a tool that aims to support the holistic analyses of scientific publications in terms of scitorics, including the identification of argumentative components. We make ArguminSci publicly available for download. 2 In its core, it relies on separate neural models based on recurrent neural networks with the long short-term memory cells (LSTM) (Hochreiter and Schmidhuber, 1997) pre-trained for each of the five tasks in the area of scientific publication mining that ArguminSci adresses, namely (1) argumentative component identification, (2) discourse role classification, (3) subjective aspect classification, (4) summary relevance classification, and (5) citation context identification. ArguminSci is available as a command line tool, through a RESTful HTTP-based application programming interface, and as a web-based graphical user interface (i.e., as a web application).

Related Work
We divide the overview of related tools and systems into two categories: (1) systems targeting the analysis of scitorics and (2) tools for argument mining (in other domains).
Tools for the Analysis of Scitorics.  presented the Dr. Inventor Framework, which provides end-to-end analysis of scientific text starting with the extraction of text from PDF documents. The system embeds several modules for mining scientific text, e.g., for the discourse role characterization of sentences. Saggion et al. (2017) presented MultiScien, a tool that analyzes scientific text collections in English and Spanish and offers a visualization of discourse categories and summaries. Also, several systems analyzing argumentative zones  have been made publicly available (e.g., Guo et al., 2012;Simsek et al., 2013). However, to the best of our knowledge, ArguminSci is the first publicly available system that provides fine-grained argumentative analysis of scientific publications, and allows for a joint analysis of scitorics -argumentation and several other rhetorical aspects of scientific language.
Argument Mining Tools. Apart from new research models and approaches, several systems and software tools have been proposed for argument mining in other domains, mainly using machine-learning models at their core. Wachsmuth et al. (2017) developed args.me, an argument search engine that aims to support users in finding arguments and forming opinions on controversial topics. 3 Another similar system is Argu-menText (Stab et al., 2018). In contrast to args.me, the search engine of ArgumenText provides access to sentential arguments extracted from large amounts of arbitrary text. The system most similar to ArguminSci is MARGOT (Lippi and Torroni, 2016), 4 which extracts argumentative components from arbitrary text provided by the user. However, MARGOT is not tuned for a particular domain and does not perform well on scientific text (i.e., it cannot account for peculiarities of argumentative and rhetorical structures of scientific text). While MARGOT focuses only on argumentative components, ArguminSci allows for parallel analysis of four other rhetorical aspects of scientific writing.

System Overview
We first describe the five annotation tasks that Ar-guminSci covers and the models we train for addressing these tasks. Next, we provide a technical overview of the system capabilities and interfaces through which it is possible to access ArguminSci.

Annotation Tasks and Dataset
Annotation Tasks. Our system supports the following aspects of rhetorical analysis (i.e., automatic annotation) of scientific writing: (1) argument component identification, (2) discourse role classification, (3) subjective aspect classification, (4) citation context identification, and (5) summary relevance classification. Out of these tasks -in accordance with the structure of the annotations in our training corpus -argument component identification and citation context identification are token-level sequence labeling tasks, whereas the remaining three tasks are cast as sentence-level classification tasks.
• Argument Component Identification (ACI): The task is to identify argumentative components in a sentence. That is, given a sentence x = (x 1 , . . . , x n ) with individual words x i assign a sequence of labels y aci = (y 1 , . . . , y n ) out of the set of token tags Y aci . The label set is a combination of the standard B-I-O tagging scheme and three types of argumentative components, namely background claim, own claim, and data.
• Discourse Role Classification (DRC): Given a sentence x the task is to classify the role of the sentence in terms of the discourse structure of the publication. The classes are given by the set Y drc = {Background, Unspecified, Challenge, FutureWork, Approach, Outcome}.
• Subjective Aspect Classification (SAC): Given a sentence x the task is to assign a single class out of eight possible categories in Y sac = {None, Limitation, Advantage, Disadvantage-Advantage, Disadvantage, Common Practice, Novelty, Advantage-Disadvantage}.
• Summary Relevance Classification (SRC): Out of the set of possible relevance classes Y src , choose one given a sentence x, with Y src = {Very relevant, Relevant, May appear, Should not appear, Totally irrelevant}.
• Citation Context Identification (CCI): The task is to identify textual spans corresponding to citation contexts. More specifically, given a sentence x = (x 1 , . . . , x n ) the task is to decide on a label for each of the tokens x i . The possible labels are Begin Citation Context, Inside Citation Context, and Outside.  Dataset. For training our models, we used an extension of the Dr. Inventor Corpus (Fisas et al., 2015(Fisas et al., , 2016, which we annotated with finegrained argumentation structures (Lauscher et al., 2018). The corpus consists of 40 scientific publications in the field of computer graphics and, besides our annotations of argumentative components, offers four layers of annotation, three of which are on the sentence level (DRC, SAC, SRC). Our argument annotation scheme includes three types of argumentative components: • Background claim: A statement of argumentative nature, which is about or closely related to the work of others or common practices in a research field or about background facts related to the topic of the publication.
• Own claim: A statement of argumentative nature, which related to the authors own work and contribution.
• Data: A fact that serves as evidence pro or against a claim.
More details on the argument-extended corpus we use to train our models can be found in the accompanying resource paper (Lauscher et al., 2018). For more details on the original annotation layers of the Dr. Inventor Corpus, we refer the reader to (Fisas et al., 2015(Fisas et al., , 2016. In Table 1, we provide the overview of all labels for all five scitorics tasks that ArguminSci is capable of recognizing.

Annotation Models.
At the core of ArguminSci is a collection of bidirectional recurrent networks with long short-term memory cells (Bi-LSTMs) (Hochreiter and Schmidhuber, 1997), one pre-trained for each of the five annotation tasks on our argumentatively extended Dr. Inventor corpus (Fisas et al., 2015(Fisas et al., , 2016Lauscher et al., 2018).
Model Descriptions. As ArguminSci addresses (1) two token-level sequence tagging tasks and (2) three sentence-level classification tasks, the system implements two types of models: • Token-level Sequence Labeling: Given a sentence x = (x 1 , . . . , x n ) with words x i , we first lookup the vector representations e i (i.e., pre-trained word embeddings) of the words x i . Next, we run a Bi-LSTM and obtain the sentence-contextualized representation h i for each token: Finally, we feed the vector h i into a singlelayer feed-forward network and apply a softmax function on its output to predict the label probability distribution for each token: with W ∈ R 2K×|Y | being the weight matrix, b ∈ R |Y | the bias vector, and K being the state size of the LSTMs.
• Sentence-level Classification: The sentencelevel classification builds upon the output of the Bi-LSTM described above: Following Yang et al. (2016), we first obtain a sentence representation by aggregating the individual hidden representations of the words h i using an intrasentence attention mechanism defined as The individual weights α i are computed as follows: with u att as the trainable attention head vector and the matrix U containing the Bi-LSTMcontextualized token representations, transformed through a single-layer feed-forward network with non-linear activation (i.e., we first non-linearly transform vectors h i and stack the transformations to form the matrix U ): Analogous to the above-mentioned token-level sequence tagging model, in the last step we apply a feed-forward net with a softmax layer to get the class predictions from the obtained attention-based sentence representation s i .
We implemented all models in Python, using the Tensorflow framework. 5 Model Performance. We evaluated the performance of our models on a held-out test set, which comprises 12 randomly selected publications in our corpus (roughly 30% of the corpus, totaling in 2874 sentences). In Table 2 we report the results in terms of F 1 score, macro-averaged over the task labels.

Interfaces
We offer three different modes of access to Argu-minSci: (1) using a command line tool, (2) via an RESTful application programming interface, and (3) using a web application.  Command Line Tool. The first interface Ar-guminSci offers is a command line tool, invokable with Python. The script should be provided with two mandatory arguments defining the path to the input file containing the text to be annotated and the path to the output folder where the processing results (i.e., annotated text) will be stored. Furthermore, there are five optional flags which define the type of analysis to perform, each corresponding to one of the scitorics tasks. For example, if the user wants to run ACI and DRC on the input text, she should set the flags --argumentation and --discourse, respectively. Figure 1 shows the help content for the command line tool.
RESTful Application Programming Interface. The application programming interface (API) provides one main HTTP POST end point, which expects a string parameter text to be submitted. A second parameter api mode acts as a flag for setting the output format of the predictions (i.e., annotated text) to JSON. A cURL request to our RESTful interface has the following format: curl --request POST --url http://<host>/predict --data 'text=<text>&api_mode=True' For example, given the text "Our model performs best.", the API will return a JSON object with the following nested structure:  In order to enable developers and researchers to use ArguminSci as an HTTP service, we make the RESTful API publicly accessible 6 . For the implementation of the API we used the Flask framework in Python. 7 Web application. Finally, the third option for accessing ArguminSci is the web application, based on the template rendering engine Jinja2 8 and the front-end library Bootstrap. 9 We adopt a lean and simple design with a a single interaction screen. Here, the user can enter the text she desires to annotate with ArguminSci's scitorics annotation models (see Figure 2). Figures 3a and 3b depict the results of the processing. The result is displayed in a tab control in the middle of the screen -different annotation layers can be accessed via the tab navigation. The spans of the input text are highlighted with colors indicating different labels, as predicted by the ArguminSci's neural models.

Conclusion
Scientific publications, as tools of persuation (Gilbert, 1977), are highly argumentative and carefully composed texts in which explicit arguments are intertwined with other rhetorical aspects of scientific writing. In this paper, we presented ArguminSci, a tool that offers a holistic analysis of scientific publications through a set of rhetorical and argumentative aspects of scientific writing we collectively dub scitorics. The Argumin-Sci tool encompasses pre-trained recurrent neural models for two different token-level sequence tagging (identification of argumentative components and citation contexts) and three sentence classification tasks (discourse roles, subjective aspect, and summary relevance).
ArguminSci's functionality can be accessed in three different ways: as a command line tool, via a RESTful application programming interface, and as a web application. In future work, we intend to expose the training phase for the models as well. We also plan to allow for different annotation schemes and to extend the tool with models for other scitorics tasks, such as citation purpose and citation polarity classification.