An Online Annotation Assistant for Argument Schemes

Understanding the inferential principles underpinning an argument is essential to the proper interpretation and evaluation of persuasive discourse. Argument schemes capture the conventional patterns of reasoning appealed to in persuasion. The empirical study of these patterns relies on the availability of data about the actual use of argumentation in communicative practice. Annotated corpora of argument schemes, however, are scarce, small, and unrepresentative. Aiming to address this issue, we present one step in the development of improved datasets by integrating the Argument Scheme Key – a novel annotation method based on one of the most popular typologies of argument schemes – into the widely used OVA software for argument analysis.


Introduction
In argumentative discourse, a speaker or writer intends to convince their audience of a contested point of view (van Eemeren et al., 2014). To convince their audience, an appeal is made to reasoning, either in direct conversation (such as a courtroom discussion), or in indirect or monological settings (such as a political speech). The argumentative quality of such discourse can be evaluated from various perspectives. In the current paper, we focus on the argumentative quality in terms of the acceptability of the reasoning appealed to in the arguments -thus disregarding, e.g., the rhetorical effectiveness, another dimension of the quality of argumentative discourse.
Consider Hillary Clinton's argument in Example (1) -taken from the US2016 annotated corpus of television debates in the lead-up to the 2016 US presidential elections (Visser et al., 2019a). Anticipating that her first asserted proposition might not be outright acceptable to the entire audience, she provides a reason in support. By defending her policy proposal by comparing the dangers of potential terrorists flying to the dangers of them buying guns, Clinton's argument relies on a conventionalised reasoning pattern: that comparable situations should be dealt with similarly. (1) Hillary Clinton: And we finally need to pass a prohibition on anyone who's on the terrorist watch list from being able to buy a gun in our country. If you're too dangerous to fly, you are too dangerous to buy a gun.
Evaluating an argument begins by identifying the reasoning pattern it is based on. These common reasoning patterns are conceptualised within the field of argumentation theory as 'argument schemes' (Section 2). While corpus-linguistic approaches have gained traction in the study of argumentation -partly motivated by the rise of 'argument mining' (Stede and Schneider, 2018) -these have generally focused on aspects of argumentative discourse other than argument schemes (such as the use of rhetorical figures of speech (Harris and Di Marco, 2017)). The empirical study of argument schemes would greatly benefit from quantitative data in the form of annotated text corpora. Existing corpora annotated with argument schemes, however, tend to be based on restricted typologies, be of limited size, or suffer from poor validation (Section 3).
In the current paper, we aim to support the annotation of argument schemes by combining a recently developed annotation method for one of the leading typologies of argument schemes (Section 4) and a popular online software tool for annotating argumentative discourse, OVA (Section 5). The standard version of OVA, and other software for manual argument annotation, such as Araucaria (Reed and Rowe, 2004), Rationale (van Gelder, 2007), and Carneades (Gordon et al., 2007) allow the analyst to label arguments with a particular scheme, but they do not offer support to the analyst in the actual scheme selection, which is what our OVA extension is aimed at.

Argument Schemes
Argument schemes are theoretical abstractions of the conventional patterns of reasoning appealed to in persuasive communication, substantiating the inferential relation between premise(s) and conclusion. The defeasibility of the schemes sets them apart from the strict reasoning patterns of classical formal logic (e.g., Modus Ponens). The type of argument scheme determines its evaluation criteria, commonly expressed as critical questions -owing to the dialectical origins of the notion (van Eemeren and Garssen, 2019). Adequately arguing for a standpoint implies both that the premise(s) of the argument should be acceptable, and that the argumentative connection between the premise(s) and the conclusion can withstand the critical questioning.
Walton's typology comprises a great variety of schemes, conventionally occurring in argumentative practices ranging from colloquial discussion to legal adjudication (Walton et al., 2008). Many of the schemes are commonly distinguished in dialectical or informal-logical approaches to argumentation (e.g. argument from sign and argument from cause to effect). Others, however, are more exotic or highly specialised (e.g. argument from arbitrariness of a verbal classification), are closer to modes of persuasion in a rhetorical perspective on argumentation (e.g. ethotic argument), or would in other approaches be considered fallacies (e.g. generic ad hominem). The list also includes composite schemes that combine aspects from various schemes into one (e.g. practical reasoning from analogy combining practical reasoning and argument from analogy).

Annotating Argument Schemes
The annotation of argument schemes comprises the classification of the inferential relations between premises and conclusions of arguments in accordance with a particular typology. Figure 1 shows a diagrammatic visualisation of the argument of Example (1) with in the middle the classification of the argument scheme as an instance of practical reasoning from analogy. While we start from Walton's typology, alternative approaches are also employed for scheme identification: Green (2015) presents ten custom argument schemes for genetics research articles, Musi et al. Existing annotations on the basis of Walton's typology tend to use a restricted set of scheme types, and struggle to obtain replicable results. For example, Duschl (2007) initially adopts a selection of nine argument schemes described by Walton (1996), for his annotation of transcribed middle-school student interviews about science fair projects. Later, however, he collapses several schemes into four more general classes no longer directly related to particular scheme types. This deviation from Walton's typology appears to be motivated by the need to improve annotation agreement. The validation of the annotation method does not account for chance agreement, by only providing percentage-agreement scores (in-

The Argument Scheme Key (ASK)
Visser et al. (2018) aim to develop an annotation procedure that stays close to Walton's original typology, while facilitating the reliable annotation of a broad range of argument schemes. The resulting method is reported to yield an interannotator agreement of 0.723 (in terms of Cohen's (1960) κ) on a 10.2% random sample. The main principle guiding the annotation is the clustering of argument schemes on the basis of intuitively clear features recognisable for annotators. Due to the strong reliance on the distinctive properties of arguments that are characteristic for a particular scheme, the annotation procedure bears a striking resemblance to methods for biological taxonomythe identification of organisms in the various subfields of biology (see, e.g., Voss (1952); Pankhurst (1978)). Drawing on the biological analogue and building on the guidelines used by Visser et al.
(2018), we developed a taxonomic key for the identification of argument schemes in accordance with Walton's typology: the Argument Scheme Key -or ASK.
The ASK (reproduced in Appendix A) is a dichotomous identification key that leads the analyst through a series of disjunctive choices based on the distinctive features of a 'species' of argument scheme to the particular type. Starting from the distinction between source-based and other arguments, each further choice in the key leads to either a particular argument scheme or to a further distinction. The distinctive characteristics are numbered, listing between brackets the number of any not directly preceding previous characteristic that led to this particular point in the key.
In annotating Example (1), an analyst using the ASK follows a sequence of numbered characteristics to identify the argument as an instance of practical reasoning from analogy: 1. Argument does not depend on a source's opinion or character; 17(1). Conclusion is about a course of action; 18. Argument hinges on another motivation for the action [other than its outcome]; 19. Course of action is compared to a similar or alternative action; 21(19). Action is directly compared to another.
The ASK dichotomous identification key can be thought of as a linear textual rendering of a binary taxonomic tree. Figure 2 visualises the decision procedure as such a tree, with each leaf representing an argument scheme label, and all internal nodes showing clusters of schemes that share particular characteristic properties. For each of the numbered binary decision points in the ASK, the tree representation branches into two, thus leading the annotator from the full set of schemes, through their binary choices, to one (and only one) leafi.e. an argument scheme classification.  al., 1997;Snoeck Henkemans, 1992) can, in turn, be formed by connecting an I-node to an existing Snode, or by chaining the connections.
Whilst the original version of OVA allows for a user to label any RA-node as an instance of an argument scheme from Walton's typology by selecting from a dropdown list, in this work, we have introduced the option for users to be guided through this process using the ASK. In order to achieve this, the ASK is first converted into JSON , a fragment of which is shown in Listing 1 (we have also made the full JSON representation available online 1 for download and integration into other argumentation tools). Each branching point in the ASK has two options, which are represented by their text, and a result -where the result can either be a scheme name ("resulttype": "scheme") or a pointer to another branching point ("resulttype": "branch").
Listing 1: A fragment of the ASK in JSON {"id": "existing-character", "options": [ { "text": "Argument relies on the source's good character", "result": "Ethotic argument", "resulttype": "scheme" },{ "text": "Argument relies on bad character", "result": "negative-character", "resulttype": "branch" } ] } When a user elects to use the ASK to help them select an argument scheme, they are presented with a series of modal dialogue boxes similar to that shown in Figure 3. At each stage, the user selects one of the options and is then either presented with the next dialogue box, or they reach a scheme classification which they can choose to accept and apply. An ordered list of user selections at each stage is recorded so that they can step back through the options if they wish to correct an earlier choice.

Conclusion
Identifying the scheme an argument is based on is an important part of evaluating the argumentative quality of discourse. The availability of large, reliable, and representative datasets is essential both to the empirical study of the use of argument schemes in argumentative practice, and to the development of automated classifiers and argument mining techniques. Existing annotated corpora, however, such as those used by Feng and Hirst (2011), and Lawrence and Reed (2015), for the automatic classification of argument schemes, are not validated, of limited size, or do not represent a broad range of scheme types.
Aiming to improve the availability of highquality argument scheme corpora, the online annotation assistant we present here combines a novel annotation method for Walton's typology, with the widely used OVA software for argument analysis. The Argument Scheme Key (ASK) module is available for annotators in OVA at http: //ova.arg.tech. This work constitutes an intermediate step in the development of automated classifiers, utilising the uniquely identifying characteristics of the ASK. Future work will explore the accuracy and robustness of manual annotations by experts, non-experts, and crowd-sourcing (Musi et al., 2016) using the ASK module in OVA. F 33.