Natural Language Processing has broadened in scope to tackle more and more challenging language understanding and reasoning tasks. The core NLP tasks remain predominantly unimodal, focusing on linguistic input, despite the fact that we, humans, acquire and use language while communicating in perceptually rich environments. Moving towards human-level AI will require the integration and modeling of multiple modalities beyond language. With this tutorial, our aim is to introduce researchers to the areas of NLP that have dealt with multimodal signals. The key advantage of using multimodal signals in NLP tasks is the complementarity of the data in different modalities. For example, we are less likely to nd descriptions of yellow bananas or wooden chairs in text corpora, but these visual attributes can be readily extracted directly from images. Multimodal signals, such as visual, auditory or olfactory data, have proven useful for models of word similarity and relatedness, automatic image and video description, and even predicting the associated smells of words. Finally, multimodality offers a practical opportunity to study and apply multitask learning, a general machine learning paradigm that improves generalization performance of a task by using training signals of other related tasks.All material associated to the tutorial will be available at http://multimodalnlp.github.io/
Argumentation and debating represent primary intellectual activities of the human mind. People in all societies argue and debate, not only to convince others of their own opinions but also in order to explore the differences between multiple perspectives and conceptualizations, and to learn from this exploration. The process of reaching a resolution on controversial topics typically does not follow a simple sequence of purely logical steps. Rather it involves a wide variety of complex and interwoven actions. Presumably, pros and cons are identified, considered, and weighed, via cognitive processes that often involve persuasion and emotions, which are inherently harder to formalize from a computational perspective.This wide range of conceptual capabilities and activities, have only in part been studied in fields like CL and NLP, and typically within relatively small sub-communities that overlap the ACL audience. The new field of Computational Argumentation has very recently seen significant expansion within the CL and NLP community as new techniques and datasets start to become available, allowing for the first time investigation of the computational aspects of human argumentation in a holistic manner.The main goal of this tutorial would be to introduce this rapidly evolving field to the CL community. Specifically, we will aim to review recent advances in the field and to outline the challenging research questions - that are most relevant to the ACL audience - that naturally arise when trying to model human argumentation.We will further emphasize the practical value of this line of study, by considering real-world CL and NLP applications that are expected to emerge from this research, and to impact various industries, including legal, finance, healthcare, media, and education, to name just a few examples.The first part of the tutorial will provide introduction to the basics of argumentation and rhetoric. Next, we will cover fundamental analysis tasks in Computational Argumentation, including argumentation mining, revealing argument relations, assessing arguments quality, stance classification, polarity analysis, and more. After the coffee break, we will first review existing resources and recently introduced benchmark data. In the following part we will cover basic synthesis tasks in Computational Argumentation, including the relation to NLG and dialogue systems, and the evolving area of Debate Technologies, defined as technologies developed directly to enhance, support, and engage with human debating. Finally, we will present relevant demos, review potential applications, and discuss the future of this emerging field.
Moving beyond post-editing machine translation, a number of recent research efforts have advanced computer aided translation methods that allow for more interactivity, richer information such as confidence scores, and the completed feedback loop of instant adaptation of machine translation models to user translations.This tutorial will explain the main techniques for several aspects of computer aided translation: confidence measures;interactive machine translation (interactive translation prediction);bilingual concordancers;translation option display;paraphrasing (alternative translation suggestions);visualization of word alignment;online adaptation;automatic reviewing;integration of translation memory;eye tracking, logging, and cognitive user models;For each of these, the state of the art and open challenges are presented. The tutorial will also look under the hood of the open source CASMACAT toolkit that is based on MATECAT, and available as a "Home Edition" to be installed on a desktop machine. The target audience of this tutorials are researchers interested in computer aided machine translation and practitioners who want to use or deploy advanced CAT technology.
Representing the semantics of linguistic items in a machine interpretable form has been a major goal of Natural Language Processing since its earliest days. Among the range of different linguistic items, words have attracted the most research attention. However, word representations have an important limitation: they conflate different meanings of a word into a single vector. Representations of word senses have the potential to overcome this inherent limitation. Indeed, the representation of individual word senses and concepts has recently gained in popularity with several experimental results showing that a considerable performance improvement can be achieved across different NLP applications upon moving from word level to the deeper sense and concept levels. Another interesting point regarding the representation of concepts and word senses is that these models can be seamlessly applied to other linguistic items, such as words, phrases, sentences, etc.This tutorial will first provide a brief overview of the recent literature concerning word representation (both count based and neural network based). It will then describe the advantages of moving from the word level to the deeper level of word senses and concepts, providing an extensive review of state of the art systems. Approaches covered will not only include those which draw upon knowledge resources such as WordNet, Wikipedia, BabelNet or FreeBase as reference, but also the so called multi prototype approaches which learn sense distinctions by using different clustering techniques. Our tutorial will discuss the advantages and potential limitations of all approaches, showing their most successful applications to date. We will conclude by presenting current open problems and lines of future work.
Neural Machine Translation (NMT) is a simple new architecture for getting machines to learn to translate. Despite being relatively new (Kalchbrenner and Blunsom, 2013; Cho et al., 2014; Sutskever et al., 2014), NMT has already shown promising results, achieving state-of-the-art performances for various language pairs (Luong et al, 2015a; Jean et al, 2015; Luong et al, 2015b; Sennrich et al., 2016; Luong and Manning, 2016). While many of these NMT papers were presented to the ACL community, research and practice of NMT are only at their beginning stage. This tutorial would be a great opportunity for the whole community of machine translation and natural language processing to learn more about a very promising new approach to MT. This tutorial has four parts.In the first part, we start with an overview of MT approaches, including: (a) traditional methods that have been dominant over the past twenty years and (b) recent hybrid models with the use of neural network components. From these, we motivate why an end-to-end approach like neural machine translation is needed. The second part introduces a basic instance of NMT. We start out with a discussion of recurrent neural networks, including the back-propagation-through-time algorithm and stochastic gradient descent optimizers, as these are the foundation on which NMT builds. We then describe in detail the basic sequence-to-sequence architecture of NMT (Cho et al., 2014; Sutskever et al., 2014), the maximum likelihood training approach, and a simple beam-search decoder to produce translations.The third part of our tutorial describes techniques to build state-of-the-art NMT. We start with approaches to extend the vocabulary coverage of NMT (Luong et al., 2015a; Jean et al., 2015; Chitnis and DeNero, 2015). We then introduce the idea of jointly learning both translations and alignments through an attention mechanism (Bahdanau et al., 2015); other variants of attention (Luong et al., 2015b; Tu et al., 2016) are discussed too. We describe a recent trend in NMT, that is to translate at the sub-word level (Chung et al., 2016; Luong and Manning, 2016; Sennrich et al., 2016), so that language variations can be effectively handled. We then give tips on training and testing NMT systems such as batching and ensembling. In the final part of the tutorial, we briefly describe promising approaches, such as (a) how to combine multiple tasks to help translation (Dong et al., 2015; Luong et al., 2016; Firat et al., 2016; Zoph and Knight, 2016) and (b) how to utilize monolingual corpora (Sennrich et al., 2016). Lastly, we conclude with challenges remained to be solved for future NMT.PS: we would also like to acknowledge the very first paper by Forcada and Ñeco (1997) on sequence-to-sequence models for translation!
The development of game theory in the early 1940's by John von Neumann was a reaction against the then dominant view that problems in economic theory can be formulated using standard methods from optimization theory. Indeed, most real-world economic problems involve conflicting interactions among decision-making agents that cannot be adequately captured by a single (global) objective function. The main idea behind game theory is to shift the emphasis from optimality criteria to equilibrium conditions. Game theory provides a framework to model complex scenarios, with applications in economics and social science but also in different fields of information technology. With the recent development of algorithmic game theory, it has been used to solve problems in computer vision, pattern recognition, machine learning and natural language processing.Game-theoretic frameworks have been used in different ways to study language origin and evolution. Furthermore, the so-called game metaphor has been used by philosophers and linguists to explain how language evolved and how it works. Ludwig Wittgenstein, for example, famously introduced the concept of a language game to explain the conventional nature of language, and put forward the idea of the spontaneous formation of a common language that gradually emerges from the interactions among the speakers within a population.This concept opens the way to the interpretation of language as a complex adaptive system composed of linguistic units and their interactions, which gives rise to the emergence of structural properties. It is the core part of many computational models of language that are based on classical game theory and evolutionary game theory. With the former it is possible to model how speakers form a signaling system in which the ambiguity of the symbols is minimized; with the latter it is possible to model how speakers coordinate their linguistic choices according to the satisfaction that they have about the outcome of a communication act, converging to a common language. In the same vein, many other attempts have been proposed to explain how other characteristics of language follow similar dynamics.Game theory, and in particular evolutionary game theory, thanks to their ability to model interactive situations and to integrate information from multiple sources, have also been used to solve specific problems in natural language processing and information retrieval, such as language generation, word sense disambiguation and document and text clustering.The goal of this tutorial is to offer an introduction to the basic concepts of game theory and to show its main applications in the study of language, from different perspectives. We shall assume no pre-existing knowledge of game theory by the audience, thereby making the tutorial self-contained and understandable by a non-expert.
Billions of short texts are produced every day, in the form of search queries, ad keywords, tags, tweets, messenger conversations, social network posts, etc. Unlike documents, short texts have some unique characteristics which make them difficult to handle. First, short texts, especially search queries, do not always observe the syntax of a written language. This means traditional NLP techniques, such as syntactic parsing, do not always apply to short texts. Second, short texts contain limited context. The majority of search queries contain less than 5 words, and tweets can have no more than 140 characters. Because of the above reasons, short texts give rise to a significant amount of ambiguity, which makes them extremely difficult to handle. On the other hand, many applications, including search engines, ads, automatic question answering, online advertising, recommendation systems, etc., rely on short text understanding. In all these applications, the necessary first step is to transform an input text into a machine-interpretable representation, namely to "understand" the short text. A growing number of approaches leverage external knowledge to address the issue of inadequate contextual information that accompanies the short texts. These approaches can be classified into two categories: Explicit Representation Model (ERM) and Implicit Representation Model (IRM). In this tutorial, we will present a comprehensive overview of short text understanding based on explicit semantics (knowledge graph representation, acquisition, and reasoning) and implicit semantics (embedding and deep learning). Specifically, we will go over various techniques in knowledge acquisition, representation, and inferencing has been proposed for text understanding, and we will describe massive structured and semi-structured data that have been made available in the recent decade that directly or indirectly encode human knowledge, turning the knowledge representation problems into a computational grand challenge with feasible solutions insight.
The ubiquity of metaphor in language (Lakoff and Johnson 1980) has served as impetus for cognitive linguistic approaches to the study of language, mind, and the study of mind (e.g. Thibodeau & Boroditsky 2011). While native speakers use metaphor naturally and easily, the treatment and interpretation of metaphor in computational systems remains challenging because such systems have not succeeded in developing ways to recognize the semantic elements that define metaphor. This tutorial demonstrates MetaNet's frame-based semantic analyses, and their informing of MetaNet's automatic metaphor identification system. Participants will gain a complete understanding of the theoretical basis and the practical workings of MetaNet, and acquire relevant information about the Frame Semantics basis of that knowledge base and the way that FrameNet handles the widespread phenomenon of metaphor in language. The tutorial is geared to researchers and practitioners of language technology, not necessarily experts in metaphor analysis or knowledgeable about either FrameNet or MetaNet, but who are interested in natural language processing tasks that involve automatic metaphor processing, or could benefit from exposure to tools and resources that support frame-based deep semantic, analyses of language, including metaphor as a widespread phenomenon in human language.