ACL Wiki - User contributions [en]

Data sets for NLG blog

2020-08-06T20:09:33Z

Raojinfeng:

This blog is a supplement to [[Data sets for NLG]], which lists comments about these data sets from users, authors and other interested parties. We are especially interested in comments about appropriate and inappropriate usage of a data set, "best practice" use of a data set, useful additional information about a data set (eg, scope, how it was constructed), and pointers to related data sets which may be more appropriate for some users. Links to relevant papers and other resources are welcome.

We'd love to see more content here, please email Ehud Reiter (e.reiter@abdn.ac.uk) with contributions or other comments

=== E2E ===
The E2E dataset was used in the [http://www.macs.hw.ac.uk/InteractionLab/E2E/ E2E challenge].

=== SumTime ===
The SumTime corpus is structured as a database, and presented in text (CSV) and MDB (Microsoft Access) formats.

A good example of the use of Sumtime is [https://doi.org/10.1017/S1351324907004664 Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models].

=== Tuna ===
[http://www.lrec-conf.org/proceedings/lrec2010/pdf/251_Paper.pdf Dutch] and [https://www.aclweb.org/anthology/W17-3532 Mandarin] versions of Tuna have been developed.

=== WebNLG ===
Thiago Castro Ferreira and Diego Moussallem spent six months producing an enriched version of WebNLG with high-quality annotations. This is available on [https://github.com/ThiagoCF05/webnlg GitHub]

The WebNLG dataset was used in the [http://webnlg.loria.fr/pages/results.html WebNLG challenge].

=== Weather ===
The weather dataset leverages tree-structured meaning representations for better discourse-level structuring, and collects ~30K human annotated utterances.

This is available on [https://github.com/facebookresearch/TreeNLG Github].

=== Weathergov ===
The Weathergov corpus contains the output of a template-based weather forecast generator, not human-written forecasts ([https://ehudreiter.com/2017/05/09/weathergov/ blog post]). Hence ML on Weathergov is an exercise in reverse engineering a template-based NLG system, not in training an NLG system from human data. If you want to train on human-written weather forecasts, consider using the [https://github.com/facebookresearch/TreeNLG Weather corpus] and [https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip SumTime corpus] instead.

=== WikiBio ===
No manual verification or filtering [https://ehudreiter.com/2019/09/26/generated-texts-must-be-accurate/#comment-15983]

Data sets for NLG

2020-08-06T20:00:25Z

Raojinfeng:

This page lists data sets and corpora used for research in natural language generation. They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.

We also have a [[Data sets for NLG blog|blog page]] about data sets, which includes comments about appropriate and inappropriate usage, additional information about data sets, and pointers to related resources.

==Data-to-text/Concept-to-text Generation==
These datasets contain data and corresponding texts based on this data.

=== boxscore-data (Rotowire) ===
https://github.com/harvardnlp/boxscore-data/

This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.

=== E2E ===
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data ([[Data sets for NLG blog#E2E|blog comments]])

Crowdsourced restaurant descriptions with corresponding restaurant data. English.

=== Methodius Corpus ===
https://www.inf.ed.ac.uk/research/isdd/admin/package?view=1&id=197

This dataset consists of 5000 short texts describing ancient Greek artefacts, generated by the Methodius NLG system. Each text is linked to its corresponding content plan (including rhetorical relations) and OpenCCG logical form (which describes the syntactic structure).

=== Personage Stylistic Variation for NLG ===
https://nlds.soe.ucsc.edu/stylistic-variation-nlg

This dataset provides training data for natural language generation of restaurant descriptions in different Big-Five personality styles.

=== Personage Sentence Planning for NLG ===
https://nlds.soe.ucsc.edu/sentence-planning-NLG

This dataset provides training data for natural language generation of restaurant descriptions using sentence planning operations of various kinds.

=== SUMTIME ===
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip ([[Data sets for NLG blog#SumTime|blog comments]])

Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.

=== ToTTo ===
https://github.com/google-research-datasets/ToTTo/

100,000 examples of descriptions of the content of highlighted cells in a Wikipedia table.

=== Weather ===
https://github.com/facebookresearch/TreeNLG

~30K human annotated utterances for tree-structured weather meaning representations.

=== WeatherGov ===
https://cs.stanford.edu/~pliang/data/weather-data.zip ([[Data sets for NLG blog#Weathergov|blog comments]])

Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.

=== WebNLG===
http://webnlg.loria.fr/pages/data.html ([[Data sets for NLG blog#WebNLG|blog comments]])

Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.

=== WikiBio (Wikipedia biography dataset) ===
https://github.com/DavidGrangier/wikipedia-biography-dataset ([[Data sets for NLG blog#WikiBio|blog comments]])

This dataset gathers 728,321 biographies from Wikipedia. It consists of the first paragraph and the infobox (both tokenized).

=== WikiBio German and French(Wikipedia biography dataset) ===
https://github.com/PrekshaNema25/StructuredData_To_Descriptions

This dataset consists of the first paragraph and the infobox from German and French Wikipedia biography pages.

=== Wikipedia Person and Animal Dataset ===
https://eaglew.github.io/dataset/narrating

This dataset gathers 428,748 person and 12,236 animal infobox with descriptions based on Wikipedia dump (2018/04/01) and Wikidata (2018/04/12).

=== The Wikipedia company corpus ===
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus

Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English

== Referring Expressions Generation==
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.

=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===
http://jetteviethen.net/research/spatial.html

Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.
The resulting corpora GRE3D3 and GRE3D7 contain 720 and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.

=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===
https://github.com/lichengunc/refer

Referring expressions for objects in images, and the corresponding images.

=== The REAL dataset ===
https://datastorre.stir.ac.uk/handle/11667/82

Referring expressions for real-wrold objects in images, and the corresponding images.

=== GeoDescriptors ===
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors

Geographical descriptions (eg, "Norte de Galicia") and corresponding regions on a map

=== TUNA Reference Corpus ===
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php ([[Data sets for NLG blog#Tuna|blog comments]])

https://www.abdn.ac.uk/ncs/documents/corpus.zip [direct download]

The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis.

=== COCONUT Corpus ===
http://www.pitt.edu/~coconut/coconut-corpus.html

http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz [direct download]

COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].

=== Stars2 corpus of referring expressions ===
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication.
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y

=== b5 corpus of text and referring expressions labelled with personality information ===
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ
Cite: https://www.aclweb.org/anthology/L18-1183

==Surface Realisation ==

=== Surface Realization Shared Task 2018 (SR'18) dataset ===
http://taln.upf.edu/pages/msr2018-ws/SRST.html#data

Description: A multilingual dataset automatically converted from the Universal Dependencies v2.0, comprising unordered syntactic structures (10 languages) and predicate-argument structures (3 languages).

=== Finnish morphology ===

https://www.kaggle.com/mikahama/finnish-locative-cases-for-nouns

Dataset for picking the correct locative case for Finnish nouns (e.g Venäjä'''llä''' vs Suome'''ssa''')

https://www.kaggle.com/mikahama/cases-of-complements-of-finnish-verbs

Dataset for picking the right case for objects of verbs in Finnish (e.g. näen talo'''n''' vs uneksin talo'''sta''')

== Dialogue ==

=== Alex Context NLG Dataset===
https://github.com/UFAL-DSG/alex_context_nlg_dataset

A dataset for NLG in dialogue systems in the public transport information domain. It includes preceding context along with each data instance, which should allow NLG systems trained on this data to adapt to user's way of speaking, which should improve perceived naturalness. Papers: http://workshop.colips.org/re-wochat/documents/02_Paper_6.pdf, https://www.aclweb.org/anthology/W16-3622

=== Cam4NLG ===
https://github.com/shawnwun/RNNLG/tree/master/data

Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple.

===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===
http://www.classic-project.org/corpora

CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards' choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.

=== CODA corpus Release 1.0 ===
http://computing.open.ac.uk/coda/resources/code_form.html

This release contains approximately 700 turns of human-authored expository dialogue (by Mark Twain and George Berkeley) which has been aligned with monologue that expresses the same information as the dialogue. The monologue side is annotated with Coherence Relations (RST), and the dialogue side with Dialogue Act tags.

=== Hotel Dialogs for NLG ===
https://nlds.soe.ucsc.edu/hotels

This set of hotel corpora includes a set of paraphrases, room and property descriptions, and full hotel dialogues aimed at exploring different ways of eliciting dialogic, conversational descriptions about hotels.

== Summarisation ==

=== CASS (French) ===
https://github.com/euranova/CASS-dataset

This dataset is composed of decisions made by the French Court of cassation and summaries of these decisions made by lawyer.

=== TL;DR ===
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D

Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text, which can be used to train neural networks for summarization technology.

== Image description ==

===Chinese===
* Flickr8k-CN: http://lixirong.net/datasets/flickr8kcn

===Dutch===

* DIDEC: http://didec.uvt.nl
* Flickr30K https://github.com/cltl/DutchDescriptions

===German===
* Multi30K: http://www.statmt.org/wmt16/multimodal-task.html

== Question Generation ==

=== QGSTEC 2010 Generating Questions from Sentences Corpus ===
http://computing.open.ac.uk/coda/resources/qg_form.html

A corpus of over 1000 questions (both human and machine generated). The automatically generated questions have been rated by several raters according to five criteria (relevance, question type, syntactic correctness and fluency, ambiguity, and variety).

=== QGSTEC+ ===
https://github.com/Keith-Godwin/QG-STEC-plus

Improved annotations for the QGSTEC corpus (with higher inter-rater reliability) as described in [http://oro.open.ac.uk/47284/ Godwin and Piwek (2016)].

== Paper Generation ==

=== ACL Title and Abstract Dataset ===
https://github.com/EagleW/ACL_titles_abstracts_dataset

This dataset gathers 10,874 title and abstract pairs from the ACL Anthology Network (until 2016).

=== PubMed Term, Abstract, Conclusion, Title Dataset ===
https://eaglew.github.io/dataset/paperrobot_writing

This dataset gathers three types of pairs: Title-to-Abstract (Training: 22,811/Development: 2095/Test: 2095), Abstract-to-Conclusion and Future work (Training: 22,811/Development: 2095/Test: 2095), Conclusion and Future work-to-Title (Training: 15,902/Development: 2095/Test: 2095) from PubMed. Each pair contains a pair of input and output as well as the corresponding terms(from original KB and link prediction results).

==Challenge Data Repository ==

https://sites.google.com/site/genchalrepository/

== Other ==
=== PIL: Patient Information Leaflet corpus ===
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/

http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz [direct download]

The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.

=== Validity of BLEU Evaluation Metric ===
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu

https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip [direct download]

Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology

[[Category:Knowledge Collections and Datasets]]
{{SIGGEN Wiki}}

Question Answering (State of the art)

2019-09-12T23:34:37Z

Raojinfeng:

== Answer Sentence Selection ==

The task of answer sentence selection is designed for the open-domain question answering setting. Given a question and a set of candidate sentences, the task is to choose the correct sentence that contains the exact answer and can sufficiently support the answer choice.

* [http://cs.stanford.edu/people/mengqiu/data/qg-emnlp07-data.tgz QA Answer Sentence Selection Dataset]: labeled sentences using TREC QA track data, provided by [http://cs.stanford.edu/people/mengqiu/ Mengqiu Wang] and first used in [http://www.aclweb.org/anthology/D/D07/D07-1003.pdf Wang et al. (2007)].
* Over time, the original dataset diverged to two versions due to different pre-processing in recent publications: both have the same training set but their development and test sets differ. The Raw version has 82 questions in the development set and 100 questions in the test set; The Clean version (Wang and Ittycheriah et al. 2015, Tan et al. 2015, dos Santos et al. 2016, Wang et al. 2016) removed questions with no answers or with only positive/negative answers, thus has only 65 questions in the development set and 68 questions in the test set.
* Note: MAP/MRR scores on the two versions of TREC QA data (Clean vs Raw) are not comparable according to [https://dl.acm.org/authorize.cfm?key=N27026 Rao et al. (2016)].

{| border="1" cellpadding="5" cellspacing="1"
|-
! Algorithm - Raw Version of TREC QA
! Reference
! [http://en.wikipedia.org/wiki/Mean_average_precision MAP]
! [http://en.wikipedia.org/wiki/Mean_reciprocal_rank MRR]
|-
| Punyakanok (2004)
| Wang et al. (2007)
| 0.419
| 0.494
|-
| Cui (2005)
| Wang et al. (2007)
| 0.427
| 0.526
|-
| Wang (2007)
| Wang et al. (2007)
| 0.603
| 0.685
|-
| H&S (2010)
| Heilman and Smith (2010)
| 0.609
| 0.692
|-
| W&M (2010)
| Wang and Manning (2010)
| 0.595
| 0.695
|-
| Yao (2013)
| Yao et al. (2013)
| 0.631
| 0.748
|-
| S&M (2013)
| Severyn and Moschitti (2013)
| 0.678
| 0.736
|-
| Shnarch (2013) - Backward
| Shnarch (2013)
| 0.686
| 0.754
|-
| Yih (2013) - LCLR
| Yih et al. (2013)
| 0.709
| 0.770
|-
| Yu (2014) - TRAIN-ALL bigram+count
| Yu et al. (2014)
| 0.711
| 0.785
|-
| W&N (2015) - Three-Layer BLSTM+BM25
| Wang and Nyberg (2015)
| 0.713
| 0.791
|-
| Feng (2015) - Architecture-II
| Tan et al. (2015)
| 0.711
| 0.800
|-
| S&M (2015)
| Severyn and Moschitti (2015)
| 0.746
| 0.808
|-
| Yang (2016) - Attention-Based Neural Matching Model
| Yang et al. (2016)
| 0.750
| 0.811
|-
| Tay (2017) - Holographic Dual LSTM Architecture
| Tay et al. (2017)
| 0.750
| 0.815
|-
| H&L (2016) - Pairwise Word Interaction Modelling
| He and Lin (2016)
| 0.758
| 0.822
|-
| H&L (2015) - Multi-Perspective CNN
| He and Lin (2015)
| 0.762
| 0.830
|-
| Tay (2017) - HyperQA (Hyperbolic Embeddings)
| Tay et al. (2017)
| 0.770
| 0.825
|-
| Rao (2016) - PairwiseRank + Multi-Perspective CNN
| Rao et al. (2016)
| 0.780
| 0.834
|-
| Rao (2019) - Hybrid Co-Attention Network (HCAN)
| Rao et al. (2019)
| 0.774
| 0.843
|-
| Tayyar Madabushi (2018) - Question Classification + PairwiseRank + Multi-Perspective CNN
| Tayyar Madabushi et al. (2018)
| 0.836
| 0.863
|-
| Kamath (2019) - Question Classification + RNN + Pre-Attention
| Kamath et al. (2019)
| 0.852
| 0.891
|}

{| border="1" cellpadding="5" cellspacing="1"
|-
! Algorithm - Clean Version of TREC QA
! Reference
! [http://en.wikipedia.org/wiki/Mean_average_precision MAP]
! [http://en.wikipedia.org/wiki/Mean_reciprocal_rank MRR]
|-
| W&I (2015)
| Wang and Ittycheriah (2015)
| 0.746
| 0.820
|-
| Tan (2015) - QA-LSTM/CNN+attention
| Tan et al. (2015)
| 0.728
| 0.832
|-
| dos Santos (2016) - Attentive Pooling CNN
| dos Santos et al. (2016)
| 0.753
| 0.851
|-
| Wang et al. (2016) - L.D.C Model
| Wang et al. (2016)
| 0.771
| 0.845
|-
| H&L (2015) - Multi-Perspective CNN
| He and Lin (2015)
| 0.777
| 0.836
|-
| Tay et al. (2017) - HyperQA (Hyperbolic Embeddings)
| Tay et al. (2017)
| 0.784
| 0.865
|-
| Rao et al. (2016) - PairwiseRank + Multi-Perspective CNN
| Rao et al. (2016)
| 0.801
| 0.877
|-
| Wang et al. (2017) - BiMPM
| Wang et al. (2017)
| 0.802
| 0.875
|-
| Bian et al. (2017) - Compare-Aggregate
| Bian et al. (2017)
| 0.821
| 0.899
|-
| Shen et al. (2017) - IWAN
| Shen et al. (2017)
| 0.822
| 0.889
|-
| Tran et al. (2018) - IWAN + sCARNN
| Tran et al. (2018)
| 0.829
| 0.875
|-
| Tay et al. (2018) - Multi-Cast Attention Networks (MCAN)
| Tay et al. (2018)
| 0.838
| 0.904
|-
| Tayyar Madabushi (2018) - Question Classification + PairwiseRank + Multi-Perspective CNN
| Tayyar Madabushi et al. (2018)
| 0.865
| 0.904
|-
| Yoon et al. (2019) - Compare-Aggregate + LanguageModel + LatentClustering
| Yoon et al. (2019)
| 0.868
| 0.928
|}

== References ==
* Vasin Punyakanok, Dan Roth, and Wen-Tau Yih. 2004. [http://cogcomp.cs.illinois.edu/papers/PunyakanokRoYi04a.pdf Mapping dependencies trees: An application to question answering]. In Proceedings of the 8th International Symposium on Artificial Intelligence and Mathematics, Fort Lauderdale, FL, USA.
* Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan, and Tat-Seng Chua. 2005. [http://ws.csie.ncku.edu.tw/login/upload/2005/paper/Question%20answering%20Question%20answering%20passage%20retrieval%20using%20dependency%20relations.pdf Question answering passage retrieval using dependency relations]. In Proceedings of the 28th ACM-SIGIR International Conference on Research and Development in Information Retrieval, Salvador, Brazil.
* Wang, Mengqiu and Smith, Noah A. and Mitamura, Teruko. 2007. [http://www.aclweb.org/anthology/D/D07/D07-1003.pdf What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA]. In EMNLP-CoNLL 2007.
* Heilman, Michael and Smith, Noah A. 2010. [http://www.aclweb.org/anthology/N10-1145 Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions]. In NAACL-HLT 2010.
* Wang, Mengqiu and Manning, Christopher. 2010. [http://aclweb.org/anthology//C/C10/C10-1131.pdf Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering]. In COLING 2010.
* E. Shnarch. 2013. Probabilistic Models for Lexical Inference. Ph.D. thesis, Bar Ilan University.
* Yao, Xuchen and Van Durme, Benjamin and Callison-Burch, Chris and Clark, Peter. 2013. [http://www.aclweb.org/anthology/N13-1106.pdf Answer Extraction as Sequence Tagging with Tree Edit Distance]. In NAACL-HLT 2013.
* Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej. 2013. [http://research.microsoft.com/pubs/192357/QA-SentSel-Updated-PostACL.pdf Question Answering Using Enhanced Lexical Semantic Models]. In ACL 2013.
* Severyn, Aliaksei and Moschitti, Alessandro. 2013. [http://www.aclweb.org/anthology/D13-1044.pdf Automatic Feature Engineering for Answer Selection and Extraction]. In EMNLP 2013.
* Lei Yu, Karl Moritz Hermann, Phil Blunsom, and Stephen Pulman. 2014. [http://arxiv.org/pdf/1412.1632v1.pdf Deep Learning for Answer Sentence Selection]. In NIPS deep learning workshop.
* Di Wang and Eric Nyberg. 2015. [http://www.aclweb.org/anthology/P15-2116 A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering]. In ACL 2015.
* Minwei Feng, Bing Xiang, Michael R. Glass, Lidan Wang, Bowen Zhou. 2015. [http://arxiv.org/abs/1508.01585 Applying deep learning to answer selection: A study and an open task]. In ASRU 2015.
* Aliaksei Severyn and Alessandro Moschitti. 2015. [http://disi.unitn.it/~severyn/papers/sigir-2015-long.pdf Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks]. In SIGIR 2015.
* Zhiguo Wang and Abraham Ittycheriah. 2015. [http://arxiv.org/abs/1507.02628 FAQ-based Question Answering via Word Alignment]. In eprint arXiv:1507.02628.
* Ming Tan, Cicero dos Santos, Bing Xiang & Bowen Zhou. 2015. [http://arxiv.org/abs/1511.04108 LSTM-Based Deep Learning Models for Nonfactoid Answer Selection]. In eprint arXiv:1511.04108.
* Cicero dos Santos, Ming Tan, Bing Xiang & Bowen Zhou. 2016. [http://arxiv.org/abs/1602.03609 Attentive Pooling Networks]. In eprint arXiv:1602.03609.
* Zhiguo Wang, Haitao Mi and Abraham Ittycheriah. 2016. [http://arxiv.org/pdf/1602.07019v1.pdf Sentence Similarity Learning by Lexical Decomposition and Composition]. In Coling 2016.
* Hua He, Kevin Gimpel and Jimmy Lin. 2015. [http://aclweb.org/anthology/D/D15/D15-1181.pdf Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks]. In EMNLP 2015.
* Hua He and Jimmy Lin. 2016. [https://cs.uwaterloo.ca/~jimmylin/publications/He_etal_NAACL-HTL2016.pdf Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement]. In NAACL 2016.
* Liu Yang, Qingyao Ai, Jiafeng Guo, W. Bruce Croft. 2016. [http://maroo.cs.umass.edu/pub/web/getpdf.php?id=1240 aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model]. In CIKM 2016.
* Jinfeng Rao, Hua He and Jimmy Lin. 2016. [https://dl.acm.org/authorize.cfm?key=N27026 Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks]. In CIKM 2016.
* Yi Tay, Minh C. Phan, Luu Anh Tuan and Siu Cheung Hui. 2017 [https://arxiv.org/abs/1707.06372 Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture]. In SIGIR 2017.
* Yi Tay, Luu Anh Tuan, Siu Cheung Hui. 2017 [https://arxiv.org/pdf/1707.07847 Enabling Efficient Question Answer Retrieval via Hyperbolic Neural Networks]. In eprint arXiv: 1707.07847.
[[Category:State of the art]]
* Zhiguo Wang, Wael Hamza and Radu Florian. 2017. [https://arxiv.org/pdf/1702.03814.pdf Bilateral Multi-Perspective Matching for Natural Language Sentences]. In eprint arXiv:1702.03814.
* Weijie Bian, Si Li, Zhao Yang, Guang Chen, Zhiqing Lin. 2017. [https://dl.acm.org/citation.cfm?id=3133089&CFID=791659397&CFTOKEN=43388059 A Compare-Aggregate Model with Dynamic-Clip Attention for Answer Selection]. In CIKM 2017.
* Gehui Shen, Yunlun Yang, Zhi-Hong Deng. 2017. [https://aclanthology.info/pdf/D/D17/D17-1122.pdf Inter-Weighted Alignment Network for Sentence Pair Modeling.]. In EMNLP 2017.
* Quan Hung Tran, Tuan Manh Lai, Gholamreza Haffari, Ingrid Zukerman, Trung Bui, Hung Bui, [http://www.aclweb.org/anthology/N18-1115 The Context-dependent Additive Recurrent Neural Net], In NAACL 2018
* Yi Tay, Luu Anh Tuan, Siu Cheung Hui, [https://arxiv.org/abs/1806.00778 Multi-Cast Attention Networks], In KDD 2018
* Harish Tayyar Madabushi, Mark Lee and John Barnden. [https://aclanthology.coli.uni-saarland.de/papers/C18-1278/c18-1278 Integrating Question Classification and Deep Learning for improved Answer Selection], In COLING 2018
* Seunghyun Yoon, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Kyomin Jung. 2019. [https://arxiv.org/abs/1905.12897 A Compare-Aggregate Model with Latent Clustering for Answer Selection]. In eprint arXiv:1905.12897.
* Sanjay Kamath, Brigitte Grau and Yue Ma. 2019. [https://hal.archives-ouvertes.fr/hal-02104488/ Predicting and Integrating Expected Answer Types into a Simple Recurrent Neural Network Model for Answer Sentence Selection]. In CICLING 2019
* Jinfeng Rao, Linqing Liu, Yi Tay, Wei Yang, Peng Shi, Jimmy Lin, [https://jinfengr.github.io/publications/Rao_etal_EMNLP2019.pdf Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling], In EMNLP 2019

Question Answering (State of the art)

2018-06-01T18:27:38Z

Raojinfeng:

== Answer Sentence Selection ==

The task of answer sentence selection is designed for the open-domain question answering setting. Given a question and a set of candidate sentences, the task is to choose the correct sentence that contains the exact answer and can sufficiently support the answer choice.

* [http://cs.stanford.edu/people/mengqiu/data/qg-emnlp07-data.tgz QA Answer Sentence Selection Dataset]: labeled sentences using TREC QA track data, provided by [http://cs.stanford.edu/people/mengqiu/ Mengqiu Wang] and first used in [http://www.aclweb.org/anthology/D/D07/D07-1003.pdf Wang et al. (2007)].
* Over time, the original dataset diverged to two versions due to different pre-processing in recent publications: both have the same training set but their development and test sets differ. The Raw version has 82 questions in the development set and 100 questions in the test set; The Clean version (Wang and Ittycheriah et al. 2015, Tan et al. 2015, dos Santos et al. 2016, Wang et al. 2016) removed questions with no answers or with only positive/negative answers, thus has only 65 questions in the development set and 68 questions in the test set.
* Note: MAP/MRR scores on the two versions of TREC QA data (Clean vs Raw) are not comparable according to [https://dl.acm.org/authorize.cfm?key=N27026 Rao et al. (2016)].

{| border="1" cellpadding="5" cellspacing="1"
|-
! Algorithm - Raw Version of TREC QA
! Reference
! [http://en.wikipedia.org/wiki/Mean_average_precision MAP]
! [http://en.wikipedia.org/wiki/Mean_reciprocal_rank MRR]
|-
| Punyakanok (2004)
| Wang et al. (2007)
| 0.419
| 0.494
|-
| Cui (2005)
| Wang et al. (2007)
| 0.427
| 0.526
|-
| Wang (2007)
| Wang et al. (2007)
| 0.603
| 0.685
|-
| H&S (2010)
| Heilman and Smith (2010)
| 0.609
| 0.692
|-
| W&M (2010)
| Wang and Manning (2010)
| 0.595
| 0.695
|-
| Yao (2013)
| Yao et al. (2013)
| 0.631
| 0.748
|-
| S&M (2013)
| Severyn and Moschitti (2013)
| 0.678
| 0.736
|-
| Shnarch (2013) - Backward
| Shnarch (2013)
| 0.686
| 0.754
|-
| Yih (2013) - LCLR
| Yih et al. (2013)
| 0.709
| 0.770
|-
| Yu (2014) - TRAIN-ALL bigram+count
| Yu et al. (2014)
| 0.711
| 0.785
|-
| W&N (2015) - Three-Layer BLSTM+BM25
| Wang and Nyberg (2015)
| 0.713
| 0.791
|-
| Feng (2015) - Architecture-II
| Tan et al. (2015)
| 0.711
| 0.800
|-
| S&M (2015)
| Severyn and Moschitti (2015)
| 0.746
| 0.808
|-
| Yang (2016) - Attention-Based Neural Matching Model
| Yang et al. (2016)
| 0.750
| 0.811
|-
| Tay (2017) - Holographic Dual LSTM Architecture
| Tay et al. (2017)
| 0.750
| 0.815
|-
| H&L (2016) - Pairwise Word Interaction Modelling
| He and Lin (2016)
| 0.758
| 0.822
|-
| H&L (2015) - Multi-Perspective CNN
| He and Lin (2015)
| 0.762
| 0.830
|-
| Tay (2017) - HyperQA (Hyperbolic Embeddings)
| Tay et al. (2017)
| 0.770
| 0.825
|-
| Rao (2016) - PairwiseRank + Multi-Perspective CNN
| Rao et al. (2016)
| 0.780
| 0.834
|}

{| border="1" cellpadding="5" cellspacing="1"
|-
! Algorithm - Clean Version of TREC QA
! Reference
! [http://en.wikipedia.org/wiki/Mean_average_precision MAP]
! [http://en.wikipedia.org/wiki/Mean_reciprocal_rank MRR]
|-
| W&I (2015)
| Wang and Ittycheriah (2015)
| 0.746
| 0.820
|-
| Tan (2015) - QA-LSTM/CNN+attention
| Tan et al. (2015)
| 0.728
| 0.832
|-
| dos Santos (2016) - Attentive Pooling CNN
| dos Santos et al. (2016)
| 0.753
| 0.851
|-
| Wang et al. (2016) - L.D.C Model
| Wang et al. (2016)
| 0.771
| 0.845
|-
| H&L (2015) - Multi-Perspective CNN
| He and Lin (2015)
| 0.777
| 0.836
|-
| Tay et al. (2017) - HyperQA (Hyperbolic Embeddings)
| Tay et al. (2017)
| 0.784
| 0.865
|-
| Rao et al. (2016) - PairwiseRank + Multi-Perspective CNN
| Rao et al. (2016)
| 0.801
| 0.877
|-
| Wang et al. (2017) - BiMPM
| Wang et al. (2017)
| 0.802
| 0.875
|-
| Bian et al. (2017) - Compare-Aggregate
| Bian et al. (2017)
| 0.821
| 0.899
|-
| Shen et al. (2017) - IWAN
| Shen et al. (2017)
| 0.822
| 0.889
|-
| Tran et al. (2018) - IWAN + sCARNN
| Tran et al. (2018)
| 0.829
| 0.875
|}

== References ==
* Vasin Punyakanok, Dan Roth, and Wen-Tau Yih. 2004. [http://cogcomp.cs.illinois.edu/papers/PunyakanokRoYi04a.pdf Mapping dependencies trees: An application to question answering]. In Proceedings of the 8th International Symposium on Artificial Intelligence and Mathematics, Fort Lauderdale, FL, USA.
* Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan, and Tat-Seng Chua. 2005. [http://ws.csie.ncku.edu.tw/login/upload/2005/paper/Question%20answering%20Question%20answering%20passage%20retrieval%20using%20dependency%20relations.pdf Question answering passage retrieval using dependency relations]. In Proceedings of the 28th ACM-SIGIR International Conference on Research and Development in Information Retrieval, Salvador, Brazil.
* Wang, Mengqiu and Smith, Noah A. and Mitamura, Teruko. 2007. [http://www.aclweb.org/anthology/D/D07/D07-1003.pdf What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA]. In EMNLP-CoNLL 2007.
* Heilman, Michael and Smith, Noah A. 2010. [http://www.aclweb.org/anthology/N10-1145 Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions]. In NAACL-HLT 2010.
* Wang, Mengqiu and Manning, Christopher. 2010. [http://aclweb.org/anthology//C/C10/C10-1131.pdf Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering]. In COLING 2010.
* E. Shnarch. 2013. Probabilistic Models for Lexical Inference. Ph.D. thesis, Bar Ilan University.
* Yao, Xuchen and Van Durme, Benjamin and Callison-Burch, Chris and Clark, Peter. 2013. [http://www.aclweb.org/anthology/N13-1106.pdf Answer Extraction as Sequence Tagging with Tree Edit Distance]. In NAACL-HLT 2013.
* Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej. 2013. [http://research.microsoft.com/pubs/192357/QA-SentSel-Updated-PostACL.pdf Question Answering Using Enhanced Lexical Semantic Models]. In ACL 2013.
* Severyn, Aliaksei and Moschitti, Alessandro. 2013. [http://www.aclweb.org/anthology/D13-1044.pdf Automatic Feature Engineering for Answer Selection and Extraction]. In EMNLP 2013.
* Lei Yu, Karl Moritz Hermann, Phil Blunsom, and Stephen Pulman. 2014. [http://arxiv.org/pdf/1412.1632v1.pdf Deep Learning for Answer Sentence Selection]. In NIPS deep learning workshop.
* Di Wang and Eric Nyberg. 2015. [http://www.aclweb.org/anthology/P15-2116 A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering]. In ACL 2015.
* Minwei Feng, Bing Xiang, Michael R. Glass, Lidan Wang, Bowen Zhou. 2015. [http://arxiv.org/abs/1508.01585 Applying deep learning to answer selection: A study and an open task]. In ASRU 2015.
* Aliaksei Severyn and Alessandro Moschitti. 2015. [http://disi.unitn.it/~severyn/papers/sigir-2015-long.pdf Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks]. In SIGIR 2015.
* Zhiguo Wang and Abraham Ittycheriah. 2015. [http://arxiv.org/abs/1507.02628 FAQ-based Question Answering via Word Alignment]. In eprint arXiv:1507.02628.
* Ming Tan, Cicero dos Santos, Bing Xiang & Bowen Zhou. 2015. [http://arxiv.org/abs/1511.04108 LSTM-Based Deep Learning Models for Nonfactoid Answer Selection]. In eprint arXiv:1511.04108.
* Cicero dos Santos, Ming Tan, Bing Xiang & Bowen Zhou. 2016. [http://arxiv.org/abs/1602.03609 Attentive Pooling Networks]. In eprint arXiv:1602.03609.
* Zhiguo Wang, Haitao Mi and Abraham Ittycheriah. 2016. [http://arxiv.org/pdf/1602.07019v1.pdf Sentence Similarity Learning by Lexical Decomposition and Composition]. In Coling 2016.
* Hua He, Kevin Gimpel and Jimmy Lin. 2015. [http://aclweb.org/anthology/D/D15/D15-1181.pdf Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks]. In EMNLP 2015.
* Hua He and Jimmy Lin. 2016. [https://cs.uwaterloo.ca/~jimmylin/publications/He_etal_NAACL-HTL2016.pdf Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement]. In NAACL 2016.
* Liu Yang, Qingyao Ai, Jiafeng Guo, W. Bruce Croft. 2016. [http://maroo.cs.umass.edu/pub/web/getpdf.php?id=1240 aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model]. In CIKM 2016.
* Jinfeng Rao, Hua He and Jimmy Lin. 2016. [https://dl.acm.org/authorize.cfm?key=N27026 Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks]. In CIKM 2016.
* Yi Tay, Minh C. Phan, Luu Anh Tuan and Siu Cheung Hui. 2017 [https://arxiv.org/abs/1707.06372 Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture]. In SIGIR 2017.
* Yi Tay, Luu Anh Tuan, Siu Cheung Hui. 2017 [https://arxiv.org/pdf/1707.07847 Enabling Efficient Question Answer Retrieval via Hyperbolic Neural Networks]. In eprint arXiv: 1707.07847.
[[Category:State of the art]]
* Zhiguo Wang, Wael Hamza and Radu Florian. 2017. [https://arxiv.org/pdf/1702.03814.pdf Bilateral Multi-Perspective Matching for Natural Language Sentences]. In eprint arXiv:1702.03814.
* Weijie Bian, Si Li, Zhao Yang, Guang Chen, Zhiqing Lin. 2017. [https://dl.acm.org/citation.cfm?id=3133089&CFID=791659397&CFTOKEN=43388059 A Compare-Aggregate Model with Dynamic-Clip Attention for Answer Selection]. In CIKM 2017.
* Gehui Shen, Yunlun Yang, Zhi-Hong Deng. 2017. [https://aclanthology.info/pdf/D/D17/D17-1122.pdf Inter-Weighted Alignment Network for Sentence Pair Modeling.]. In EMNLP 2017.
* Quan Hung Tran, Tuan Manh Lai, Gholamreza Haffari, Ingrid Zukerman, Trung Bui, Hung Bui, [http://www.aclweb.org/anthology/N18-1115 The Context-dependent Additive Recurrent Neural Net], In NAACL 2018

Question Answering (State of the art)

2017-11-16T15:52:57Z

Raojinfeng: /* References */

== Answer Sentence Selection ==

The task of answer sentence selection is designed for the open-domain question answering setting. Given a question and a set of candidate sentences, the task is to choose the correct sentence that contains the exact answer and can sufficiently support the answer choice.

* [http://cs.stanford.edu/people/mengqiu/data/qg-emnlp07-data.tgz QA Answer Sentence Selection Dataset]: labeled sentences using TREC QA track data, provided by [http://cs.stanford.edu/people/mengqiu/ Mengqiu Wang] and first used in [http://www.aclweb.org/anthology/D/D07/D07-1003.pdf Wang et al. (2007)].
* Over time, the original dataset diverged to two versions due to different pre-processing in recent publications: both have the same training set but their development and test sets differ. The Raw version has 82 questions in the development set and 100 questions in the test set; The Clean version (Wang and Ittycheriah et al. 2015, Tan et al. 2015, dos Santos et al. 2016, Wang et al. 2016) removed questions with no answers or with only positive/negative answers, thus has only 65 questions in the development set and 68 questions in the test set.
* Note: MAP/MRR scores on the two versions of TREC QA data (Clean vs Raw) are not comparable according to [https://dl.acm.org/authorize.cfm?key=N27026 Rao et al. (2016)].

{| border="1" cellpadding="5" cellspacing="1"
|-
! Algorithm - Raw Version of TREC QA
! Reference
! [http://en.wikipedia.org/wiki/Mean_average_precision MAP]
! [http://en.wikipedia.org/wiki/Mean_reciprocal_rank MRR]
|-
| Punyakanok (2004)
| Wang et al. (2007)
| 0.419
| 0.494
|-
| Cui (2005)
| Wang et al. (2007)
| 0.427
| 0.526
|-
| Wang (2007)
| Wang et al. (2007)
| 0.603
| 0.685
|-
| H&S (2010)
| Heilman and Smith (2010)
| 0.609
| 0.692
|-
| W&M (2010)
| Wang and Manning (2010)
| 0.595
| 0.695
|-
| Yao (2013)
| Yao et al. (2013)
| 0.631
| 0.748
|-
| S&M (2013)
| Severyn and Moschitti (2013)
| 0.678
| 0.736
|-
| Shnarch (2013) - Backward
| Shnarch (2013)
| 0.686
| 0.754
|-
| Yih (2013) - LCLR
| Yih et al. (2013)
| 0.709
| 0.770
|-
| Yu (2014) - TRAIN-ALL bigram+count
| Yu et al. (2014)
| 0.711
| 0.785
|-
| W&N (2015) - Three-Layer BLSTM+BM25
| Wang and Nyberg (2015)
| 0.713
| 0.791
|-
| Feng (2015) - Architecture-II
| Tan et al. (2015)
| 0.711
| 0.800
|-
| S&M (2015)
| Severyn and Moschitti (2015)
| 0.746
| 0.808
|-
| Yang (2016) - Attention-Based Neural Matching Model
| Yang et al. (2016)
| 0.750
| 0.811
|-
| Tay (2017) - Holographic Dual LSTM Architecture
| Tay et al. (2017)
| 0.750
| 0.815
|-
| H&L (2016) - Pairwise Word Interaction Modelling
| He and Lin (2016)
| 0.758
| 0.822
|-
| H&L (2015) - Multi-Perspective CNN
| He and Lin (2015)
| 0.762
| 0.830
|-
| Tay (2017) - HyperQA (Hyperbolic Embeddings)
| Tay et al. (2017)
| 0.770
| 0.825
|-
| Rao (2016) - PairwiseRank + Multi-Perspective CNN
| Rao et al. (2016)
| 0.780
| 0.834
|}

{| border="1" cellpadding="5" cellspacing="1"
|-
! Algorithm - Clean Version of TREC QA
! Reference
! [http://en.wikipedia.org/wiki/Mean_average_precision MAP]
! [http://en.wikipedia.org/wiki/Mean_reciprocal_rank MRR]
|-
| W&I (2015)
| Wang and Ittycheriah (2015)
| 0.746
| 0.820
|-
| Tan (2015) - QA-LSTM/CNN+attention
| Tan et al. (2015)
| 0.728
| 0.832
|-
| dos Santos (2016) - Attentive Pooling CNN
| dos Santos et al. (2016)
| 0.753
| 0.851
|-
| Wang et al. (2016) - L.D.C Model
| Wang et al. (2016)
| 0.771
| 0.845
|-
| H&L (2015) - Multi-Perspective CNN
| He and Lin (2015)
| 0.777
| 0.836
|-
| Tay et al. (2017) - HyperQA (Hyperbolic Embeddings)
| Tay et al. (2017)
| 0.784
| 0.865
|-
| Rao et al. (2016) - PairwiseRank + Multi-Perspective CNN
| Rao et al. (2016)
| 0.801
| 0.877
|-
| Wang et al. (2017) - BiMPM
| Wang et al. (2017)
| 0.802
| 0.875
|-
| Bian et al. (2017) - Compare-Aggregate
| Bian et al. (2017)
| 0.821
| 0.899
|-
| Shen et al. (2017) - IWAN
| Shen et al. (2017)
| 0.822
| 0.889
|}

== References ==
* Vasin Punyakanok, Dan Roth, and Wen-Tau Yih. 2004. [http://cogcomp.cs.illinois.edu/papers/PunyakanokRoYi04a.pdf Mapping dependencies trees: An application to question answering]. In Proceedings of the 8th International Symposium on Artificial Intelligence and Mathematics, Fort Lauderdale, FL, USA.
* Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan, and Tat-Seng Chua. 2005. [http://ws.csie.ncku.edu.tw/login/upload/2005/paper/Question%20answering%20Question%20answering%20passage%20retrieval%20using%20dependency%20relations.pdf Question answering passage retrieval using dependency relations]. In Proceedings of the 28th ACM-SIGIR International Conference on Research and Development in Information Retrieval, Salvador, Brazil.
* Wang, Mengqiu and Smith, Noah A. and Mitamura, Teruko. 2007. [http://www.aclweb.org/anthology/D/D07/D07-1003.pdf What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA]. In EMNLP-CoNLL 2007.
* Heilman, Michael and Smith, Noah A. 2010. [http://www.aclweb.org/anthology/N10-1145 Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions]. In NAACL-HLT 2010.
* Wang, Mengqiu and Manning, Christopher. 2010. [http://aclweb.org/anthology//C/C10/C10-1131.pdf Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering]. In COLING 2010.
* E. Shnarch. 2013. Probabilistic Models for Lexical Inference. Ph.D. thesis, Bar Ilan University.
* Yao, Xuchen and Van Durme, Benjamin and Callison-Burch, Chris and Clark, Peter. 2013. [http://www.aclweb.org/anthology/N13-1106.pdf Answer Extraction as Sequence Tagging with Tree Edit Distance]. In NAACL-HLT 2013.
* Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej. 2013. [http://research.microsoft.com/pubs/192357/QA-SentSel-Updated-PostACL.pdf Question Answering Using Enhanced Lexical Semantic Models]. In ACL 2013.
* Severyn, Aliaksei and Moschitti, Alessandro. 2013. [http://www.aclweb.org/anthology/D13-1044.pdf Automatic Feature Engineering for Answer Selection and Extraction]. In EMNLP 2013.
* Lei Yu, Karl Moritz Hermann, Phil Blunsom, and Stephen Pulman. 2014. [http://arxiv.org/pdf/1412.1632v1.pdf Deep Learning for Answer Sentence Selection]. In NIPS deep learning workshop.
* Di Wang and Eric Nyberg. 2015. [http://www.aclweb.org/anthology/P15-2116 A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering]. In ACL 2015.
* Minwei Feng, Bing Xiang, Michael R. Glass, Lidan Wang, Bowen Zhou. 2015. [http://arxiv.org/abs/1508.01585 Applying deep learning to answer selection: A study and an open task]. In ASRU 2015.
* Aliaksei Severyn and Alessandro Moschitti. 2015. [http://disi.unitn.it/~severyn/papers/sigir-2015-long.pdf Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks]. In SIGIR 2015.
* Zhiguo Wang and Abraham Ittycheriah. 2015. [http://arxiv.org/abs/1507.02628 FAQ-based Question Answering via Word Alignment]. In eprint arXiv:1507.02628.
* Ming Tan, Cicero dos Santos, Bing Xiang & Bowen Zhou. 2015. [http://arxiv.org/abs/1511.04108 LSTM-Based Deep Learning Models for Nonfactoid Answer Selection]. In eprint arXiv:1511.04108.
* Cicero dos Santos, Ming Tan, Bing Xiang & Bowen Zhou. 2016. [http://arxiv.org/abs/1602.03609 Attentive Pooling Networks]. In eprint arXiv:1602.03609.
* Zhiguo Wang, Haitao Mi and Abraham Ittycheriah. 2016. [http://arxiv.org/pdf/1602.07019v1.pdf Sentence Similarity Learning by Lexical Decomposition and Composition]. In Coling 2016.
* Hua He, Kevin Gimpel and Jimmy Lin. 2015. [http://aclweb.org/anthology/D/D15/D15-1181.pdf Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks]. In EMNLP 2015.
* Hua He and Jimmy Lin. 2016. [https://cs.uwaterloo.ca/~jimmylin/publications/He_etal_NAACL-HTL2016.pdf Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement]. In NAACL 2016.
* Liu Yang, Qingyao Ai, Jiafeng Guo, W. Bruce Croft. 2016. [http://maroo.cs.umass.edu/pub/web/getpdf.php?id=1240 aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model]. In CIKM 2016.
* Jinfeng Rao, Hua He and Jimmy Lin. 2016. [https://dl.acm.org/authorize.cfm?key=N27026 Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks]. In CIKM 2016.
* Yi Tay, Minh C. Phan, Luu Anh Tuan and Siu Cheung Hui. 2017 [https://arxiv.org/abs/1707.06372 Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture]. In SIGIR 2017.
* Yi Tay, Luu Anh Tuan, Siu Cheung Hui. 2017 [https://arxiv.org/pdf/1707.07847 Enabling Efficient Question Answer Retrieval via Hyperbolic Neural Networks]. In eprint arXiv: 1707.07847.
[[Category:State of the art]]
* Zhiguo Wang, Wael Hamza and Radu Florian. 2017. [https://arxiv.org/pdf/1702.03814.pdf Bilateral Multi-Perspective Matching for Natural Language Sentences]. In eprint arXiv:1702.03814.
* Weijie Bian, Si Li, Zhao Yang, Guang Chen, Zhiqing Lin. 2017. [https://dl.acm.org/citation.cfm?id=3133089&CFID=791659397&CFTOKEN=43388059 A Compare-Aggregate Model with Dynamic-Clip Attention for Answer Selection]. In CIKM 2017.
* Gehui Shen, Yunlun Yang, Zhi-Hong Deng. 2017. [https://aclanthology.info/pdf/D/D17/D17-1123.pdf Inter-Weighted Alignment Network for Sentence Pair Modeling.]. In EMNLP 2017.

Question Answering (State of the art)

2017-11-15T16:20:27Z

Raojinfeng: /* References */

== Answer Sentence Selection ==

The task of answer sentence selection is designed for the open-domain question answering setting. Given a question and a set of candidate sentences, the task is to choose the correct sentence that contains the exact answer and can sufficiently support the answer choice.

* [http://cs.stanford.edu/people/mengqiu/data/qg-emnlp07-data.tgz QA Answer Sentence Selection Dataset]: labeled sentences using TREC QA track data, provided by [http://cs.stanford.edu/people/mengqiu/ Mengqiu Wang] and first used in [http://www.aclweb.org/anthology/D/D07/D07-1003.pdf Wang et al. (2007)].
* Over time, the original dataset diverged to two versions due to different pre-processing in recent publications: both have the same training set but their development and test sets differ. The Raw version has 82 questions in the development set and 100 questions in the test set; The Clean version (Wang and Ittycheriah et al. 2015, Tan et al. 2015, dos Santos et al. 2016, Wang et al. 2016) removed questions with no answers or with only positive/negative answers, thus has only 65 questions in the development set and 68 questions in the test set.
* Note: MAP/MRR scores on the two versions of TREC QA data (Clean vs Raw) are not comparable according to [https://dl.acm.org/authorize.cfm?key=N27026 Rao et al. (2016)].

{| border="1" cellpadding="5" cellspacing="1"
|-
! Algorithm - Raw Version of TREC QA
! Reference
! [http://en.wikipedia.org/wiki/Mean_average_precision MAP]
! [http://en.wikipedia.org/wiki/Mean_reciprocal_rank MRR]
|-
| Punyakanok (2004)
| Wang et al. (2007)
| 0.419
| 0.494
|-
| Cui (2005)
| Wang et al. (2007)
| 0.427
| 0.526
|-
| Wang (2007)
| Wang et al. (2007)
| 0.603
| 0.685
|-
| H&S (2010)
| Heilman and Smith (2010)
| 0.609
| 0.692
|-
| W&M (2010)
| Wang and Manning (2010)
| 0.595
| 0.695
|-
| Yao (2013)
| Yao et al. (2013)
| 0.631
| 0.748
|-
| S&M (2013)
| Severyn and Moschitti (2013)
| 0.678
| 0.736
|-
| Shnarch (2013) - Backward
| Shnarch (2013)
| 0.686
| 0.754
|-
| Yih (2013) - LCLR
| Yih et al. (2013)
| 0.709
| 0.770
|-
| Yu (2014) - TRAIN-ALL bigram+count
| Yu et al. (2014)
| 0.711
| 0.785
|-
| W&N (2015) - Three-Layer BLSTM+BM25
| Wang and Nyberg (2015)
| 0.713
| 0.791
|-
| Feng (2015) - Architecture-II
| Tan et al. (2015)
| 0.711
| 0.800
|-
| S&M (2015)
| Severyn and Moschitti (2015)
| 0.746
| 0.808
|-
| Yang (2016) - Attention-Based Neural Matching Model
| Yang et al. (2016)
| 0.750
| 0.811
|-
| Tay (2017) - Holographic Dual LSTM Architecture
| Tay et al. (2017)
| 0.750
| 0.815
|-
| H&L (2016) - Pairwise Word Interaction Modelling
| He and Lin (2016)
| 0.758
| 0.822
|-
| H&L (2015) - Multi-Perspective CNN
| He and Lin (2015)
| 0.762
| 0.830
|-
| Tay (2017) - HyperQA (Hyperbolic Embeddings)
| Tay et al. (2017)
| 0.770
| 0.825
|-
| Rao (2016) - PairwiseRank + Multi-Perspective CNN
| Rao et al. (2016)
| 0.780
| 0.834
|}

{| border="1" cellpadding="5" cellspacing="1"
|-
! Algorithm - Clean Version of TREC QA
! Reference
! [http://en.wikipedia.org/wiki/Mean_average_precision MAP]
! [http://en.wikipedia.org/wiki/Mean_reciprocal_rank MRR]
|-
| W&I (2015)
| Wang and Ittycheriah (2015)
| 0.746
| 0.820
|-
| Tan (2015) - QA-LSTM/CNN+attention
| Tan et al. (2015)
| 0.728
| 0.832
|-
| dos Santos (2016) - Attentive Pooling CNN
| dos Santos et al. (2016)
| 0.753
| 0.851
|-
| Wang et al. (2016) - L.D.C Model
| Wang et al. (2016)
| 0.771
| 0.845
|-
| H&L (2015) - Multi-Perspective CNN
| He and Lin (2015)
| 0.777
| 0.836
|-
| Tay et al. (2017) - HyperQA (Hyperbolic Embeddings)
| Tay et al. (2017)
| 0.784
| 0.865
|-
| Rao et al. (2016) - PairwiseRank + Multi-Perspective CNN
| Rao et al. (2016)
| 0.801
| 0.877
|-
| Wang et al. (2017) - BiMPM
| Wang et al. (2017)
| 0.802
| 0.875
|-
| Bian et al. (2017) - Compare-Aggregate
| Bian et al. (2017)
| 0.821
| 0.899
|-
| Shen et al. (2017) - IWAN
| Shen et al. (2017)
| 0.822
| 0.889
|}

== References ==
* Vasin Punyakanok, Dan Roth, and Wen-Tau Yih. 2004. [http://cogcomp.cs.illinois.edu/papers/PunyakanokRoYi04a.pdf Mapping dependencies trees: An application to question answering]. In Proceedings of the 8th International Symposium on Artificial Intelligence and Mathematics, Fort Lauderdale, FL, USA.
* Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan, and Tat-Seng Chua. 2005. [http://ws.csie.ncku.edu.tw/login/upload/2005/paper/Question%20answering%20Question%20answering%20passage%20retrieval%20using%20dependency%20relations.pdf Question answering passage retrieval using dependency relations]. In Proceedings of the 28th ACM-SIGIR International Conference on Research and Development in Information Retrieval, Salvador, Brazil.
* Wang, Mengqiu and Smith, Noah A. and Mitamura, Teruko. 2007. [http://www.aclweb.org/anthology/D/D07/D07-1003.pdf What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA]. In EMNLP-CoNLL 2007.
* Heilman, Michael and Smith, Noah A. 2010. [http://www.aclweb.org/anthology/N10-1145 Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions]. In NAACL-HLT 2010.
* Wang, Mengqiu and Manning, Christopher. 2010. [http://aclweb.org/anthology//C/C10/C10-1131.pdf Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering]. In COLING 2010.
* E. Shnarch. 2013. Probabilistic Models for Lexical Inference. Ph.D. thesis, Bar Ilan University.
* Yao, Xuchen and Van Durme, Benjamin and Callison-Burch, Chris and Clark, Peter. 2013. [http://www.aclweb.org/anthology/N13-1106.pdf Answer Extraction as Sequence Tagging with Tree Edit Distance]. In NAACL-HLT 2013.
* Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej. 2013. [http://research.microsoft.com/pubs/192357/QA-SentSel-Updated-PostACL.pdf Question Answering Using Enhanced Lexical Semantic Models]. In ACL 2013.
* Severyn, Aliaksei and Moschitti, Alessandro. 2013. [http://www.aclweb.org/anthology/D13-1044.pdf Automatic Feature Engineering for Answer Selection and Extraction]. In EMNLP 2013.
* Lei Yu, Karl Moritz Hermann, Phil Blunsom, and Stephen Pulman. 2014. [http://arxiv.org/pdf/1412.1632v1.pdf Deep Learning for Answer Sentence Selection]. In NIPS deep learning workshop.
* Di Wang and Eric Nyberg. 2015. [http://www.aclweb.org/anthology/P15-2116 A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering]. In ACL 2015.
* Minwei Feng, Bing Xiang, Michael R. Glass, Lidan Wang, Bowen Zhou. 2015. [http://arxiv.org/abs/1508.01585 Applying deep learning to answer selection: A study and an open task]. In ASRU 2015.
* Aliaksei Severyn and Alessandro Moschitti. 2015. [http://disi.unitn.it/~severyn/papers/sigir-2015-long.pdf Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks]. In SIGIR 2015.
* Zhiguo Wang and Abraham Ittycheriah. 2015. [http://arxiv.org/abs/1507.02628 FAQ-based Question Answering via Word Alignment]. In eprint arXiv:1507.02628.
* Ming Tan, Cicero dos Santos, Bing Xiang & Bowen Zhou. 2015. [http://arxiv.org/abs/1511.04108 LSTM-Based Deep Learning Models for Nonfactoid Answer Selection]. In eprint arXiv:1511.04108.
* Cicero dos Santos, Ming Tan, Bing Xiang & Bowen Zhou. 2016. [http://arxiv.org/abs/1602.03609 Attentive Pooling Networks]. In eprint arXiv:1602.03609.
* Zhiguo Wang, Haitao Mi and Abraham Ittycheriah. 2016. [http://arxiv.org/pdf/1602.07019v1.pdf Sentence Similarity Learning by Lexical Decomposition and Composition]. In Coling 2016.
* Hua He, Kevin Gimpel and Jimmy Lin. 2015. [http://aclweb.org/anthology/D/D15/D15-1181.pdf Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks]. In EMNLP 2015.
* Hua He and Jimmy Lin. 2016. [https://cs.uwaterloo.ca/~jimmylin/publications/He_etal_NAACL-HTL2016.pdf Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement]. In NAACL 2016.
* Liu Yang, Qingyao Ai, Jiafeng Guo, W. Bruce Croft. 2016. [http://maroo.cs.umass.edu/pub/web/getpdf.php?id=1240 aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model]. In CIKM 2016.
* Jinfeng Rao, Hua He and Jimmy Lin. 2016. [https://dl.acm.org/authorize.cfm?key=N27026 Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks]. In CIKM 2016.
* Yi Tay, Minh C. Phan, Luu Anh Tuan and Siu Cheung Hui. 2017 [https://arxiv.org/abs/1707.06372 Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture]. In SIGIR 2017.
* Yi Tay, Luu Anh Tuan, Siu Cheung Hui. 2017 [https://arxiv.org/pdf/1707.07847 Enabling Efficient Question Answer Retrieval via Hyperbolic Neural Networks]. In eprint arXiv: 1707.07847.
[[Category:State of the art]]
* Zhiguo Wang, Wael Hamza and Radu Florian. 2017. [https://arxiv.org/pdf/1702.03814.pdf Bilateral Multi-Perspective Matching for Natural Language Sentences]. In eprint arXiv:1702.03814.
* Weijie Bian, Si Li, Zhao Yang, Guang Chen, Zhiqing Lin. 2017. [https://aclanthology.info/pdf/D/D17/D17-1123.pdf A Compare-Aggregate Model with Dynamic-Clip Attention for Answer Selection]. In CIKM 2017.
* Gehui Shen, Yunlun Yang, Zhi-Hong Deng. 2017. [https://aclanthology.info/pdf/D/D17/D17-1123.pdf Inter-Weighted Alignment Network for Sentence Pair Modeling.]. In EMNLP 2017.

Question Answering (State of the art)

2017-11-15T16:19:09Z

Raojinfeng: /* Answer Sentence Selection */

== Answer Sentence Selection ==

The task of answer sentence selection is designed for the open-domain question answering setting. Given a question and a set of candidate sentences, the task is to choose the correct sentence that contains the exact answer and can sufficiently support the answer choice.

* [http://cs.stanford.edu/people/mengqiu/data/qg-emnlp07-data.tgz QA Answer Sentence Selection Dataset]: labeled sentences using TREC QA track data, provided by [http://cs.stanford.edu/people/mengqiu/ Mengqiu Wang] and first used in [http://www.aclweb.org/anthology/D/D07/D07-1003.pdf Wang et al. (2007)].
* Over time, the original dataset diverged to two versions due to different pre-processing in recent publications: both have the same training set but their development and test sets differ. The Raw version has 82 questions in the development set and 100 questions in the test set; The Clean version (Wang and Ittycheriah et al. 2015, Tan et al. 2015, dos Santos et al. 2016, Wang et al. 2016) removed questions with no answers or with only positive/negative answers, thus has only 65 questions in the development set and 68 questions in the test set.
* Note: MAP/MRR scores on the two versions of TREC QA data (Clean vs Raw) are not comparable according to [https://dl.acm.org/authorize.cfm?key=N27026 Rao et al. (2016)].

{| border="1" cellpadding="5" cellspacing="1"
|-
! Algorithm - Raw Version of TREC QA
! Reference
! [http://en.wikipedia.org/wiki/Mean_average_precision MAP]
! [http://en.wikipedia.org/wiki/Mean_reciprocal_rank MRR]
|-
| Punyakanok (2004)
| Wang et al. (2007)
| 0.419
| 0.494
|-
| Cui (2005)
| Wang et al. (2007)
| 0.427
| 0.526
|-
| Wang (2007)
| Wang et al. (2007)
| 0.603
| 0.685
|-
| H&S (2010)
| Heilman and Smith (2010)
| 0.609
| 0.692
|-
| W&M (2010)
| Wang and Manning (2010)
| 0.595
| 0.695
|-
| Yao (2013)
| Yao et al. (2013)
| 0.631
| 0.748
|-
| S&M (2013)
| Severyn and Moschitti (2013)
| 0.678
| 0.736
|-
| Shnarch (2013) - Backward
| Shnarch (2013)
| 0.686
| 0.754
|-
| Yih (2013) - LCLR
| Yih et al. (2013)
| 0.709
| 0.770
|-
| Yu (2014) - TRAIN-ALL bigram+count
| Yu et al. (2014)
| 0.711
| 0.785
|-
| W&N (2015) - Three-Layer BLSTM+BM25
| Wang and Nyberg (2015)
| 0.713
| 0.791
|-
| Feng (2015) - Architecture-II
| Tan et al. (2015)
| 0.711
| 0.800
|-
| S&M (2015)
| Severyn and Moschitti (2015)
| 0.746
| 0.808
|-
| Yang (2016) - Attention-Based Neural Matching Model
| Yang et al. (2016)
| 0.750
| 0.811
|-
| Tay (2017) - Holographic Dual LSTM Architecture
| Tay et al. (2017)
| 0.750
| 0.815
|-
| H&L (2016) - Pairwise Word Interaction Modelling
| He and Lin (2016)
| 0.758
| 0.822
|-
| H&L (2015) - Multi-Perspective CNN
| He and Lin (2015)
| 0.762
| 0.830
|-
| Tay (2017) - HyperQA (Hyperbolic Embeddings)
| Tay et al. (2017)
| 0.770
| 0.825
|-
| Rao (2016) - PairwiseRank + Multi-Perspective CNN
| Rao et al. (2016)
| 0.780
| 0.834
|}

{| border="1" cellpadding="5" cellspacing="1"
|-
! Algorithm - Clean Version of TREC QA
! Reference
! [http://en.wikipedia.org/wiki/Mean_average_precision MAP]
! [http://en.wikipedia.org/wiki/Mean_reciprocal_rank MRR]
|-
| W&I (2015)
| Wang and Ittycheriah (2015)
| 0.746
| 0.820
|-
| Tan (2015) - QA-LSTM/CNN+attention
| Tan et al. (2015)
| 0.728
| 0.832
|-
| dos Santos (2016) - Attentive Pooling CNN
| dos Santos et al. (2016)
| 0.753
| 0.851
|-
| Wang et al. (2016) - L.D.C Model
| Wang et al. (2016)
| 0.771
| 0.845
|-
| H&L (2015) - Multi-Perspective CNN
| He and Lin (2015)
| 0.777
| 0.836
|-
| Tay et al. (2017) - HyperQA (Hyperbolic Embeddings)
| Tay et al. (2017)
| 0.784
| 0.865
|-
| Rao et al. (2016) - PairwiseRank + Multi-Perspective CNN
| Rao et al. (2016)
| 0.801
| 0.877
|-
| Wang et al. (2017) - BiMPM
| Wang et al. (2017)
| 0.802
| 0.875
|-
| Bian et al. (2017) - Compare-Aggregate
| Bian et al. (2017)
| 0.821
| 0.899
|-
| Shen et al. (2017) - IWAN
| Shen et al. (2017)
| 0.822
| 0.889
|}

== References ==
* Vasin Punyakanok, Dan Roth, and Wen-Tau Yih. 2004. [http://cogcomp.cs.illinois.edu/papers/PunyakanokRoYi04a.pdf Mapping dependencies trees: An application to question answering]. In Proceedings of the 8th International Symposium on Artificial Intelligence and Mathematics, Fort Lauderdale, FL, USA.
* Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan, and Tat-Seng Chua. 2005. [http://ws.csie.ncku.edu.tw/login/upload/2005/paper/Question%20answering%20Question%20answering%20passage%20retrieval%20using%20dependency%20relations.pdf Question answering passage retrieval using dependency relations]. In Proceedings of the 28th ACM-SIGIR International Conference on Research and Development in Information Retrieval, Salvador, Brazil.
* Wang, Mengqiu and Smith, Noah A. and Mitamura, Teruko. 2007. [http://www.aclweb.org/anthology/D/D07/D07-1003.pdf What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA]. In EMNLP-CoNLL 2007.
* Heilman, Michael and Smith, Noah A. 2010. [http://www.aclweb.org/anthology/N10-1145 Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions]. In NAACL-HLT 2010.
* Wang, Mengqiu and Manning, Christopher. 2010. [http://aclweb.org/anthology//C/C10/C10-1131.pdf Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering]. In COLING 2010.
* E. Shnarch. 2013. Probabilistic Models for Lexical Inference. Ph.D. thesis, Bar Ilan University.
* Yao, Xuchen and Van Durme, Benjamin and Callison-Burch, Chris and Clark, Peter. 2013. [http://www.aclweb.org/anthology/N13-1106.pdf Answer Extraction as Sequence Tagging with Tree Edit Distance]. In NAACL-HLT 2013.
* Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej. 2013. [http://research.microsoft.com/pubs/192357/QA-SentSel-Updated-PostACL.pdf Question Answering Using Enhanced Lexical Semantic Models]. In ACL 2013.
* Severyn, Aliaksei and Moschitti, Alessandro. 2013. [http://www.aclweb.org/anthology/D13-1044.pdf Automatic Feature Engineering for Answer Selection and Extraction]. In EMNLP 2013.
* Lei Yu, Karl Moritz Hermann, Phil Blunsom, and Stephen Pulman. 2014. [http://arxiv.org/pdf/1412.1632v1.pdf Deep Learning for Answer Sentence Selection]. In NIPS deep learning workshop.
* Di Wang and Eric Nyberg. 2015. [http://www.aclweb.org/anthology/P15-2116 A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering]. In ACL 2015.
* Minwei Feng, Bing Xiang, Michael R. Glass, Lidan Wang, Bowen Zhou. 2015. [http://arxiv.org/abs/1508.01585 Applying deep learning to answer selection: A study and an open task]. In ASRU 2015.
* Aliaksei Severyn and Alessandro Moschitti. 2015. [http://disi.unitn.it/~severyn/papers/sigir-2015-long.pdf Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks]. In SIGIR 2015.
* Zhiguo Wang and Abraham Ittycheriah. 2015. [http://arxiv.org/abs/1507.02628 FAQ-based Question Answering via Word Alignment]. In eprint arXiv:1507.02628.
* Ming Tan, Cicero dos Santos, Bing Xiang & Bowen Zhou. 2015. [http://arxiv.org/abs/1511.04108 LSTM-Based Deep Learning Models for Nonfactoid Answer Selection]. In eprint arXiv:1511.04108.
* Cicero dos Santos, Ming Tan, Bing Xiang & Bowen Zhou. 2016. [http://arxiv.org/abs/1602.03609 Attentive Pooling Networks]. In eprint arXiv:1602.03609.
* Zhiguo Wang, Haitao Mi and Abraham Ittycheriah. 2016. [http://arxiv.org/pdf/1602.07019v1.pdf Sentence Similarity Learning by Lexical Decomposition and Composition]. In Coling 2016.
* Hua He, Kevin Gimpel and Jimmy Lin. 2015. [http://aclweb.org/anthology/D/D15/D15-1181.pdf Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks]. In EMNLP 2015.
* Hua He and Jimmy Lin. 2016. [https://cs.uwaterloo.ca/~jimmylin/publications/He_etal_NAACL-HTL2016.pdf Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement]. In NAACL 2016.
* Liu Yang, Qingyao Ai, Jiafeng Guo, W. Bruce Croft. 2016. [http://maroo.cs.umass.edu/pub/web/getpdf.php?id=1240 aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model]. In CIKM 2016.
* Jinfeng Rao, Hua He and Jimmy Lin. 2016. [https://dl.acm.org/authorize.cfm?key=N27026 Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks]. In CIKM 2016.
* Yi Tay, Minh C. Phan, Luu Anh Tuan and Siu Cheung Hui. 2017 [https://arxiv.org/abs/1707.06372 Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture]. In SIGIR 2017.
* Yi Tay, Luu Anh Tuan, Siu Cheung Hui. 2017 [https://arxiv.org/pdf/1707.07847 Enabling Efficient Question Answer Retrieval via Hyperbolic Neural Networks]. In eprint arXiv: 1707.07847.
[[Category:State of the art]]
* Zhiguo Wang, Wael Hamza and Radu Florian. 2017. [https://arxiv.org/pdf/1702.03814.pdf Bilateral Multi-Perspective Matching for Natural Language Sentences]. In eprint arXiv:1702.03814.
* Gehui Shen, Yunlun Yang, Zhi-Hong Deng. 2017. [https://aclanthology.info/pdf/D/D17/D17-1123.pdf Inter-Weighted Alignment Network for Sentence Pair Modeling.]. In EMNLP 2017.

Question Answering (State of the art)

2016-12-06T19:08:27Z

Raojinfeng:

Question Answering (State of the art)

2016-10-20T17:19:48Z

Raojinfeng:

Question Answering (State of the art)

2016-10-20T17:17:23Z

Raojinfeng: