English -> French Translation of the AMI and ICSI Corpora: Call for Annotators

Event Notification Type: 
Other
Contact: 
Guokan Shang

[apologies for cross-posting]

In the context of the open source AI-powered vocal assistant LinTo (https://linto.ai/) project funded by BPI France (http://www.bpifrance.com/), LINAGORA Labs (https://research.linagora.com) and the DaSciM team of Ecole Polytechnique (http://www.lix.polytechnique.fr/dascim) are looking for volunteer annotators fluent in English and French to work on the English -> French translation of the speech transcriptions and abstractive summaries of the AMI and ICSI meeting corpora.

We need your help to make this project a success.
Join us at http://lingua.linto.ai/pages/translation-guidelines/ !

The French versions of the datasets will be made publicly available under the Creative Commons Attribution 4.0 International Licence (CC BY 4.0). These new resources will be a great asset for many NLP research areas, including:

  • Summarization
  • Dialogue and Interactive Systems
  • Machine Translation
  • etc...

The AMI and ICSI corpora (http://groups.inf.ed.ac.uk/ami/) are the only publicly available datasets in the meeting domain that contain rich annotations (dialogue acts, extractive and abstractive summaries, topics, adjacency pairs, links between utterances and summaries, etc). These annotations are available for 226,903 utterances belonging to 212 different meetings. They can be used in any language provided translations of the utterances and abstractive summaries.

Please share this announcement around you, and feel free to contact us (guokan.shang [at] polytechnique.edu) for more information. Thank you for your valuable contribution!

Kind regards,

Jean-Pierre Lorré (director @ Linagora Labs)
Michalis Vazirgiannis (full professor @ Ecole Polytechnique, director @ DaSciM)
Antoine Tixier (postdoctoral researcher @ DaSciM)
Guokan Shang (PhD student @ DaSciM & Linagora Labs)