Tomáš Mikolov

Also published as: Tomas Mikolov


2018

pdf bib
Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion
Armand Joulin | Piotr Bojanowski | Tomas Mikolov | Hervé Jégou | Edouard Grave
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Continuous word representations learned separately on distinct languages can be aligned so that their words become comparable in a common space. Existing works typically solve a quadratic problem to learn a orthogonal matrix aligning a bilingual lexicon, and use a retrieval criterion for inference. In this paper, we propose an unified formulation that directly optimizes a retrieval criterion in an end-to-end fashion. Our experiments on standard benchmarks show that our approach outperforms the state of the art on word translation, with the biggest improvements observed for distant language pairs such as English-Chinese.

pdf bib
Advances in Pre-Training Distributed Word Representations
Tomas Mikolov | Edouard Grave | Piotr Bojanowski | Christian Puhrsch | Armand Joulin
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Learning Word Vectors for 157 Languages
Edouard Grave | Piotr Bojanowski | Prakhar Gupta | Armand Joulin | Tomas Mikolov
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Bag of Tricks for Efficient Text Classification
Armand Joulin | Edouard Grave | Piotr Bojanowski | Tomas Mikolov
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. We can train fastText on more than one billion words in less than ten minutes using a standard multicore CPU, and classify half a million sentences among 312K classes in less than a minute.

pdf bib
Enriching Word Vectors with Subword Information
Piotr Bojanowski | Edouard Grave | Armand Joulin | Tomas Mikolov
Transactions of the Association for Computational Linguistics, Volume 5

Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Popular models that learn such representations ignore the morphology of words, by assigning a distinct vector to each word. This is a limitation, especially for languages with large vocabularies and many rare words. In this paper, we propose a new approach based on the skipgram model, where each word is represented as a bag of character n-grams. A vector representation is associated to each character n-gram; words being represented as the sum of these representations. Our method is fast, allowing to train models on large corpora quickly and allows us to compute word representations for words that did not appear in the training data. We evaluate our word representations on nine different languages, both on word similarity and analogy tasks. By comparing to recently proposed morphological word representations, we show that our vectors achieve state-of-the-art performance on these tasks.

2014

pdf bib
Using Neural Networks for Modeling and Representing Natural Languages
Tomas Mikolov
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Tutorial Abstracts

2013

pdf bib
Linguistic Regularities in Continuous Space Word Representations
Tomas Mikolov | Wen-tau Yih | Geoffrey Zweig
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Combining Heterogeneous Models for Measuring Relational Similarity
Alisa Zhila | Wen-tau Yih | Christopher Meek | Geoffrey Zweig | Tomas Mikolov
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2011

pdf bib
A Fast Re-scoring Strategy to Capture Long-Distance Dependencies
Anoop Deoras | Tomáš Mikolov | Kenneth Church
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing