Ashish Venugopal


2011

pdf bib
Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation.
Ashish Venugopal | Jakob Uszkoreit | David Talbot | Franz Och | Juri Ganitkevitch
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2009

pdf bib
Preference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation
Ashish Venugopal | Andreas Zollmann | Noah A. Smith | Stephan Vogel
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2008

pdf bib
The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments.
Andreas Zollmann | Ashish Venugopal | Stephan Vogel
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

We present the CMU Syntax Augmented Machine Translation System that was used in the IWSLT-08 evaluation campaign. We participated in the Full-BTEC data track for Chinese-English translation, focusing on transcript translation. For this year’s evaluation, we ported the Syntax Augmented MT toolkit [1] to the Hadoop MapReduce [2] parallel processing architecture, allowing us to efficiently run experiments evaluating a novel “wider pipelines” approach to integrate evidence from N -best alignments into our translation models. We describe each step of the MapReduce pipeline as it is implemented in the open-source SAMT toolkit, and show improvements in translation quality by using N-best alignments in both hierarchical and syntax augmented translation systems.

pdf bib
Wider Pipelines: N-Best Alignments and Parses in MT Training
Ashish Venugopal | Andreas Zollmann | Noah A. Smith | Stephan Vogel
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers

State-of-the-art statistical machine translation systems use hypotheses from several maximum a posteriori inference steps, including word alignments and parse trees, to identify translational structure and estimate the parameters of translation models. While this approach leads to a modular pipeline of independently developed components, errors made in these “single-best” hypotheses can propagate to downstream estimation steps that treat these inputs as clean, trustworthy training data. In this work we integrate N-best alignments and parses by using a probability distribution over these alternatives to generate posterior fractional counts for use in downstream estimation. Using these fractional counts in a DOP-inspired syntax-based translation system, we show significant improvements in translation quality over a single-best trained baseline.

pdf bib
A Systematic Comparison of Phrase-Based, Hierarchical and Syntax-Augmented Statistical MT
Andreas Zollmann | Ashish Venugopal | Franz Och | Jay Ponte
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf bib
The Syntax Augmented MT (SAMT) System at the Shared Task for the 2007 ACL Workshop on Statistical Machine Translation
Andreas Zollmann | Ashish Venugopal | Matthias Paulik | Stephan Vogel
Proceedings of the Second Workshop on Statistical Machine Translation

pdf bib
An Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT
Ashish Venugopal | Andreas Zollmann | Stephan Vogel
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
The CMU-UKA statistical machine translation systems for IWSLT 2007
Ian Lane | Andreas Zollmann | Thuy Linh Nguyen | Nguyen Bach | Ashish Venugopal | Stephan Vogel | Kay Rottmann | Ying Zhang | Alex Waibel
Proceedings of the Fourth International Workshop on Spoken Language Translation

This paper describes the CMU-UKA statistical machine translation systems submitted to the IWSLT 2007 evaluation campaign. Systems were submitted for three language-pairs: Japanese→English, Chinese→English and Arabic→English. All systems were based on a common phrase-based SMT (statistical machine translation) framework but for each language-pair a specific research problem was tackled. For Japanese→English we focused on two problems: first, punctuation recovery, and second, how to incorporate topic-knowledge into the translation framework. Our Chinese→English submission focused on syntax-augmented SMT and for the Arabic→English task we focused on incorporating morphological-decomposition into the SMT framework. This research strategy enabled us to evaluate a wide variety of approaches which proved effective for the language pairs they were evaluated on.

2006

pdf bib
Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation
Andreas Zollmann | Ashish Venugopal | Stephan Vogel
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers

pdf bib
The CMU-UKA syntax augmented machine translation system for IWSLT-06
Andreas Zollmann | Ashish Venugopal | Stephan Vogel | Alex Waibel
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

pdf bib
Syntax Augmented Machine Translation via Chart Parsing
Andreas Zollmann | Ashish Venugopal
Proceedings on the Workshop on Statistical Machine Translation

2005

pdf bib
Training and Evaluating Error Minimization Decision Rules for Statistical Machine Translation
Ashish Venugopal | Andreas Zollmann | Alex Waibel
Proceedings of the ACL Workshop on Building and Using Parallel Texts

2003

pdf bib
The CMU statistical machine translation system
Stephan Vogel | Ying Zhang | Fei Huang | Alicia Tribble | Ashish Venugopal | Bing Zhao | Alex Waibel
Proceedings of Machine Translation Summit IX: Papers

In this paper we describe the components of our statistical machine translation system. This system combines phrase-to-phrase translations extracted from a bilingual corpus using different alignment approaches. Special methods to extract and align named entities are used. We show how a manual lexicon can be incorporated into the statistical system in an optimized way. Experiments on Chinese-to-English and Arabic-to-English translation tasks are presented.

pdf bib
Effective Phrase Translation Extraction from Alignment Models
Ashish Venugopal | Stephan Vogel | Alex Waibel
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics