Objective: To develop a preliminary Hebrew-to-English machine translation system in a transfer-based frameowrk.
Researchers: In Haifa, Gennadi Lembersky, Amit Kirschenbaum, Reshef Shilon, Yulia Tsvetkov, and Shuly Wintner. This project is joint with a team at the Language Technologies Institute , Carnegie Mellon University, headed by Alon Lavie.
Status: Complete
Funding: The Caesarea Edmond Benjamin de Rothschild Foundation Institute for Interdisciplinary Applications of Computer Science; ISF (grant 137/06).
We will develop a preliminary Hebrew-to-English Machine Translation (MT) system under a transfer-based framework specifically designed for rapid MT prototyping for languages with limited linguistic resources. The task is particularly challenging due to two main reasons: the high lexical and morphological ambiguity of Hebrew and the dearth of available resources for the language. We will use existing, publicly available resources and adapt them in novel ways to support the MT task. The methodology behind the system will be based on two separate modules: a transfer engine which produces a lattice of possible translation segments, and a decoder which searches and selects the most likely translation according to an English language model. We will develop a set of manually crafted transfer rules to improve the translations. Performance will be evaluated using state of the art measures.
A database of transliteration examples. This is an Excel file with three columns: Hebrew form, English form and Hebrew represented in ASCII (using a 1-1 mapping of the Hebrew characters). The file contains over 20,000 entries obtained automatically, of which the first 1000 were verified manually.
Publications
Reshef Shilon, Hanna Fadida and Shuly Wintner. Incorporating Linguistic Knowledge in Statistical Machine Translation: Translating Prepositions. Proceedings of the EACL-2012 Workshop on Innovative Hybrid Approaches to the Processing of Textual Data, pages 106-114, Avignon, France, April 2012. 📖
Reshef Shilon, Nizar Habash, Alon Lavie and Shuly Wintner. Machine translation between Hebrew and Arabic. Machine Translation 26(1-2):177-195, March 2012. 📖
Reshef Shilon, Nizar Habash, Alon Lavie and Shuly Wintner. Machine Translation between Hebrew and Arabic: Needs, Challenges and Preliminary Solutions, Proceedings of AMTA 2010, The Ninth Conference of the Association for Machine Translation in the Americas, Denver, Colorado, November 2010. 📖
Yulia Tsvetkov and Shuly Wintner. Automatic Acquisition of Parallel Corpora from Websites with Dynamic Content. Proceedings of the seventh international conference on Language Resources and Evaluation (LREC-2010), pages 3389-3392, Malta, May 2010. 📖
Amit Kirschenbaum and Shuly Wintner. A General Method for Creating a Bilingual Transliteration Dictionary. Proceedings of the seventh international conference on Language Resources and Evaluation (LREC-2010), pages 273-276, Malta, May 2010. 📖
Amit Kirschenbaum and Shuly Wintner. Lightly Supervised Transliteration for Machine Translation. Proceedings of The 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09), pages 433--441, Athens, Greece, April 2009. 📖
Idan Szpektor, Ido Dagan, Alon Lavie, Danny Shacham and Shuly Wintner. Cross Lingual and Semantic Retrieval for Cultural Heritage Appreciation. In Proceedings of the ACL-2007 Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007), pages 65-72, Prague, June 2007. 📖
Alon Lavie, Erik Peterson, Katharina Probst, Shuly Wintner and Yaniv Eytani. Rapid Prototyping of a Transfer-based Hebrew-to-English Machine Translation System. Proceedings of The 10th International Conference on Theoretical and Methodological Issues in Machine Translation, pages 1-10, Baltimore, MD, October 2004. 📖