Building Usable Conversational Machine Translation Systems



Ruhi Sarıkaya


IBM T.J. Watson Research Center, USA




Machine Translation (MT) has been one of the long-standing elusive goals in natural language processing and artificial intelligence.  With increasing globalization, and wide-spread use of social networking sites the necessity to exchange knowledge between people who do not share a  common language put MT into the spotlight. Now, having access to vast amounts of  translation data and powerful computers, we are closer than ever to achieving that goal.  In this talk we focus on building usable conversational machine translation systems. We will highlight the practical and fundamental challenges for building MT systems and present our solutions and approaches on both  fronts. In particular, we present methods for parallel data construction for MT, language and MT modeling in continuous space. We also demonstrate working conversational MT systems between English and five major languages.


Bio: Dr. Ruhi Sarikaya is a research staff member and team lead in the Human Language Technologies Group at IBM T.J. Watson Research Center. He received the B.S. degree from Bilkent University, Turkey in 1995, M.S. degree from Clemson University, SC in 1997 and the Ph.D. degree from Duke University, NC in 2001 all in electrical and computer engineering.  He has published about 50 technical papers in refereed journal and conference proceedings and, is holder of seven patents in the area of speech and natural language processing. At IBM he has received several prestigious awards for his work including two Outstanding Technical Achievement Awards (2005 and 2008) and two Research Division Awards (2005 and 2007). Prior to joining IBM in 2001 he was a researcher at the Center for Spoken Language Research (CSLR) at the University of Colorado at Boulder for two years.  He also spent the summer of 1999 at the Panasonic Speech Technology Laboratory, Santa Barbara, CA. He has served as the publicity chair of IEEE ASRU’05 and gave a tutorial on “Processing Morphologically Rich Languages” at Interspeech’07. Dr. Sarikaya is currently serving as associate editor of IEEE Transactions on Audio Speech and Language Processing. He also served as the lead guest editor of the special issue on “Processing Morphologically-Rich Languages” for IEEE Trans. on Audio Speech & Language Processing. His past and present research interests span all aspects of speech and language processing including speech recognition, natural language processing, machine translation, machine learning, speech-to-speech translation, speaker identification/verification and digital signal processing. Dr. Sarikaya is a member of IEEE (senior member), ACL and ISCA.


May 6, 2009, Wednesday, 14:40, FENS G032