Turing Center at University of Washington

Investigating problems at the crossroads of natural language processing, data mining, Web search, and the Semantic Web.

Turing Center Home Turing Center People Turing Center Publications Turing Center Press Turing Center Events Turing Center Jobs Turing Center Contact
 

Previous Events

2008

Colloquium

Mausam (Turing Center) and Stephen Soderland (Turing Center)
Towards Panlingual Translation: Supporting Translations Across All Languages
February 12 (Tuesday), 3:30 pm - 5:00 pm, Electrical Engineering 105

Abstract

The goal of our project is a system that can translate between arbitrary pairs of languages. Unfortunately, most machine translation methodologies assume aligned corpora or grammar rules, which are available for only a small number of major language pairs. This makes scaling the popular approaches to any-language translation virtually impossible. We propose to scale machine translation to a panlingual level by first attempting to solve the lexical translation problem and then proceeding to translating pairs of words, phrases and then sentences. In this talk we primarily describe a novel approach to lexical translation that employs probabilistic inference over the Translation Graph, a novel lexical resource that combines translations from hundreds of machine readable dictionaries and Wiktionaries. Our inference algorithm opens up several interesting and challenging future directions that we detail in the talk. We will also demo PanImages (http://www.panimages.org), an image search application that uses the Translation Graph.

Symposium

Fourteenth UW/Microsoft Quarterly Symposium in Computational Linguistics
February 15 (Friday), 3:30 pm - 5:30 pm, Gowen 201

You are invited to take advantage of this opportunity to connect with the computational linguistics community at Microsoft and the University of Washington. Sponsored by the UW Departments of Linguistics, Electrical Engineering, and Computer Science; Microsoft Research; and UW alumni at Microsoft. The symposium consists of two invited talks, followed by a poster presentation and an informal reception.

Amar Subramanya and Jeff Bilmes (Electrical Engineering)
Training Speech Recognizers with Uncertain Word Boundaries

Colin Cherry (NLP group, Microsoft Research)
Cohesive Phrase-Based Decoding for Statistical Machine Translation

Michael Tepper (Linguistics)
Knowledge-Lite Induction of Underlying Morphology (KLIUM)

Colloquium

Oren Etzioni, Mausam, and Stephen Soderland (Turing Center)
Towards Panlingual Translation: Supporting Translations Across All Languages
May 16 (Friday), 3:30 pm - 5:00 pm, Sieg 225

Abstract

The goal of our project is a system that can translate between arbitrary pairs of languages. Unfortunately, most machine translation methodologies assume aligned corpora or grammar rules, which are available for only a small number of major language pairs. This makes scaling the popular approaches to any-language translation virtually impossible. We propose to scale machine translation to a panlingual level by first attempting to solve the lexical translation problem and then proceeding to translating pairs of words, phrases and then sentences. In this talk we primarily describe a novel approach to lexical translation that employs probabilistic inference over the Translation Graph, a novel lexical resource that combines translations from hundreds of machine readable dictionaries and Wiktionaries. We hope to also discuss whether the Graph can be used to support linguistics research, and to unearth new observations by providing lexical data on a large number of languages in concert.

Symposium

Fifteenth UW/Microsoft Quarterly Symposium in Computational Linguistics
June 6 (Friday), 3:00 pm - 5:30 pm, Microsoft Corporation, Building 99 (14820 NE 36th Street), room 1919 (first floor)

You are invited to take advantage of this opportunity to connect with the computational linguistics community at Microsoft and the University of Washington. Sponsored by the UW Departments of Linguistics, Electrical Engineering, and Computer Science; Microsoft Research; and UW alumni at Microsoft.

Transit from UW: Sound Transit Route 545 every 15 minutes from Montlake Freeway Station (2:07 or 2:22 p.m.) to SR 520 and NE 40th Street (2:19 or 2:34 p.m.); cross SR 520 to 148th Avenue NE, turn left, go to NE 36th Street.

Carpooling from UW: Contact Dan Jinguji (danjj@u.washington.edu).

Talks:

Douglas Downey (Computer Science & Engineering)
Autonomous Web-scale Information Extraction

Chris Quirk (Microsoft Research)
Models for Comparable Corpus Fragment Alignment

Demonstrations:

Scott Drellishak, Kelly O’Hara, and Emily M. Bender (Linguistics)
Case and Inflection in the Grammar Matrix

Michael Gamon, Sumit Basu, Dmitriy Belenko, Danyel Fisher, Matthew Hurst, and Arnd Christian König (Microsoft Research and Microsoft Live Labs)
BLEWS: What the Blogosphere Tells You About the News

Michael Gamon, Chris Brockett, Dmitriy Belenko, Bill Dolan, Jianfeng Gao, Lucy Vanderwende (Microsoft Research)
The MSR ESL Assistant

Colloquium

Timothy Baldwin (Computer Science and Software Engineering, Melbourne)
Enhanced Information Access to Troubleshooting-Oriented Web User Forum Data
June 13 (Friday), 11:00 am - 12:00 noon, Paul G. Allen Center for Computer Science and Engineering 303

Abstract

The ILIAD (Improved Linux Information Access by Data Mining) Project is an attempt to apply language technology to the task of Linux troubleshooting by analysing the underlying information structure of a multi-document text discourse and improving information delivery through a combination of filtering, term identification and information extraction techniques. In this talk, I will outline the overall project design and present results for a variety of thread-level filtering tasks.

Speaker

Timothy Baldwin is a Senior Lecturer in the Department of Computer Science and Software Engineering, University of Melbourne. Since completing his PhD at the Tokyo Institute of Technology in 2001, he has been involved with research grants from sources including the NSF, NTT, ARC, NICTA and Google. His research interests include web mining, information extraction, deep linguistic processing, multiword expressions, deep lexical acquisition, and biomedical text mining. He is the author of over 130 journal and conference publications, and has held visiting appointments at NTT Communication Science Laboratories and Saarland University. He is the recipient of a number of awards for both teaching and research in the areas of computer science and natural language processing. He is currently on the editorial board of Computational Linguistics, a series editor for CSLI Publications, and a member of the Deep Linguistic Processing with HPSG Initiative (DELPH-IN).

Symposium

Sixteenth UW/Microsoft Quarterly Symposium in Computational Linguistics
November 14 (Friday), 3:30 pm - 5:30 pm, Law 127

You are invited to take advantage of this opportunity to connect with the computational linguistics community at Microsoft and the University of Washington. Sponsored by the UW Departments of Linguistics, Electrical Engineering, and Computer Science; Microsoft Research; and UW alumni at Microsoft.

Talks:

Meliha Yetişgen Yıldız (Information School and Kiha, Inc.)
Finding the Meaning of Medical Concept Correlations

Jian-Yun Nie (University of Montreal)
Machine Translation and Cross-Language Information Retrieval

Turing Talk

Rion Snow (Computer Science, Stanford)
The Stanford Wordnet Project: Automatically Learning Semantic Hierarchies
December 5 (Friday), 2:00 pm - 3:00 pm, Paul G. Allen Center for Computer Science and Engineering 403

Abstract

This talk describes our work in learning semantic relations and WordNet-like taxonomies from English text. We use machine learning methods to learn the hypernym (is a kind of) and coordinate term (is similar to) relations, and propose a model for inferring taxonomies that combine heterogeneous evidence sources for maximal benefit. Our work has resulted in the Stanford Wordnet Project, which currently offers an augmented version of WordNet with 400,000 additional automatically-inferred hyponyms (available at http://ai.stanford.edu/~rion/swn/augmented.html).

Speaker

Rion Snow is a PhD Candidate in Computer Science at Stanford University, advised by Professors Andrew Ng and Dan Jurafsky. Rion works in the intersection of machine learning and natural language processing, with a focus in computational semantics. He leads the Stanford Wordnet Project, which aims at learning large-scale semantic networks automatically from natural text. His work on automatically inferring semantic taxonomies received the Best Paper Award at the 2006 conference for the Association of Computational Linguistics.

Current Events

 
 

Email: | Maps | Directions