Skip to main content

MIT Quest AI Roundtable

Natural Language Processing for All

June 4, 2021 | 12pm - 1pm EST

Natural Language Processing for All

Nearly 7,000 languages are spoken in the world today, but fewer than two dozen have the massive training data required to build AI applications like language and speech-to-text translation. The problem is compounded by a shortage of computing resources in much of the world. This panel will explore ways of making natural language processing more efficient, interpretable, and linguistically informed, to reach speakers of all languages.




photo of Jacob Andreas

Jacob Andreas

Assistant Professor
MIT Computer Science & Artificial Intelligence Lab
photo of Yoon Kim

Yoon Kim

Research Scientist
MIT-IBM Watson AI Lab
photo of Ellie Pavlick

Ellie Pavlick

Manning Assistant Professor of Computer Science
Computer Science Department
Brown University
photo of Emma Strubell

Emma Strubell

Assistant Professor
Language Technologies Institute
Carnegie Mellon University
Photo of Aude Oliva

Aude Oliva

Director, MIT Quest Corporate
MIT Director, MIT-IBM Watson AI Lab
Senior Research Scientist, MIT
Schwarzman College of Computing



Date: Friday, June 4, 2021
Time: 12pm - 1pm EST
Where: Zoom Webinar

12:00 PM - 12:05 PM

Aude Oliva

12:05 PM - 12:25 PM

Speaker Talks


Speaker: Yoon Kim
Title: Towards Practical Neuro-symbolic Language Systems
Abstract: Neuro-symbolic models incorporate neural networks into classic symbolic systems and offer an alternative class of methods for building human-like computational models of language. In this talk, I present some recent work on using neuro-symbolic models for compositional sequence-to-sequence learning and also discuss some challenges that arise when working with such models in practice.


Speaker: Jacob Andreas
Title: Integrating structured linguistic resources into models for NLP
Abstract: Prior to the widespread use of deep learning methods, structured resources like dictionaries and grammars played a key role in NLP models for many tasks. Especially in low-resource settings, these resources still contain high-quality information about words, word meanings, and relations. How do we build neural models that can take advantage of these resources when we have them, and what can we learn from the design of structured resources about how to train better models from scratch?


Speaker: Emma Strubell
Title: Efficient NLP: Why, how, what?
Abstract: Large, pre-trained language models have become a basic building block for state-of-the-art NLP models. Unfortunately, training and deploying these models comes at a high computational cost, limiting their development and use to the small set of individuals, organizations, and applications with access to substantial computational resources. In this talk I’ll describe why efficient NLP is important, how we might make some NLP models more efficient, and what we may want to focus on going forward in order to maximize impact in this direction.


Speaker: Ellie Pavlick
Title: How many examples is enough?
Abstract: Neural network models are famously data hungry—they require many training examples to achieve good performance, and are easily tricked by spurious correlations during training that don’t generalize well. I will discuss recent work on how neural language models form and use generalizable representations of linguistic structure, and how their behavior at inference time is influenced by the amount and distribution of data seen in training.

12:25 PM - 12:45 PM

Panel Discussion
Jacob Andreas, Yoon Kim, Ellie Pavlick, Emma Strubell, Aude Oliva 

12:45 PM - 1:00 PM

Q&A and Wrap Up