Skip to main content

MIT Quest AI Roundtable

Natural Language Processing for All

June 4, 2021 | 12pm - 1pm EST

Natural Language Processing for All

Nearly 7,000 languages are spoken in the world today, but fewer than two dozen have the massive training data required to build AI applications like language and speech-to-text translation. The problem is compounded by a shortage of computing resources in much of the world. This panel will explore ways of making natural language processing more efficient, interpretable, and linguistically informed, to reach speakers of all languages.




  • photo of Jacob Andreas
    X Consortium Career Development Assistant Professor, Department of Electrical Engineering and Computer Science
    Computer Science and Artificial Intelligence Laboratory
    • Natural Language Processing
    • Machine Learning
  • Aude Oliva
    Director of Strategic Industry Engagement, MIT Schwarzman College of Computing
    MIT Director, MIT-IBM Watson AI Lab
    Senior Research Scientist, MIT CSAIL
    • Computational Neuroscience
    • Cognitive Science
    • Computer Vision
    • Machine Learning
  • photo of Ellie Pavlick

    Ellie Pavlick

    Manning Assistant Professor of Computer Science
    Computer Science Department
    Brown University
  • photo of Emma Strubell

    Emma Strubell

    Assistant Professor
    Language Technologies Institute
    Carnegie Mellon University



Date: Friday, June 4, 2021
Time: 12pm - 1pm EST
Where: Zoom Webinar

12:00 PM - 12:05 PM

Aude Oliva

12:05 PM - 12:25 PM

Speaker Talks


Speaker: Yoon Kim
Title: Towards Practical Neuro-symbolic Language Systems
Abstract: Neuro-symbolic models incorporate neural networks into classic symbolic systems and offer an alternative class of methods for building human-like computational models of language. In this talk, I present some recent work on using neuro-symbolic models for compositional sequence-to-sequence learning and also discuss some challenges that arise when working with such models in practice.


Speaker: Jacob Andreas
Title: Integrating structured linguistic resources into models for NLP
Abstract: Prior to the widespread use of deep learning methods, structured resources like dictionaries and grammars played a key role in NLP models for many tasks. Especially in low-resource settings, these resources still contain high-quality information about words, word meanings, and relations. How do we build neural models that can take advantage of these resources when we have them, and what can we learn from the design of structured resources about how to train better models from scratch?


Speaker: Emma Strubell
Title: Efficient NLP: Why, how, what?
Abstract: Large, pre-trained language models have become a basic building block for state-of-the-art NLP models. Unfortunately, training and deploying these models comes at a high computational cost, limiting their development and use to the small set of individuals, organizations, and applications with access to substantial computational resources. In this talk I’ll describe why efficient NLP is important, how we might make some NLP models more efficient, and what we may want to focus on going forward in order to maximize impact in this direction.


Speaker: Ellie Pavlick
Title: How many examples is enough?
Abstract: Neural network models are famously data hungry—they require many training examples to achieve good performance, and are easily tricked by spurious correlations during training that don’t generalize well. I will discuss recent work on how neural language models form and use generalizable representations of linguistic structure, and how their behavior at inference time is influenced by the amount and distribution of data seen in training.

12:25 PM - 12:45 PM

Panel Discussion
Jacob Andreas, Yoon Kim, Ellie Pavlick, Emma Strubell, Aude Oliva 

12:45 PM - 1:00 PM

Q&A and Wrap Up