MIT Quest AI Roundtable

Natural Language Processing for All

June 4, 2021 | 12pm - 1pm EST

Webinar

Natural Language Processing for All

Nearly 7,000 languages are spoken in the world today, but fewer than two dozen have the massive training data required to build AI applications like language and speech-to-text translation. The problem is compounded by a shortage of computing resources in much of the world. This panel will explore ways of making natural language processing more efficient, interpretable, and linguistically informed, to reach speakers of all languages.

Speakers

Jacob Andreas
Associate Professor, Department of Electrical Engineering and Computer Science

Computer Science and Artificial Intelligence Laboratory
Natural Language Processing
Machine Learning
Missions

Language
,
Scaling Inference
Ellie Pavlick

Speaker

Manning Assistant Professor of Computer Science

Computer Science Department

Brown University
Emma Strubell

Speaker

Assistant Professor

Language Technologies Institute

Carnegie Mellon University

Schedule

Date: Friday, June 4, 2021

Time: 12pm - 1pm EST

Where: Zoom Webinar

12:00 PM - 12:05 PM

Introduction

Aude Oliva

12:05 PM - 12:25 PM

Speaker Talks

Speaker: Yoon Kim

Title: Towards Practical Neuro-symbolic Language Systems

Abstract: Neuro-symbolic models incorporate neural networks into classic symbolic systems and offer an alternative class of methods for building human-like computational models of language. In this talk, I present some recent work on using neuro-symbolic models for compositional sequence-to-sequence learning and also discuss some challenges that arise when working with such models in practice.

Speaker: Jacob Andreas

Title: Integrating structured linguistic resources into models for NLP

Abstract: Prior to the widespread use of deep learning methods, structured resources like dictionaries and grammars played a key role in NLP models for many tasks. Especially in low-resource settings, these resources still contain high-quality information about words, word meanings, and relations. How do we build neural models that can take advantage of these resources when we have them, and what can we learn from the design of structured resources about how to train better models from scratch?

Speaker: Emma Strubell

Title: Efficient NLP: Why, how, what?

Abstract: Large, pre-trained language models have become a basic building block for state-of-the-art NLP models. Unfortunately, training and deploying these models comes at a high computational cost, limiting their development and use to the small set of individuals, organizations, and applications with access to substantial computational resources. In this talk I’ll describe why efficient NLP is important, how we might make some NLP models more efficient, and what we may want to focus on going forward in order to maximize impact in this direction.

Speaker: Ellie Pavlick

Title: How many examples is enough?

Abstract: Neural network models are famously data hungry—they require many training examples to achieve good performance, and are easily tricked by spurious correlations during training that don’t generalize well. I will discuss recent work on how neural language models form and use generalizable representations of linguistic structure, and how their behavior at inference time is influenced by the amount and distribution of data seen in training.

12:25 PM - 12:45 PM

Panel Discussion

Jacob Andreas, Yoon Kim, Ellie Pavlick, Emma Strubell, Aude Oliva

12:45 PM - 1:00 PM

Q&A and Wrap Up

Register

Event Organizers

Aude Oliva
Emily Goldman
Kim Martineau
Allison Provaire
Samantha Smiley