Natural Language Processing for All
Nearly 7,000 languages are spoken in the world today, but fewer than two dozen have the massive training data required to build AI applications like language and speech-to-text translation. The problem is compounded by a shortage of computing resources in much of the world. This panel will explore ways of making natural language processing more efficient, interpretable, and linguistically informed, to reach speakers of all languages.
Speakers
-
Associate Professor, Department of Electrical Engineering and Computer ScienceComputer Science and Artificial Intelligence Laboratory
- Natural Language Processing
- Machine Learning
Missions -
-
Schedule
Schedule
Date: Friday, June 4, 2021
Time: 12pm - 1pm EST
Where: Zoom Webinar
Introduction
Aude Oliva
Speaker Talks
Speaker: Yoon Kim
Title: Towards Practical Neuro-symbolic Language Systems
Abstract: Neuro-symbolic models incorporate neural networks into classic symbolic systems and offer an alternative class of methods for building human-like computational models of language. In this talk, I present some recent work on using neuro-symbolic models for compositional sequence-to-sequence learning and also discuss some challenges that arise when working with such models in practice.
Speaker: Jacob Andreas
Title: Integrating structured linguistic resources into models for NLP
Abstract: Prior to the widespread use of deep learning methods, structured resources like dictionaries and grammars played a key role in NLP models for many tasks. Especially in low-resource settings, these resources still contain high-quality information about words, word meanings, and relations. How do we build neural models that can take advantage of these resources when we have them, and what can we learn from the design of structured resources about how to train better models from scratch?
Speaker: Emma Strubell
Title: Efficient NLP: Why, how, what?
Abstract: Large, pre-trained language models have become a basic building block for state-of-the-art NLP models. Unfortunately, training and deploying these models comes at a high computational cost, limiting their development and use to the small set of individuals, organizations, and applications with access to substantial computational resources. In this talk I’ll describe why efficient NLP is important, how we might make some NLP models more efficient, and what we may want to focus on going forward in order to maximize impact in this direction.
Speaker: Ellie Pavlick
Title: How many examples is enough?
Abstract: Neural network models are famously data hungry—they require many training examples to achieve good performance, and are easily tricked by spurious correlations during training that don’t generalize well. I will discuss recent work on how neural language models form and use generalizable representations of linguistic structure, and how their behavior at inference time is influenced by the amount and distribution of data seen in training.
Panel Discussion
Jacob Andreas, Yoon Kim, Ellie Pavlick, Emma Strubell, Aude Oliva
Q&A and Wrap Up