Mission Update - Language | The MIT Quest for Intelligence

image of language bubbles on orange background

Date: March 19, 2024 | 4pm EST

Location: Quest Conference Room, 45-792

Large language models are fundamental building blocks in many modern AI systems—for language processing, as well as robotics, computer vision, software engineering, and more. For models trained on text to be useful for general AI and scientific applications, they must understand not just the structure of language, but the structure of the world; moreover, their language, reasoning, and world knowledge capabilities must align with those in humans. The research under the umbrella of our Mission aims to provide a robust theoretically motivated and empirically grounded framework for studying and improving world knowledge and reasoning capabilities in large language models, and using our understanding of human cognition to make models better. In this talk, we will outline a framework for dissociating language and thought in LLMs (Mahowald, Ivanova et al., forthcoming, TiCS) and briefly discuss several ongoing efforts to i) design a comprehensive benchmark targeting key aspects of world knowledge; ii) build models with improved coherence and factual accuracy; iii) investigate the relationship between language and mathematical and inductive reasoning in humans and LLMs; and iv) build developmentally plausible models of language learning.