Distinguished Lecture: Fernando Pereira | 60 years of computer science and AI

Interview with Fernando Pereira

Title: Meaning Outside-In

Lecture abstract

Fernando Pereira with Srini Narayanan and Iulia Comșa

Google DeepMind

The temptation to try to explain the connection between form and meaning in language by “inside-out” recursive construction from elementary signs to complex combinations goes far back, further than I am qualified to discuss here. As one who yielded to that temptation for a substantial part of my research career, I struggled with a nagging suspicion that inside-out construction failed to capture the dominant effects of context on the semantic interpretation of complex texts, even as I worked with colleagues on increasingly expressive ways to attempt to capture the interactions between scope, ellipsis, anaphora, and more. Recent advances in large language models suggest a very different, “outside-in” view. By being rewarded for being a very good predictor of masked segments of its training texts in their full sociocultural diversity, the model is forced to learn regularities of language form and use at multiple scales, including regularities in discourse associated with commonly accepted chains of argumentation. That is, the models acquire some informal logic as well as those aspects of so-called common sense that can be “read from” what is written. In this, what the models seem to discover appears closer to ideas from cognitive linguistics and case-based reasoning than from logical semantics. Yet, with suitable prompting – which may be brittle in ways that human subjects are not – models can identify and explain the kinds of ambiguity that intrigued many of us in the inside-out tradition. Just to whet your appetite: User: Alice recommended a book she hated before Mary did. Did Alice and Mary recommend the same book?

Model: It is not possible to say for sure whether Alice and Mary recommended the same book. The sentence "Alice recommended a book she hated before Mary did" is ambiguous. It could mean that Alice recommended a book that she hated, and then Mary later recommended the same book. Or, it could mean that Alice recommended a book that she hated, and then Mary later recommended a different book that she also hated.

In this talk I will review how models handle examples of a variety of semantic, pragmatic, and embodied inference phenomena that many of us investigated for decades, to start to see how current models perform as language reasoners; and to explore what their remaining errors suggest about their representational limits. As we used to do when we asked informants for their native speaker intuitions about possible sentence meanings, do not expect formal or statistical conclusions, at least not just yet. The journey has just started.

Speaker's bio

Fernando Pereira is VP, Research at Google DeepMind, where he focuses on putting language AI into practice. Before that, he led language AI and ML teams at Google Research from 2008 until 2022.

Before coming to Google, he was chair of the Computer and Information Science department of the University of Pennsylvania, head of the Machine Learning and Information Retrieval department at AT&T Labs, and held research and management positions at SRI International. He received a Ph.D. in Artificial Intelligence from the University of Edinburgh in 1982, and has over 120 research publications on computational linguistics, machine learning, bioinformatics, speech recognition, and logic programming, and several patents. He is a fellow of AAAI, ACM, and ACL; member of the American Philosophical Society, of the American Academy of Arts and Sciences, and of the National Academy of Engineering; and past president of the Association for Computational Linguistics.