Friday, 21st April 2023 - 11am Clara Meister : Seminar | ILCC

Title: Using psycholinguistics to understand the decoding of probabilistic language generators

Abstract:

Standard probabilistic language generators often fall short when it comes to producing coherent and fluent text despite the fact that the underlying models perform well under standard metrics, e.g., perplexity. This discrepancy has puzzled the language generation community for the last few years. In this talk, we’ll take a different approach to looking at generation from probabilistic models, pulling on concepts from psycholinguistics and information theory in the attempt to provide insights into some observed behaviors, e.g., why high-probability texts can be dull or repetitive. Humans use language as a means of communicating information, aiming to do so in a simultaneously efficient and error-minimizing manner; in fact, psycholinguistics research suggests humans choose each word in a string with this subconscious goal in mind. We propose that decoding from probabilistic models of language should attempt to mimic these behaviors. To motivate this notion, we’ll look at common characteristics of several successful decoding strategies, showing how their design allows them to implicitly adhere to attributes of efficient and robust communication. We will then propose a new decoding strategy, with the explicit aim of encoding these human-like properties of natural language usage into generations.

Bio:

Clara is a third year PhD student under Professor Ryan Cotterell at ETH Zürich. Her research focuses include decoding methods for language generators, analysis techniques for language models, and computational methods in psycholinguistics. During her PhD she has had the privilege of interning with DeepMind's language team; she is currently supported by a Google PhD Fellowship.

vCal iCal