Distinguished Lecture: Albert Cohen | 60 years of computer science and AI

Interview with Albert Cohen

As part of the 60 years of computer science and AI celebration, distinguished researchers from both disciplines have been invited to visit the School of Informatics. We have asked them to tell us about their research. Albert Cohen is a research scientist at Google Brain and the chief architect of tensor comprehensions, the underlying formulation of today's ML compilers.

Title: Pervasive Portable Performance: quand est-ce qu'on arrive ? (are we there yet?)

Lecture abstract

Despite decades of investment in software infrastructure, scientific computing, signal processing and machine learning are stuck in a rut. Some numerical computations are more equal than others: while the core linear algebra operations achieve near-peak performance, even marginally different variants do not get this chance. This results in a dramatic loss of programmability: a tiny group of low-level programmers specializing in heroic optimizations provide the rest of us with a limited range of precooked operations. Since we are going through a Cambrian explosion of hardware accelerators, this makes the problem even more burning and scientifically exciting.

Compilers are obviously part of the answer. But what compilers? Did you say domain-specific code generator? Sure, but what problem are we asking these to solve? Is there a specification of the numerical computations and programming interface they should obey, like BLAS, HLO, ONNX, for linear algebra and deep neural networks? Aren’t these the precooked offerings we are precisely attempting to avoid? How should these compilers be built, deployed, retargeted, autotuned? Did I hear scheduling languages in the back? Oh, you meant autoscheduling? Definitely interesting, but this may only be the beginning of the answer as the specification, deployment, and compiler construction issues remain. Also, what would be the respective roles of machine learning and operation research in the associated automation process? And what about the complexity and maintenance of the associated infrastructure? Some would claim that MLIR will bring peace to the world by making everything compose, extensible and reusable, but what if only a handful of experts dare to contribute? What about correctness? Is there room for mechanically proven compilation, or translation validation? What about the adepts of the affine cult living in a polyhedral world far, far away? Didn’t they solve the problem already? If not, maybe those who fell to the program synthesis and SMT-solver side did?

We will review these questions, focusing on tensor algebra. We will also sketch a holistic, collaborative research agenda, aiming to forsake low-level programming while improving the productivity of compiler engineers:

Establishing a parallel with hardware-software codesign, we will advocate for a new tile-level programming interface sitting in-between the top-level numerical operations and generators of target- and problem-specific code.
We will propose a structured approach to the construction of tensor compilers. This structure reflects the natural decomposition of tensor algebra. It also stems from the empirical observation by some (few) optimization experts that performance can be made compositional with proper discipline and tuning.

Speaker's bio: Albert Cohen

Albert is a research scientist at Google. An alumnus of École Normale Supérieure de Lyon and the University of Versailles, he has been a research scientist at Inria, a visiting scholar at the University of Illinois, an invited professor at Philips Research, and a visiting scientist at Facebook Artificial Intelligence Research. Albert Cohen works on parallelizing and optimizing compilers, machine learning compilers, parallel and synchronous programming languages, with applications to high-performance computing, artificial intelligence and reactive control.