Distinguished Lecture - hosted by UKRI CDT in Natural Language Processing

Model Flows: Powering AI of Science and Science of AI

Abstract:

Neural language models with billions of parameters and trained on trillions of words are powering the fastest-growing computing applications in history and generating discussion and debate around the world. Yet most scientists cannot study or improve those state-of-the-art models because the organizations deploying them keep their data and machine learning processes secret. I believe that the path to models that are usable by all, at low cost, customizable for areas of critical need like the sciences, and whose capabilities and limitations are made transparent and understandable, is open development, with academic and not-for-profit researchers empowered to do reproducible science. In this talk, I’ll discuss some of the work our team is doing to open up the science of language modeling and make it possible to explore new scientific questions and democratize control of the future of this fascinating and important technology.

The work I’ll present was carried out by a large team at the Allen Institute for Artificial Intelligence in Seattle, with collaboration from the Paul G. Allen School at the University of Washington, and co-led with Hanna Hajishirzi. The team is grateful for various kinds of support and coordination from many organizations, including the Kempner Institute for the Study of Natural and Artificial Intelligence at Harvard University, AMD, CSC - IT Center for Science (Finland), Databricks, Together.ai, the National AI Research Resource Pilot, Oak Ridge National Labs, the National Science Foundation, and NVIDIA.

Bio:

Noah A. Smith is a researcher in natural language processing and machine learning, serving as the Vice Provost for Artificial Intelligence, Charles and Lisa Simonyi Chair for Endowed Chair for Artificial Intelligence and Emerging Technologies, and Professor of Computer Science and Engineering at the University of Washington and Senior Director of NLP Research at the Allen Institute for AI. He co-directs the OLMo open language modeling initiative and is the PI of the NSF- and NVIDIA-supported project “Open Multimodal AI Infrastructure to Accelerate Science.” His current work spans language, music, and AI research methodology, with a strong emphasis on mentoring—his former mentees now hold faculty and leadership roles worldwide. Smith is a Fellow of the Association for Computational Linguistics and has received numerous awards for research and innovation.