Professor Geoffrey Hinton Distinguished Lecture on Boltzmann Machines

Abstract

To train a neural net efficiently we need to compute the gradient of some measure of the performance of the net with respect to each of the connection weights. The standard way to do this is to use the chain rule to backpropagate gradients through layers of neurons. Professor Geoffrey Hinton shall briefly review a few of the engineering successes of backpropagation and then describe a very different way of getting the gradients that, for a while, seemed a lot more plausible as a model of how the brain gets gradients.

Consider a system composed of binary neurons that can be active or inactive with weighted pairwise couplings between pairs of neurons, including long range couplings. If the neurons represent pixels in a binary image, we can store a set of binary training images by adjusting the coupling weights so that the images are local minima of a Hopfield energy function which is minus the sum over all pairs of active neurons of their coupling weights. But this energy function can only capture pairwise correlations. It cannot represent the kinds of complicated higher-order correlations that occur in images. Now suppose that in addition to the "visible" neurons that represent the pixel intensities, we also have a large set of hidden neurons that have weighted couplings with each other and with the visible neurons. Suppose also that all of the neurons are asynchronous and stochastic: They adopt the active state with a log odds that is equal to the difference in the energy function when the neuron is inactive versus active. Given a set of training images, is there a simple way to set the weights on all of the couplings so that the training images are local minima of the free energy function obtained by integrating out the states of the hidden neurons? The Boltzmann machine learning algorithm solved this problem in an elegant way. It was proof of principle that learning in neural networks with hidden neurons was possible using only locally available information, contrary to what was generally believed at the time.

Speaker Biography

Geoffrey Hinton received his PhD in Artificial Intelligence from Edinburgh in 1978. He did postdoctoral work at the University of California San Diego and spent five years as a faculty member in the Computer Science department at Carnegie-Mellon University. He then moved to the Department of Computer Science at the University of Toronto where he is now a Professor Emeritus. From 2013 to 2023 he worked half-time for Google where he became a Vice President and Engineering Fellow.

He was one of the researchers who introduced the backpropagation algorithm and the first to use backpropagation for learning word embeddings. His other contributions to neural network research include Boltzmann machines, distributed representations, time-delay neural nets, mixtures of experts, variational learning and deep learning. His research group in Toronto made major breakthroughs in deep learning that revolutionized speech recognition and object classification.

Geoffrey Hinton is a fellow of the UK Royal Society and a foreign member of the US National Academy of Engineering and the US National Academy of Science. His awards include the David E. Rumelhart prize, the IJCAI award for research excellence, the Killam prize for Engineering, the NSERC Herzberg Gold Medal, the IEEE James Clerk Maxwell Gold medal, the NEC C&C award, the BBVA award, the Honda Prize, the ACM Turing Award, the Princess of Asturias Award, the Vinfuture Grand Prize, and the Nobel Prize in Physics.