Tuesday, 6th December 2022 Opportunities and challenges for deep learning in biotechnology - Diego Oyarzun Abstract: A central goal in biotechnology is engineering cells that produce high-value chemicals. These compounds feed into many products we use in our everyday lives, including food, cosmetics, medicines, and materials. Since cells can feed from sustainable sources (e.g. food waste), this technology offers a promising path to move away from petrochemical-based production and promote a more circular economy. From a computational standpoint, the problem is to regress protein production from short sequences of DNA, i.e. strings of 50-200 chars from a four-letter alphabet. Such regressors can then be wrapped into optimisation routines to find new DNA sequences that produce more of the target protein. In a recent paper to appear in Nature Communications (preprint here), we showed that off-the-shelf deep learning architectures can effortlessly provide high predictive accuracy that is sufficient for most applications in biotechnology. The real challenge is to make these algorithms work for end users in biology, particularly in terms of the large data requirements for training. Biological data is expensive and few laboratories have the incentives or budgets to invest six figure sums solely for the purpose of model training. In this talk, I will discuss some of our technical results and seek feedback from our ANC colleagues on approaches for low-N regression that could be useful in this class of problems. The work is the result of a collaboration between biology PhD students, machine learners (Oisin Mac Aodha from ANC), and molecular biologists in France. The ideas in the talk are also the subject of an upcoming article in the journal Current Opinion in Biotechnology. Event type: Workshop Date: Tuesday, 6th December 2022 Time: 11:00 Location: G.03 Speaker(s): Diego Oyarzun Chair/Host: Nigel Goddard This article was published on 2024-11-22