2023 cohort

Meet our 2023 cohort.

Alex Belo

Image
Alex Belo

PhD Project:  Fragment-based drug design: an AI approach to go from fragment to drug-molecule

Supervisors: Antonia Mey, Chris Wood

Drug design and development is a key area of scientific research leading to improvement of quality of life of society. At the first steps of this endeavour, several candidate small molecule drugs are designed, which are later enhanced and tested for their medicinal potential. A key experimental technique for this process is known as fragment-based drug design (FBDD). Several small molecule fragments are experimentally screened against a protein target of choice and ranked based on the observed interaction with the target. This data can be used directly to select fragments for further development into a functional drug molecule, or it can be used as a training set to develop machine learning models for drug design.

This project will leverage data from fragment-based X-ray screens at the XChem beamline at Diamond and literature datasets to design AI-based methods for generating drug-like molecules from fragments. The first goal is to analyse and clean existing fragment-based data and make this data useable in machine learning applications. We will investigate interaction patterns between fragments and protein targets to identify recurring common phramacophores using clustering strategies. We will design benchmarks to test AI and conventional methods to look at test cases for going from fragment to drug candidate. Lastly, we will combine these methods with accurate ways of assessing binding affinities between the proposed molecules and the protein target leveraging ML-based methods as well as alchemical free energy simulation-based methods.

 

Leonie Bossemeyer

Image
Leonie Bossemeyer

PhD Project:  Assisting decision making from visual data via human-in-the-loop AI

Supervisors:  Oisin Mac Aodha, Tom MacGillivray

The proposed research project explores enhancing human-machine collaboration in high-stakes visual decision making settings such as medical fields like radiology and ophthalmology. Given the rapid advancements in computer vision and machine learning, integrating AI methods into medical decision making processes holds significant promise.

However, current systems often interact with users in a static manner, failing to adapt to the dynamic learning processes of human experts. This project aims to develop ML systems that not only assist humans but also adapt to their evolving knowledge and skills. For this, we will investigate how interpretable representations can be leveraged to foster direct collaboration between users and AI systems, how more realistic models of human learning can enhance human teaching, and how decision making can be supported in real time. Ultimately, this research aims to create adaptive systems that not only support decision-making by managing cognitive load and providing relevant information but also contribute to the user’s learning process, making the collaboration between human and machine more effective while keeping the final decision in the user’s hands.

Yu (Jade) Cheng

Image
Yu (Jade) Cheng

PhD Project:  Robust Visual and Textual Explanations in Medical Domain

Supervisors:  Hakan Bilen, Mirella Lapata

This project addresses the critical need for interpretability in medical image classification by developing a framework that combines gradient-based visual explanations with concept-based classification. Traditional benchmarks in medical image classification, particularly those using architectures like CLIP, often lack comprehensive diagnostic information beyond basic image labels, limiting their clinical applicability and reliability. Our approach leverages gradient-based techniques to visually interpret how Transformer-based models derive feature embeddings from input images, enhancing the understanding of model predictions. Additionally, inspired by recent advancements in concept-based models, we employ large language models, such as LLaMA3.1, to generate concise, clinically relevant descriptors for diseases and pathologies. These descriptors serve as concept vectors aligned with visual features from images, providing interpretable outputs. By integrating visual and textual explanations, this project aims to develop a robust and interpretable AI framework that improves the clinical adoption of medical image classification models, ultimately fostering greater trust and reliability in AI-driven diagnostics.

Chaeeun Lee

Image
Chaeeun Lee

PhD Project: Improving Clinical Evidence Retrieval for Rare and Genetic Diseases

Supervisors: Ian Simpson, Pasquale Minervini

Drug design and development is a key area of scientific research leading to improvement of quality of life of society. At the first steps of this endeavour, several candidate small molecule drugs are designed, which are later enhanced and tested for their medicinal potential. A key experimental technique for this process is known as fragment-based drug design (FBDD). Several small molecule fragments are experimentally screened against a protein target of choice and ranked based on the observed interaction with the target. This data can be used directly to select fragments for further development into a functional drug molecule, or it can be used as a training set to develop machine learning models for drug design.

This project will leverage data from fragment-based X-ray screens at the XChem beamline at Diamond and literature datasets to design AI-based methods for generating drug-like molecules from fragments. The first goal is to analyse and clean existing fragment-based data and make this data useable in machine learning applications. We will investigate interaction patterns between fragments and protein targets to identify recurring common phramacophores using clustering strategies. We will design benchmarks to test AI and conventional methods to look at test cases for going from fragment to drug candidate. Lastly, we will combine these methods with accurate ways of assessing binding affinities between the proposed molecules and the protein target leveraging ML-based methods as well as alchemical free energy simulation-based methods.

Dewy Nijhof

Image
Dewy Nijhof

PhD Project: Exploring Molecular Mechanisms of Comorbidity: A Network-Based Analysis of ADHD and Autism

Supervisors:  Oksana Sorokina, Douglas Armstrong

Autism Spectrum Disorder (ASD) and Attention-Deficit/Hyperactivity Disorder (ADHD) are two prevalent neurodevelopmental disorders (NDDs) that are frequently comorbid, posing significant challenges for diagnosis and treatment. Despite their distinct clinical profiles, they share several overlapping symptoms, suggesting common underlying genetic mechanisms. However, comprehensive research exploring this genetic overlap is still lacking. This PhD project aims to address this gap by investigating the shared molecular basis of ASD and ADHD, with a particular focus on synaptic mechanisms.

Building on preliminary research conducted during a Master of Science by Research (MScRes) pilot project, this study will meticulously curate disorder-specific gene lists for ASD and ADHD through an extensive literature review. Each gene included in these lists will be supported by at least two independent studies, with careful consideration given to sample demographics, data sources, and methodological rigor. The central hypothesis of this research is that the majority of the genetic overlap between ASD and ADHD is localized within the synapse, where brain cells communicate with each other. This hypothesis is grounded in extensive literature that highlights synaptic dysfunction as a key feature of both disorders.

Kendig Sham

Image
Kendig Sham

PhD Project: Developing Methods for Causal Mechanisms for Spatiotemporal Biomedical Data

Supervisors:  Stuart King, Ian MacCormick, Christopher Lucas

Scientific research often involves studying causal mechanisms. However, natural phenomena are inherently spatiotemporal. To advance scientific discovery, it is crucial to apply robust causal inference methods to spatial data. This approach holds significant potential for enhancing various biomedical research areas, including the geographical modelling of disease spread to prevent pandemics and the identification of disease causality through biomedical imaging. The predominant current paradigm for causal inference, the Rubin Causal Model (RCM), is more suited for traditional pharmaceutical clinical trials. The RCM requires the Stable Unit Treatment Value Assumption (SUTVA) to hold. SUTVA asserts that interventions are standardised, and inter-subject interference is absent. SUTVA often fails to hold in the context of spatiotemporal data, necessitating a revised approach to causal inference. This project will be divided in different phases to address these challenges.

First, I will review methods for causal discovery and causal inference from spatial data, and explore the links between statistical mechanics and causal inference.

Second, I will create synthetic datasets from known dynamics. These datasets will serve as benchmarking datasets to measure the effectiveness of current or new methods.

Third, I will try existing published methods on published non-biomedical spatial data as well as on the synthetic benchmarking data.

Fourth, I will attempt to develop new methods that will be suitable for spatiotemporal data and benchmark the methods on synthetic data.

Last, I will apply methods that demonstrate satisfactory results on the benchmarking data to biomedical datasets with a spatiotemporal structure.

In summary, this project aims to develop new methods for understanding causal mechanisms from spatial data, which will be at the intersection between space, time and mechanism.

Luwei (Demi) Wang

Image
Luwei (Demi) Wang

Personal website

PhD Project: Clustering and Causality in Healthcare

Supervisors:  Sohan Seth, Nazir Lone

This research proposal aims to address the limitations of current personalized medicine approaches, which predominantly rely on correlation-based methods, by developing novel causal machine learning models specifically tailored to healthcare data. Traditional machine learning techniques excel in prediction tasks but often fall short in identifying the underlying causal mechanisms that influence patient outcomes. To advance precision medicine, there is a pressing need for methods that go beyond prediction to extract actionable, causal insights from observational data, thereby enabling more informed and effective medical interventions. The proposed research will focus on integrating causal inference with clustering techniques to uncover both hidden and observed causal relationships within complex healthcare datasets. By leveraging cumulants and conditional independence tests, the research will develop models that can accurately identify and quantify causal dependencies, even in the presence of latent confounders. These models aim to be interpretable, stable, and operationalizable in clinical settings, ultimately improving decision- making processes and patient outcomes. Key outputs of the project will include the development of robust, nonparametric clustering and causal models, the creation of open-source tools for the wider research community, and a series of publications detailing the methodologies and their applications in healthcare. This work will provide healthcare professionals with advanced tools that offer deeper causal insights, thereby enhancing the efficacy of personalzed medicine.

Cameron Wheeler

Image
Cameron Wheeler

PhD Project: Advancement of 3D Segmentation Methodologies for Multi-Organ Systems within Positron Emission Tomography-Computed Tomography.

Supervisors:  Adriana Tavares, Eleonora D'Arnese

Medical imaging generates vast amounts of 3D volumetric data and plays a crucial role in numerous clinical workflows. Within medical imaging, computed tomography (CT) and Positron Emission Tomography (PET) are frequently combined to perform anatomical and functional imaging. With the rise of total-body scanners, it is now feasible to image a whole patient within 1-2 bed positions. However, most image analysis and quantification is still performed manually. A key analysis step is the delineation of volumes of interest (VOI) such as organ structures or tumours. The manual process of VOI delineation leads to slow analysis times alongside substandard results that are prone to intra-observer variability. This is amplified further with the increased data obtained from total-body scanners. Consequently, there is a growing need to automate this process. Current methods range from simple value thresholding to machine learning algorithms, however, the current automated methods for multi-organ whole-body image analysis remain inadequate for clinical applications. This is particularly true within the context of diseased datasets. This causes issues as almost all patients are being imaged because they are ill. These inaccuracies can affect clinical decision making and result in sub-standard patient treatment. As such, this project aims to further advance the field of 3D multi-organ segmentation with PET/CT imaging with a specific focus on clinical applications on diseased patients.

There are two avenues to this project. In the first, I shall look to design a benchmarking software, this contribution aims to not only assist in evaluating my work, but also look to standardize and simplify the way evaluation is performed in the field. Currently, evaluation varies massively, and best practices are not followed across the board making methodological comparisons difficult to perform. As such, there is a need for a simple to use and easy to access leaderboard and benchmarking software. Using my benchmarking software, I shall set the baselines to compare my other work against.

The second avenue will focus on advancing the current state of the art in multi-organ (abdominal) segmentation. There are several ways I seek to do this. Currently, foundational models for segmentation perform well within 2D object segmentation. However, their application within medical imaging is underwhelming; a method for adapting foundational models to domain specific applications termed Parameter-Efficient-Fine-tuning (PEFT) remains under explored within 3D medical image segmentation. Secondly, current multi- organ frameworks utilize several models for different groups of target structures. An investigation into the performance of a single model for all target structures within the abdominal region could also prove useful as inference will only have to take place a single time. Finally, it has been shown that Transformer based architectures are underperforming when compared to convolutional based architectures. An investigation into modifications and alterations to improve the transformer for medical segmentation tasks could yield clinically applicable performances. This could serve helpful within diseased images as often global context within the image affects the morphology of VOIs.

Junkai Yang

Image
Junkai Yang

PhD Project: Multi-model decoder of V1 neural signals

Supervisors:  Arno Onken, Nathalie Rochefort

Encoding and decoding neural signals in the primary visual cortex (V1) are widely studied topics in neural science. While encoding helps understand how the brain processes visual information, decoding can potentially benefit the development of brain-computer interfaces for therapeutic purposes. However, pixel-by-pixel reconstruction of the visual stimuli using human data remains challenging since human data is usually noisy and has low spatial resolution. Additionally, the human visual system is inherently complex. In contrast, the visual system of a mouse shares similar properties to that of a human, while being relatively simpler. Newly developed technologies such as two-photon calcium imaging even allow for monitoring several thousands of neurons simultaneously, making mice a good research object for both tasks. To leverage this single-cell level technology for visual stimuli reconstruction, this project will include a series of experiments from three angels: 1) Generative models such as generative adversarial networks (GANs) and diffusion models (DMs) have been used for decoding electroencephalography (EEG), and functional Magnetic Resonance Imaging (fMRI) data. I will explore the feasibility of incorporating generative models on decoding calcium imaging data. 2) I will reconstruct images by splitting the decoding steps, where the first model converts the neural responses to latent vectors and the second model reconstructs the natural images using the latent vectors. 3) I will experiment with different strategies that integrate behavioural information during decoding. Eventually, the optimal solutions from the three angles will be combined to form the final decoder. And I will conduct quantitative comparisons between the final decoder and the models from other studies.

Yongcheng Yao

Image
Yongcheng Yao

Personal website

PhD Project: Trustworthy Multimodal Medical Foundation Model

Supervisors:  Timothy Hospedales

The proposed project aims to develop and evaluate trustworthy multimodal medical foundation models, with a particular focus on Vision-Language Models (VLMs) in healthcare. The project emphasizes improving the trustworthiness of VLMs by enhancing their explainability and reliability. Explainability allows users to understand the reasoning behind the model’s decisions, while reliability ensures that the model generates outputs grounded in accurate, factual knowledge. To achieve this, retrieval-augmented systems will be explored to reduce hallucinations and improve model transparency. The research aims to incorporate traditional medical tasks such as image segmentation, classification, and detection into the VLM framework. The research will explore the spatial reasoning and measurement ability of VLMs to enhance medical image computing capabilities on 2D and 3D medical images.