Tuesday 4 November 2024

Host: Ava Khamseh

Speaker: Alina Kumukova

Title: From (causal) molecular mechanisms and cell states to Real-World Evidence and back

Abstract: The genome contains the complete set of DNA that provides instructions for cells and tissues to develop and function. Genomic medicine integrates insights from the genome with information about a person’s health to design and apply improved diagnostic tools and treatments. Research in genomic medicine goes beyond identification of risk factors and aims at pinpointing underlying causal mechanisms, often at various biological scales. 

Molecular scale: In the first part of this talk, we present Stator, a data-driven methodology based on structure learning that identifies cell (sub)types and states, through quantification of higher-order (beyond pair-wise) gene expression. Stator does not rely on cells’ local proximity in transcriptome space, in contrast to methods such as clustering. We exemplify Stator, in a recent application, for successful detection of early cancer cells in liver cancer. 

Genotype-phenotype scale: In the second part of the talk, we present a novel approach routed in semi-parametric efficient estimation theory, integrating population genetics, functional genomics and targeted machine learning (TarGene), to quantify epistatic contributions to human traits via transcription factor mechanisms. By taking experimentally verified differentially binding variants across 9 nuclear hormone receptors as candidates and using UK Biobank data across 768 traits, we reveal, for the first time, hundreds of epistatic interactions involving these transcription factor mechanisms.

Clinical scale: In the third part of the talk, we continue with the semi-parametric estimation theme, to generate Real-World Evidence via causal identification of treatment effects on complex trait outcomes (here cardiovascular events) using Real-World Data (here, All of Us Biobank).

Finally, we present a sketch of ongoing ideas/work on how these biomedical scales can be integrated for prioritisation of causal pathways to disease and design of corresponding optimised interventions.