IPAB Workshop - 26/2/2026 Title: Specifying, Measuring, and Simulating Safety in Cyber-Physical AI SystemsAbstract: When developing machine learning systems (object detectors, RL-controllers, behavioural prediction algorithms) many of us are used to thinking of success in terms of terms of "accuracy of an isolated module on a test dataset". Yet if we ever wish to deploy such (typically "black-box") machinery in the real world, it will inevitably be a small part of a larger system, interacting with other modules in some complex scenario. What does it mean for our system to perform "safely" and "successfully" in the context of, say, a full autonomous vehicle stack? This talk will give an overview of some of my work in this area. In particular, it will touch on the questions of: - How do we even express what our specifications are in a given cyber-physical scenario?- How can we measure our degree of risk and success across a range of feasible scenarios?- How can we create efficient testing strategies given a possibly infinite number of simulations we could run? Feb 26 2026 13.00 - 14.00 IPAB Workshop - 26/2/2026 Craig Innes MF2
IPAB Workshop - 26/2/2026 Title: Specifying, Measuring, and Simulating Safety in Cyber-Physical AI SystemsAbstract: When developing machine learning systems (object detectors, RL-controllers, behavioural prediction algorithms) many of us are used to thinking of success in terms of terms of "accuracy of an isolated module on a test dataset". Yet if we ever wish to deploy such (typically "black-box") machinery in the real world, it will inevitably be a small part of a larger system, interacting with other modules in some complex scenario. What does it mean for our system to perform "safely" and "successfully" in the context of, say, a full autonomous vehicle stack? This talk will give an overview of some of my work in this area. In particular, it will touch on the questions of: - How do we even express what our specifications are in a given cyber-physical scenario?- How can we measure our degree of risk and success across a range of feasible scenarios?- How can we create efficient testing strategies given a possibly infinite number of simulations we could run? Feb 26 2026 13.00 - 14.00 IPAB Workshop - 26/2/2026 Craig Innes MF2