Diffusion-Based Deep Generative Models for Assessing Safety in Autonomous Vehicles

My PhD research project, in partnership with the Transport Research Laboratory, focuses on deep generative modelling and its applications to scenario generation in testing autonomous vehicle safety.

It is hoped that the use of autonomous vehicles will significantly reduce the number of road accidents due to human error. However, extensive testing will be necessary to demonstrate that they satisfy a high standard of safety before they can be introduced. Much of this testing must be carried out in simulated environments, which allow for a far greater degree of safety and flexibility than road testing. This is done by testing the vehicle AI on a set of predetermined scenarios. Examples of scenarios include recovering from a loss of control due to road conditions, performing manoeuvres in the presence of oncoming traffic, and responding to a sudden deceleration by a leading vehicle. A key issue in autonomous vehicle safety is the need for a huge, diverse set of scenarios that reflect both normal driving conditions and difficult situations. This project aims to develop methods for automating scenario generation, both by reconstructing conditions from real life driving datasets, and by creating completely new scenarios using generative modelling.

The problem of generative modelling is to generate realistic fake samples from the same unknown distribution as a given dataset. Diffusion-based models are a recent advance in generative modelling, which work by gradually transforming random noise into data by a diffusion process. This is constructed by defining a stochastic differential equation (SDE) that gradually adds noise to the data until it resembles Gaussian noise of known mean and variance. The time reversal of this process is available in closed form, and depends on the original SDE coefficients as well as the score function of the perturbed data distribution. This unknown score function can be approximated by a neural network using denoising score matching. New samples can then be generated from noise by simulating the reverse-time diffusion.

Diffusion models are efficient to train and have been shown to be effective at generating diverse samples from complex, high-dimensional distributions. This project will focus on research into diffusion models and how they can be adapted for the scenario generation problem to generate realistic new scenarios given an existing database.


Research proposal: Diffusion-Based Deep Generative Models for Assessing Safety in Autonomous Vehicles. An introduction to deep generative models in the context of generating scenarios to test autonomous vehicle safety in simulators, with a particular focus on diffusion-based models.

Stochastic Dynamic Optimisation. An introduction to the properties, solution methods, and applications of Markov decision processes and stochastic games.

The Particle Filter. An introduction to particle filtering and particle MCMC, with applications to epidemic modelling.


Master’s project: Statistics and Data Science for Text Data (2021). An introduction to the field of natural language processing with a particular focus on language modelling. Poster and presentation focus on word embeddings.

STOR-i internship project: Approximate posterior sampling via stochastic optimisation (2019). An overview of how stochastic gradient Markov chain Monte Carlo algorithms can be used for computationally efficient Bayesian inference.