PhD
Below is a description of my PhD project which I am doing in partnership with the UK Cabinet Office.
Uncertainty Quantification when Combining Forecasts and Simulation for Long-term Problems
Supervised by Luke Rhodes-Leader and Ivan Svetunkov (Lancaster University), and Aaron Morris and Zara Grout (Joint Data Analysis Centre, UK Cabinet Office
While JDAC works on many short-term projects, it also focuses on long-term issues that the government may need to address, such as the effects of demographic and climate change on the workforce and vulnerable populations over the next 20 years. To explore these kinds of issues, JDAC relies on long-term forecasting techniques. These forecasts, based on statistical and machine learning models, help predict future trends if the right data is available. However, forecasts alone are not enough and need to be connected to decision-making. By integrating forecasts into simulation models, analysts can study how different policy choices might impact public systems like healthcare.
This can be done using simulation techniques such as stochastic discrete-event models, agent-based models, or deterministic system dynamics models. These are especially useful when policymakers need to evaluate different strategies for reducing risks. In such cases, forecasts act as inputs to the simulation models, allowing policymakers to assess which strategies will be most effective.
All models come with some level of uncertainty. In forecasting, this uncertainty can come from the data itself, the way model parameters are estimated, and the structure of the model. When forecasts are used as inputs for simulation models, this uncertainty carries over. Simulation models also have their own uncertainties, such as randomness (stochastic factors) and model assumptions. However, unlike standard simulation input uncertainty the uncertainty coming from the forecast is not only due to estimating parameters but also because of the unknown future conditions. This project will focus on measuring and managing uncertainty at the point where forecasting and simulation intersect.
MRes and Undergraduate
During the MRes year of the STOR-i programme we get the opportunity to research different topics and produce reports and presentations on them. Below you can can find links to the research I did during this year, as well as the research I did as part of my undergraduate degree and during my STOR-i summer internship.
Using Item Response Theory to Assign Group Memberships
STOR601: Technical Report supervised by Gabriel Wallin, Lancaster University, 2025
Item Response Theory (IRT) Models are commonly used in the construction and evaluation of educational tests. They are typically used with categorical data and are probabilistic models for individuals’ responses to a set of items. These models are based on latent factor models which classify individuals into groups based on traits, where everyone with the same trait behaves in the same way. This method can also be used in other scenarios where the group memberships affect the distributions of the latent traits of the individuals. One example of this is in political data, for example using the voting patterns of different senators in the US Senate to try to classify each senator into one of the two main parties.
Evaluating Solutions for Making Decisions Under Uncertainty
STOR601: Non-Technical Report supervised by Jamie Fairbrother, Lancaster University, 2025
Making decisions is an important part of many people’s lives, however there is often knowledge that would be helpful in order to make a decision which is not available when the decision is being made. This is a problem in many areas of industry, for example, transportation, supply chain management and finance, and there can be large consequences if the correct decision is not made. This means that we want to find a way to model decision problems in order to obtain solutions that perform well under the uncertainty we have.
Using the Bouncy Particle Sampler To Remove Reversibility in MCMC
STOR608: Contemporary Topic Sprints MCMC Report supervised by Christopher Nemeth, Lancaster University, 2025
Markov Chain Monte Carlo (MCMC) is a class of algorithms which can be used to draw samples from a probability distribution, often ones which are high dimensional or complex meaning that analytic techniques alone can not be used to study them. This can be achieved by constructing a Markov Chain with elements which have a distribution that approximates the distribution of interest. Common algorithms used include Metropolis-Hastings, Gibbs Sampling, and Metropolis-adjusted Langevin Algorithm. These methods however all have similar properties, with one of these being that they are all reversible which can lead to unwanted behaviour when sampling.
Using Known Boundaries to Improve Bayesian Emulation
Master’s Dissertation supervised by Ian Vernon, Durham University, 2024
As more complex models that need to be evaluated are being used in a variety of situations, ways to efficiently perform these evaluations and model the function need to be developed. One way of doing this is by using a Bayes Linear Emulator. Whilst these perform well, many techniques are being developed to improve the performance of these emulators and reduce the number of expensive evaluations that need to be performed. This project looks at one of these techniques – adding information to the emulator about a boundary which can be easily analytically solved.
Exploring Methods for Finding Maxima Using Bayesian Optimisation
Stor-i Summer Internship supervised by Daniel Dodd, Lancaster University, 2022
The global optimisation of expensive, potentially gradient-free, black-box functions is a critical problem in science and engineering. In these settings, the form of the objective function is generally unknown, and even a single evaluation is costly which renders optimisation difficult. For optimisation problems with any of these challenges, Bayesian optimisation is a prominent approach that has been shown to obtain better results, in fewer evaluations, than alternatives such as grid or random search-based methods. The general idea is to construct a probabilistic model of the objective function and then sequentially decide where to evaluate it next. In particular, Gaussian processes are probabilistic models known to make well-calibrated predictions and, therefore, stand as a robust model of choice. The aim of this project was to explore Gaussian processes and Bayesian optimisation methods through the GPJax package.