# Interns

Below you can find details of the summer 2019 interns including a description of their research project.

## Dylan Bahia

### Supervisor: Jordan Richards

Investigating the Eﬀect of Dependence on Averaging Extremes

Extreme Value Theory is often used to model extreme weather events, such as extreme rainfall, which is the main cause of river ﬂooding. Given data at separate locations, we can ﬁt simple models to understand the distribution of heavy rainfall at these single locations. However, ﬂooding is generally not caused by an extreme event at a single location; it is caused by extreme rainfall averaged over several locations, often referred to as a catchment area. This project aims to investigate how the distribution of extreme rainfall at single locations, and the dependence between extreme events at diﬀerent sites, aﬀects the distribution of the average over all sites. We do this using a subset of data provided by the Met Oﬃce, which consists of gridded hourly rainfall across the north of England. Taking each grid box to be a single location, we can ﬁt distributions that model extreme events at each particular location. Dependence between grid boxes can also be quantiﬁed using empirical measures. We then average over adjacent grid boxes and ﬁt the same distributions. We are particularly interested in how the parameters of the distributions change as we average over an increasing number of grid boxes, and how the dependence between locations inﬂuence this change.

## Matthew Darlington

### Supervisor: Livia Stark

Optimal learning for multi-armed bandits

Optimal learning is concerned with eﬃciently gathering information (via observations) that is used in decision making. It becomes important when the way information is gathered is expensive, so that we are willing to put some eﬀort into making the process more eﬃcient. Learning can take place in one of two settings, oﬄine, or online. In oﬄine learning we make a decision after a number of observations have taken place, while in online learning decisions are made sequentially, so that a decision results in a new observation that in turn informs our next decision. This project will focus on learning for multi-armed bandits.

Multi armed bandits present an online learning problem. They are easiest to visualise as a collection of slot machines (sometimes referred to as one armed bandits). The rewards from the slot machines are random, and each machine has a diﬀerent, unknown expected reward. The goal is to maximise one’s earnings from playing the slot machines. That can be achieved by playing the machine with the highest expected reward. However, the expected rewards can only be estimated by playing the machines and observing their random rewards. Therefore there is a trade-oﬀ between exploring bandits to learn more about their expected rewards and exploiting bandits with known high expected rewards.

## Katie Dixon

### Supervisor: Szymon Urbas

Recruitment to Phase III Clinical Trials

Clinical trials are a series of rigorous experiments examining the eﬀect of a new treatment in humans. They are essential in the drug-approval process, as per the European Medicines Agency guidelines. In order for a drug to be made available to the public, it must pass a number of statistical tests each with suﬃcient certainty in the outcomes. The most costly part of the trials process is Phase III, which is composed of randomised controlled studies with large samples of patients. Patients are being continuously enrolled across a number of recruitment centres.

The standard way of modelling recruitment in a practical setting is to use a hierarchical Poissongamma (PG) model as introduced in Anisimov and Fedorov (2007). The framework assumes that the rates at which patients come into each centre do not change over time. The main argument for using the simple model is the limited data available for inferences as well as tractable predictive distributions. A recent work of Lan et al. (2018) explores the idea of decaying recruitment rates. However, the proposed model lacks ﬂexibility in accounting for a multitude of recruitment patterns appearing across diﬀerent studies.

The internship project will concern itself with the analysis of a ﬂexible class of recruitment models in data-rich scenarios. The project will likely tackle an open problem in the area which is the presence of a mixture of diﬀerent recruitment patterns appearing in a single study. This will likely involve novel ways of clustering centres based on the observed recruitments. The project will entail a mixture of applied probability, likelihood/Bayesian inference and predictive modelling. There will be a strong computing component in the form of eﬃcient optimisation or simulation methods.

## Matthew Gorton

### Supervisor: Alan Wise

Investigating Optimism in the Exploration/Exploitation Dilemma

A stochastic multi-armed bandit problem is one where a learner/agent has to maximise their sum of rewards by playing a row of slot machines, or ‘arms’, in sequence. In each round the learner pulls an arm and receives a reward corresponding to this arm. The rewards that are generated from each arm are assumed to be distributed as a noisy realisation of some unknown mean, therefore, maximising the sum of the rewards relies on ﬁnding the arm with the highest mean. We wish to create policies to tell us which arm to play next in order to maximise our reward sum.

The challenge to ﬁnding the best arm in the multi-armed bandit problem is the exploration-exploitation dilemma. This dilemma occurs since at any time point we need to decide between playing arms which have been played a low number of times (exploration) or the arm with the best estimated mean (exploitation). If the learner explores too much then they will miss out on playing the optimum arm, however, if the learner chooses to exploit the best arm, without exploring other options, then they could end up exploiting a sub-optimal arm. It is clear that the best policies balances both exploration and exploitation. The policies which we will study in this project follow the philosophy of optimism in the face of uncertainty.

These policies work by being optimistic towards options which we are uncertain about. For instance, consider an intern visiting Lancaster University for the ﬁrst time. For lunch, if they are optimistic about the local food places (Sultan’s/ Go Burritto) over chains (Subway/ Greggs) then they will be more likely to explore the places that they are more uncertain about. In multi-armed bandits, these policies give each arm an upper conﬁdence bound index (UCB), which usually takes the form of the estimated mean reward plus some bias, and the arm is played with the largest value of the index. These types of policies are mathematically guaranteed to never perform badly - but can we do better? This is the major question of this project.

## Joseph Holey

### Supervisor: Mike O'Malley

Predicting ocean current speed using drifter trajectories

The dataset I am using involves tracking of drifting objects in the sea which are tracked by GPS. These drifting objects are commonly referred to as drifters. In summary, the location of drifters are processed to obtain quarter daily Longitude, Latitude and Velocity Data. In order to model complex phenomena in the ocean one of the ﬁrst pre-processing steps is to remove a large scale mean velocity of the drifters. In other words, focus on the residuals, a model which predicts velocity, given location. Currently one of the most popular methods to do this involve binning the data, then extracting a mean in each bin and using this mean as the prediction. This project will aim to develop a better, more accurate method to predict velocity at a given location.

The general scope of methods you will be focusing on are classed as nonparametric regression, this includes spline regression, Gaussian processes, local polynomial regression and more. These methods are generally applied to independently distributed data. One of the more diﬃcult aspects of modelling this type data is accounting for the non-independent nature. In particular, the next sampled location in a trajectory strongly depends on the current location and velocity at that location. In particular this sequential sampling can strongly aﬀect model selection which will be a large part of this project.

Initially the project will look at modelling a relatively simple toy simulation which I will supply. The reasoning behind this is that we know the true underlying process, therefore the models you ﬁt can be compared to the known ground truth. The method which is found to work best can then be applied to the real dataset with empirical evidence that it works on similar data.

## Shyam Popat

### Supervisor: Jess Gillam

Point Processes on Categorical Data

The aim of this project to explore point process methods to model categorical time series, speciﬁcally data provided by Howz. Howz is home monitoring system based on research that indicates changes in daily routine can identify potential health risks. Howz use appliances placed around the house and other low-cost sources such as smart meter data to detect these changes. This data is a great example of how categorical data applies to real life situations. One potential way of modelling this data is to use point processes.

Point processes are composed of a time series of binary events (Daley and VereJones, 2003). There exist many diﬀerent point processes that could be useful for modelling this data, such as Poisson processes, Hawkes processes and Renewal processes (Rizoiu et al., 2017). The goal of this project is to ﬁnd ways to model multiple sensors, looking at the time between sensors being triggered to see if this indicates a change in routine. One extension to this project would be exploring the relationship between the categories; thus having diﬀerent models for each category. We could also look into subject speciﬁc eﬀects in the data.

## Katy Ring

### Supervisor: Mirjam Kirchner / Tom Grundy

Detection Boundaries of Univariate Changepoints in Gaussian Data

Changepoint detection deals with the problem of identifying structural changes in sequential data, such as deviations in mean, volatility, or trend. In many applications, these points are of interest as they might be linked to some exogenous cause. In the univariate case, the factors impacting on the detectability of a changepoint are well known: size of the change, location of the change, type of change, number of observations, and noise. However, the interplay of these parameters with the detectability of a changepoint hidden within a data sequence are yet to be studied in detail.

In this project, we investigate the reliability of the likelihood ratio test (LRT) statistic for detecting a single change in a univariate Gaussian process. To this end, we will conduct a simulation study testing diﬀerent settings of the parameters change in mean, change in variance, location of the change, and sample size. In particular, we are interested in ﬁnding parameter combinations for which the LRT becomes unreliable. For example, for a ﬁxed variance, sequence length, and changepoint position in the data, we would decrease the change in mean until we ﬁnd a region in which the LRT scatters around zero. The overall aim is to derive a surface that splits the parameter space of the LRT statistic into a detectable (LRT > 0) and undetectable (LRT →±0) changepoint region. Ideally, as a next step, an explicit relation between the simulation parameters and the detection boundary would be determined empirically. Further experiments on detection boundaries are possible, such as analysing non-Gaussian data or alternative test statistics.

How the diﬃculty of multivariate changepoint problems vary with dimension and sparsity

Multivariate changepoint detection aims to identify structural changes in multivariate time series. Increasingly more attention is being paid to developing methods to identify multivariate changes with little information known about the diﬃculty of multivariate changepoint problems and how they scale with dimension and sparsity. This work aims to answer the question: ‘If we have a user deﬁned signiﬁcant change size, could a change this size actually be detected using changepoint methods?’

In the univariate changepoint setting, it is well understood that there are many factors that aﬀect detectability of a changepoint including: size of the change, location of the change, type of change and length of the time series. Moving from a univariate to multivariate setting adds several layers to the detection problem; including the dimension of the time series and the sparsity of the changepoint.

Recent work at Lancaster University has considered, computationally, the case of a multivariate change in mean problem where the change size is identical in all dimensions. Under these constraints a relationship was identiﬁed between the size of the change and the number of dimensions that ensures the true and false positive rates remain constant. This project seeks to identify a relationship between the sparsity of a changepoint and the diﬃculty of detecting it as well as exploring the problems theoretically to give more justiﬁcation to the computational ﬁndings. If time and interest allows, we will also explore the aﬀect of varying the size of change in each series on the detectability of the changepoint.

## Moaaz Sidat

### Supervisor: Holly Jackson

Evaluating A Response Adaptive Clinical Trial using simulations

Before a new drug can be distributed to the public it must ﬁrst go through rigorous testing to make sure it is safe and eﬀective. This evaluation in humans is undertaken in a series of clinical trials. The approach most often used in clinical trials is the randomised controlled trial (RCT), which assigns all patients with equal probability to each treatment in the trial. Therefore RCTs are an eﬃcient way to identify if there is a signiﬁcant diﬀerence between the treatments in the study. Hence, the equal allocation of patients to each treatment, maximises the power of the study. However, RCTs do not allow the possibility of changing the probability of assigning a patient to the treatments. If it emerges before the end of the trial that one treatment is clearly more eﬀective than the other, then to maximise the number of patients treated successfully, logic dictates the remaining patients should be allocated to the most eﬀective treatment.

Response adaptive designs use information from previous patients to decide which treatment to assign to the next patient. They vary the arm allocation in order to favour the treatment which is estimated to be best. Multi-Armed Bandits (MAB) are an example of a response-adaptive design. They allocate patients to competing treatments in order to balance learning (identifying the best treatment) and earning (treating as many patients as eﬀectively as possible). One issue with some response adaptive designs is every patient is expected to produce the same outcome if given the same treatment. However some patients will have certain characteristics (also known as covariates) which means they will react to the same treatment diﬀerently. For example an overweight man in his twenties, may react diﬀerently to a drug than an underweight woman in her eighties.

This internship will focus on a randomised allocation method with nonparametric estimation for a multi-armed bandit problem with covariates. This method uses nonparametric regression techniques (including polynomial regression, splines and random forests) to estimate which treatment is best for the next patient due to their particular covariate. The main emphasis of this project is the endpoint. An endpoint could be binary, such as the treatment curing the patient or not, integer valued, such as the number of epileptic ﬁts in 6 months, continuous, such as a change in blood pressure, or it could be the survival time of a patient.

## Jack Trainer

### Supervisor: Matt Bold

Heuristic procedures for the resource-constrained project scheduling problem

The resource-constrained project scheduling problem (RCPSP) is a well-studied problem in operational research. Given a set of precedence-related activities of known durations and resource requirements, and a limited amount of resource, the RCPSP consists of ﬁnding a schedule that minimises the time to complete all the activities (known as the project makespan). Solving this problem on a large scale is very diﬃcult. Hence, whilst many exact solution methods exist for solving the RCPSP, these are too slow and therefore largely ineﬀective at solving this problem on a realistic scale. Therefore, the study and evaluation of fast, but inexact procedures (known as heuristic procedures) for solving the RCPSP is critical for real-world application.

Priority-rule heuristics are simple, yet eﬀective, scheduling procedures, consisting of a rule for ordering activities into a so-called activity list representation, and a rule for turning the activity list representation into a complete schedule. This simple class of heuristics form the basis of many of the most successful heuristic procedures for the RCPSP. This project aims to compare the eﬀectiveness of a number of diﬀerent procedures from this large subset of heuristics, by testing them on a large database of RCPSP test-instances, as well as investigate possible further improvements and extensions to them.

## Connie Trojan

### Supervisor: Srshti Putcha

Approximate posterior sampling via stochastic optimisation

We now have access to so much data that many existing statistical methods are not very eﬀective in terms of computation. These changes have prompted considerable interest amongst the machine learning and statistics communities to develop methods which can scale easily in relation to the size of the data. The “size” of a data set can refer to either the number of observations it has (tall data) or to its dimensions (wide data). This project will focus on a class of methods designed to scale up as the number of available observations increases.

In recent years, there has been a demand for large scale machine learning models based on stochastic optimisation methods. These algorithms are mainly used for their computational eﬃciency, making it possible to train models even when it is necessary to incorporate a large number of observations. The speed oﬀered by stochastic optimisation can be attributed to the fact that only a subset of examples from the dataset are used at each iteration. The main drawback of this approach is that parameter uncertainty cannot be captured since only a point estimate of the local optimum is produced.

Bayesian inference methods allow us to get a much better understanding of the parameter uncertainty present in the learning process. The Bayesian posterior distribution is generally simulated using statistical algorithms known as Markov chain Monte Carlo (MCMC). Unfortunately, MCMC algorithms often involve calculations over the whole dataset at each iteration, which means that they can be very slow for large datasets. To tackle this issue, a whole host of scalable MCMC algorithms have been developed in the literature. In particular, stochastic gradient MCMC (SGMCMC) methods combine the computational savings oﬀered by stochastic optimisation with posterior sampling, allowing us to capture parameter uncertainty more eﬀectively. This project will focus on implementing and testing the stochastic gradient Langevin dynamics (SGLD) algorithm. SGLD exploits the similarity between Langevin dynamics and stochastic optimisation methods to construct a robust sampler for tall data.

## Liv Watson

### Supervisor: Harry Spearing

Using Pairwise Comparison in Sports to Rank and Forecast

The aim of this project is to develop a ranking system for sports. Deﬁning a ‘good’ ranking depends on the aim. A ranking system that is used to predict future results must provide accurate predictions and could have a complex structure, whereas a system designed to seed players for a tournament needs to be robust to exploitation, fair, and easy to understand. Generally, a system that excels in one of these areas will fail in the other.

It is expected that the focus of this project will be the former, namely to develop an accurate and robust ranking system. The accuracy of the system can be measured by comparing its predictive performance against existing benchmarks as well as bookmaker’s odds, and robustness can be measured by the ranking’s sensitivity to small changes in match outcomes. A ranking system that is applicable to all sports is of course ideal, but sport speciﬁc features will need to be considered to achieve state-of-the-art prediction accuracy, and some general knowledge or interest in sports will be of use. Initially, simulated data will be used to design the ranking system. Then, once a general framework for the model is established, real data from a sport of the student’s choice will be used to test it. The model will then be tweaked to make use of all the available sport speciﬁc features of the data.

## Gwen Williams

### Supervisor: Nicola Rennie

Bid Price Controls for Dynamic Pricing in the Airline Industry

In the airline industry, revenue management systems seek to maximise revenue by forecasting the expected demand for diﬀerent ﬂights, and optimally determining the prices at which to sell tickets over time. Ideally, rather than setting prices at the start of the booking horizon, they should be updated over time depending on how many people have so far purchased tickets and how much time remains until departure. One such method of dynamically pricing tickets is the use of bid price controls. Bid price controls set threshold values for each leg of a ﬂight network; such that an itinerary (path on the network requested by a potential passenger) is sold only if its fare exceeds the sum of the threshold values along the path (Talluri and Ryzin, 1998).

Given that bid price controls require forecasts of demand; if demand is not as expected, for example due to increased sales around the time of major sporting events or carnivals, this results in non-optimal pricing, which leads to a decrease in potential revenue. So far, we have considered the potential gains in revenue when incorrect forecasts are updated under simpler revenue management pricing control mechanisms, and found that revenue can be increased by up to 20%. This project will similarly seek to quantify the potential gains in revenue from updating the bid prices when unexpected demand is detected.