PhD: Ranking Systems in Sport

Listen to a talk presented at joint meeting of the Sports Analytics Lab at Harvard University/American Statistical Association Section on Statistics in Sports seminar series.

A brief clip from the Royal Statistical Society's audio podcast series, where I speak to Brian Tarran at the 2019 RSS conference in Belfast.

Lancaster University Sports Statistics Motivation


Pick your favourite sport. Chances are, you have an opinion on who the best in the world is, at this current moment, or of all time, or who would win if A played B. But is it possible to develop a system which returns an objective answer to these questions?

In developing such systems, it is crucial to capture as much information as possible about the dynamic world in which we live. Understand it. Learn from it. Predict it. Athlete’s injuries, the weather, and even economic factors all impact the outcome of these events and the implied ability of the athletes or teams. This project requires a wide range of strategies in order to capture these signals, from graph theory to extreme value theory, and contextual information from news websites, so that the most accurate system of ranking sports teams or athletes is formulated.

Ranking systems in sport are not only interesting to the inquisitive fan, but a fair and accurate system is at the core of all sports organisational bodies and the multi-billion pound industries that they represent.

But these systems are not exclusive to sports.

Methodological advances in the field of sports ranking systems have far-reaching consequences. Ranking systems are used to rank webpages, or to rank schools and hospitals, or even to determine the most essential medical treatments. So, a ranking system based on poor methodology can have much more severe repercussions than incorrectly seeding a tennis tournament… Ultimately, the importance of ranking systems is self-evident, and sport creates a fruitful playground in which ample advancements can be made.


In most commonly used ranking systems, some level of underlying transitivity is assumed. If transitivity exists in a system then information about pairwise comparisons can be translated to other linked pairs. For example, if typically A beats B and B beats C, this could inform us about the expected outcome between A and C. We show that in the seminal Bradley-Terry model knowing the probabilities of A beating B and B beating C completely defines the probability of A beating C, with these probabilities determined by individual skill levels of A, B and C. Users of this model tend not to investigate the validity of this transitive assumption, nor that some skill levels may not be statistically significantly different from each other; the latter leading to false conclusions about rankings. We provide a novel extension to the Bradley-Terry model, which accounts for both of these features: the intransitive relationships between pairs of objects are dealt with through interaction terms that are specific to each pair; and by partitioning the n skills into A + 1 ≤ n distinct clusters, any differences in the objects’ skills become significant, given appropriate A. With n competitors there are n(n − 1)/2 interactions, so even in multiple round robin competitions this gives too many parameters to efficiently estimate. Therefore we separately cluster the n(n − 1)/2 values of intransitivity into K clusters, giving (A, K) estimatable values respectively, typically with A + K < n. Using a Bayesian hierarchical model, (A, K) are treated as unknown, and inference is conducted via a reversible jump Markov chain Monte Carlo (RJMCMC) algorithm. The model is shown to have an improved fit out of sample in both simulated data and when applied to American League baseball data.

Harry Spearing applies extreme value theory to personal best swim times to rank swimmers and predict future world records.

Abstract: The International Swimming Federation (FINA) uses a very simple points system with the aim to rank swimmers across all Olympic events. The points acquired is a function of the ratio of the recorded time and the current world record for that event. With some world records considered better than others however, bias is introduced between events, with some being much harder to attain points where the world record is hard to beat. A model based on extreme value theory will be introduced, where swim-times are modelled through their rate of occurrence, and with the distribution of the best times following a generalised Pareto distribution. Within this framework, the strength of a particular swim is judged based on its position compared to the whole distribution of swim-times, rather than just the world record. This model also accounts for the date of the swim, as training methods improve over the years, as well as changes in technology, such as full body suits. The parameters of the generalised Pareto distribution, for each of the 34 individual Olympic events, will be shown to vary with a covariate, leading to a novel single unified description of swim quality over all events and time. This structure, which allows information to be shared across all strokes, distances, and genders, improves the predictive power as well as the model robustness compared to equivalent independent models. A by-product of the model is that it is possible to estimate other features of interest, such as the ultimate possible time, the distribution of new world records for any event, and to correct swim times for the effect of full body suits. The methods will be illustrated using a dataset of the best 500 swim-times for each event in the period 2001-2018.

Previous mini-projects

Exploring Solutions to the Stochastic Multi-Armed Bandit Problem

This report begins with an introduction to machine learning and its various classifications, and the crossover between machine learning and statistical learning in multi-armed bandit problems. A literature review of solutions to multi-armed bandit problems are presented: the ε-greedy algorithm, Thompson sampling algorithms, and Upper-Confidence-Bound (UCB) algorithms. A simulation is then formulated and coded in R and tested on a 10-armed Bernoulli bandit to compare these methods along with an algorithm introduced in this report, the greedy-η(t) algorithm. It is shown that the greedy-η(t) algorithm outperforms the standard ε-greedy algorithm after approximately a 10th of the horizon time, and will always outperform the ε-greedy algorithm asymptotically, independent of parameter selection. In a macro test of the code, it is shown that the greedy-η algorithm is more robust and its performance is less dependent on prior information. The simulation is based in a Bayesian setting with a Beta posterior distribution. View the R code for this simulation. Finally a comparison of these methods is presented based on theoretical guarantees on regret bounds, the simulation in this report, and other simulations in the literature.

View article.

Sequential Monte Carlo Methods in State Space Models.

This report formulates the mathematics behind hidden Markov models, before introducing and simulating a Kalman filter. Later, approximate techniques are presented and applied to a hidden Markov model, with a simulation of one particular example, the Bootstrap filter. Throughout this report it is assumed that the parameters in the models are known, however in practice this is not the case.

View article.

Markov Chains Applied to Time Dependent Queuing Systems.

Continuous Time Markov chains are applied to an M(t)/M/S queue using a transition rate matrix and the expected number in the queue E(n(t)) is tracked over time. This is tested on two different arrival rate functions and the experimental findings agree with the theoretical values. The upper 10 % tail of the distribution of the number in the queue varies in a similar way to the expected number E(n(t)) in the queue. For 3 servers, this peaks at 97 in our example. We conclude that the number of servers S, has more effect on E(n(t)) than on the upper 10 % tail of the distribution. Discrete Time Markov Decision Chains are applied to and M(t)/M/1 queue and backwards recursion is used to find a closed form for the optimal policy.

View article.

The Existence, Nature, and Effect of Dark Energy.

The cosmic age problem is approached by comparing observations of globular clusters with the theoretical age of the Universe derived from the Friedmann equation, assuming a cosmological constant and a Universe which is flat and matter dominated. My data concludes that there is indeed a need for dark energy. The possibility of a cyclic Universe is considered, and is shown how this can solve many current problems in the Standard Cosmological Model such as the flatness problem, the coincidence problem, the horizon problem, and the dark energy problem.

View article.

Further topics

For a light read into a broad range of topics in statistics and operational research, visit my blog [note: inactive].


Harry Spearing Royal Statistical Society Belfast 2019

I have been a PhD student within the STOR-i Centre for Doctoral Training, Lancaster University, since 2018. From April 2021-2022 I took a break from my PhD to work as a researcher at Roche in Basel , Switzerland. There, I worked within their neuroscience section, designing and implementing novel statistical models in order to increase our understanding of Huntington's disease and multiple sclerosis. Broadly, my research interests include Bayesian modelling and inference, extreme value theory, and spatio-temporal statistics. With a background in astrophysics and cosmology, my real passion lies at the interplay between mathematics and science - the enabling of new discoveries through statistical analysis of scientific data. Applications of particular interest include physical systems, biology and climatology.

Hobbies and Interests

I love to explore nature through cycling, walking, running and swimming. I play guitar, violin, and drums, and enjoy jamming with friends.