This seminar is part of the Management Science PhD seminar series and is the first part of a two seminar session
Abstract: The k-nearest neighbour (k-NN) algorithm is one of the most widely used benchmark algorithm in classification, supported by its simplicity and intuitiveness in finding similar instances in multivariate and large-dimensional feature spaces of arbitrary attribute scales. In contrast, applications of k-NN on forecasting time series data are fewer, and have largely focussed on assessing various distance metrics as similarity measures to identify similar univariate time series shapes in past data. In particular, k-NN applications in electricity load forecasting were constrained to identifying past realisations of the same dependent variable which match future realisations, in a non-causal approach to forecasting. However, deterministic calendar information is readily available on past and future time series motifs, allowing the distinction between load profiles of working days, weekends and bank-holidays to be encoded as categorical variables, and to be included in the search for similar neighbours. In this paper, we propose a multivariate k-NN regression method for forecasting the electricity demand in the UK market which utilises additional features created to encode additional calendar and holiday information pertaining to the day being forecasted. We assess the efficacy of this approach in a robust empirical evaluation using UK electricity load data. The approach shows improvements beyond conventional k-NN approaches and accuracy beyond that of standard benchmark methods.