More Website Templates @ - September08, 2014!

Tails, Droughts and Extremes

8th February 2016

The topic of this blog post follows from one of the talks we had last week on various research topics. The overall topic it came under was Extreme Value Theory (EVT) and one section was Covariate Modelling in this context. The talk was given by Emma Eastoe and after looking at the theory described how these ideas can be applied to two environmental issues, namely ground level water and the Greenland ice sheet. Both of these are projects she is currently engaged with.

The simplest case in which Extreme Value Theory comes in useful when we have a series of iid random variables $X_t$ which are assumed to be from some unknown probability distribution $F$. Often Statistics is aimed at modelling the main part of $F$ and often the tails may not the focus. But in some cases, such as storm modelling, it is the tails of the distribution that are most important in describing rare events. This is the arena of EVT. One cannot hope to design flood defenses effectively for extreme conditions simply by looking near the mean and mode of the distribution of data. Unfortunately, rare events mean that not much data actually exists, but with some clever tricks, these can be modelled quite well.

Thankfully, there is an analogous result to the Central Limit Theorem that applies to the maxima of a distribution. This is called the Unified Extremal Types Theorem:
Let $M_n$ be the maximum of a set of random variables $\{X_1,...,X_n\}$, then there exist normalising constants $a_n>0$ and $b_n$ such that, as $n\rightarrow\infty$ $$Pr\left[\frac{M_n-b_n}{a_n}\leq x\right]\rightarrow G(x)$$ then $G(x) = exp\{-[1+\xi x]_+^{-1/\xi}\}$.
$G$ is known as the generalised extreme value distribution. For large enough $n$, this allows the approximation for the distribution of the maxima, and so is very important in EVT. It applies for any given $F$. It can also be scaled and shifted by replacing $x$ with $(x-\mu)/\sigma$, for parameters $\mu$ and $\sigma$. From this, the conditional distribution of $X>x|X>u$ can be shown to have the asymptotic distribution of $$Pr\left[\frac{X-b_n}{a_n}\leq x|\frac{X-b_n}{a_n}\leq u\right]\approx [1+\xi \frac{x-u}{\sigma}]_+^{-1/\xi}.$$ This is called the Generalised Pareto distribution and is also very useful. Its parameters are scale

What Emma Eastoe considered was the case when $X_t$ is either not independent, or each $X_t$ comes from a different distribution $F_t$. In this situation, there are two different methods, Linear Regression Models and random Effects Models, both of which can be applied to the Generalised Pareto and Generalised Extreme Value distributions.

The example that was used presented was Groundwater level. This gives an indication of the overall amount of water in an aquifer. Aquifers are useful in times of little rainfall as the water can be extracted from them and then supplied to residences. The problem is that monthly data of groundwater levles is fairly correlated with the data from that month in previous years and the months before it. This means that simply applying a Generalised Pareto or Generalised Extreme Value will not effectively model the situation. To deal with this, Eastoe has used linear regression models and included rainfall, potential evaporation and a year to year trend as covariates for the parameter $\mu$ of the Generalised Extreme Value distribution. The results were interesting in that the minima and maxima showed different significant covariates. For minima, the year and potential evaporation proved to be very important, whereas for the maxima, it was the rainfall that showed up as significant. The next thing to look at is whether or not this is because the information is reacted to in slightly differently ways.

After this Eastoe briefly discussed some work on the greenland ice sheet she is about to begin. She has yet to start modelling, but her early analysis of the data suggests that observational data and numerical models from scientists differ a lot in the tails. This idea is an example of something Professor Jon Tawn (director of STOR-i) said to us at the beginning of the year, that serious study of the data is very important before leaping into modelling.

The other parts of the talks discussed what one should do when the data is correlated and methods to tackle this, as well as some example applications such as flood risk modelling. The flood risk management required real thought about how to model the data. In particular, although two points may be close together in a spatial sense, this does not mean that their flood risks and water levels will be similar, possibly due to being near a joining of two rivers. A more appropriate measure of closeness must therefore be considered in order to take this into account. The idea settled upon was to compare the centres of the areas from which that point recieves water. I found the whole talk very intersting, and EVT seems very interesting mathematically as well as having some very important applications.