{"id":718,"date":"2022-04-04T11:17:00","date_gmt":"2022-04-04T11:17:00","guid":{"rendered":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/?p=718"},"modified":"2022-05-06T11:27:08","modified_gmt":"2022-05-06T11:27:08","slug":"models-for-maxima-extremes-part-2","status":"publish","type":"post","link":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/2022\/04\/04\/models-for-maxima-extremes-part-2\/","title":{"rendered":"Models for Maxima &#8211; Extremes Part 2"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\">What is an Extreme?<\/h3>\n\n\n\n<p>In my last post, I introduced the ideas and motivations behind the study of extreme value theory, an area of statistics central to engineering and risk assessment. Many structures or processes need to be designed to function even when facing extreme conditions. We now ask &#8211; what exactly do we consider to be an extreme event? <\/p>\n\n\n\n<p>Well clearly, it needs to be an observation that can be said to be extreme in some way. That is, it be that it is unusually small or large. In more applications we are usually interested in when a phenomemon, such as a weather event, is going to be unusually large, so the majority of the theory is developed around this interest. Of course, a lot of the models can be applied in the reverse case with some adjustment, I will just not cover this here.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full is-resized\"><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-content\/uploads\/sites\/32\/2022\/05\/extreme.png\" alt=\"\" class=\"wp-image-721\" width=\"416\" height=\"490\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-content\/uploads\/sites\/32\/2022\/05\/extreme.png 555w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-content\/uploads\/sites\/32\/2022\/05\/extreme-255x300.png 255w\" sizes=\"(max-width: 416px) 100vw, 416px\" \/><figcaption>Two people possessesing extreme heights<\/figcaption><\/figure><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Not enough data<\/h3>\n\n\n\n<p>Unfortunately, the very nature of extreme values makes modelling them via standard methods a bad idea. They are an improbable event by defintion, and so there is usually not enough data within range of the values we are considering to provide good results. To illustrate this, consider the following. Imagine we have some density function, such as the one shown below. If we do not know this density functin, and wish to look model it, we can look to the observations. In the example below we have plenty of data points, so this can be done using standard methods such as the empirical distribution function. Now imagine that we only want to model the function in the upper tail region marked in red &#8211; how could we do this? Well, since this is a tail region, there are not many observations lying in this region, meaning we cannot as reliably utilise standard methods. For this reason, tail specific models have been developed.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full is-resized\"><img decoding=\"async\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-content\/uploads\/sites\/32\/2022\/05\/tail.png\" alt=\"\" class=\"wp-image-722\" width=\"479\" height=\"280\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-content\/uploads\/sites\/32\/2022\/05\/tail.png 638w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-content\/uploads\/sites\/32\/2022\/05\/tail-300x175.png 300w\" sizes=\"(max-width: 479px) 100vw, 479px\" \/><figcaption>An example distibution &#8211; extremes in red<\/figcaption><\/figure><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Maxima<\/h3>\n\n\n\n<p>Before we can develop a model for extremes, we must first decide what values were are going to be an extreme. Sometimes it seems obvious, such as in the example above, but often it is not clear. What if there were more data points close to the red line? How would I have decided where to place it in that case, and thus categories some points as extreme and some not? Well there are a couple of ways of doing this. The first, block maxima, I will cover here. The second, threshold exceedances, I will look at in the next blog. <\/p>\n\n\n\n<p>Say we have some observations <span class=\"wp-katex-eq\" data-display=\"false\"> \\{X_1,\\ldots,X_{n}\\} <\/span>. The most extreme value here will be the maximum given by <span class=\"wp-katex-eq\" data-display=\"false\"> M= \\max_{i=1,\\ldots,n}\\{X_1,\\ldots,X_{n}\\} <\/span>. This is what we want to find the distribution of. We could, of course, find some estimate for <span class=\"wp-katex-eq\" data-display=\"false\">F_X<\/span>, the distribution of <em>X<\/em>, and then approximate the distribution of <em>M <\/em>as <span class=\"wp-katex-eq\" data-display=\"false\"> F^n_X<\/span>. This isn&#8217;t a great idea though, because if we are slightly off with our estimation of <span class=\"wp-katex-eq\" data-display=\"false\"> F_X<\/span>, our estimate distribution for the maximum will be greatly affected. Instead, it is best to directly find a form for the distribution of the maximum. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Generalised Extreme Value (GEV)<\/h3>\n\n\n\n<p>The limiting form of the distribution for the maximum is said to be a member of the Generlised Extreme Value family. The distribution function of this family is<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"459\" height=\"86\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-content\/uploads\/sites\/32\/2022\/05\/gev.png\" alt=\"\" class=\"wp-image-739\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-content\/uploads\/sites\/32\/2022\/05\/gev.png 459w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-content\/uploads\/sites\/32\/2022\/05\/gev-300x56.png 300w\" sizes=\"(max-width: 459px) 100vw, 459px\" \/><\/figure><\/div>\n\n\n\n<p>for <span class=\"wp-katex-eq\" data-display=\"false\">  -\\infty &lt; \\mu &lt; \\infty,\\, \\sigma &gt; 0\\, \\,\\text{and} \\, \\,-\\infty &lt; \\xi &lt; \\infty <\/span>. The value of <span class=\"wp-katex-eq\" data-display=\"false\">\\xi <\/span> is particularly important, as it determines how heavy the tail of the distribution is and also what special case of the family we have: the Weibull, Gumbel, or Fr\u00e9chet.<\/p>\n\n\n\n<p>This distribution can be fitted to data by splitting it up into blocks of length <em>n. <\/em>Then, the maxima of each of these blocks can be taken and these used as observations from the GEV distribution. The question of how large each of these blocks must be is an important one. If we pick them too large we won&#8217;t have enough data, if they&#8217;re too small then the theory behind the GEV won&#8217;t hold. Sometimes the block size is natural from the data though; for example, if we have rainfall data over the past 50 years, we could take the block size to be 1 year of observations. The parameters of the GEV can then be estimated via maximising the likelihood over these yearly maxima.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Wasting data<\/h3>\n\n\n\n<p>There is an issue with this block maxima approach &#8211; one of wasting data. Say that we have a time series of<br>seasonal weather conditions, such as rainfall, which we segment into monthly blocks. A particularly wet<br>month may contain many extreme values, and so by only considering the maximum of these values we are<br>losing information about the others. Conversely, a dry month may not contain any extreme values but<br>we would be fitting our model to its maximum anyway, potentially distorting the model. The alternative method of considering threshold exceedances avoids this problem altogether. I&#8217;ll discuss this in my next post. <\/p>\n\n\n\n<p>In the meantime, if you&#8217;d like to read some of the theory behind the GEV distribution, I&#8217;d again reccomend<a href=\"https:\/\/www.amazon.co.uk\/Extreme-Value-Theory-Introduction-Engineering\/dp\/144192020X\/ref=asc_df_144192020X\/?tag=googshopuk-21&amp;linkCode=df0&amp;hvadid=334395561503&amp;hvpos=&amp;hvnetw=g&amp;hvrand=9910861710224376562&amp;hvpone=&amp;hvptwo=&amp;hvqmt=&amp;hvdev=c&amp;hvdvcmdl=&amp;hvlocint=&amp;hvlocphy=1006854&amp;hvtargid=pla-718179723512&amp;psc=1&amp;th=1&amp;psc=1\"> this book.<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>What is an Extreme? In my last post, I introduced the ideas and motivations behind the study of extreme value theory, an area of statistics central to engineering and risk assessment. Many structures or processes need to be designed to function even when facing extreme conditions. We now ask &#8211; what exactly do we consider&hellip;&nbsp;<a href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/2022\/04\/04\/models-for-maxima-extremes-part-2\/\" rel=\"bookmark\">Read More &raquo;<span class=\"screen-reader-text\">Models for Maxima &#8211; Extremes Part 2<\/span><\/a><\/p>\n","protected":false},"author":35,"featured_media":719,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"neve_meta_sidebar":"","neve_meta_container":"","neve_meta_enable_content_width":"","neve_meta_content_width":0,"neve_meta_title_alignment":"","neve_meta_author_avatar":"","neve_post_elements_order":"","neve_meta_disable_header":"","neve_meta_disable_footer":"","neve_meta_disable_title":"","slim_seo":{"title":"Models for Maxima - Extremes Part 2 - Matthew Speers","description":"What is an Extreme? In my last post, I introduced the ideas and motivations behind the study of extreme value theory, an area of statistics central to engineeri"},"footnotes":""},"categories":[1],"tags":[],"class_list":["post-718","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/posts\/718","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/users\/35"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/comments?post=718"}],"version-history":[{"count":16,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/posts\/718\/revisions"}],"predecessor-version":[{"id":740,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/posts\/718\/revisions\/740"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/media\/719"}],"wp:attachment":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/media?parent=718"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/categories?post=718"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/matthew-speers\/wp-json\/wp\/v2\/tags?post=718"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}