{"id":365,"date":"2021-05-13T20:58:54","date_gmt":"2021-05-13T20:58:54","guid":{"rendered":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/?p=365"},"modified":"2021-05-14T00:22:12","modified_gmt":"2021-05-14T00:22:12","slug":"markov-chains-and-hidden-models","status":"publish","type":"post","link":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/2021\/05\/13\/markov-chains-and-hidden-models\/","title":{"rendered":"Markov Chains and Hidden Models"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\"><span class=\"has-inline-color has-black-color\"><strong>Markov Chains<\/strong><\/span><\/h3>\n\n\n\n<p><span class=\"has-inline-color has-black-color\">Markov chains are a fairly common, and relatively simple, way to statistically model random processes. They have been used in many different domains, ranging from text generation to financial modeling. Overall, Markov Chains are conceptually quite intuitive, and are very accessible in that they can be implemented without the use of any advanced statistical or mathematical concepts. They are a great way to start learning about probabilistic modeling and data science techniques.<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"has-inline-color has-black-color\"><strong>Weather example<\/strong><\/span><\/h3>\n\n\n\n<p><span class=\"has-inline-color has-black-color\">Imagine that when you wake up, you observe the if it is sunny or rainy each day. You might want to assign probabilities, such that given the state of the day today (sunny or rainy), what is the probability that tomorrow will be sunny? You can now use this <strong>distribution<\/strong> to predict weather for days to come, based on what the current weather state is at the time. This is schematically illustrated in the following picture:<\/span><\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img fetchpriority=\"high\" decoding=\"async\" width=\"743\" height=\"219\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/Sunny_Rainy.png\" alt=\"\" class=\"wp-image-371\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/Sunny_Rainy.png 743w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/Sunny_Rainy-300x88.png 300w\" sizes=\"(max-width: 743px) 100vw, 743px\" \/><figcaption><strong>Sunny\/Rainy day: a 2 state Markov Chain<\/strong><\/figcaption><\/figure><\/div>\n\n\n\n<p><span class=\"has-inline-color has-black-color\">Based on our observations, we have concluded the probabilities for those transitions above. In particular, if today is sunny, then we have 90% chance of remaining sunny tomorrow and 10% chance of raining. If, however, today is rainy, we have 50% chance of transitioning either to rainy or sunny day. Those transitions are also called <strong>jumps<\/strong>, and the system itself is termed: a <strong>Markov chain<\/strong>. The name, Markov, comes from the person, who invented them: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Andrey_Markov\" data-type=\"URL\" data-id=\"https:\/\/en.wikipedia.org\/wiki\/Andrey_Markov\">Andrey Markov<\/a>, a Russian mathematician. The feature that make them special is the so called <strong>memoryless<\/strong> property, which means that how the chain evolves only depend on the current state of the system. This is also the case for our weather model above, as whether it is sunny or rainy tomorrow depends only on what the state is now. This typically leaves them unable to successfully produce sequences in which some underlying trend would be expected to occur. For example, while a Markov chain may be able to mimic the writing style of an author based on word frequencies, it would be unable to produce text that contains deep meaning or thematic significance since these are developed over much longer sequences of text.&nbsp;<strong>They therefore lack the ability to produce context-dependent content since they cannot take into account the full chain of prior states.<\/strong> <\/span><\/p>\n\n\n\n<p><span class=\"has-inline-color has-black-color\">A typical way to represent Markov chains is through their transition matrices, which contain information of how the move forwards in time. For our example above, the transition matrix will be of the form:<\/span><\/p>\n\n\n\n<figure class=\"wp-block-table aligncenter\"><table><tbody><tr><td><span class=\"has-inline-color has-black-color\"><strong>1: Sunny<\/strong><\/span><\/td><td><span class=\"has-inline-color has-black-color\"><strong>0.9<\/strong><\/span><\/td><td><span class=\"has-inline-color has-black-color\"><strong>0.1<\/strong><\/span><\/td><\/tr><tr><td><span class=\"has-inline-color has-black-color\"><strong>2: Rainy<\/strong><\/span><\/td><td><span class=\"has-inline-color has-black-color\"><strong>0.5<\/strong><\/span><\/td><td><span class=\"has-inline-color has-black-color\"><strong>0.5<\/strong><\/span><\/td><\/tr><\/tbody><\/table><figcaption><strong><span class=\"has-inline-color has-black-color\">Transition matrix<\/span><\/strong><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"has-inline-color has-black-color\"><strong>Hidden Markov Models<\/strong><\/span><\/h3>\n\n\n\n<p class=\"has-text-align-left\"><span class=\"has-inline-color has-black-color\">Before introducing the formal definition, let us first take a look at the following Figure:<\/span><\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"369\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/Hidden_Final-1024x369.png\" alt=\"\" class=\"wp-image-374\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/Hidden_Final-1024x369.png 1024w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/Hidden_Final-300x108.png 300w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/Hidden_Final-768x277.png 768w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/Hidden_Final.png 1171w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption><span class=\"has-inline-color has-black-color\"><strong>Hidden Markov Model<\/strong><\/span><\/figcaption><\/figure><\/div>\n\n\n\n<p><span class=\"has-inline-color has-black-color\">Imagine that there is a process X that evolves in time. In this case, we will consider the so called jump chains, where each time step, 1, 2, . . ., will present a transition of the process on another stage. The process <span class=\"wp-katex-eq\" data-display=\"false\">\\mathbf{X}<\/span> will be the unobservable component of the hidden Markov model and at each time, <span class=\"wp-katex-eq\" data-display=\"false\">\\mathbf{X}_{t}<\/span> will contain information of possibly many unknowns that we cannot see. In our simple simulation example, <span class=\"wp-katex-eq\" data-display=\"false\">\\mathbf{X}<\/span> will contain the number of rabbits <span class=\"wp-katex-eq\" data-display=\"false\">R_{t}<\/span> and number of foxes <span class=\"wp-katex-eq\" data-display=\"false\">F_{t}<\/span> at time <span class=\"wp-katex-eq\" data-display=\"false\">t<\/span>. Another assumption will be that if the process is currently at time <span class=\"wp-katex-eq\" data-display=\"false\">t<\/span>, how it evolves in the future will be independent of its past. Hence, this is the Markov process in the our hidden model.<\/span><\/p>\n\n\n\n<p><span class=\"has-inline-color has-black-color\">Ok, so what can we see? Well, in practice we only observe noisy observations of some of the unknowns contained in the process <span class=\"wp-katex-eq\" data-display=\"false\">\\mathbf{X}<\/span>. Furthermore, while in theory we could obtain observations at each transition, in practice it might not be possible. For example, you do not observe the population of rabbits each time they increase by one, but you might send a person to record their numbers once a month. This will be our noisy observation, say <span class=\"wp-katex-eq\" data-display=\"false\">Y_{t}<\/span> at time <span class=\"wp-katex-eq\" data-display=\"false\">t<\/span>. In our discussion, we will assume that some of the unknowns are unobservable even through noisy observations (this is realistic as we rarely see foxes). Additionally, we shall also assume that <span class=\"wp-katex-eq\" data-display=\"false\">Y_{t}<\/span> only depends on the current state of the system <span class=\"wp-katex-eq\" data-display=\"false\">\\mathbf{X}_{t}<\/span> and not on the past. Mathematically, we can express a hidden Markov model as:<\/span><\/p>\n\n\n\n<p><span class=\"has-inline-color has-black-color\"><strong>Definition<\/strong>: The pair <span class=\"wp-katex-eq\" data-display=\"false\">(\\mathbf{X}_{t}, Y_{t})<\/span> is a hidden Markov model if:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><span class=\"has-inline-color has-black-color\"><span class=\"wp-katex-eq\" data-display=\"false\">\\mathbf{X}_{t}<\/span> is a Markov process, whose behaviour is not directly observable<\/span><\/li><li><span class=\"has-inline-color has-black-color\"><span class=\"wp-katex-eq\" data-display=\"false\">P (Y_{t} \\in A|\\mathbf{X}_{1} = \\mathbf{x}_{1} , . . . , \\mathbf{X}_{t} = \\mathbf{x}_{t}) = P (Y_{t} \\in A|\\mathbf{X}_{t} = \\mathbf{x}_{t}<\/span>), for all <span class=\"wp-katex-eq\" data-display=\"false\">t \\geq 1<\/span> and any measurable set <span class=\"wp-katex-eq\" data-display=\"false\">A<\/span><\/span><\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"has-inline-color has-black-color\"><strong>Simulation<\/strong><\/span><\/h3>\n\n\n\n<p><span class=\"has-inline-color has-black-color\">If we assume that the individual variables in the hidden process <span class=\"wp-katex-eq\" data-display=\"false\">\\mathbf{X}<\/span>, rabbits and foxes, follow a Poisson process, then, knowing the transition rates in the exponential distributions will determine how the process evolves. If we settle on a Gamma noise, where the mean of the Gamma represents the true state, we can simulate the system. The following plot represents the noisy observations as red crosses, foxes as the orange path and rabbits  as the blue path:<\/span><\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/True_Simulation_Plus_Observations.png\" alt=\"\" class=\"wp-image-383\" width=\"622\" height=\"437\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/True_Simulation_Plus_Observations.png 622w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-content\/uploads\/sites\/31\/2021\/05\/True_Simulation_Plus_Observations-300x211.png 300w\" sizes=\"(max-width: 622px) 100vw, 622px\" \/><figcaption><span class=\"has-inline-color has-black-color\"><strong>Population of rabbits and foxes<\/strong><\/span><\/figcaption><\/figure><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Markov Chains Markov chains are a fairly common, and relatively simple, way to statistically model random processes. They have been used in many different domains, ranging from text generation to financial modeling. Overall, Markov Chains are conceptually quite intuitive, and are very accessible in that they can be implemented without the use of any advanced&hellip; <br \/> <a class=\"read-more\" href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/2021\/05\/13\/markov-chains-and-hidden-models\/\">Read more<\/a><\/p>\n","protected":false},"author":32,"featured_media":366,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-365","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/posts\/365","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/users\/32"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/comments?post=365"}],"version-history":[{"count":12,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/posts\/365\/revisions"}],"predecessor-version":[{"id":387,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/posts\/365\/revisions\/387"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/media\/366"}],"wp:attachment":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/media?parent=365"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/categories?post=365"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/martin-dimitrov\/wp-json\/wp\/v2\/tags?post=365"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}