{"id":231,"date":"2020-04-29T09:07:00","date_gmt":"2020-04-29T09:07:00","guid":{"rendered":"http:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/?p=231"},"modified":"2020-05-01T13:51:48","modified_gmt":"2020-05-01T13:51:48","slug":"missing-data-part-1-introducing-the-missingness-mechanism","status":"publish","type":"post","link":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/2020\/04\/29\/missing-data-part-1-introducing-the-missingness-mechanism\/","title":{"rendered":"Missing Data: Introducing the Missingness Mechanism"},"content":{"rendered":"\n<p>Often when we collect data, some is missing. What do we do? Well, there is a load of stuff to cover here (and I\u2019m going to do it over a few posts). This post is going to cover an important question: what is causing the data to be missing?<\/p>\n\n\n\n<p>What causes the data to be missing is known as the Missingness Mechanism. There are three main types: Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR).<\/p>\n\n\n\n<p>You can think of these as traffic lights: MCAR is green (easy to deal with), MAR is amber (a bit problematic but there are some decent methods out there) and MNAR is red (a total pig to deal with).<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Missing Completely at Random<\/h5>\n\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-flow wp-block-group-is-layout-flow\">\n<div class=\"wp-block-image\"><figure class=\"alignright size-large is-resized\"><img decoding=\"async\" src=\"http:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-content\/uploads\/sites\/14\/2020\/04\/Markus10-1024x768.jpg\" alt=\"\" class=\"wp-image-211\" width=\"256\" height=\"192\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-content\/uploads\/sites\/14\/2020\/04\/Markus10-1024x768.jpg 1024w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-content\/uploads\/sites\/14\/2020\/04\/Markus10-300x225.jpg 300w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-content\/uploads\/sites\/14\/2020\/04\/Markus10-768x576.jpg 768w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-content\/uploads\/sites\/14\/2020\/04\/Markus10-1536x1152.jpg 1536w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-content\/uploads\/sites\/14\/2020\/04\/Markus10-2048x1536.jpg 2048w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-content\/uploads\/sites\/14\/2020\/04\/Markus10-1140x855.jpg 1140w\" sizes=\"(max-width: 256px) 100vw, 256px\" \/><figcaption>Markus snoozes peacefully, having decided that his data is MCAR<\/figcaption><\/figure><\/div>\n<\/div><\/div>\n\n\n\n<p>As the name suggests, Missing Completely at Random data means that the missingness presenting in your data is in a totally random pattern. There isn\u2019t anything in the data driving it that you need to be further concerned with. This is nice, because you can get away with some simplistic methods to deal with it.<\/p>\n\n\n\n<p>I describe some of those methods in <a href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/2020\/05\/01\/missing-data-part-ii-what-to-do-with-missing-data\/\">my next post<\/a> on missing data. <\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Missing at Random<\/h5>\n\n\n\n<p>Missing at Random data is where what drives the missingness is something in the data we are collecting, but that what drives it is something we have observed. The preceding sentence starts to give me a headache if I think about it too much, so I prefer to think of it in terms of an example.<\/p>\n\n\n\n<p>Imagine a university does a survey of previous students, to find out where they are working, what their income bracket is, etc.<\/p>\n\n\n\n<p>Let\u2019s say that alumni that work in a particular sector are less likely to disclose their income. But, they do disclose what sector it is that they work in. That data would be Missing at Random.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Missing Not at Random<\/h5>\n\n\n\n<p>But, what if students are less likely to respond to that income question the more they earn? Then we have Missing Not at Random data. The missingness depends on something we do not observe. &nbsp;<\/p>\n\n\n\n<p>This is very difficult to deal with and often causes bias in our analysis. To make it even more difficult, we cannot test whether the missingness mechanism is Missing at Random or Missing Not at Random.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Want to know more?<\/h4>\n\n\n\n<p>You can read more about missingness mechanisms in Chapter 1 of the book below \u2014 this is a really good book on missing data in general.<\/p>\n\n\n\n<p>Little, R. J. A. and Rubin, D. B. (2020). <em>Statistical analysis with missing data<\/em>. Wiley Series in Probability and Statistics. Wiley, Hoboken, NJ, third edition.<\/p>\n\n\n\n<p>I\u2019ve also included a link to the paper that introduced the idea of considering missingness mechanisms.<\/p>\n\n\n\n<p><a href=\"https:\/\/academic.oup.com\/biomet\/article-abstract\/63\/3\/581\/270932\">Rubin, D. B. (1976). Inference and missing data. <em>Biometrika<\/em>, 63(3):581-592.<\/a><\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<p>Thank you for reading. <a href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/2020\/05\/01\/missing-data-part-ii-what-to-do-with-missing-data\/\">Click here<\/a> to see my next post in this series. This will discuss some simple methods to deal with missing data. <\/p>\n\n\n\n<p>Or you can skip to <a href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/2020\/05\/01\/dealing-with-imputation-uncertainty\/\">my final post<\/a> on missing data: this will discuss a method that allows you to quantify the uncertainty that you are introducing into your analysis by using some of the methods discussed in my second post. <\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<p>I wrote a 20 page report on Missing Data as part of my studies at STOR-i. It discusses the ideas above in more depth. If you want to take a look, <a href=\"http:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-content\/uploads\/sites\/14\/2020\/05\/RT2__Missing_Data_TW_1.5_spacing.pdf\">click here<\/a>. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Often when we collect data, some is missing. What do we do? This post is going to cover an important question: what is causing the data to be missing?<\/p>\n","protected":false},"author":8,"featured_media":246,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[],"class_list":["post-231","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-statistics"],"_links":{"self":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/posts\/231","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/comments?post=231"}],"version-history":[{"count":17,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/posts\/231\/revisions"}],"predecessor-version":[{"id":292,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/posts\/231\/revisions\/292"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/media\/246"}],"wp:attachment":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/media?parent=231"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/categories?post=231"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/tessa-wilkie\/wp-json\/wp\/v2\/tags?post=231"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}