{"id":250,"date":"2021-04-14T11:18:52","date_gmt":"2021-04-14T11:18:52","guid":{"rendered":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/?p=250"},"modified":"2021-04-30T12:14:13","modified_gmt":"2021-04-30T12:14:13","slug":"statistics-in-social-science3-step-by-step-tutorial-on-one-way-anova-test","status":"publish","type":"post","link":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/2021\/04\/14\/statistics-in-social-science3-step-by-step-tutorial-on-one-way-anova-test\/","title":{"rendered":"Statistics in Social Science(3): Step-by-Step tutorial on One-way ANOVA test"},"content":{"rendered":"\n<p><span class=\"has-inline-color has-secondary-color\">This blog will explain the one-way ANOVA test in detail (including assumptions, implementing situation and explanation), and an example analysed by R will be shown at the end.<\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is this test for?<\/h2>\n\n\n\n<p>You may be familiar with the t-test and some other nonparametric test used to test if there is a difference in the mean between two groups (e.g., if there is a difference in mean score between two classes; if one treatment is better than another treatment). The one-way analysis of variance (ANOVA) is used to <strong>determine if there is a significant difference among the means of three or more independent groups<\/strong>. For example, the application situation could be:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>if there is a difference in mean score among the four classes<\/li><li>if there is a difference in the mean effect among the three types of treatment<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Assumptions:<\/h2>\n\n\n\n<p>There is no free lunch. To implement the one-way ANOVA test, it should satisfy three assumptions:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>The variable is normally distributed in each group in the one-way ANOVA (technically, it is the residuals that need to be normally distributed, but the results will be the same). For example, if we want to compare the mean score on three classes, the score should have a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Normal_distribution#:~:text=The%20normal%20distribution%20is%20the,a%20specified%20mean%20and%20variance.\">normal distribution<\/a> for each class. <\/li><li>The variances are homogenous. This means the population variance in each group should equal. For example, the scores of the students in the three classes should fluctuate by a similar level. <\/li><li>The observations should be independent. This means one observation will not influence other observations. For example, student A&#8217;s grade will not influence student B&#8217;s grade as they took their exam independently.<\/li><\/ul>\n\n\n\n<p>All three test will be tested before implementing one-way ANOVA test. Now, let&#8217;s look at how to implementing ANOVA test through R.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How to do it and explain it (An example in R)<\/h2>\n\n\n\n<p>Let&#8217;s use the dataset in R called &#8216;PlantGrowth&#8217;. It includes the weight of 30 plants with three groups (10 plants will not receive any treatment (control group), 10 plants receive treatment A, and 10 plants receive treatment B). And our purpose is to find if there is a difference in the mean effect among the three groups?<\/p>\n\n\n\n<p>Firstly, lets draw a boxplot to see the data graphically.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-2-1024x541.png\" alt=\"\" class=\"wp-image-252\" width=\"566\" height=\"298\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-2-1024x541.png 1024w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-2-300x158.png 300w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-2-768x406.png 768w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-2.png 1043w\" sizes=\"auto, (max-width: 566px) 100vw, 566px\" \/><\/figure><\/div>\n\n\n\n<p>From the boxplot, we could conclude that treatment 1 has a lower effect than the control group, but the difference is not too large. And plants received treatment 3 has a larger weight than the other two groups.<\/p>\n\n\n\n<p>Next, we measure the difference through One-way ANOVA, and we got the result:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>res.aov &lt;- aov(weight ~ group, data = data)\n# Summary of the analysis\nsummary(res.aov)\n            Df Sum Sq Mean Sq F value Pr(&gt;F)  \ngroup        2  3.766  1.8832   4.846 0.0159 *\nResiduals   27 10.492  0.3886                 \n---\nSignif. codes:  0 \u2018***\u2019 0.001 \u2018**\u2019 0.01 \u2018*\u2019 0.05 \u2018.\u2019 0.1 \u2018 \u2019 1<\/code><\/pre>\n\n\n\n<h5 class=\"wp-block-heading\">Interpretation<\/h5>\n\n\n\n<p>Under a 5% significance level, the P-value of the test is less than 0.05 (P=0.0159&lt;0.05). So we could conclude there is a significant difference among groups. <\/p>\n\n\n\n<p>However, we could only say there is a significant difference among groups, but we don\u2019t know which pairs of groups are different. To understand if there is a difference between specific pairs of groups, we could implement Tukey multiple pairwise-comparisons:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>TukeyHSD(res.aov)\n  Tukey multiple comparisons of means\n    95% family-wise confidence level\nFit: aov(formula = weight ~ group, data = data)\n$group\n            diff        lwr       upr     p adj\ntrt1-ctrl -0.371 -1.0622161 0.3202161 0.3908711\ntrt2-ctrl  0.494 -0.1972161 1.1852161 0.1979960\ntrt2-trt1  0.865  0.1737839 1.5562161 0.0120064<\/code><\/pre>\n\n\n\n<p>Under a 5% significance level, we could conclude that treatment 2 is significantly better than treatment1 on the mean weight of the plant. However, there is no statistical evidence that treatment 2 is better than treatment 1, and treatment 1 is worse than receiving no treatment.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Checking the assumptions<\/h4>\n\n\n\n<p>Now lets check the assumptions:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Normally distributed assumptions. On the QQ plot, most points lie on the straight line except point 4, 15 and 17. However, we only have a small sample size (30 plants), so it is reasonable to see a normal QQ plot like this. We could also test the normality through the Shapiro-Wilk normality test. Under the 5% significance level, we could not reject the null hypothesis that the residuals are normally distributed.<\/li><\/ul>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-3-1024x541.png\" alt=\"\" class=\"wp-image-253\" width=\"623\" height=\"328\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-3-1024x541.png 1024w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-3-300x158.png 300w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-3-768x406.png 768w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-3.png 1043w\" sizes=\"auto, (max-width: 623px) 100vw, 623px\" \/><\/figure><\/div>\n\n\n\n<pre class=\"wp-block-code\"><code>shapiro.test(x = residuals(res.aov) )\n\n\tShapiro-Wilk normality test\n\ndata:  residuals(res.aov)\nW = 0.96607, p-value = 0.4379<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\"><li>Homogenous variance assumption: From the Residual vs Fitted plot, we could see slight evidence of non-constant variance since the degree of dispersion for each group is different. However, it seems not serious. LeveneTest could also be done to test the homogeneity of variance. Under 5% significance, we could not reject the null hypothesis (P-value&gt;0.05) to assume the homogeneity of variances in the different treatment groups.<\/li><\/ul>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-4-1024x541.png\" alt=\"\" class=\"wp-image-254\" width=\"599\" height=\"316\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-4-1024x541.png 1024w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-4-300x158.png 300w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-4-768x406.png 768w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/image-4.png 1043w\" sizes=\"auto, (max-width: 599px) 100vw, 599px\" \/><\/figure><\/div>\n\n\n\n<pre class=\"wp-block-code\"><code>leveneTest(weight ~ group, data =data)\nLevene's Test for Homogeneity of Variance (center = median)\n      Df F value Pr(&gt;F)\ngroup  2  1.1192 0.3412\n      27      <\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\"><li>Independent assumption: This assumption needs more consideration. In our example, we could assume satisfying this independent assumption since the weight of one plant will not influence the weight of other plants.<\/li><\/ul>\n\n\n\n<p>That&#8217;s all done! This blog references the blog which including specific R code:<\/p>\n\n\n\n<p><a href=\"http:\/\/mathsbox.com\/notebooks\/python-utilities.html\">http:\/\/mathsbox.com\/notebooks\/python-utilities.html<\/a><\/p>\n\n\n\n<p>Besides, I also found useful blogs which using SPSS to do one-way ANOVA test:<\/p>\n\n\n\n<p><a href=\"https:\/\/statistics.laerd.com\/statistical-guides\/one-way-anova-statistical-guide-3.php\">https:\/\/statistics.laerd.com\/statistical-guides\/one-way-anova-statistical-guide-3.php<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/statistics.laerd.com\/spss-tutorials\/one-way-anova-using-spss-statistics.php\">https:\/\/statistics.laerd.com\/spss-tutorials\/one-way-anova-using-spss-statistics.php<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This blog will explain the one-way ANOVA test in detail (including assumptions, implementing situation and explanation), and an example analysed&hellip;<\/p>\n","protected":false},"author":25,"featured_media":252,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4],"tags":[],"class_list":["post-250","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blogs"],"_links":{"self":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/posts\/250","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/users\/25"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/comments?post=250"}],"version-history":[{"count":6,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/posts\/250\/revisions"}],"predecessor-version":[{"id":378,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/posts\/250\/revisions\/378"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/media\/252"}],"wp:attachment":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/media?parent=250"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/categories?post=250"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/tags?post=250"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}