class: center, middle # Many Types of Categories: Multi-Way and Factorial ANOVA ![](images/anova/yoda_factorial.jpg) --- class: center, middle # Etherpad <br><br> <center><h3>https://etherpad.wikimedia.org/p/607-many-predictors-2020</h3></center> --- # A Non-Additive World 1. Replicating Categorical Variable Combinations: Factorial Models 2. Evaluating Interaction Effects 3. How to Look at Means with an Interaction Effect 4. Unbalanced Data --- # The world isn't additive - Until now, we have assumed factors combine additively - the effect of one is not dependent on the effect of the other -- - BUT - what if the effect of one factor depends on another? -- - This is an **INTERACTION** and is quite common -- - Biology: The science of "It depends..." -- - This is challenging to think about and visualize, but if you can master it, you will go far! --- # Intertidal Grazing! .center[ ![image](./images/22/grazing-expt.jpeg) #### Do grazers reduce algal cover in the intertidal? ] --- # Experiment Replicated on Two Ends of a gradient ![image](./images/22/zonation.jpg) --- # Factorial Experiment ![image](./images/22/factorial_blocks.jpg) --- # Factorial Design ![image](./images/22/factorial_layout.jpg) Note: You can have as many treatment types or observed category combinations as you want (and then 3-way, 4-way, etc. interactions) --- # The Data: See the dependency of one treatment on another? <img src="anova_3_files/figure-html/plot_algae-1.png" style="display: block; margin: auto;" /> --- # If we had fit y ~ a + b, residuals look weird <img src="anova_3_files/figure-html/graze_assumptions-1.png" style="display: block; margin: auto;" /> A Tukey Non-Additivity Test would Scream at us --- # A Factorial Model `$$\large y_{ijk} = \beta_{0} + \sum \beta_{i}x_{i} + \sum \beta_{j}x_{j} + \sum \beta_{ij}x_{ij} + \epsilon_{ijk}$$` `$$\large \epsilon_{ijk} \sim N(0, \sigma^{2} )$$` `$$\large x_{i} = 0,1, x_{j} = 0,1, x_{ij} = 0,1$$` - Note the new last term - Deviation due to treatment combination -- <hr> This is still something that can be in the form `$$\Large \boldsymbol{Y} = \boldsymbol{\beta X} + \boldsymbol{\epsilon}$$` --- # The Data (Four Rows) <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> height </th> <th style="text-align:left;"> herbivores </th> <th style="text-align:right;"> sqrtarea </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> low </td> <td style="text-align:left;"> minus </td> <td style="text-align:right;"> 9.4055728 </td> </tr> <tr> <td style="text-align:left;"> low </td> <td style="text-align:left;"> plus </td> <td style="text-align:right;"> 11.9767608 </td> </tr> <tr> <td style="text-align:left;"> mid </td> <td style="text-align:left;"> minus </td> <td style="text-align:right;"> 0.7071068 </td> </tr> <tr> <td style="text-align:left;"> mid </td> <td style="text-align:left;"> plus </td> <td style="text-align:right;"> 0.7071068 </td> </tr> </tbody> </table> --- # The Dummy-Coded Data <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;"> (Intercept) </th> <th style="text-align:right;"> heightmid </th> <th style="text-align:right;"> herbivoresplus </th> <th style="text-align:right;"> heightmid:herbivoresplus </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> </tr> </tbody> </table> --- # Fitting **Least Squares** ```r graze_int <- lm(sqrtarea ~ height + herbivores + height:herbivores, data=algae) ## OR graze_int <- lm(sqrtarea ~ height*herbivores, data=algae) ``` -- **Likelihood** ```r graze_int_glm <- glm(sqrtarea ~ height*herbivores, data=algae, family = gaussian(link = "identity")) ``` **Bayes** ```r graze_int_brm <- brm(sqrtarea ~ height*herbivores, data=algae, family = gaussian(link = "identity"), chains = 2, file = "intertidal_brms.rds") ``` --- # Assumptions are Met <img src="anova_3_files/figure-html/graze_assumptions_int-1.png" style="display: block; margin: auto;" /> --- # A Non-Additive World 1. Replicating Categorical Variable Combinations: Factorial Models 2. .red[Evaluating Interaction Effects] 3. How to Look at Means with an Interaction Effect 4. Unbalanced Data --- # Omnibus Tests for Interactions - Can do an F-Test `$$SS_{Total} = SS_{A} + SS_{B} + SS_{AB} +SS_{Error}$$` `$$SS_{AB} = n\sum_{i}\sum_{j}(\bar{Y_{ij}} - \bar{Y_{i}}- \bar{Y_{j}} - \bar{Y})^{2}$$` `$$df=(i-1)(j-1)$$` -- - Can do an ANODEV - Compare A + B versus A + B + A:B -- - Can do CV as we did before, only now one model has an interaction - Again, think about what models you are comparing -- - Can look at finite population variance of interaction --- # ANOVA <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> height </td> <td style="text-align:right;"> 88.97334 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0.3740858 </td> <td style="text-align:right;"> 0.5430962 </td> </tr> <tr> <td style="text-align:left;"> herbivores </td> <td style="text-align:right;"> 1512.18349 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 6.3579319 </td> <td style="text-align:right;"> 0.0143595 </td> </tr> <tr> <td style="text-align:left;"> height:herbivores </td> <td style="text-align:right;"> 2616.95555 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 11.0029142 </td> <td style="text-align:right;"> 0.0015486 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 14270.52238 </td> <td style="text-align:right;"> 60 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> </tbody> </table> --- # ANODEV <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> LR Chisq </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> Pr(>Chisq) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> height </td> <td style="text-align:right;"> 0.3740858 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0.5407855 </td> </tr> <tr> <td style="text-align:left;"> herbivores </td> <td style="text-align:right;"> 6.3579319 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0.0116858 </td> </tr> <tr> <td style="text-align:left;"> height:herbivores </td> <td style="text-align:right;"> 11.0029142 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0.0009097 </td> </tr> </tbody> </table> --- # What do the Coefficients Mean? <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 32.91450 </td> <td style="text-align:right;"> 3.855532 </td> <td style="text-align:right;"> 8.536955 </td> <td style="text-align:right;"> 0.0000000 </td> </tr> <tr> <td style="text-align:left;"> heightmid </td> <td style="text-align:right;"> -10.43090 </td> <td style="text-align:right;"> 5.452546 </td> <td style="text-align:right;"> -1.913034 </td> <td style="text-align:right;"> 0.0605194 </td> </tr> <tr> <td style="text-align:left;"> herbivoresplus </td> <td style="text-align:right;"> -22.51075 </td> <td style="text-align:right;"> 5.452546 </td> <td style="text-align:right;"> -4.128484 </td> <td style="text-align:right;"> 0.0001146 </td> </tr> <tr> <td style="text-align:left;"> heightmid:herbivoresplus </td> <td style="text-align:right;"> 25.57809 </td> <td style="text-align:right;"> 7.711064 </td> <td style="text-align:right;"> 3.317064 </td> <td style="text-align:right;"> 0.0015486 </td> </tr> </tbody> </table> - Intercept chosen as basal condition (low, herbivores -) -- - Changing height to high is associated with a loss of 10 units of algae relative to low/- -- - Adding herbivores is associated with a loss of 22 units of algae relative to low/- -- - BUT - if you add herbivores and mid, that's also associated with an increase of 25 units of algae relative to low/- -- .center[**NEVER TRY AND INTERPRET ADDITIVE EFFECTS ALONE WHEN AN INTERACTION IS PRESENT**<Br>that way lies madness] --- # A Non-Additive World 1. Replicating Categorical Variable Combinations: Factorial Models 2. Evaluating Interaction Effects 3. .red[How to Look at Means with an Interaction Effect] 4. Unbalanced Data --- # Let's Look at Means, Figures, and Posthocs .center[.middle[ ![image](./images/22/gosling_p_value.jpg) ]] --- # This view is intuitive <img src="anova_3_files/figure-html/unnamed-chunk-1-1.png" style="display: block; margin: auto;" /> --- # This view is also intuitive <img src="anova_3_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" /> --- # Posthocs and Factorial Designs - Must look at simple effects first in the presence of an interaction - The effects of individual treatment combinations - If you have an interaction, this is what you do! -- - Main effects describe effects of one variable in the complete absence of the other - Useful only if one treatment CAN be absent - Only have meaning if there is no interaction --- # Posthoc Comparisons Averaging Over Blocks - Misleading! ``` contrast estimate SE df t.ratio p.value minus - plus 9.72 3.86 60 2.521 0.0144 Results are averaged over the levels of: height ``` --- # Posthoc with Simple Effects <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> contrast </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> SE </th> <th style="text-align:right;"> df </th> <th style="text-align:right;"> t.ratio </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> low minus - mid minus </td> <td style="text-align:right;"> 10.430905 </td> <td style="text-align:right;"> 5.452546 </td> <td style="text-align:right;"> 60 </td> <td style="text-align:right;"> 1.913034 </td> <td style="text-align:right;"> 0.0605194 </td> </tr> <tr> <td style="text-align:left;"> low minus - low plus </td> <td style="text-align:right;"> 22.510748 </td> <td style="text-align:right;"> 5.452546 </td> <td style="text-align:right;"> 60 </td> <td style="text-align:right;"> 4.128484 </td> <td style="text-align:right;"> 0.0001146 </td> </tr> <tr> <td style="text-align:left;"> low minus - mid plus </td> <td style="text-align:right;"> 7.363559 </td> <td style="text-align:right;"> 5.452546 </td> <td style="text-align:right;"> 60 </td> <td style="text-align:right;"> 1.350481 </td> <td style="text-align:right;"> 0.1819337 </td> </tr> <tr> <td style="text-align:left;"> mid minus - low plus </td> <td style="text-align:right;"> 12.079843 </td> <td style="text-align:right;"> 5.452546 </td> <td style="text-align:right;"> 60 </td> <td style="text-align:right;"> 2.215450 </td> <td style="text-align:right;"> 0.0305355 </td> </tr> <tr> <td style="text-align:left;"> mid minus - mid plus </td> <td style="text-align:right;"> -3.067346 </td> <td style="text-align:right;"> 5.452546 </td> <td style="text-align:right;"> 60 </td> <td style="text-align:right;"> -0.562553 </td> <td style="text-align:right;"> 0.5758352 </td> </tr> <tr> <td style="text-align:left;"> low plus - mid plus </td> <td style="text-align:right;"> -15.147189 </td> <td style="text-align:right;"> 5.452546 </td> <td style="text-align:right;"> 60 </td> <td style="text-align:right;"> -2.778003 </td> <td style="text-align:right;"> 0.0072896 </td> </tr> </tbody> </table> -- .center[**That's a Lot to Drink In!**] --- # Might be easier visually <img src="anova_3_files/figure-html/graze_posthoc_plot-1.png" style="display: block; margin: auto;" /> --- # We are often interested in something simpler... <img src="anova_3_files/figure-html/graze_posthoc_plot2-1.png" style="display: block; margin: auto;" /> --- # Why think about interactions - It Depends is a rule in biology - Context dependent interactions everywhere - Using categorical predictors in a factorial design is an elegant way to see interactions without worrying about shapes of relationships - BUT - it all comes down to a general linear model! And the same inferential frameworks we have been dealing with since day 1 --- # Final Thought - You can have 2, 3, and more-way interactions! .center[.middle[ ![image](./images/22/4_way_interaction.jpg) ]] --- # A Non-Additive World 1. Replicating Categorical Variable Combinations: Factorial Models 2. Evaluating Interaction Effects 3. How to Look at Means with an Interaction Effect 4. .red[Unbalanced Data] --- # What about unbalanced designs? ![](images/anova/unbalanced_cat.png) ---- # Coda: Oh no! I lost a replicate (or two) ```r algae_unbalanced <- algae[-c(1:5), ] ``` --- # Type of Sums of Squares Matters Type I <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Mean Sq </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> height </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 151.8377 </td> <td style="text-align:right;"> 151.8377 </td> <td style="text-align:right;"> 0.6380017 </td> <td style="text-align:right;"> 0.4278712 </td> </tr> <tr> <td style="text-align:left;"> herbivores </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1384.0999 </td> <td style="text-align:right;"> 1384.0999 </td> <td style="text-align:right;"> 5.8158020 </td> <td style="text-align:right;"> 0.0192485 </td> </tr> <tr> <td style="text-align:left;"> height:herbivores </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2933.5934 </td> <td style="text-align:right;"> 2933.5934 </td> <td style="text-align:right;"> 12.3265653 </td> <td style="text-align:right;"> 0.0008998 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 55 </td> <td style="text-align:right;"> 13089.4237 </td> <td style="text-align:right;"> 237.9895 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> </tbody> </table> Type II <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> height </td> <td style="text-align:right;"> 77.87253 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0.3272099 </td> <td style="text-align:right;"> 0.5696373 </td> </tr> <tr> <td style="text-align:left;"> herbivores </td> <td style="text-align:right;"> 1384.09995 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5.8158020 </td> <td style="text-align:right;"> 0.0192485 </td> </tr> <tr> <td style="text-align:left;"> height:herbivores </td> <td style="text-align:right;"> 2933.59337 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 12.3265653 </td> <td style="text-align:right;"> 0.0008998 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 13089.42369 </td> <td style="text-align:right;"> 55 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> </tbody> </table> --- # Enter Type III <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 14188.804 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 59.619447 </td> <td style="text-align:right;"> 0.0000000 </td> </tr> <tr> <td style="text-align:left;"> height </td> <td style="text-align:right;"> 1175.967 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4.941256 </td> <td style="text-align:right;"> 0.0303521 </td> </tr> <tr> <td style="text-align:left;"> herbivores </td> <td style="text-align:right;"> 4242.424 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 17.826097 </td> <td style="text-align:right;"> 0.0000915 </td> </tr> <tr> <td style="text-align:left;"> height:herbivores </td> <td style="text-align:right;"> 2933.593 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 12.326565 </td> <td style="text-align:right;"> 0.0008998 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 13089.424 </td> <td style="text-align:right;"> 55 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> </tbody> </table> -- Compare to type II <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> height </td> <td style="text-align:right;"> 77.87253 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0.3272099 </td> <td style="text-align:right;"> 0.5696373 </td> </tr> <tr> <td style="text-align:left;"> herbivores </td> <td style="text-align:right;"> 1384.09995 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5.8158020 </td> <td style="text-align:right;"> 0.0192485 </td> </tr> <tr> <td style="text-align:left;"> height:herbivores </td> <td style="text-align:right;"> 2933.59337 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 12.3265653 </td> <td style="text-align:right;"> 0.0008998 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 13089.42369 </td> <td style="text-align:right;"> 55 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> </tbody> </table> --- # What’s Going On: Type I, II, and III Sums of Squares **Type I Sums of Squares:** -- SS for A calculated from a model with A + Intercept versus just Intercept -- SS for B calculated from a model with A + B + Intercept versus A + Intercept -- SS for A:B calculated from a model with A + B + A:B +Intercept versus A + B + Intercept -- This is **fine** for a balanced design. Variation evenly partitioned. --- # What’s Going On: Type I, II, and III Sums of Squares **Type II Sums of Squares:** -- SS for A calculated from a model with A + B + Intercept versus B + Intercept -- SS for B calculated from a model with A + B + Intercept versus A + Intercept -- SS for A:B calculated from a model with A + B + A:B +Intercept versus A + B + Intercept -- Interaction not incorporated in assessing main effects ## What’s Going On: Type I, II, and III Sums of Squares **Type III Sums of Squares:** -- SS for A calculated from a model with A + B + A:B + Intercept versus B + A:B + Intercept -- SS for B calculated from a model with A + B + A:B + Intercept versus A + A:B + Intercept -- SS for A:B calculated from a model with A + B + A:B +Intercept versus A + B + Intercept -- Each SS is the unique contribution of a treatment -- **very conservative** --- ## What’s Going On: Type I and II Sums of Squares <h4> ------------ ------------ ------------ ------------ Type I Type II Test for A A v. 1 A + B v. B A + B + A:B v B + A:B<br><br> Test for B A + B v. A A + B v. A A + B + A:B v A + A:B<br><br> Test for B A + B v. A A + B v. A A + B + A:B v A + B<br><br> ------------ ------------ ------------ ------------ </h4> --- # Which SS to Use? - Traditionally, urged to use Type III -- - What do type III models mean? - A + B + A:B v. B + A:B -- - Interactions the same for all, and if A:B is real, main effects not important -- - Type III has lower power for main effects -- - Type II produces more meaningful results if main effects are a concern - which they are!