class: center, middle <!-- make the interaction plots make more sense and relate to a question --> # Mixed Models ![:scale 70%](images/mixed_models/lmer_lemur_tim_doherty.jpg) --- ## A Mixed Outline 1. Mixed Models 2. A Variable Intercept Mixed Model - Yes, there will be assumption checks 3. Variable Slopes and Intercepts 4. GLMMs - a digression 5. Visualizing Uncertainty with Mixed Models --- class: center, middle ![](images/mixed_models/mixed_barbenheimer.jpg) --- ## A Random Effects Model for a Clustered Mean `$$Y_{ij} = \alpha_{j} + \epsilon_{ij}$$` `$$\alpha_{j} \sim \mathcal{N}(\mu_{\alpha}, \sigma^2_{\alpha})$$` `$$\epsilon_{ij} \sim \mathcal{N}(0, \sigma^2)$$` --- ## A Random Effects Model for a Clustered Mean `$$Y_{ij} = \beta_0 + \alpha_{j} + \epsilon_{ij}$$` `$$\alpha_{j} \sim \mathcal{N}(0, \sigma^2_{\alpha})$$` `$$\epsilon_{ij} \sim \mathcal{N}(0, \sigma^2)$$` - here `\(\beta_0\)` is the grand mean - `\(\alpha_j\)` is the random effect due to being in cluster j --- ## Now add predictors: Mixed Models with Variable Intercepts `$$Y_{ij} = \beta_0 + \beta_1 X_{ij} + \alpha_{j} + \epsilon_{ij}$$` `$$\alpha_{j} \sim \mathcal{N}(0, \sigma^2_{\alpha})$$` `$$\epsilon_{ij} \sim \mathcal{N}(0, \sigma^2)$$` - Here, `\(\alpha_j\)` is the deviation from the grand mean due to being in cluster j --- ## Wait.... Isn't this Just a Model with a Categorical variable and a Continuous Variable? `$$Y_{ij} = \beta_0 + \beta_1 X_{ij} + \alpha_{j} + \epsilon_{ij}$$` -- Let's Re-write -- `$$Y_{ij} = \beta_0 + \beta_1 X_{1i} + \sum \beta_j X_{ij} + \epsilon_{ij}$$` `$$X_{ij} = 0,1$$` -- - Yup - THE SAME as a linear model with a categorical and continuous predictor -- - BUT - `\(\alpha_j\)` in model 1 or `\(\beta_j\)` in model 2 is constrained by a gaussian distribution -- - And we do not control for the categorical variable when estimating `\(\beta_1\)` - Endogeneity assumption! --- ## Now add predictors: Mixed Models with Variable Slopes and Intercepts `$$Y_{ij} = \alpha_{j} + \beta_{j}X_{ij} + \epsilon_{ij}$$` `$$\begin{pmatrix} \alpha_{j} \\ \beta_{j} \end{pmatrix} \sim \mathcal{MVN} \left ( \begin{pmatrix} \mu_{\alpha} \\ \mu_{\beta} \end{pmatrix} , \begin{pmatrix} \sigma_{\alpha}^{2}& \sigma_{\alpha \beta}\\ \sigma_{\alpha \beta} & \sigma_{\beta}^{2} \end{pmatrix} \right )$$` `$$\epsilon_{ij} \sim \mathcal{N}(0, \sigma^2)$$` - Here we see the slope and intercept vary around a grand mean. - Those grand means are often refered to as the fixed effects. -- - The slope and intercept covary, and hence follow a multivariate normal distribution. --- ## Now add predictors: Mixed Models with Variable Slopes and Intercepts `$$Y_{ij} = \beta_0 + \beta_1 X_{ij} + \alpha_{j} + \gamma_{j}X_{ij} + \epsilon_{ij}$$` `$$\begin{pmatrix} \alpha_{j} \\ \gamma_{j} \end{pmatrix} \sim \mathcal{MVN} \left ( \begin{pmatrix} 0 \\ 0 \end{pmatrix} , \begin{pmatrix} \sigma_{\alpha}^{2} & \sigma_{\alpha \gamma}\\ \sigma_{\alpha \gamma} & \sigma_{\beta}^{2} \end{pmatrix} \right )$$` `$$\epsilon_{ij} \sim \mathcal{N}(0, \sigma^2)$$` - Here `\(\beta_0 + \beta_1 X_{ij}\)` is the average model for the population. - You might see these refered to as the fixed effects. -- - `\(\alpha_j + \gamma_j X_{ij}\)` are the random effects. -- - We still have a MVN of the RE --- ## Why MVN? When the slope changes, the intercept must also change so that the center of the line is close to `\(\bar{X}\)` and `\(\bar{Y}\)` <img src="mixed_models_files/figure-html/mvn-1.png" style="display: block; margin: auto;" /> --- # Wait, isn't this just an interaction effect? Consider the Following Model `$$Y_{ij} = \beta_0 + \beta_1 X_{ij} + \alpha_{j} + \gamma_{j}X_{ij} + \epsilon_{ij}$$` -- - If `\(\alpha_j\)` and `\(\gamma_j\)` are drawn from a MVN distribution, this is a mixed model. -- - If they are not, this is a model with an interaction between cluster and a covariate -- Consider the model re-written: `$$Y_{ij} = \beta_0 + \beta_1 X_{i} + \sum \alpha_{j} X_j + \sum \gamma_{j}X_{i} X_j + \epsilon_{ij}$$` `$$X_j = 0,1$$` -- - This is ye olde interaction between a continuous and categorical predictor - It's the same model, re-written - it's all about constraining our terms & assuming exogeneity of the RE --- # THE BIG DIFFERENCE between RE and FE - Up until now, we have been fitting models where `\(\alpha_j\)` could be anything. -- - In truth, that made an assumption: `$$\alpha_j \sim N(0, \infty)$$` -- - With an RE, we assume `$$\alpha_j \sim N(0, \sigma^2_{cluster})$$` -- - BUT we also assume that covariate levels do not depend on clusters - i.e., if we were to resample X in our clusters, the rank order of the mean values of X per cluster would change every time --- ## A Mixed Outline 1. Mixed Models 2. .red[ A Variable Intercept Mixed Model ] - Yes, there will be assumption checks 3. Variable Slopes and Intercepts 4. GLMMs - a digression 5. Visualizing Uncertainty with Mixed Models --- # Let's take this to the beach with Tide Height: RIKZ ![:scale 80%](./images/mixed_models/denmark-lightsbeach.jpeg) Richness of benthic species in 45 sampling stations along the coastline of The Netherlands, measured by two researches of the the RIKZ (Rijksinstituut voor Kust en Zee), i.e., the Dutch Institute for Coastal and Marine Management. --- # How is Tidal Height of Measurement Associated With Species Richness? <img src="mixed_models_files/figure-html/plot_varint-1.png" style="display: block; margin: auto;" /> (note, log as this is count data, which, really....) --- # A Variable Intercept Model `$$Y_{ij} = \beta_0 + \beta_1 X_{ij} + \alpha_{j} + \epsilon_{ij}$$` `$$\alpha_{j} \sim \mathcal{N}(0, \sigma^2_{\alpha})$$` `$$\epsilon_{ij} \sim \mathcal{N}(0, \sigma^2)$$` - Here, `\(\beta_0\)` is the grand mean (fixed effect) intercept. - `\(\alpha_{j}\)` is the deviation from that due to random beach-to-beach variation --- # Quick Endogeneity Check <img src="mixed_models_files/figure-html/unnamed-chunk-1-1.png" style="display: block; margin: auto;" /> - In truth, this should come from system-specific knowledge. - Also some formal tests (e.g., Hausman test), but, not always useful --- # The R model, two ways LME4: From Doug Bates, Ben Bolker, and Many More ```r library(lme4) # first the lmer rikz_varint <- lmer(log_richness ~ NAP + (1|Beach), data = RIKZdat) ``` Using Template Model Builder from Mollie Brooks, Ben Bolker, and more ```r library(glmmTMB) # now with TMB rikz_varint_tmb <- glmmTMB(log_richness ~ NAP + (1|Beach), data = RIKZdat) ``` --- # The R model, two ways LME4: From Doug Bates, Ben Bolker, and Many More ```r library(lme4) # first the lmer rikz_varint <- lmer(log_richness ~ NAP + `(1|Beach)`, data = RIKZdat) ``` Using Template Model Builder from Mollie Brooks, Ben Bolker, and more ```r library(glmmTMB) # now with TMB rikz_varint_tmb <- glmmTMB(log_richness ~ NAP + `(1|Beach)`, data = RIKZdat) ``` --- # You Knew Assumption Checks Were Coming, Right? 1. Check your prediction distribution! 2. For LMMs, all linearity and normality assumptions hold. 3. BUT - we are now also assuming our REs have a normal distribution --- # Ye Olde Predictions <img src="mixed_models_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- # Ye Olde QQ Plot <img src="mixed_models_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> --- # Ye Olde HOV Plot <img src="mixed_models_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> --- # Are Our REs Normal? <img src="mixed_models_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- class: center, middle # Next: Evaluate! --- # Coefficients and SDs <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> effect </th> <th style="text-align:left;"> group </th> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> fixed </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 1.81 </td> <td style="text-align:right;"> 0.14 </td> <td style="text-align:right;"> 13.00 </td> </tr> <tr> <td style="text-align:left;"> fixed </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> NAP </td> <td style="text-align:right;"> -0.52 </td> <td style="text-align:right;"> 0.07 </td> <td style="text-align:right;"> -7.96 </td> </tr> <tr> <td style="text-align:left;"> ran_pars </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> sd__(Intercept) </td> <td style="text-align:right;"> 0.37 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> <tr> <td style="text-align:left;"> ran_pars </td> <td style="text-align:left;"> Residual </td> <td style="text-align:left;"> sd__Observation </td> <td style="text-align:right;"> 0.41 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> </tbody> </table> - Our fixed slope and intercept mean what they did as an lm. -- - Note, these are FEs - for the entire population -- - We also get the SD of the RE --- # What are my REs? <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> effect </th> <th style="text-align:left;"> component </th> <th style="text-align:left;"> group </th> <th style="text-align:left;"> level </th> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> ran_vals </td> <td style="text-align:left;"> cond </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> 1 </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 0.33 </td> <td style="text-align:right;"> 0.20 </td> </tr> <tr> <td style="text-align:left;"> ran_vals </td> <td style="text-align:left;"> cond </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> 2 </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 0.67 </td> <td style="text-align:right;"> 0.21 </td> </tr> <tr> <td style="text-align:left;"> ran_vals </td> <td style="text-align:left;"> cond </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> 3 </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> -0.32 </td> <td style="text-align:right;"> 0.19 </td> </tr> <tr> <td style="text-align:left;"> ran_vals </td> <td style="text-align:left;"> cond </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> 4 </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> -0.29 </td> <td style="text-align:right;"> 0.19 </td> </tr> <tr> <td style="text-align:left;"> ran_vals </td> <td style="text-align:left;"> cond </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> 5 </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 0.15 </td> <td style="text-align:right;"> 0.19 </td> </tr> <tr> <td style="text-align:left;"> ran_vals </td> <td style="text-align:left;"> cond </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> 6 </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> -0.13 </td> <td style="text-align:right;"> 0.19 </td> </tr> <tr> <td style="text-align:left;"> ran_vals </td> <td style="text-align:left;"> cond </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> 7 </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> -0.21 </td> <td style="text-align:right;"> 0.19 </td> </tr> <tr> <td style="text-align:left;"> ran_vals </td> <td style="text-align:left;"> cond </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> 8 </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> -0.11 </td> <td style="text-align:right;"> 0.19 </td> </tr> <tr> <td style="text-align:left;"> ran_vals </td> <td style="text-align:left;"> cond </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> 9 </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> -0.08 </td> <td style="text-align:right;"> 0.19 </td> </tr> </tbody> </table> --- # Visualizing REs? <img src="mixed_models_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> --- # Combined coefficients <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;"> (Intercept) </th> <th style="text-align:right;"> NAP </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 2.15 </td> <td style="text-align:right;"> -0.52 </td> </tr> <tr> <td style="text-align:right;"> 2.50 </td> <td style="text-align:right;"> -0.52 </td> </tr> <tr> <td style="text-align:right;"> 1.48 </td> <td style="text-align:right;"> -0.52 </td> </tr> <tr> <td style="text-align:right;"> 1.51 </td> <td style="text-align:right;"> -0.52 </td> </tr> <tr> <td style="text-align:right;"> 1.96 </td> <td style="text-align:right;"> -0.52 </td> </tr> <tr> <td style="text-align:right;"> 1.68 </td> <td style="text-align:right;"> -0.52 </td> </tr> <tr> <td style="text-align:right;"> 1.60 </td> <td style="text-align:right;"> -0.52 </td> </tr> <tr> <td style="text-align:right;"> 1.70 </td> <td style="text-align:right;"> -0.52 </td> </tr> <tr> <td style="text-align:right;"> 1.73 </td> <td style="text-align:right;"> -0.52 </td> </tr> </tbody> </table> --- # Whither the R2? ``` # R2 for Mixed Models Conditional R2: 0.708 Marginal R2: 0.492 ``` -- - The *Marginal R<sup>2</sup>* is the R<sup>2</sup> due to fixed effects alone - The fit explained by population-level effects -- - The *Conditional R<sup>2</sup>* is the R<sup>2</sup> due to fixed + random effects - The fit explained by both population level effects and individual variation -- - The Conditional R<sup>2</sup> will always be larger - If Marginal = 0.01, but Conditional = 0.99, what have you explained? --- class: center, middle # And now, some model visualization --- # Visualizing Fixed (Marginal) Effects Only <img src="mixed_models_files/figure-html/unnamed-chunk-15-1.png" style="display: block; margin: auto;" /> --- # Visualizing Conditional Effects Only <img src="mixed_models_files/figure-html/unnamed-chunk-16-1.png" style="display: block; margin: auto;" /> --- # Putting it All Together <img src="mixed_models_files/figure-html/unnamed-chunk-17-1.png" style="display: block; margin: auto;" /> --- # Putting it All Together <img src="mixed_models_files/figure-html/unnamed-chunk-18-1.png" style="display: block; margin: auto;" /> --- # Putting it All Together <img src="mixed_models_files/figure-html/unnamed-chunk-19-1.png" style="display: block; margin: auto;" /> --- ## A Mixed Outline 1. Mixed Models 2. A Variable Intercept Mixed Model - Yes, there will be assumption checks 3. .red[ Variable Slopes and Intercepts ] 4. GLMMs - a digression 5. Visualizing Uncertainty with Mixed Models --- class: center, middle ![](images/mixed_models/same_model.png) --- ## Now add predictors: Mixed Models with Variable Slopes and Intercepts `$$Y_{ij} = \beta_0 + \beta_1 X_{ij} + \alpha_{j} + \gamma_{j}X_{ij} + \epsilon_{ij}$$` `$$\begin{pmatrix} \alpha_{j} \\ \gamma_{j} \end{pmatrix} \sim \mathcal{MVN} \left ( \begin{pmatrix} 0 \\ 0 \end{pmatrix} , \begin{pmatrix} \sigma_{\alpha}^{2} & \sigma_{\alpha \gamma}\\ \sigma_{\alpha \gamma} & \sigma_{\beta}^{2} \end{pmatrix} \right )$$` `$$\epsilon_{ij} \sim \mathcal{N}(0, \sigma^2)$$` - Here, we will allow the effect of NAP to vary by Beach --- # Level-1, Level-2 Model Style - What if you want to be explicit about where different sources of variation entered from? -- - For example, what if we had sample nested in beach nested in region nested in country, nested in.... -- - This can get a bit crazy, but, it makes life easier to separate models into levels, with the replicate/individual level as Level-1, the lowest level cluset as level-2, etc... --- # Level-1, Level-2 Model Style Level-1 Model (the measurement level) `$$Y_{ij} = \beta_{0j} + \beta_{1j} X_{ij} + \epsilon_{ij}$$` -- Level-2 Model (the beach level) `$$\beta_{0j} = \beta_0 + \alpha_{j}$$` `$$\beta_{1j} = \beta_1 + \gamma_{j}$$` -- We now refer to parameters in the level-2 model as **hyperparameters** --- # Variable Slope-Intercept Models in R ```r rikz_varslopeint <- lmer(log_richness ~ NAP + `(NAP + 1|Beach)`, data = RIKZdat) ``` - Including both slope and intercept in random term. - Could have left off intercept --- # Diagnostics on Multiple REs <img src="mixed_models_files/figure-html/unnamed-chunk-22-1.png" style="display: block; margin: auto;" /> --- # Visualize As Before <img src="mixed_models_files/figure-html/unnamed-chunk-23-1.png" style="display: block; margin: auto;" /><table> <thead> <tr> <th style="text-align:right;"> R2_conditional </th> <th style="text-align:right;"> R2_marginal </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.8265222 </td> <td style="text-align:right;"> 0.4360792 </td> </tr> </tbody> </table> --- class:center, middle ![](images/mixed_models/many_names.jpg) --- # Let's Get Hierarchical - Clusters can have **cluster-level predictors**. -- - Think infection of cells within individuals within counties within states within... - Individuals might have different jobs that bring them into contact with different numbers of other infected people - Counties will have different levels of masking prevalence - States will have different levels of funding for testing -- - For example, in the RIKZ data, each beach has a different wave exposure -- - Hierarchical predictors can affect outcomes - they could even affect slopes --- # Level-1, Level-2 Model Style Makes this Easier Level-1 Model (the measurement level) `$$Y_{ij} = \beta_{0j} + \beta_{1j} X_{ij} + \epsilon_{ij}$$` -- Level-2 Model (the beach level) `$$\beta_{0j} = \beta_0 + \alpha_{j} + \beta_2 X_j$$` `$$\beta_{1j} = \beta_1 + \gamma_{j}$$` --- class: center, middle ![](images/mixed_models/hyperparameters.jpg) --- ## Or All One `$$Y_{ij} = \beta_0 + \beta_1 X_{ij} + \beta_2 X_{j} + \alpha_{j} + \gamma_{j}X_{ij} + \epsilon_{ij}$$` `$$\begin{pmatrix} \alpha_{j} \\ \gamma_{j} \end{pmatrix} \sim \mathcal{MVN} \left ( \begin{pmatrix} 0 \\ 0 \end{pmatrix} , \begin{pmatrix} \sigma_{\alpha}^{2} & \sigma_{\alpha \gamma}\\ \sigma_{\alpha \gamma} & \sigma_{\beta}^{2} \end{pmatrix} \right )$$` `$$\epsilon_{ij} \sim \mathcal{N}(0, \sigma^2)$$` - Note that the hierarchical predictor is multiplied by `\(X_j\)` --- # The Model in R ```r rikz_hier <- lmer(log_richness ~ NAP + exposure + (NAP + 1|Beach), data = RIKZdat) ``` --- # We can look at All Paremeters <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> effect </th> <th style="text-align:left;"> group </th> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> fixed </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 5.84 </td> <td style="text-align:right;"> 0.80 </td> <td style="text-align:right;"> 7.32 </td> </tr> <tr> <td style="text-align:left;"> fixed </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> NAP </td> <td style="text-align:right;"> -0.54 </td> <td style="text-align:right;"> 0.09 </td> <td style="text-align:right;"> -6.23 </td> </tr> <tr> <td style="text-align:left;"> fixed </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> exposure </td> <td style="text-align:right;"> -0.39 </td> <td style="text-align:right;"> 0.08 </td> <td style="text-align:right;"> -5.02 </td> </tr> <tr> <td style="text-align:left;"> ran_pars </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> sd__(Intercept) </td> <td style="text-align:right;"> 0.16 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> <tr> <td style="text-align:left;"> ran_pars </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> cor__(Intercept).NAP </td> <td style="text-align:right;"> -0.41 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> <tr> <td style="text-align:left;"> ran_pars </td> <td style="text-align:left;"> Beach </td> <td style="text-align:left;"> sd__NAP </td> <td style="text-align:right;"> 0.20 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> <tr> <td style="text-align:left;"> ran_pars </td> <td style="text-align:left;"> Residual </td> <td style="text-align:left;"> sd__Observation </td> <td style="text-align:right;"> 0.35 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> </tbody> </table> --- # More Coefficients - some vary some don't <table class="table table-striped" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;"> (Intercept) </th> <th style="text-align:right;"> NAP </th> <th style="text-align:right;"> exposure </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 5.97 </td> <td style="text-align:right;"> -0.45 </td> <td style="text-align:right;"> -0.39 </td> </tr> <tr> <td style="text-align:right;"> 5.78 </td> <td style="text-align:right;"> -0.42 </td> <td style="text-align:right;"> -0.39 </td> </tr> <tr> <td style="text-align:right;"> 5.77 </td> <td style="text-align:right;"> -0.49 </td> <td style="text-align:right;"> -0.39 </td> </tr> <tr> <td style="text-align:right;"> 5.76 </td> <td style="text-align:right;"> -0.47 </td> <td style="text-align:right;"> -0.39 </td> </tr> <tr> <td style="text-align:right;"> 6.02 </td> <td style="text-align:right;"> -0.87 </td> <td style="text-align:right;"> -0.39 </td> </tr> <tr> <td style="text-align:right;"> 5.85 </td> <td style="text-align:right;"> -0.42 </td> <td style="text-align:right;"> -0.39 </td> </tr> <tr> <td style="text-align:right;"> 5.84 </td> <td style="text-align:right;"> -0.52 </td> <td style="text-align:right;"> -0.39 </td> </tr> <tr> <td style="text-align:right;"> 5.73 </td> <td style="text-align:right;"> -0.55 </td> <td style="text-align:right;"> -0.39 </td> </tr> <tr> <td style="text-align:right;"> 5.82 </td> <td style="text-align:right;"> -0.72 </td> <td style="text-align:right;"> -0.39 </td> </tr> </tbody> </table> --- # Visualization is an MLR Problem <img src="mixed_models_files/figure-html/unnamed-chunk-27-1.png" style="display: block; margin: auto;" /> --- # Plotting Fixed Effects of Multiple Predictors <img src="mixed_models_files/figure-html/unnamed-chunk-28-1.png" style="display: block; margin: auto;" /> --- # Plotting Everything <img src="mixed_models_files/figure-html/unnamed-chunk-29-1.png" style="display: block; margin: auto;" /> --- # Mixed Model Structure - You can build models as complex as you want, but... -- - Your RE structure should reflect your sampling design -- - Sometimes variance components are close to 0. Do they need to be there? -- - Also, do not forget your endogeneity assumption! -- - Hierarchies: you can do a lot with level-1 and level-2 thinking! --- ## A Mixed Outline 1. Mixed Models 2. A Variable Intercept Mixed Model - Yes, there will be assumption checks 3. Variable Slopes and Intercepts 4. .red[ GLMMs - a digression ] 5. Visualizing Uncertainty with Mixed Models --- class: center, middle ![](images/mixed_models/more_names.png) --- # This is Not Normal <img src="mixed_models_files/figure-html/unnamed-chunk-30-1.png" style="display: block; margin: auto;" /> --- # We CAN fit Generalized Linear Mixed Models Let's try this for a variable-slope variable intercept Poisson model with a Log Link `$$\large \eta_{ij} = \alpha + \beta_{j}$$` `$$\ \beta_{j} \sim \mathcal{N}(0, \sigma^2_{site})$$` `$$\large log(\widehat{Y_{ij}}) = \eta_{ij}$$` `$$\large Y_{ij} \sim \mathcal{P}(\widehat{Y_{ij}})$$` --- # We can fit with glmer or glmmTMB ```r rikz_pois <- glmer(Richness ~ NAP + (NAP + 1|Beach), family = poisson(link = "log"), data = RIKZdat) ``` --- # Check that prediction <img src="mixed_models_files/figure-html/unnamed-chunk-32-1.png" style="display: block; margin: auto;" /> --- # Check that Overdispersion <img src="mixed_models_files/figure-html/unnamed-chunk-33-1.png" style="display: block; margin: auto;" /> --- # Check that Overdispersion <img src="mixed_models_files/figure-html/unnamed-chunk-34-1.png" style="display: block; margin: auto;" /> --- # Check the Random Effects ``` [[1]] ``` <img src="mixed_models_files/figure-html/unnamed-chunk-35-1.png" style="display: block; margin: auto;" /> --- # Evaluate Away! <img src="mixed_models_files/figure-html/unnamed-chunk-36-1.png" style="display: block; margin: auto;" /> ``` # A tibble: 1 × 2 R2_conditional R2_marginal <dbl> <dbl> 1 0.815 0.400 ``` --- # Final Thoughts on Mixed Models - These are *very* powerful -- - They let you cope with data structures FE would fail on -- - They are more efficient with respect to DF and hence narrower CIs -- - Shrinkage is philosophically a real plus -- - BUT - they will take some fiddling, as MANY parameters and hyperparameters as well as complex likelihood functions