Linear Models
Generalized Linear Models & Likelihood
Mixed Models
tb=b−β0SEb
H0:β0=0, but we can test other hypotheses
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 1.925 | 1.506 | 1.278 | 0.218 |
resemblance | 2.989 | 0.571 | 5.232 | 0.000 |
p is very small here so...
We reject the hypothesis of no slope for resemblance, but fail to reject it for the intercept.
We reject that there is no relationship between resemblance and predator visits in our experiment.
0.6 of the variability in predator visits is associated with resemblance.
Ho = The model predicts no variation in the data.
Ha = The model predicts variation in the data.
Ho = The model predicts no variation in the data.
Ha = The model predicts variation in the data.
To evaluate these hypotheses, we need to have a measure of variation explained by data versus error - the sums of squares!
Ho = The model predicts no variation in the data.
Ha = The model predicts variation in the data.
To evaluate these hypotheses, we need to have a measure of variation explained by data versus error - the sums of squares!
This is an Analysis of Variance..... ANOVA!
Ho = The model predicts no variation in the data.
Ha = The model predicts variation in the data.
To evaluate these hypotheses, we need to have a measure of variation explained by data versus error - the sums of squares!
This is an Analysis of Variance..... ANOVA! SSTotal=SSRegression+SSError
Distance from ˆy to ˉy
SSR=∑(^Yi−ˉY)2, df=1
SSE=∑(Yi−ˆYi)2, df=n-2
SSR=∑(^Yi−ˉY)2, df=1
SSE=∑(Yi−ˆYi)2, df=n-2
To compare them, we need to correct for different DF. This is the Mean Square.
MS=SS/DF
e.g, MSE=SSEn−2
F=MSRMSE with DF=1,n-2
Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
---|---|---|---|---|---|
resemblance | 1 | 255.1532 | 255.153152 | 27.37094 | 5.64e-05 |
Residuals | 18 | 167.7968 | 9.322047 | NA | NA |
Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
---|---|---|---|---|---|
resemblance | 1 | 255.1532 | 255.153152 | 27.37094 | 5.64e-05 |
Residuals | 18 | 167.7968 | 9.322047 | NA | NA |
We reject the null hypothesis that resemblance does not explain variability in predator approaches
T-tests evaluate whether coefficients are different from 0
Often, F and T agree - but not always
T-tests for Coefficients with Treatment Contrasts
F Tests for Variability Explained by Including Categorical Predictor
More T-Tests for Posthoc Evaluation
SSTotal=SSModel+SSError
SSTotal=SSModel+SSError
(Classic ANOVA: SSTotal=SSBetween+SSWithin)
SSTotal=SSModel+SSError
(Classic ANOVA: SSTotal=SSBetween+SSWithin)
Yes, these are the same!
SSModel=∑i∑j(¯Yi−ˉY)2, df=k-1
SSError=∑i∑j(Yij−¯Yi)2, df=n-k
To compare them, we need to correct for different DF. This is the Mean Square.
MS = SS/DF, e.g, MSW=SSWn−k
Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
---|---|---|---|---|---|
group | 2 | 0.5402533 | 0.2701267 | 7.823136 | 0.0012943 |
Residuals | 42 | 1.4502267 | 0.0345292 | NA | NA |
We have strong confidence that we can reject the null hypothesis
TreatmentHo: μi1=μi2=μi3=...
Block Ho: μj1=μj2=μj3=...
i.e., The variance due to each treatment type is no different than noise
SSTotal=SSA+SSB+SSError
Let's Assume Y ~ A + B where A is categorical and B is continuous
F-Tests are really model comparisons
Let's Assume Y ~ A + B where A is categorical and B is continuous
F-Tests are really model comparisons
The SS for A is the Residual SS of Y ~ A + B - Residual SS of Y ~ B
Let's Assume Y ~ A + B where A is categorical and B is continuous
F-Tests are really model comparisons
The SS for A is the Residual SS of Y ~ A + B - Residual SS of Y ~ B
Let's Assume Y ~ A + B where A is categorical and B is continuous
F-Tests are really model comparisons
The SS for A is the Residual SS of Y ~ A + B - Residual SS of Y ~ B
Proceed as normal
This also works for interactions, where interactions are all tested including additive or lower-level interaction components
We've been here before with SEs and CIs
In truth, we are using T-tests
We've been here before with SEs and CIs
In truth, we are using T-tests
BUT - we now correct p-values for Family-Wise Error (if at all)
Linear Models
Generalized Linear Models & Likelihood
Mixed Models
.center[
L(H|D)=p(D|H)
Where the D is the data and H is the hypothesis (model) including a both a data generating process with some choice of parameters (often called θ). The error generating process is inherent in the choice of probability distribution used for calculation.
Let's say this is our data:
[1] 3.37697212 3.30154837 1.90197683 1.86959410 0.20346568 3.72057350 [7] 3.93912102 2.77062225 4.75913135 3.11736679 2.14687718 3.90925918[13] 4.19637296 2.62841610 2.87673977 4.80004312 4.70399588 -0.03876461[19] 0.71102505 3.05830349
Let's say this is our data:
[1] 3.37697212 3.30154837 1.90197683 1.86959410 0.20346568 3.72057350 [7] 3.93912102 2.77062225 4.75913135 3.11736679 2.14687718 3.90925918[13] 4.19637296 2.62841610 2.87673977 4.80004312 4.70399588 -0.03876461[19] 0.71102505 3.05830349
We know that the data comes from a normal population with a σ of 1.... but we want to get the MLE of the mean.
Let's say this is our data:
[1] 3.37697212 3.30154837 1.90197683 1.86959410 0.20346568 3.72057350 [7] 3.93912102 2.77062225 4.75913135 3.11736679 2.14687718 3.90925918[13] 4.19637296 2.62841610 2.87673977 4.80004312 4.70399588 -0.03876461[19] 0.71102505 3.05830349
We know that the data comes from a normal population with a σ of 1.... but we want to get the MLE of the mean.
p(D|θ)=∏p(Di|θ)
Let's say this is our data:
[1] 3.37697212 3.30154837 1.90197683 1.86959410 0.20346568 3.72057350 [7] 3.93912102 2.77062225 4.75913135 3.11736679 2.14687718 3.90925918[13] 4.19637296 2.62841610 2.87673977 4.80004312 4.70399588 -0.03876461[19] 0.71102505 3.05830349
We know that the data comes from a normal population with a σ of 1.... but we want to get the MLE of the mean.
p(D|θ)=∏p(Di|θ)
= ∏dnorm(Di,μ,σ=1)
MLE = 2.896
We use Log-Likelihood as it is not subject to rounding error, and approximately χ2 distributed.
Distribution of sums of squares of k data points drawn from N(0,1)
k = Degrees of Freedom
Measures goodness of fit
A large probability density indicates a match between the squared difference of an observation and expectation
The 68% CI of a χ2 distribution is 0.49, so....
The 95% CI of a χ2 distribution is 1.92, so....
L(θ|Data)=n∏i=1N(Visitsi|β0+β1Resemblancei,σ)
where β0,β1,σ are elements of θ
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 1.925 | 1.506 | 1.278 | 0.218 |
resemblance | 2.989 | 0.571 | 5.232 | 0.000 |
Test Statistic is a Wald Z-Test Assuming a well behaved quadratic Confidence Interval
Compare p(D|θ1) versus p(D|θ2)
G=L(H1|D)L(H2|D)
G=L(H1|D)L(H2|D)
G is the ratio of Maximum Likelihoods from each model
Used to compare goodness of fit of different models/hypotheses
G=L(H1|D)L(H2|D)
G is the ratio of Maximum Likelihoods from each model
Used to compare goodness of fit of different models/hypotheses
Most often, θ = MLE versus θ = 0
G=L(H1|D)L(H2|D)
G is the ratio of Maximum Likelihoods from each model
Used to compare goodness of fit of different models/hypotheses
Most often, θ = MLE versus θ = 0
−2log(G) is χ2 distributed
A new test statistic: D=−2log(G)
=2[Log(L(H2|D))−Log(L(H1|D))]
A new test statistic: D=−2log(G)
=2[Log(L(H2|D))−Log(L(H1|D))]
We then scale by dispersion parameter (e.g., variance, etc.)
A new test statistic: D=−2log(G)
=2[Log(L(H2|D))−Log(L(H1|D))]
We then scale by dispersion parameter (e.g., variance, etc.)
It's χ2 distributed!
A new test statistic: D=−2log(G)
=2[Log(L(H2|D))−Log(L(H1|D))]
We then scale by dispersion parameter (e.g., variance, etc.)
It's χ2 distributed!
We compare our slope + intercept to a model fit with only an intercept!
Note, models must have the SAME response variable
int_only <- glm(predators ~ 1, data = puffer)
We compare our slope + intercept to a model fit with only an intercept!
Note, models must have the SAME response variable
int_only <- glm(predators ~ 1, data = puffer)
Analysis of Deviance TableModel 1: predators ~ 1Model 2: predators ~ resemblance Resid. Df Resid. Dev Df Deviance Pr(>Chi) 1 19 422.95 2 18 167.80 1 255.15 1.679e-07 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note, uses Difference in Deviance / Dispersion where Dispersion = Variance as LRT
Analysis of Deviance Table (Type II tests)Response: predators LR Chisq Df Pr(>Chisq) resemblance 27.371 1 1.679e-07 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here, LRT = Difference in Deviance / Dispersion where Dispersion = Variance
Linear Models
Generalized Linear Models & Likelihood
Mixed Models
Satterthwaite approximation - Based on sample sizes and variances within groups
lmerTest
(which is kinda broken at the moment)Kenward-Roger’s approximation.
car::Anova()
and pbkrtest
Baseline - only for balanced LMMs!
Analysis of Variance Table npar Sum Sq Mean Sq F valueNAP 1 10.467 10.467 63.356
Satterwaith
Type III Analysis of Variance Table with Satterthwaite's method Sum Sq Mean Sq NumDF DenDF F value Pr(>F) NAP 10.467 10.467 1 37.203 63.356 1.495e-09 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Kenward-Roger
Analysis of Deviance Table (Type II Wald F tests with Kenward-Roger df)Response: log_richness F Df Df.res Pr(>F) NAP 62.154 1 37.203 1.877e-09 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
F-test
# A tibble: 1 × 5 term statistic df Df.res p.value <chr> <dbl> <dbl> <dbl> <dbl>1 NAP 62.2 1 37.2 0.00000000188
LR Chisq
# A tibble: 1 × 4 term statistic df p.value <chr> <dbl> <dbl> <dbl>1 NAP 63.4 1 1.72e-15
F-test
# A tibble: 1 × 5 term statistic df Df.res p.value <chr> <dbl> <dbl> <dbl> <dbl>1 NAP 62.2 1 37.2 0.00000000188
LR Chisq
# A tibble: 1 × 4 term statistic df p.value <chr> <dbl> <dbl> <dbl>1 NAP 63.4 1 1.72e-15
LR Chisq where REML = FALSE
# A tibble: 1 × 4 term statistic df p.value <chr> <dbl> <dbl> <dbl>1 NAP 65.6 1 5.54e-16
For LMMs, make sure you have fit with REML = TRUE
One school of thought is to leave them in place
For LMMs, make sure you have fit with REML = TRUE
One school of thought is to leave them in place
lmerTest::ranova()
rlrsim
, but gets tricky ranova(rikz_varint)
ANOVA-like table for random-effects: Single term deletionsModel:log_richness ~ NAP + (1 | Beach) npar logLik AIC LRT Df Pr(>Chisq) <none> 4 -32.588 73.175 (1 | Beach) 3 -38.230 82.460 11.285 1 0.0007815 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes you can!
Some of the fun issues of denominator DF raise their heads
Yes you can!
Some of the fun issues of denominator DF raise their heads
Keep up to date on the literature/FAQ when you are using them!
Linear Models
Generalized Linear Models & Likelihood
Mixed Models
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |