yi=β0+β1xi+ϵi ϵi∼N(0,σ)
yi=β0+K∑j=1βjxij+ϵi
massi=β0+β1sexi+ϵi
massi=β1adeliei+β2chinstrapi+β3gentooi+ϵi
massi=β1adeliei+β2chinstrapi+β3gentooi+β4malei+ϵi
The Categorical as Continuous
Many Levels of One Category
Interpretation of Categorical Results.
Querying Your Model to Compare Groups
yi=β0+β1xi+ϵi
ϵi∼N(0,σ)
But, what if xi was just 0,1?
Horns prevent these lizards from being eaten by birds. Are horn lengths different between living and dead lizards, indicating selection pressure?
What if we think of Dead = 0, Living = 1
Squamosal horn length | Survive | Status |
---|---|---|
13.1 | 1 | Living |
15.2 | 0 | Dead |
15.5 | 0 | Dead |
15.7 | 1 | Living |
17.2 | 0 | Dead |
17.7 | 1 | Living |
Squamosal horn length | Survive | Status | StatusDead | StatusLiving |
---|---|---|---|---|
13.1 | 1 | Living | 0 | 1 |
15.2 | 0 | Dead | 1 | 0 |
15.5 | 0 | Dead | 1 | 0 |
15.7 | 1 | Living | 0 | 1 |
17.2 | 0 | Dead | 1 | 0 |
17.7 | 1 | Living | 0 | 1 |
Squamosal horn length | Survive | Status | (Intercept) | StatusLiving |
---|---|---|---|---|
13.1 | 1 | Living | 1 | 1 |
15.2 | 0 | Dead | 1 | 0 |
15.5 | 0 | Dead | 1 | 0 |
15.7 | 1 | Living | 1 | 1 |
17.2 | 0 | Dead | 1 | 0 |
17.7 | 1 | Living | 1 | 1 |
Squamosal horn length | Survive | Status | (Intercept) | StatusLiving |
---|---|---|---|---|
13.1 | 1 | Living | 1 | 1 |
15.2 | 0 | Dead | 1 | 0 |
15.5 | 0 | Dead | 1 | 0 |
15.7 | 1 | Living | 1 | 1 |
17.2 | 0 | Dead | 1 | 0 |
17.7 | 1 | Living | 1 | 1 |
This is known as a Treatment Contrast structure
Lengthi=β0+β1Statusi+ϵi
Lengthi=β0+β1Statusi+ϵi
Lengthi=β0+β1Statusi+ϵi
Lengthi=β0+β1Statusi+ϵi
We can always turn groups into "Dummy" 0 or 1
We could even fit a model with no β0 and code Dead = 0 or 1 and Living = 0 or 1
Lengthi=β0+β1Statusi+ϵi
We can always turn groups into "Dummy" 0 or 1
We could even fit a model with no β0 and code Dead = 0 or 1 and Living = 0 or 1
This approach works for any unordered categorical (nominal) variable
The Categorical as Continuous.
Many Levels of One Category
Interpretation of Categorical Results.
Querying Your Model to Compare Groups
What is the variance between groups v. within groups?
Underlying linear model with control = intercept, dummy variable for bipolar
Underlying linear model with control = intercept, dummy variable for bipolar
Underlying linear model with control = intercept, dummy variable for schizo
Underlying linear model with control = intercept, dummy variable for schizo
yij=β0+∑βjxij+ϵij,xi=0,1
yij=β0+∑βjxij+ϵij,xi=0,1
i = replicate, j = group
xij inidicates presence/abscence (1/0) of level j for individual i
yij=β0+∑βjxij+ϵij,xi=0,1
i = replicate, j = group
xij inidicates presence/abscence (1/0) of level j for individual i
yij=β0+∑βjxij+ϵij,xi=0,1
i = replicate, j = group
xij inidicates presence/abscence (1/0) of level j for individual i
This is the multiple predictor extension of a two-category model
All categories are orthogonal
yij=β0+∑βjxij+ϵij,xi=0,1
i = replicate, j = group
xij inidicates presence/abscence (1/0) of level j for individual i
This is the multiple predictor extension of a two-category model
All categories are orthogonal
One category set to β0 for ease of fitting, and other βs are different from it
yij=αj+ϵij
ϵij∼N(0,σ2)
yij=ˉy+(ˉyj−ˉy)+(yij−ˉyj)
yij=ˉy+(ˉyj−ˉy)+(yij−ˉyj)
yij=ˉy+(ˉyj−ˉy)+(yij−ˉyj)
Consider ˉy an intercept, deviations from intercept by treatment, and residuals
Can Calculate this with a fit model to answer questions - it's a relic of a bygone era
Using Least Squares
brain_lm <- lm(PLP1.expression ~ group, data=brainGene)tidy(brain_lm) |> select(-c(4:5)) |> knitr::kable(digits = 3) |> kableExtra::kable_styling()
term | estimate | std.error |
---|---|---|
(Intercept) | -0.004 | 0.048 |
groupschizo | -0.191 | 0.068 |
groupbipolar | -0.259 | 0.068 |
The Categorical as Continuous
Many Levels of One Category
Interpretation of Categorical Results
Querying Your Model to Compare Groups
yij=β0+∑βjxij+ϵij
term | estimate | std.error |
---|---|---|
(Intercept) | -0.004 | 0.048 |
groupschizo | -0.191 | 0.068 |
groupbipolar | -0.259 | 0.068 |
yij=β0+∑βjxij+ϵij
term | estimate | std.error |
---|---|---|
(Intercept) | -0.004 | 0.048 |
groupschizo | -0.191 | 0.068 |
groupbipolar | -0.259 | 0.068 |
What does this mean?
yij=β0+∑βjxij+ϵij
term | estimate | std.error |
---|---|---|
(Intercept) | -0.004 | 0.048 |
groupschizo | -0.191 | 0.068 |
groupbipolar | -0.259 | 0.068 |
What does this mean?
Intercept ( β0 ) = the average value associated with being in the control group
Others = the average difference between control and each other group
Note: Order is alphabetical
yij=αj+ϵij
group | estimate | std.error |
---|---|---|
control | -0.0040000 | 0.0479786 |
schizo | -0.1953333 | 0.0479786 |
bipolar | -0.2626667 | 0.0479786 |
yij=αj+ϵij
group | estimate | std.error |
---|---|---|
control | -0.0040000 | 0.0479786 |
schizo | -0.1953333 | 0.0479786 |
bipolar | -0.2626667 | 0.0479786 |
What does this mean?
yij=αj+ϵij
group | estimate | std.error |
---|---|---|
control | -0.0040000 | 0.0479786 |
schizo | -0.1953333 | 0.0479786 |
bipolar | -0.2626667 | 0.0479786 |
What does this mean?
Being in group j is associated with an average outcome of y.
We can look at fit to data - even in categorical data!
# R2 for Linear Regression R2: 0.271 adj. R2: 0.237
We can look at fit to data - even in categorical data!
# R2 for Linear Regression R2: 0.271 adj. R2: 0.237
But, remember, this is based on the sample at hand.
We can look at fit to data - even in categorical data!
# R2 for Linear Regression R2: 0.271 adj. R2: 0.237
But, remember, this is based on the sample at hand.
Adjusted R2: adjusts for sample size and model complexity (k = # params = # groups)
R2adj=1−(1−R2)(n−1)n−k−1
The Categorical as Continuous
Many Levels of One Category
Interpretation of Categorical Results.
Querying Your Model to Compare Groups
Many mini-linear models with two means....multiple comparisons!
Each group has a mean and SE
We can calculate a comparison for each
Each group has a mean and SE
We can calculate a comparison for each
BUT, we lose precision as we keep resampling the model
Each group has a mean and SE
We can calculate a comparison for each
BUT, we lose precision as we keep resampling the model
Remember, for every time we look at a system, we have some % of our CI not overlapping the true value
Each group has a mean and SE
We can calculate a comparison for each
BUT, we lose precision as we keep resampling the model
Remember, for every time we look at a system, we have some % of our CI not overlapping the true value
Each time we compare means, we have a chance of our CI not covering the true value
Each group has a mean and SE
We can calculate a comparison for each
BUT, we lose precision as we keep resampling the model
Remember, for every time we look at a system, we have some % of our CI not overlapping the true value
Each time we compare means, we have a chance of our CI not covering the true value
To minimize this possibility, we correct (widen) our CIs for this Family-Wise Error Rate
Ignore it -
Increase your CI given m = # of comparisons
Ignore it -
Increase your CI given m = # of comparisons
Other multiple comparison corrections
contrast | estimate | conf.low | conf.high |
---|---|---|---|
control - schizo | 0.1913333 | 0.0544024 | 0.3282642 |
control - bipolar | 0.2586667 | 0.1217358 | 0.3955976 |
schizo - bipolar | 0.0673333 | -0.0695976 | 0.2042642 |
contrast | estimate | conf.low | conf.high |
---|---|---|---|
control - schizo | 0.1913333 | 0.0221330 | 0.3605337 |
control - bipolar | 0.2586667 | 0.0894663 | 0.4278670 |
schizo - bipolar | 0.0673333 | -0.1018670 | 0.2365337 |
contrast | estimate | conf.low | conf.high |
---|---|---|---|
control - schizo | 0.1913333 | 0.0264873 | 0.3561793 |
control - bipolar | 0.2586667 | 0.0938207 | 0.4235127 |
schizo - bipolar | 0.0673333 | -0.0975127 | 0.2321793 |
contrast | estimate | conf.low | conf.high |
---|---|---|---|
schizo - control | -0.1913333 | -0.3474491 | -0.0352176 |
bipolar - control | -0.2586667 | -0.4147824 | -0.1025509 |
At the end of the day, they are just another linear model
We can understand a lot about groups, though
We can begin to see the value of queries/counterfactuals
ˆY=Xβ Y∼N(ˆY,Σ)
yi=β0+β1xi+ϵi ϵi∼N(0,σ)
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |