Regression, ANOVA, t-test are related…

minipost stats

In a new episode of things I forgot to find.

Joshua Kunst http://jkunst.com/
06-08-2021

Source: https://stats.stackexchange.com/questions/59047/how-are-regression-the-t-test-and-the-anova-all-versions-of-the-general-linear

I Always fail in remember the code to show how this models are related, so I will put here for my future me. An important thing to do is check the p-values.

The data, according help(sleep):

Data which show the effect of two soporific drugs (increase in hours of sleep compared to control) on 10 patients -Scheffé, Henry (1959) The Analysis of Variance. New York, NY: Wiley.

Now, load packages and data.

library(tibble)
library(broom)

data("sleep")

sleep <- as_tibble(sleep)

glimpse(sleep)
Rows: 20
Columns: 3
$ extra <dbl> 0.7, -1.6, -0.2, -1.2, -0.1, 3.4, 3.7, 0.8, 0.0, 2.0, ~
$ group <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, ~
$ ID    <fct> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8,~

Regression

linear_model <- lm(extra ~ group, data = sleep)

summary(linear_model)

Call:
lm(formula = extra ~ group, data = sleep)

Residuals:
   Min     1Q Median     3Q    Max 
-2.430 -1.305 -0.580  1.455  3.170 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   0.7500     0.6004   1.249   0.2276  
group2        1.5800     0.8491   1.861   0.0792 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.899 on 18 degrees of freedom
Multiple R-squared:  0.1613,    Adjusted R-squared:  0.1147 
F-statistic: 3.463 on 1 and 18 DF,  p-value: 0.07919
tidy(linear_model)
# A tibble: 2 x 5
  term        estimate std.error statistic p.value
  <chr>          <dbl>     <dbl>     <dbl>   <dbl>
1 (Intercept)     0.75     0.600      1.25  0.228 
2 group2          1.58     0.849      1.86  0.0792

ANOVA

anova <- aov(extra ~ group, data = sleep)

summary(anova)
            Df Sum Sq Mean Sq F value Pr(>F)  
group        1  12.48  12.482   3.463 0.0792 .
Residuals   18  64.89   3.605                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
tidy(anova)
# A tibble: 2 x 6
  term         df sumsq meansq statistic p.value
  <chr>     <dbl> <dbl>  <dbl>     <dbl>   <dbl>
1 group         1  12.5  12.5       3.46  0.0792
2 Residuals    18  64.9   3.60     NA    NA     

\(t\)-test

t_test <- t.test(extra ~ group, var.equal = TRUE, data = sleep) 

t_test

    Two Sample t-test

data:  extra by group
t = -1.8608, df = 18, p-value = 0.07919
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.363874  0.203874
sample estimates:
mean in group 1 mean in group 2 
           0.75            2.33 
tidy(t_test)
# A tibble: 1 x 10
  estimate estimate1 estimate2 statistic p.value parameter conf.low
     <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>
1    -1.58      0.75      2.33     -1.86  0.0792        18    -3.36
# ... with 3 more variables: conf.high <dbl>, method <chr>,
#   alternative <chr>

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/jbkunst/blog, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Kunst (2021, June 8). Data, Code and Visualization: Regression, ANOVA, t-test are related.... Retrieved from http://jkunst.com/blog/posts/2021-06-08-regression-anova-t-test/

BibTeX citation

@misc{kunst2021regression,,
  author = {Kunst, Joshua},
  title = {Data, Code and Visualization: Regression, ANOVA, t-test are related...},
  url = {http://jkunst.com/blog/posts/2021-06-08-regression-anova-t-test/},
  year = {2021}
}