10  Correlation and \(R^2\)

Last modified on 26. December 2025 at 19:12:18

“A quote.” — Dan Meyer

10.1 General background

R Code [show / hide]
r2_good_tbl <- tibble(weight = abs(rnorm(5, 3, 4)),
                      jumplength = 10 + 1.2 * weight + rnorm(5, 0, 3))

r2_good_fit <- lm(jumplength ~ weight, data = r2_good_tbl)

mean_good_cat_jump <- mean(r2_good_tbl$jumplength)

r2_good_plot_tbl <- r2_good_tbl |> 
  mutate(sy = jumplength - mean(jumplength),
         e = residuals(r2_good_fit))

sum_good_sy <- (r2_good_plot_tbl$sy)^2 |> abs() |> sum() |> round(2)
sum_good_e <- (r2_good_plot_tbl$e)^2 |> abs() |> sum() |> round(2)
r2_good <- 1-(sum_good_e/sum_good_sy)


r2_bad_tbl <- tibble(weight = r2_good_tbl$weight,
                     jumplength = 9 + 0.5 * weight + rnorm(5, 0, 4))

r2_bad_fit <- lm(jumplength ~ weight, data = r2_bad_tbl)

mean_bad_cat_jump <- mean(r2_bad_tbl$jumplength)

r2_bad_plot_tbl <- r2_bad_tbl |> 
  mutate(sy = jumplength - mean(jumplength),
         e = residuals(r2_bad_fit),
         e2 = e^2)

sum_bad_sy <- (r2_bad_plot_tbl$sy)^2 |> abs() |> sum() |> round(2)
sum_bad_e <- (r2_bad_plot_tbl$e)^2 |> abs() |> sum() |> round(2)
r2_bad <- 1-(sum_bad_e/sum_bad_sy)

\[ R^2 = 1 - \cfrac{SS_{res}}{SS_{total}} \]

Figure 10.1: foo
R Code [show / hide]
set.seed(20251226) #20251226
cor_high_tbl <- tibble(weight = abs(rnorm(5, 3, 4)),
                       jumplength = 10 + 1 * weight + rnorm(5, 0, 4)) |> 
  mutate(sweight = weight - mean(weight),
         sjump = jumplength - mean(jumplength),
         ss_xy = sweight * sjump,
         ss_x = sweight^2,
         ss_y = sjump^2,
         sign_xy = ifelse(sign(ss_xy) == -1, "\U2012", "+"))
set.seed(202511)

sum(cor_high_tbl$ss_x)
[1] 33.95848
R Code [show / hide]
sum(cor_high_tbl$ss_x) |> sqrt()
[1] 5.82739
R Code [show / hide]
sum(cor_high_tbl$ss_y)
[1] 47.94158
R Code [show / hide]
sum(cor_high_tbl$ss_y) |> sqrt()
[1] 6.923986
R Code [show / hide]
sum(cor_high_tbl$ss_xy)
[1] 33.21142
R Code [show / hide]
cor(cor_high_tbl$weight, cor_high_tbl$jumplength)
[1] 0.8231087

\[ r = \cfrac{SS_{xy}}{\sqrt{SS_x}\cdot\sqrt{SS_y}} \]

Figure 10.2: foo

10.2 Theoretical background

10.3 R packages used

10.4 Data

10.5 Alternatives

Further tutorials and R packages on XXX

10.6 Glossary

term

what does it mean.

10.7 The meaning of “Models of Reality” in this chapter.

  • itemize with max. 5-6 words

10.8 Summary

References