84 Baustelle
Letzte Änderung am 22. August 2025 um 08:47:22
Dieses Kapitel wird in den nächsten Wochen geschrieben. Ich plane zum Beginn des WiSe 2025/26 eine neue Version des Kapitels erstellt zu haben. Während das Kapitel entsteht, funktioniert so manches dann nicht so wie es soll.
rnorm()
: Erzeugt Zufallszahlen aus einer Normalverteilung.runif()
: Erzeugt Zufallszahlen aus einer Gleichverteilung.rbinom()
: Generiert Zufallszahlen aus einer Binomialverteilung.rpois()
: Erzeugt Zufallszahlen aus einer Poisson-Verteilung.rgamma()
: Generiert Zufallszahlen aus einer Gamma-Verteilung.rexp()
: Erzeugt Zufallszahlen aus einer Exponentialverteilung.rt()
: Generiert Zufallszahlen aus einer Student’s t-Verteilung.rchisq()
: Erzeugt Zufallszahlen aus einer Chi-Quadrat-Verteilung.rbeta()
: Erzeugt Zufallszahlen aus einer Beta-Verteilung.rf()
: Erzeugt Zufallszahlen aus einer F-Verteilung.rlogis()
: Erzeugt Zufallszahlen aus einer logistischen Verteilung.rweibull()
: Erzeugt Zufallszahlen aus einer Weibull-Verteilung.
- Draw what you want to visualize
- Make models for it
- Select the best model
- Visualize the best model
- Investigate its parameters
84.1 Genutzte R Pakete
Wir wollen folgende R Pakete in diesem Kapitel nutzen.
R Code [zeigen / verbergen]
::p_load(tidyverse, gtsummary, marginaleffects, emmeans, scales,
pacman
ggpmisc, readxl, conflicted)conflicts_prefer(dplyr::mutate)
conflicts_prefer(dplyr::summarize)
conflicts_prefer(dplyr::filter)
conflicts_prefer(ggplot2::annotate)
<- c("#000000", "#E69F00", "#56B4E9",
cb_pal "#009E73", "#F0E442", "#F5C710",
"#0072B2", "#D55E00", "#CC79A7")
##
<- label_number(style_negative = "minus", accuracy = 0.01)
nice_number <- label_pvalue(prefix = c("p < ", "p = ", "p > "))
nice_p <- function(x1, y1, slope) {
find_intercept <- slope * (-x1) + y1
intercept return(intercept)
}
An der Seite des Kapitels findest du den Link Quellcode anzeigen, über den du Zugang zum gesamten R-Code dieses Kapitels erhältst.
84.2 Daten
R Code [zeigen / verbergen]
<- read_excel("data/fleas_model_data.xlsx") |>
flea_model_tbl mutate(feeding = as_factor(feeding),
stage = as_factor(stage),
bonitur = as.numeric(bonitur),
infected = factor(infected, labels = c("healthy", "infected"))) |>
select(feeding, stage, jump_length, weight, hatched, count_leg, bonitur, infected)
84.3 Hypothesen
\[ Z=\frac{h(\hat{\theta})-H_0}{\sqrt{\hat{V}[h(\hat{\theta})]}} \]
R Code [zeigen / verbergen]
= lm(jump_length ~ 0 + feeding, data = flea_model_tbl)
feeding_fit coef(feeding_fit)
feedingsugar_water feedingblood feedingketchup
70.00562 107.87063 101.69437
R Code [zeigen / verbergen]
summary(feeding_fit) |> coef()
Estimate Std. Error t value Pr(>|t|)
feedingsugar_water 70.00562 7.516627 9.313436 4.598588e-12
feedingblood 107.87063 7.516627 14.350934 2.091381e-18
feedingketchup 101.69437 7.516627 13.529256 1.821420e-17
\[ Z = \cfrac{\hat{\beta_1}-H_0}{\sqrt{\widehat{V}[\beta_1]}} = \cfrac{75.43938 - 0}{4.791469} = 15.74 \]
R Code [zeigen / verbergen]
2*pt(15.74452, df = 45, lower.tail = FALSE)
[1] 6.366696e-20
R Code [zeigen / verbergen]
<- lm(jump_length ~ feeding * stage, data = flea_model_tbl)
feeding_fit coef(feeding_fit)
(Intercept) feedingblood
78.97750 20.73500
feedingketchup stagejuvenile
13.17125 -17.94375
feedingblood:stagejuvenile feedingketchup:stagejuvenile
34.26000 37.03500
84.4 Prädiktion
R Code [zeigen / verbergen]
<- flea_model_tbl |>
simple_tbl filter(stage == "adult")
<- lm(jump_length ~ feeding, simple_tbl)
simple_fit coef(simple_fit)
(Intercept) feedingblood feedingketchup
78.97750 20.73500 13.17125
R Code [zeigen / verbergen]
predictions(simple_fit)
Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
79.0 8.45 9.35 <0.001 66.6 62.4 95.5
79.0 8.45 9.35 <0.001 66.6 62.4 95.5
79.0 8.45 9.35 <0.001 66.6 62.4 95.5
79.0 8.45 9.35 <0.001 66.6 62.4 95.5
79.0 8.45 9.35 <0.001 66.6 62.4 95.5
79.0 8.45 9.35 <0.001 66.6 62.4 95.5
79.0 8.45 9.35 <0.001 66.6 62.4 95.5
79.0 8.45 9.35 <0.001 66.6 62.4 95.5
99.7 8.45 11.81 <0.001 104.4 83.2 116.3
99.7 8.45 11.81 <0.001 104.4 83.2 116.3
99.7 8.45 11.81 <0.001 104.4 83.2 116.3
99.7 8.45 11.81 <0.001 104.4 83.2 116.3
99.7 8.45 11.81 <0.001 104.4 83.2 116.3
99.7 8.45 11.81 <0.001 104.4 83.2 116.3
99.7 8.45 11.81 <0.001 104.4 83.2 116.3
99.7 8.45 11.81 <0.001 104.4 83.2 116.3
92.1 8.45 10.91 <0.001 89.7 75.6 108.7
92.1 8.45 10.91 <0.001 89.7 75.6 108.7
92.1 8.45 10.91 <0.001 89.7 75.6 108.7
92.1 8.45 10.91 <0.001 89.7 75.6 108.7
92.1 8.45 10.91 <0.001 89.7 75.6 108.7
92.1 8.45 10.91 <0.001 89.7 75.6 108.7
92.1 8.45 10.91 <0.001 89.7 75.6 108.7
92.1 8.45 10.91 <0.001 89.7 75.6 108.7
Type: response
R Code [zeigen / verbergen]
avg_predictions(simple_fit)
Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
90.3 4.88 18.5 <0.001 251.8 80.7 99.8
Type: response
R Code [zeigen / verbergen]
<- flea_model_tbl |>
simple_2_tbl filter(stage == "adult")
<- lm(jump_length ~ weight*feeding, simple_2_tbl)
simple_2_fit coef(simple_2_fit)
(Intercept) weight feedingblood
43.21968178 2.11178610 29.67377681
feedingketchup weight:feedingblood weight:feedingketchup
43.24721935 -0.03097599 -1.65129790
R Code [zeigen / verbergen]
|> broom::augment()
simple_2_fit predictions(simple_2_fit)
avg_predictions(simple_2_fit, by = "feeding")
tibble(weight = simple_2_tbl$weight,
jump_length = simple_2_tbl$jump_length,
feeding = simple_2_tbl$feeding,
estimate = predictions(simple_2_fit)$estimate) |>
ggplot(aes(weight, estimate, shape = feeding)) +
geom_point(aes(weight, jump_length), color = "blue") +
geom_point(color = "red") +
geom_line(aes(y = predict(simple_2_fit))) +
geom_hline(yintercept = avg_predictions(simple_2_fit, by = "feeding")$estimate)
84.5 Counterfactual
R Code [zeigen / verbergen]
avg_comparisons(feeding_fit,
by = "stage",
variables = list("feeding" = "pairwise"),
vcov = "HC3")
Contrast stage Estimate Std. Error z Pr(>|z|) S 2.5 %
blood - sugar_water adult 20.73 11.6 1.794 0.0728 3.8 -1.92
ketchup - blood adult -7.56 15.0 -0.503 0.6148 0.7 -37.02
ketchup - sugar_water adult 13.17 11.4 1.157 0.2472 2.0 -9.14
blood - sugar_water juvenile 54.99 17.7 3.106 0.0019 9.0 20.29
ketchup - blood juvenile -4.79 22.3 -0.215 0.8301 0.3 -48.52
ketchup - sugar_water juvenile 50.21 14.3 3.512 <0.001 11.1 22.19
97.5 %
43.4
21.9
35.5
89.7
38.9
78.2
Term: feeding
Type: response
84.6 Weitere R Pakete
84.7 Marginal effects models
84.7.1 Kontrafaktische Vergleiche
Kontrafaktische Vergleiche (eng. counterfactual)
R Code [zeigen / verbergen]
<- lm(jump_length ~ weight*I(weight^2)*feeding + count_leg,
model_grp_sq data = flea_model_tbl)
tidy(model_grp_sq)
# A tibble: 13 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 49.8 36.5 1.37 1.80e- 1
2 weight 1.11 11.7 0.0946 9.25e- 1
3 I(weight^2) 0.0615 0.950 0.0647 9.49e- 1
4 feedingblood -13.1 92.9 -0.141 8.89e- 1
5 feedingketchup -7.07 68.1 -0.104 9.18e- 1
6 count_leg 0.0973 0.0113 8.59 3.84e-10
7 weight:I(weight^2) -0.00305 0.0223 -0.137 8.92e- 1
8 weight:feedingblood 7.02 27.8 0.253 8.02e- 1
9 weight:feedingketchup 0.517 20.7 0.0250 9.80e- 1
10 I(weight^2):feedingblood -0.562 2.30 -0.244 8.09e- 1
11 I(weight^2):feedingketchup 0.0565 1.80 0.0313 9.75e- 1
12 weight:I(weight^2):feedingblood 0.0142 0.0536 0.266 7.92e- 1
13 weight:I(weight^2):feedingketchup 0.0000836 0.0468 0.00179 9.99e- 1
R Code [zeigen / verbergen]
<- datagrid(model = model_grp_sq,
cfct_data weight = c(5, 15),
grid_type = "counterfactual")
R Code [zeigen / verbergen]
nrow(flea_model_tbl)
[1] 48
R Code [zeigen / verbergen]
nrow(cfct_data)
[1] 96
R Code [zeigen / verbergen]
<- model_grp_sq |>
mfx_cfct slopes(datagrid(weight = c(5, 15),
grid_type = "counterfactual"),
variables = "weight")
rbind(head(mfx_cfct), tail(mfx_cfct))
Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
1.49 4.01 0.372 0.710 0.5 -6.37 9.36
1.49 4.01 0.372 0.710 0.5 -6.37 9.36
1.49 4.02 0.372 0.710 0.5 -6.38 9.36
1.49 4.02 0.372 0.710 0.5 -6.38 9.36
1.49 4.02 0.372 0.710 0.5 -6.38 9.36
1.49 4.01 0.372 0.710 0.5 -6.37 9.36
3.16 2.20 1.433 0.152 2.7 -1.16 7.48
3.16 2.21 1.426 0.154 2.7 -1.18 7.50
3.16 2.21 1.426 0.154 2.7 -1.18 7.50
3.16 2.21 1.426 0.154 2.7 -1.18 7.50
3.16 2.20 1.433 0.152 2.7 -1.16 7.48
3.16 2.20 1.433 0.152 2.7 -1.16 7.48
Term: weight
Type: response
Comparison: dY/dX
R Code [zeigen / verbergen]
|>
mfx_cfct group_by(weight) |>
summarize(avg_slope = mean(estimate))
# A tibble: 2 × 2
weight avg_slope
<dbl> <dbl>
1 5 2.68
2 15 1.57
R Code [zeigen / verbergen]
|>
model_grp_sq avg_slopes(newdata = datagrid(weight = c(5, 20),
grid_type = "counterfactual"),
variables = "weight",
by = c("weight", "feeding"))
weight feeding Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
5 sugar_water 1.4930 4.01 0.3721 0.710 0.5 -6.37 9.36
5 blood 3.9624 8.00 0.4955 0.620 0.7 -11.71 19.64
5 ketchup 2.5814 5.16 0.4999 0.617 0.7 -7.54 12.70
20 sugar_water -0.0983 2.13 -0.0462 0.963 0.1 -4.27 4.07
20 blood 1.5321 1.68 0.9140 0.361 1.5 -1.75 4.82
20 ketchup 2.7790 5.72 0.4856 0.627 0.7 -8.44 14.00
Term: weight
Type: response
Comparison: dY/dX
84.8 Analyse von Zeitreihen
Hier nochmal {mgcv}
und {gamm4}
Introduction to Generalized Additive Mixed Models
s()
und Interaktion mit s(x_1, by = f_1)
84.9 Links
- Mixed
- Mixed II
- Marginal and conditional effects for GLMMs with {marginaleffects} | Andrew Heiss – Andrew Heiss
- R Paket
{gapminder}
- Gapminder
- Lists are my secret weapon for reporting stats with knitr - Higher Order Functions
- Visualizing {dplyr}’s mutate(), summarize(), group_by(), and ungroup() with animations | Andrew Heiss – Andrew Heiss