Skip to content

Add HERS dataset linear regression example with interactions, interspersed with birthweight analysis (Vittinghoff Ch. 4)#381

Open
Copilot wants to merge 24 commits intomainfrom
copilot/add-example-model-hers-dataset
Open

Add HERS dataset linear regression example with interactions, interspersed with birthweight analysis (Vittinghoff Ch. 4)#381
Copilot wants to merge 24 commits intomainfrom
copilot/add-example-model-hers-dataset

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 7, 2026

Integrates a HERS dataset linear regression example into the "Linear Models Overview" chapter, interspersed with the existing birthweight analysis to mirror the narrative flow of Vittinghoff Ch. 4.

Changes Made

HERS content subfiles (all at repo root, for easy repositioning)

  • _sec_hers_data.qmd — HERS data introduction: data loading, scatter plot (faceted by HT with label_both), pairwise key-variables plot (GGally::ggpairs()), and data notation (H, B, V variables)
  • _sec_hers_lm_parallel.qmd — Parallel lines model: LDL ~ HT + BMI; fitted-values plot faceted by HT
  • _sec_hers_lm_interact.qmd — Interaction model: LDL ~ HT + BMI + statins + BMI:statins; plot faceted by HT
  • _sec_hers_lm_gof.qmd — Goodness of fit: AIC/BIC table and deviance for both HERS models
  • _sec_hers_lm_diagnostics.qmd — Diagnostics: residuals vs fitted and QQ plots for both hers_lm1 and hers_lm2, all faceted by HT with label_both
  • _sec_hers_lm_model_selection.qmd — Model selection: LRT comparing the two HERS models

Structure in _sec_linreg_understanding.qmd

  • HERS data intro (_sec_hers_data.qmd) inserted directly after the birthweight dataset section
  • HERS parallel model included after the birthweight parallel lines coefficient interpretation
  • HERS interaction model included after the birthweight interaction coefficient comparison table

"Assessing model fit" section (_sec_linreg_assess.qmd)

  • HERS goodness-of-fit examples added after the birthweight GOF subsection
  • HERS diagnostic examples added after the birthweight Diagnostics subsection
  • HERS model selection examples added after the birthweight Model Selection subsection

Other

  • Interaction model uses BMI:statins (not BMI:HT); statin use $V$ added to data notation
  • All HERS plots faceted by HT using facet_wrap(~HT, labeller = label_both)
  • Factor reference levels explicitly set: relevel(ref = "placebo") for HT, relevel(ref = "no") for statins
  • GGally::ggpairs() key-variables panel added to the HERS data tabset, showing LDL, HT, BMI, statins, and age colored by treatment group
  • GGally (v2.4.0) added to DESCRIPTION Suggests and renv.lock
  • inst/WORDLIST deduplicated and sorted (105 unique words after merging with main)
  • Deleted unreferenced standalone _sec_exm_hers_lm.qmd

Testing

  • ✅ Lint passes (lintr::lint() on all changed .qmd files)
  • ✅ Spell check passes (spelling::spell_check_package())

Copilot AI and others added 2 commits April 7, 2026 20:05
Copilot AI changed the title [WIP] Add an example model with the HERS dataset for linear regression Add HERS dataset linear regression example (Vittinghoff Ch. 4) Apr 7, 2026
Copilot AI requested a review from d-morrison April 7, 2026 20:08
@d-morrison
Copy link
Copy Markdown
Owner

add a model with interactions, like we did for the birthweight data. intersperse the HERS analysis with the corresponding sections of the birthweight analysis

@d-morrison
Copy link
Copy Markdown
Owner

@copilot add a model with interactions, like we did for the birthweight data. intersperse the HERS analysis with the corresponding sections of the birthweight analysis

Copilot AI and others added 2 commits April 7, 2026 20:41
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot add a model with interactions, like we did for the birthweight data. intersperse the HERS analysis with the corresponding sections of the birthweight analysis

Done in commit 27418f9. The HERS analysis is now interspersed with the birthweight sections:

  • 2.2.4 Motivating example: hers data — data loading, scatter plot, and notation (H = HT treatment, B = BMI)
  • 2.2.5 Parallel lines regression for hers data — LDL ~ HT + BMI, mirroring the birthweight parallel lines model
  • 2.2.7 Interactions in hers data — LDL ~ HT + BMI + HT:BMI, added right after the birthweight interaction section (2.2.6)

Each HERS section follows the same structure as its birthweight counterpart: model equation, parameter table, and fitted-values plot.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • glmnet.stanford.edu
    • Triggering command: /opt/R/4.5.3/lib/R/bin/exec/R /opt/R/4.5.3/lib/R/bin/exec/R --no-echo --no-restore --file=/opt/quarto/share/rmd/rmd.R (dns block)
    • Triggering command: /opt/R/4.5.3/lib/R/bin/exec/R /opt/R/4.5.3/lib/R/bin/exec/R --no-echo --no-restore --file=/opt/quarto/share/rmd/rmd.R unam�� (dns block)
  • olsrr.rsquaredacademy.com
    • Triggering command: /opt/R/4.5.3/lib/R/bin/exec/R /opt/R/4.5.3/lib/R/bin/exec/R --no-echo --no-restore --file=/opt/quarto/share/rmd/rmd.R (dns block)
    • Triggering command: /opt/R/4.5.3/lib/R/bin/exec/R /opt/R/4.5.3/lib/R/bin/exec/R --no-echo --no-restore --file=/opt/quarto/share/rmd/rmd.R unam�� (dns block)
  • packagemanager.posit.co
    • Triggering command: /usr/bin/curl /usr/bin/curl --config /tmp/Rtmpz21USI/renv-download-config-26a159b3cb76 (dns block)
    • Triggering command: /usr/bin/curl /usr/bin/curl --config /tmp/Rtmpz21USI/renv-download-config-26a143be690e (dns block)
    • Triggering command: /usr/bin/curl /usr/bin/curl --config /tmp/Rtmpz21USI/renv-download-config-26a1777d9204 (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title Add HERS dataset linear regression example (Vittinghoff Ch. 4) Add HERS dataset linear regression example with interactions, interspersed with birthweight analysis (Vittinghoff Ch. 4) Apr 7, 2026
@d-morrison
Copy link
Copy Markdown
Owner

@copilot add interaction between bmi and statin use to the HERS model, and remove the interaction between bmi and HT

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 9, 2026

@copilot add interaction between bmi and statin use to the HERS model, and remove the interaction between bmi and HT

Done in commit f7d07a6. The interaction model is now LDL ~ HT + BMI + statins + BMI:statins:

  • Removed HT:BMI interaction
  • Added statins as a main effect and BMI:statins as the interaction term
  • Added statin use ($U$/$V$) to the data notation section
  • Updated the scatter plot to colour/shape by statin use (no vs yes)

The model equation is now:
$$\mu(h,v,b) = \beta_0 + \beta_H h + \beta_V v + \beta_B b + \beta_{VB}(v \cdot b)$$

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • rspm-sync.rstudio.com
    • Triggering command: /usr/bin/curl /usr/bin/curl --config /tmp/RtmpdqGxja/renv-download-config-2677336aef2b (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@d-morrison
Copy link
Copy Markdown
Owner

@copilot resolve merge conflicts

@d-morrison d-morrison removed their request for review April 10, 2026 18:54
LRT is already covered in _sec_hers_lm_model_selection.qmd.
The GOF file should focus on AIC, BIC, and deviance only.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot

Done in commits c46e981 and f9d6ec1:

  • HERS data intro moved: Extracted into _sec_hers_data.qmd and inserted directly after the birthweight dataset section (before "Parallel lines regression")
  • All HERS graphs faceted by HT: Added facet_wrap(~HT, labeller = label_both) to scatter plot in _sec_hers_data.qmd, fitted-values plot in _sec_hers_lm_parallel.qmd, and interaction model plot in _sec_hers_lm_interact.qmd
  • All HERS content in subfiles: _sec_hers_data.qmd, _sec_hers_lm_parallel.qmd, _sec_hers_lm_interact.qmd, _sec_hers_lm_gof.qmd, _sec_hers_lm_diagnostics.qmd, _sec_hers_lm_model_selection.qmd
  • HERS examples for all "Evaluating model fit" subsections:
    • Goodness of fit: AIC/BIC table and deviance for hers_lm1 vs hers_lm2 (_sec_hers_lm_gof.qmd)
    • Diagnostics: Residuals vs fitted and QQ plot, both faceted by HT (_sec_hers_lm_diagnostics.qmd)
    • Model selection: LRT comparing the two HERS models (_sec_hers_lm_model_selection.qmd)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates a HERS dataset linear regression example into the “Linear Models Overview” chapter, interleaving it with the existing birthweight analysis to follow the narrative flow of Vittinghoff Ch. 4.

Changes:

  • Adds new HERS-focused Quarto subfiles covering data intro, parallel-lines and interaction models, GOF, diagnostics, and model selection.
  • Inserts the new HERS subfiles into the “Understanding…” and “Assessing model fit” sections via {{< include ... >}}.
  • Updates inst/WORDLIST to reflect new terminology and deduplicate/sort entries.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
inst/WORDLIST Updates spelling whitelist for new HERS terms and general cleanup.
_subfiles/Linear-models-overview/_sec_linreg_understanding.qmd Includes new HERS data + modeling sections into the narrative flow.
_subfiles/Linear-models-overview/_sec_linreg_assess.qmd Includes new HERS GOF/diagnostics/model-selection examples after birthweight sections.
_sec_hers_data.qmd New HERS dataset intro: load data, show table/plot, define notation.
_sec_hers_lm_parallel.qmd New HERS parallel-lines model fit + coefficient table + fitted-values overlay plot.
_sec_hers_lm_interact.qmd New HERS interaction model fit + interpretation + fitted overlay plot.
_sec_hers_lm_gof.qmd New AIC/BIC and deviance comparisons for the HERS models.
_sec_hers_lm_diagnostics.qmd New residual-vs-fitted and QQ diagnostics plots (currently for hers_lm1).
_sec_hers_lm_model_selection.qmd New LRT comparison (lrtest) between hers_lm1 and hers_lm2.

Comment thread _sec_hers_data.qmd Outdated

::: notes

{{< include _sec_hers_intro.qmd >}}
Comment thread _sec_hers_data.qmd
Comment on lines +93 to +97
::: notes
"Placebo" is the **reference level** for the categorical variable $T$,
and "no" is the **reference level** for statin use $U$.
The choice of reference level is arbitrary;
it only affects the interpretation of the intercept and corresponding indicator coefficients.
Comment thread _sec_hers_lm_diagnostics.qmd Outdated
@@ -0,0 +1,34 @@
### Diagnostics for `hers` models
@d-morrison
Copy link
Copy Markdown
Owner

@copilot apply changes based on the comments in this thread

also, add a graph of the key variables of the HERS data, including the treatment variable, the outcome variable, and the most important covariates

Copilot AI and others added 2 commits April 16, 2026 19:36
…ariables plot

Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/a89643f9-3164-469b-9b64-621cc53426a0

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot apply changes based on the comments in this thread

also, add a graph of the key variables of the HERS data, inclu...

Done in commits f61a83b and 9d7eb27:

PR review fixes:

  • Fixed include path: {{< include _sec_hers_intro.qmd >}}{{< include _subfiles/shared/_sec_hers_intro.qmd >}}
  • Added explicit relevel() calls: relevel(ref = "placebo") for HT and relevel(ref = "no") for statins (both were already in correct order from as_factor(), but now explicit)
  • Expanded diagnostics into two subsections — one for hers_lm1 (parallel-lines model) and one for hers_lm2 (interaction model), each with residuals vs fitted and QQ plots, both faceted by HT

Key variables graph:

  • Added a "Key variables" tab in the HERS data panel-tabset with a GGally::ggpairs() plot of LDL (outcome), HT (treatment), BMI, statins, and age — colored by HT treatment group

@d-morrison d-morrison marked this pull request as ready for review April 19, 2026 00:34
Copilot AI review requested due to automatic review settings April 21, 2026 00:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new HERS linear regression walkthrough (parallel lines, interaction, GOF/diagnostics/model selection) into the Linear Models Overview chapter, interleaved with the existing birthweight narrative to mirror Vittinghoff Ch. 4.

Changes:

  • Inserts new HERS include subfiles into the “Understanding linear regression” and “Assessing model fit” subfiles.
  • Adds new Quarto subfiles at repo root for HERS data intro, two models, GOF, diagnostics, and LRT-based model selection.
  • Updates inst/WORDLIST with additional allowed technical terms.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
inst/WORDLIST Updates spellcheck allowlist for new terminology used in the added HERS content.
_subfiles/Linear-models-overview/_sec_linreg_understanding.qmd Includes the HERS intro + model examples into the “Understanding” flow.
_subfiles/Linear-models-overview/_sec_linreg_assess.qmd Includes HERS GOF/diagnostics/model-selection examples into the “Assessing fit” flow.
_sec_hers_data.qmd New HERS dataset introduction, plots, and notation.
_sec_hers_lm_parallel.qmd New HERS parallel-lines model example and fitted-values visualization.
_sec_hers_lm_interact.qmd New HERS interaction model example and visualization by statin use.
_sec_hers_lm_gof.qmd New AIC/BIC and deviance comparisons for the two HERS models.
_sec_hers_lm_diagnostics.qmd New residual-vs-fitted and QQ diagnostics for both HERS models.
_sec_hers_lm_model_selection.qmd New LRT comparison between HERS models.

Comment thread _sec_hers_lm_interact.qmd
Comment on lines +39 to +44
hers_lm2 |>
parameters() |>
parameters::print_md(
include_reference = include_reference_lines,
select = "{estimate}"
)
Comment thread inst/WORDLIST
Comment on lines 33 to 40
WCGS
Welch
Zhou
biomarkers
Biostat
bmatrix
BMI
bmt
Comment thread _sec_hers_data.qmd
### Motivating example: `hers` data {.smaller}

:::{.callout-note}
This section is based on @vittinghoff2e, Chapter 4.
@d-morrison
Copy link
Copy Markdown
Owner

@copilot add GGally package to renv and Description

@d-morrison d-morrison removed their request for review April 21, 2026 01:17
Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/911ad1ab-e623-4550-9811-addaf2dbd9b9

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot April 21, 2026 01:52
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 21, 2026

@copilot add GGally package to renv and Description

Done in commit 238201b. Added GGally to DESCRIPTION Suggests and ran renv::install("GGally"); renv::snapshot() to add GGally 2.4.0 (along with its new dependencies ggstats and patchwork) to renv.lock.

Copilot AI requested a review from d-morrison April 21, 2026 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add an example model with the HERS dataset for linear regression

3 participants