-
-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
What happened?
If any by variables are not present in the denominator dataset, and a subject has multiple records across multiple levels of said by variable, only 1 is retained.
In the following example, a subject has 1 mild and 1 moderate AE, but only the last one (moderate) is counted.
library(cards)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
# load data
adsl <- pharmaverseadam::adsl |>
filter(SAFFL == "Y", SITEID=="701")
adae <- pharmaverseadam::adae |>
filter(SAFFL == "Y", SITEID=="701")
# subset data to limit
adae <- adae |>
filter(AESOC %in% unique(AESOC)[1:2]) |>
unique()
# duplicate one of the USUBJID records & add it
adae_row_add <- adae |> slice(1)
print(adae_row_add |> select(USUBJID, TRT01A, AESOC, AESEV))
#> # A tibble: 1 × 4
#> USUBJID TRT01A AESOC AESEV
#> <chr> <chr> <chr> <chr>
#> 1 01-701-1015 Placebo GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS MILD
adae <- adae |>
bind_rows(
adae_row_add |> mutate(AESEV = "MODERATE")
)
# calculate manually
manual_counts <- adae |>
select(USUBJID, TRT01A, AESEV, AESOC) |>
unique() |>
group_by(TRT01A, AESEV, AESOC) |>
tally() |>
ungroup()
# calculate with ard_stack_hier
ard_counts <- ard_stack_hierarchical(
data = adae,
by = c(TRT01A, AESEV),
variables = AESOC,
statistic = ~ "n",
denominator = adsl,
id = USUBJID,
by_stats = FALSE
) |>
unlist_ard_columns()
#> ℹ Denominator set by "TRT01A" column in `denominator` data frame.
# 3 MILD, 1 MODERATE
manual_counts |>
filter(TRT01A=="Placebo",
AESOC=="GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS")
#> # A tibble: 2 × 4
#> TRT01A AESEV AESOC n
#> <chr> <chr> <chr> <int>
#> 1 Placebo MILD GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS 3
#> 2 Placebo MODERATE GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS 1
# 2 MILD, 1 MODERATE
ard_counts |>
filter(group1_level=="Placebo",
variable_level=="GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS")
#> {cards} data frame: 3 x 13
#> group1 group1_level group2 group2_level variable variable_level stat_name
#> 1 TRT01A Placebo AESEV MILD AESOC GENERAL … n
#> 2 TRT01A Placebo AESEV MODERATE AESOC GENERAL … n
#> 3 TRT01A Placebo AESEV SEVERE AESOC GENERAL … n
#> stat_label stat
#> 1 n 2
#> 2 n 1
#> 3 n 0
#> ℹ 4 more variables: context, fmt_fun, warning, errorCreated on 2025-11-24 with reprex v2.1.1
I'm not 100% sure if this is a bug or intentional behavior, but the result is unexpected in some cases and may be confusing to users. The help file does describe the logic accurately, so it's probably intentional and might just need additional clarification.
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
No status