First, thanks for this great package, I've been looking forward to it. I appreciate all the effort that goes into making something like this.
I've been running into an error when I run cont_did() for cases where the number of groups < number of periods (although I now see that this issues applies more generally, see the df3 case below). For example, when I have many periods with one ever-treated group and one never-treated group I often get the error Error in overall_weights(att_gt, ...) : something's going wrong calculating overall weights.
I think I've tracked down the issue, although my comparative advantage is not in R so it's possible I'm wrong here. I've provided a minimum working example below, and then provide some code that I've used to (possibly?) diagnose the issue.
The error is thrown at the end of overall_weights() in pte_aggte.R when sum(out_weight) != 1. For the examples I've run, the sum of the weights should be equal to 1, but the condition evaluates to TRUE because of a floating point precision issue.
(Also, let me know if I should move this report to the ptetools repo. I wasn't entirely sure where to put it.)
First, here's a quick MWE that demonstrates the issue.
library("contdid")
set.seed(117)
# Dataset from contdid README
df1 = simulate_contdid_data(
n = 5000,
num_time_periods = 4,
num_groups = 4,
dose_linear_effect = 0,
dose_quadratic_effect = 0
)
# Dataset where ever-treated units are all treated mid-way through the sample
df2 = simulate_contdid_data(
n = 5000,
num_time_periods = 10,
num_groups = 10,
dose_linear_effect = 0,
dose_quadratic_effect = 0
)
df2$G = ifelse(
df2$G != 0,
5,
0
)
# Dataset similar to README example, with contrived number of groups/periods
df3 = simulate_contdid_data(
n = 5000,
num_time_periods = 11,
num_groups = 11,
dose_linear_effect = 0,
dose_quadratic_effect = 0
)
# Run cont_did for each dataset
r1 = cont_did(
yname = "Y",
tname = "time_period",
idname = "id",
dname = "D",
data = df1,
gname = "G",
target_parameter = "slope",
aggregation = "dose",
treatment_type = "continuous",
control_group = "nevertreated",
biters = 100,
cband = TRUE,
num_knots = 0,
degree = 1,
)
r2 = cont_did(
yname = "Y",
tname = "time_period",
idname = "id",
dname = "D",
data = df2,
gname = "G",
target_parameter = "slope",
aggregation = "dose",
treatment_type = "continuous",
control_group = "nevertreated",
biters = 100,
cband = TRUE,
num_knots = 0,
degree = 1,
)
r3 = cont_did(
yname = "Y",
tname = "time_period",
idname = "id",
dname = "D",
data = df3,
gname = "G",
target_parameter = "slope",
aggregation = "dose",
treatment_type = "continuous",
control_group = "nevertreated",
biters = 100,
cband = TRUE,
num_knots = 0,
degree = 1,
)
Model r1 runs properly (modulo a warning about uniform vs. pointwise CIs), r2 and r3 give the error:
Error in overall_weights(att_gt, ...) :
something's going wrong calculating overall weights
Note that these examples also work for non-linear models.
Here's some code/directions that will document the precision issue:
library("contdid")
set.seed(117)
# Set breakpoint
options(error = browser)
debugonce(ptetools:::overall_weights)
# Dataset where ever-treated units are all treated mid-way through the sample
df2 = simulate_contdid_data(
n = 5000,
num_time_periods = 10,
num_groups = 10,
dose_linear_effect = 0,
dose_quadratic_effect = 0
)
df2$G = ifelse(
df2$G != 0,
5,
0
)
# Run model
r2 = cont_did(
yname = "Y",
tname = "time_period",
idname = "id",
dname = "D",
data = df2,
gname = "G",
target_parameter = "slope",
aggregation = "dose",
treatment_type = "continuous",
control_group = "nevertreated",
biters = 100,
cband = TRUE,
num_knots = 1,
degree = 3,
)
# Step through function with "n" until out_weight has been created.
print(out_weight) # Output: Three 0s, and 6 approximations of 1/6; should sum to 1
print(sum(out_weight)) # Output: 1
print(sum(out_weight) - 1) # Output: Small, non-zero value
# Run the "quick sanity check" from overall_weights()
if (sum(out_weight) != 1) stop("something's going wrong calculating overall weights")
# Output: stop("something's going wrong calculating overall weights")
For what it's worth, it's easy to make and break examples of this. E.g., change the 5 in df2 to 6 and it's resolved.
First, thanks for this great package, I've been looking forward to it. I appreciate all the effort that goes into making something like this.
I've been running into an error when I run
cont_did()for cases where the number of groups < number of periods (although I now see that this issues applies more generally, see thedf3case below). For example, when I have many periods with one ever-treated group and one never-treated group I often get the errorError in overall_weights(att_gt, ...) : something's going wrong calculating overall weights.I think I've tracked down the issue, although my comparative advantage is not in R so it's possible I'm wrong here. I've provided a minimum working example below, and then provide some code that I've used to (possibly?) diagnose the issue.
The error is thrown at the end of
overall_weights()inpte_aggte.Rwhensum(out_weight) != 1. For the examples I've run, the sum of the weights should be equal to 1, but the condition evaluates toTRUEbecause of a floating point precision issue.(Also, let me know if I should move this report to the ptetools repo. I wasn't entirely sure where to put it.)
First, here's a quick MWE that demonstrates the issue.
Model
r1runs properly (modulo a warning about uniform vs. pointwise CIs),r2andr3give the error:Note that these examples also work for non-linear models.
Here's some code/directions that will document the precision issue:
For what it's worth, it's easy to make and break examples of this. E.g., change the 5 in
df2to 6 and it's resolved.