Error raised when calculating overall weights under some parameterizations of the dataset

First, thanks for this great package, I've been looking forward to it. I appreciate all the effort that goes into making something like this.

I've been running into an error when I run `cont_did()` for cases where the number of groups < number of periods (although I now see that this issues applies more generally, see the `df3` case below). For example, when I have many periods with one ever-treated group and one never-treated group I often get the error `Error in overall_weights(att_gt, ...) :  something's going wrong calculating overall weights`.

I _think_ I've tracked down the issue, although my comparative advantage is not in R so it's possible I'm wrong here. I've provided a minimum working example below, and then provide some code that I've used to (possibly?) diagnose the issue.

The error is thrown at the end of `overall_weights()` in `pte_aggte.R` when `sum(out_weight) != 1`. For the examples I've run, the sum of the weights should be equal to 1, but the condition evaluates to `TRUE` because of a floating point precision issue.

(Also, let me know if I should move this report to the ptetools repo. I wasn't entirely sure where to put it.)

First, here's a quick MWE that demonstrates the issue.

```r
library("contdid")
set.seed(117)

# Dataset from contdid README
df1 = simulate_contdid_data(
  n = 5000,
  num_time_periods = 4,
  num_groups = 4,
  dose_linear_effect = 0,
  dose_quadratic_effect = 0
)

# Dataset where ever-treated units are all treated mid-way through the sample
df2 = simulate_contdid_data(
  n = 5000,
  num_time_periods = 10,
  num_groups = 10,
  dose_linear_effect = 0,
  dose_quadratic_effect = 0
)
df2$G = ifelse(
	df2$G != 0,
	5,
	0
)

# Dataset similar to README example, with contrived number of groups/periods
df3 = simulate_contdid_data(
  n = 5000,
  num_time_periods = 11,
  num_groups = 11,
  dose_linear_effect = 0,
  dose_quadratic_effect = 0
)

# Run cont_did for each dataset
r1 = cont_did(
  yname = "Y",
  tname = "time_period",
  idname = "id",
  dname = "D",
  data = df1,
  gname = "G",
  target_parameter = "slope",
  aggregation = "dose",
  treatment_type = "continuous",
  control_group = "nevertreated",
  biters = 100,
  cband = TRUE,
  num_knots = 0,
  degree = 1,
)

r2 = cont_did(
  yname = "Y",
  tname = "time_period",
  idname = "id",
  dname = "D",
  data = df2,
  gname = "G",
  target_parameter = "slope",
  aggregation = "dose",
  treatment_type = "continuous",
  control_group = "nevertreated",
  biters = 100,
  cband = TRUE,
  num_knots = 0,
  degree = 1,
)

r3 = cont_did(
  yname = "Y",
  tname = "time_period",
  idname = "id",
  dname = "D",
  data = df3,
  gname = "G",
  target_parameter = "slope",
  aggregation = "dose",
  treatment_type = "continuous",
  control_group = "nevertreated",
  biters = 100,
  cband = TRUE,
  num_knots = 0,
  degree = 1,
)

```

Model `r1` runs properly (modulo a warning about uniform vs. pointwise CIs), `r2` and `r3` give the error:
```r
Error in overall_weights(att_gt, ...) :
  something's going wrong calculating overall weights
```

Note that these examples also work for non-linear models.

Here's some code/directions that will document the precision issue:

```r
library("contdid")
set.seed(117)

# Set breakpoint
options(error = browser)
debugonce(ptetools:::overall_weights)

# Dataset where ever-treated units are all treated mid-way through the sample
df2 = simulate_contdid_data(
  n = 5000,
  num_time_periods = 10,
  num_groups = 10,
  dose_linear_effect = 0,
  dose_quadratic_effect = 0
)
df2$G = ifelse(
	df2$G != 0,
	5,
	0
)

# Run model
r2 = cont_did(
  yname = "Y",
  tname = "time_period",
  idname = "id",
  dname = "D",
  data = df2,
  gname = "G",
  target_parameter = "slope",
  aggregation = "dose",
  treatment_type = "continuous",
  control_group = "nevertreated",
  biters = 100,
  cband = TRUE,
  num_knots = 1,
  degree = 3,
)

# Step through function with "n" until out_weight has been created.
print(out_weight)  # Output: Three 0s, and 6 approximations of 1/6; should sum to 1
print(sum(out_weight))  # Output: 1
print(sum(out_weight) - 1)  # Output: Small, non-zero value

# Run the "quick sanity check" from overall_weights()
if (sum(out_weight) != 1) stop("something's going wrong calculating overall weights")
# Output: stop("something's going wrong calculating overall weights")
``` 

For what it's worth, it's easy to make and break examples of this. E.g., change the 5 in `df2` to 6 and it's resolved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error raised when calculating overall weights under some parameterizations of the dataset #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Error raised when calculating overall weights under some parameterizations of the dataset #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions