Overview
Building on the excellent proposal in #233, this issue tracks a correct and complete implementation of coef() and vcov() S3 methods for MP and AGGTEobj objects. Before implementing, several scaling and clustering questions need to be resolved.
coef.MP
Straightforward — just names the ATT(g,t) vector:
coef.MP <- function(object, ...) {
x <- object$att
names(x) <- paste0("ATT(", object$group, ",", object$t, ")")
x
}
vcov.MP
vcov() must return $\widehat{\text{Var}}(\hat\theta)$, the estimated variance matrix of the ATT(g,t) estimator itself.
Analytical case (bstrap = FALSE)
V_analytical stored on MP estimates $\hat\Sigma = \frac{1}{n}\sum_i \psi_i\psi_i'$ (the asymptotic variance of $\sqrt{n}\hat\theta$). The correct vcov is:
vcov.MP <- function(object, ...) as.matrix(object$V_analytical) / object$n
This is consistent with the existing Wald pre-test (W = n * t(att) %*% solve(V) %*% att).
Bootstrap case (bstrap = TRUE, no extra clustering)
mboot() scales bootstrap draws by sqrt(n), so cov(bres) also estimates $\hat\Sigma$. Same division by n applies.
Bootstrap case with clustervars
mboot() aggregates influence functions to cluster means and scales by sqrt(n_clusters), so cov(bres_cl) estimates $n_{cl} \cdot \text{Var}(\hat\theta)$. The correct vcov is cov(bres_cl) / n_clusters. This is consistent with how mboot already computes se = bSigma / sqrt(n_clusters).
Changes needed for bootstrap VCV
bout$V is already computed inside mboot() but currently discarded. n_clusters is also computed inside mboot() but not returned. Both need to be threaded through to the MP object. Memory cost is negligible — same size as V_analytical which is already stored.
Correctness guard under clustering
V_analytical ignores between-cluster correlation and is known to be anti-conservative when clustervars is set — the Wald pre-test already suppresses itself in this case. vcov.MP must apply the same guard: warn (or error) when clustervars is set and no bootstrap VCV is available.
Analytical cluster-robust VCV
Currently the package requires bstrap = TRUE whenever clustervars is specified. In principle, an analytical cluster-robust sandwich estimator is available:
$$\hat\Sigma_{cl} = \frac{1}{n_{cl}} \sum_c \bar\psi_c \bar\psi_c'$$
where $\bar\psi_c = \frac{1}{n_c}\sum_{i \in c} \psi_i$ is the cluster-mean influence function — already computed inside mboot() as cluster_mean_if. Exposing this analytically would allow vcov() to be correct under clustering without bootstrap, and would remove the forced-bootstrap requirement for clustered SEs.
vcov.AGGTEobj
This is fully feasible — the influence functions are already stored on every AGGTEobj and the computation is identical in structure to V_analytical on MP.
What is stored
| Type |
Per-estimate IFs |
Overall IF |
"dynamic" |
dynamic.inf.func.e (n × K, one col per event time) |
dynamic.inf.func (n × 1) |
"group" |
selective.inf.func.g (n × G, one col per cohort) |
selective.inf.func (n × 1) |
"calendar" |
calendar.inf.func.t (n × T, one col per calendar period) |
calendar.inf.func (n × 1) |
"simple" |
single estimate only — VCV reduces to a 1×1 scalar |
simple.att (n × 1) |
Construction
# e.g. for dynamic type
IF <- object$inf.function$dynamic.inf.func.e # n × K
n <- nobs(object)
V <- t(IF) %*% IF / n # estimates Σ (asymptotic variance of sqrt(n)*θ̂)
vcov <- V / n # estimates Var(θ̂)
Subtlety for "group" type
tidy.AGGTEobj for "group" includes an overall average row alongside the per-cohort rows. A complete (G+1)×(G+1) VCV requires stacking the overall IF with the per-cohort IFs: cbind(selective.inf.func, selective.inf.func.g).
Bootstrap and clustering
Same considerations as vcov.MP apply — bout$V and n_clusters need to be stored to support the clustered case correctly.
coef.AGGTEobj
Returns the named vector of aggregated estimates (att.egt, plus overall.att for "group" type):
coef.AGGTEobj <- function(object, ...) {
# example for dynamic type
x <- object$att.egt
names(x) <- paste0("ATT(", object$egt, ")")
x
}
Why this matters
Once coef() and vcov() are available, users can directly use standard R tools for linear hypothesis testing — for example car::linearHypothesis() — without having to reconstruct these objects manually. This would also allow replicating and extending the existing Wald pre-test (mp$W) to arbitrary linear combinations of ATT estimates.
Overview
Building on the excellent proposal in #233, this issue tracks a correct and complete implementation of
coef()andvcov()S3 methods forMPandAGGTEobjobjects. Before implementing, several scaling and clustering questions need to be resolved.coef.MPStraightforward — just names the ATT(g,t) vector:
vcov.MPvcov()must returnAnalytical case (
bstrap = FALSE)V_analyticalstored onMPestimatesvcovis:This is consistent with the existing Wald pre-test (
W = n * t(att) %*% solve(V) %*% att).Bootstrap case (
bstrap = TRUE, no extra clustering)mboot()scales bootstrap draws bysqrt(n), socov(bres)also estimatesnapplies.Bootstrap case with
clustervarsmboot()aggregates influence functions to cluster means and scales bysqrt(n_clusters), socov(bres_cl)estimatesvcoviscov(bres_cl) / n_clusters. This is consistent with howmbootalready computesse = bSigma / sqrt(n_clusters).Changes needed for bootstrap VCV
bout$Vis already computed insidemboot()but currently discarded.n_clustersis also computed insidemboot()but not returned. Both need to be threaded through to theMPobject. Memory cost is negligible — same size asV_analyticalwhich is already stored.Correctness guard under clustering
V_analyticalignores between-cluster correlation and is known to be anti-conservative whenclustervarsis set — the Wald pre-test already suppresses itself in this case.vcov.MPmust apply the same guard: warn (or error) whenclustervarsis set and no bootstrap VCV is available.Analytical cluster-robust VCV
Currently the package requires
bstrap = TRUEwheneverclustervarsis specified. In principle, an analytical cluster-robust sandwich estimator is available:where$\bar\psi_c = \frac{1}{n_c}\sum_{i \in c} \psi_i$ is the cluster-mean influence function — already computed inside
mboot()ascluster_mean_if. Exposing this analytically would allowvcov()to be correct under clustering without bootstrap, and would remove the forced-bootstrap requirement for clustered SEs.vcov.AGGTEobjThis is fully feasible — the influence functions are already stored on every
AGGTEobjand the computation is identical in structure toV_analyticalonMP.What is stored
"dynamic"dynamic.inf.func.e(n × K, one col per event time)dynamic.inf.func(n × 1)"group"selective.inf.func.g(n × G, one col per cohort)selective.inf.func(n × 1)"calendar"calendar.inf.func.t(n × T, one col per calendar period)calendar.inf.func(n × 1)"simple"simple.att(n × 1)Construction
Subtlety for
"group"typetidy.AGGTEobjfor"group"includes an overall average row alongside the per-cohort rows. A complete (G+1)×(G+1) VCV requires stacking the overall IF with the per-cohort IFs:cbind(selective.inf.func, selective.inf.func.g).Bootstrap and clustering
Same considerations as
vcov.MPapply —bout$Vandn_clustersneed to be stored to support the clustered case correctly.coef.AGGTEobjReturns the named vector of aggregated estimates (
att.egt, plusoverall.attfor"group"type):Why this matters
Once
coef()andvcov()are available, users can directly use standard R tools for linear hypothesis testing — for examplecar::linearHypothesis()— without having to reconstruct these objects manually. This would also allow replicating and extending the existing Wald pre-test (mp$W) to arbitrary linear combinations of ATT estimates.