Skip to content

Add coef() and vcov() S3 methods for MP and AGGTEobj #254

@pedrohcgs

Description

@pedrohcgs

Overview

Building on the excellent proposal in #233, this issue tracks a correct and complete implementation of coef() and vcov() S3 methods for MP and AGGTEobj objects. Before implementing, several scaling and clustering questions need to be resolved.


coef.MP

Straightforward — just names the ATT(g,t) vector:

coef.MP <- function(object, ...) {
  x <- object$att
  names(x) <- paste0("ATT(", object$group, ",", object$t, ")")
  x
}

vcov.MP

vcov() must return $\widehat{\text{Var}}(\hat\theta)$, the estimated variance matrix of the ATT(g,t) estimator itself.

Analytical case (bstrap = FALSE)

V_analytical stored on MP estimates $\hat\Sigma = \frac{1}{n}\sum_i \psi_i\psi_i'$ (the asymptotic variance of $\sqrt{n}\hat\theta$). The correct vcov is:

vcov.MP <- function(object, ...) as.matrix(object$V_analytical) / object$n

This is consistent with the existing Wald pre-test (W = n * t(att) %*% solve(V) %*% att).

Bootstrap case (bstrap = TRUE, no extra clustering)

mboot() scales bootstrap draws by sqrt(n), so cov(bres) also estimates $\hat\Sigma$. Same division by n applies.

Bootstrap case with clustervars

mboot() aggregates influence functions to cluster means and scales by sqrt(n_clusters), so cov(bres_cl) estimates $n_{cl} \cdot \text{Var}(\hat\theta)$. The correct vcov is cov(bres_cl) / n_clusters. This is consistent with how mboot already computes se = bSigma / sqrt(n_clusters).

Changes needed for bootstrap VCV

bout$V is already computed inside mboot() but currently discarded. n_clusters is also computed inside mboot() but not returned. Both need to be threaded through to the MP object. Memory cost is negligible — same size as V_analytical which is already stored.

Correctness guard under clustering

V_analytical ignores between-cluster correlation and is known to be anti-conservative when clustervars is set — the Wald pre-test already suppresses itself in this case. vcov.MP must apply the same guard: warn (or error) when clustervars is set and no bootstrap VCV is available.

Analytical cluster-robust VCV

Currently the package requires bstrap = TRUE whenever clustervars is specified. In principle, an analytical cluster-robust sandwich estimator is available:

$$\hat\Sigma_{cl} = \frac{1}{n_{cl}} \sum_c \bar\psi_c \bar\psi_c'$$

where $\bar\psi_c = \frac{1}{n_c}\sum_{i \in c} \psi_i$ is the cluster-mean influence function — already computed inside mboot() as cluster_mean_if. Exposing this analytically would allow vcov() to be correct under clustering without bootstrap, and would remove the forced-bootstrap requirement for clustered SEs.


vcov.AGGTEobj

This is fully feasible — the influence functions are already stored on every AGGTEobj and the computation is identical in structure to V_analytical on MP.

What is stored

Type Per-estimate IFs Overall IF
"dynamic" dynamic.inf.func.e (n × K, one col per event time) dynamic.inf.func (n × 1)
"group" selective.inf.func.g (n × G, one col per cohort) selective.inf.func (n × 1)
"calendar" calendar.inf.func.t (n × T, one col per calendar period) calendar.inf.func (n × 1)
"simple" single estimate only — VCV reduces to a 1×1 scalar simple.att (n × 1)

Construction

# e.g. for dynamic type
IF <- object$inf.function$dynamic.inf.func.e   # n × K
n  <- nobs(object)
V  <- t(IF) %*% IF / n    # estimates Σ (asymptotic variance of sqrt(n)*θ̂)
vcov <- V / n              # estimates Var(θ̂)

Subtlety for "group" type

tidy.AGGTEobj for "group" includes an overall average row alongside the per-cohort rows. A complete (G+1)×(G+1) VCV requires stacking the overall IF with the per-cohort IFs: cbind(selective.inf.func, selective.inf.func.g).

Bootstrap and clustering

Same considerations as vcov.MP apply — bout$V and n_clusters need to be stored to support the clustered case correctly.


coef.AGGTEobj

Returns the named vector of aggregated estimates (att.egt, plus overall.att for "group" type):

coef.AGGTEobj <- function(object, ...) {
  # example for dynamic type
  x <- object$att.egt
  names(x) <- paste0("ATT(", object$egt, ")")
  x
}

Why this matters

Once coef() and vcov() are available, users can directly use standard R tools for linear hypothesis testing — for example car::linearHypothesis() — without having to reconstruct these objects manually. This would also allow replicating and extending the existing Wald pre-test (mp$W) to arbitrary linear combinations of ATT estimates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions