add QTE support for covariate adaptive randomization#107
add QTE support for covariate adaptive randomization#107
Conversation
Implement predict_qte in SimpleStratifiedDistributionEstimator and AdjustedStratifiedDistributionEstimator with stratified bootstrap (resampling within each stratum independently) to correctly estimate variance under CAR designs.
There was a problem hiding this comment.
Pull request overview
Adds CAR-compatible QTE inference by overriding predict_qte in the stratified estimators to use stratified bootstrap (resampling independently within strata) for variance estimation.
Changes:
- Implement
predict_qteforSimpleStratifiedDistributionEstimatorusing stratified bootstrap. - Implement
predict_qteforAdjustedStratifiedDistributionEstimatorusing stratified bootstrap. - Add required imports (
Optional,norm) to support the new QTE CI computation.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Stratified bootstrap: resample within each stratum independently | ||
| bootstrap_indexes = np.concatenate([ | ||
| np.random.choice(idx, size=len(idx), replace=True) | ||
| for idx in strata_indices.values() | ||
| ]) | ||
|
|
||
| qtes[b] = self._compute_qtes( | ||
| target_treatment_arm, | ||
| control_treatment_arm, | ||
| quantiles, | ||
| self.covariates[bootstrap_indexes], | ||
| self.treatment_arms[bootstrap_indexes], | ||
| self.outcomes[bootstrap_indexes], | ||
| self.strata[bootstrap_indexes], | ||
| ) |
There was a problem hiding this comment.
This PR changes QTE variance estimation under CAR by using stratified bootstrap, but the existing unit tests only assert shapes and basic ordering. Please add a test that would fail if bootstrapping were not stratified (e.g., assert each bootstrap replicate preserves per-stratum sample counts, or compare variance vs an unstratified bootstrap on an imbalanced-strata synthetic dataset).
| quantiles: Optional[np.ndarray] = None, | ||
| alpha: float = 0.05, | ||
| n_bootstrap=500, | ||
| display_progress: bool = True, | ||
| ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: | ||
| """ | ||
| Compute Quantile Treatment Effects (QTE) using stratified bootstrap. | ||
|
|
||
| Uses stratified bootstrap (resampling independently within each stratum) to | ||
| correctly estimate variance under covariate adaptive randomization (CAR). | ||
|
|
||
| Args: | ||
| target_treatment_arm (int): The index of the treatment arm of the treatment group. | ||
| control_treatment_arm (int): The index of the treatment arm of the control group. | ||
| quantiles (np.ndarray, optional): Quantiles used for QTE. Defaults to [0.1, 0.2, ..., 0.9]. | ||
| alpha (float, optional): Significance level of the confidence bound. Defaults to 0.05. | ||
| n_bootstrap (int, optional): Number of bootstrap samples. Defaults to 500. | ||
| display_progress (bool, optional): Whether to display a progress bar. Defaults to True. | ||
|
|
||
| Returns: | ||
| Tuple[np.ndarray, np.ndarray, np.ndarray]: A tuple containing: | ||
| - Expected QTEs (np.ndarray): Treatment effect estimates at each quantile | ||
| - Lower bounds (np.ndarray): Lower confidence interval bounds | ||
| - Upper bounds (np.ndarray): Upper confidence interval bounds | ||
| """ | ||
| qte = self._compute_qtes( | ||
| target_treatment_arm, | ||
| control_treatment_arm, | ||
| quantiles, | ||
| self.covariates, | ||
| self.treatment_arms, | ||
| self.outcomes, | ||
| self.strata, | ||
| ) |
There was a problem hiding this comment.
Same as above: quantiles is optional in the signature/docs but is passed directly into _compute_qtes, which expects an ndarray and will break on None. Please initialize the default quantile grid when quantiles is None (and validate range/order).
| qte = self._compute_qtes( | ||
| target_treatment_arm, | ||
| control_treatment_arm, | ||
| quantiles, | ||
| self.covariates, | ||
| self.treatment_arms, | ||
| self.outcomes, | ||
| self.strata, | ||
| ) |
There was a problem hiding this comment.
AdjustedStratifiedDistributionEstimator._compute_cumulative_distribution draws a fresh random fold assignment on each call (folds = np.random.randint(...)). Because predict_qte calls _compute_qtes many times inside the bootstrap, the resulting CI will include extra Monte Carlo noise from re-randomizing folds, not just resampling variability. Consider fixing folds once (e.g., store them at fit time or accept a random_state and reuse a RNG/seed) so bootstrap variance reflects sampling uncertainty only.
| quantiles: Optional[np.ndarray] = None, | ||
| alpha: float = 0.05, | ||
| n_bootstrap=500, | ||
| display_progress: bool = True, | ||
| ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: | ||
| """ | ||
| Compute Quantile Treatment Effects (QTE) using stratified bootstrap. | ||
|
|
||
| Uses stratified bootstrap (resampling independently within each stratum) to | ||
| correctly estimate variance under covariate adaptive randomization (CAR). | ||
|
|
||
| Args: | ||
| target_treatment_arm (int): The index of the treatment arm of the treatment group. | ||
| control_treatment_arm (int): The index of the treatment arm of the control group. | ||
| quantiles (np.ndarray, optional): Quantiles used for QTE. Defaults to [0.1, 0.2, ..., 0.9]. | ||
| alpha (float, optional): Significance level of the confidence bound. Defaults to 0.05. | ||
| n_bootstrap (int, optional): Number of bootstrap samples. Defaults to 500. | ||
| display_progress (bool, optional): Whether to display a progress bar. Defaults to True. | ||
|
|
||
| Returns: | ||
| Tuple[np.ndarray, np.ndarray, np.ndarray]: A tuple containing: | ||
| - Expected QTEs (np.ndarray): Treatment effect estimates at each quantile | ||
| - Lower bounds (np.ndarray): Lower confidence interval bounds | ||
| - Upper bounds (np.ndarray): Upper confidence interval bounds | ||
| """ | ||
| qte = self._compute_qtes( | ||
| target_treatment_arm, | ||
| control_treatment_arm, | ||
| quantiles, | ||
| self.covariates, | ||
| self.treatment_arms, | ||
| self.outcomes, | ||
| self.strata, | ||
| ) |
There was a problem hiding this comment.
quantiles is documented as optional with a default ([0.1, …, 0.9]) but it’s passed straight into _compute_qtes. If the caller leaves quantiles=None, _compute_qtes will error when accessing quantiles.shape. Please set a default array when quantiles is None (and ideally validate they’re in (0,1)).
| # Precompute stratum indices for stratified bootstrap | ||
| unique_strata = np.unique(self.strata) | ||
| strata_indices = {s: np.where(self.strata == s)[0] for s in unique_strata} | ||
|
|
||
| qtes = np.zeros((n_bootstrap, qte.shape[0])) | ||
| bootstrap_iter = range(n_bootstrap) | ||
| if display_progress: | ||
| bootstrap_iter = tqdm(bootstrap_iter, desc="Bootstrap QTE") | ||
| for b in bootstrap_iter: | ||
| # Stratified bootstrap: resample within each stratum independently | ||
| bootstrap_indexes = np.concatenate([ | ||
| np.random.choice(idx, size=len(idx), replace=True) | ||
| for idx in strata_indices.values() | ||
| ]) |
There was a problem hiding this comment.
The stratified-bootstrap implementation here is duplicated verbatim in both stratified estimator classes. Consider extracting it into a shared private helper (or into DistributionEstimatorBase) to avoid divergence/bugs when one implementation is updated and the other isn’t.
Implement predict_qte in SimpleStratifiedDistributionEstimator and AdjustedStratifiedDistributionEstimator with stratified bootstrap (resampling within each stratum independently) to correctly estimate variance under CAR designs.
close #64