The TestMechs package implements the methodology from the paper
“Testing
Mechanisms”
by Soonwoo Kwon and Jonathan Roth. The package provides tests for the
“sharp null of full mediation”, which conjectures that the effect of a
treatment operates through a particular conjectured mechanism (or set of
mechanisms) M. It also provides lower bounds on the fraction of
“always-takers” who are affected by the treatment despite having the
same value of M regardless of treatment status. All core functions in
the package — test_sharp_null(), lb_frac_affected(), and
partial_density_plot() — support conditional random assignment and
accept a reg_formula argument. When you provide a formula, the package
regression-adjusts the relevant partial probabilities using OLS or IV.
Note that the approach in the paper requires the mediator
You can install the development version of TestMechs from GitHub with:
# Install devtools if not already installed
install.packages("devtools")
# Install package
devtools::install_github("jonathandroth/TestMechs")We illustrate how the package can be used by walking through how the
code can be applied to the application of Baranov et al. (2020) in
Section 5.2 of “Testing
Mechanisms”.
In Baranov et al. (2020),
We start with loading the required packages and the data.
# Load TestMechs
library(TestMechs)
# Load other packages that are required to run the example
library(dplyr)
library(ggplot2)
library(haven)
# Load data
data("baranov_data")
# Restrict to the experimental sample
mother_data <- mother_data %>% filter(THP_sample == 1)We begin with the case where
The main functions we will be using are: 1) partial_density_plot() to
plot the partial densities to visually detect potential violations of
the sharp null, 2) test_sharp_null() to conduct a statistical test
for the sharp null and 3) lb_frac_affected() to compute the sharp
lower bound for the fraction of always-takers (or never-takers) that are
affected by treatment. Each function can be adjusted for conditional
random assignment by providing a reg_formula if needed (e.g., adding
baseline covariates or instruments). While this example covers the basic
usage of these functions, please refer to the documentation of each
function (e.g., ?test_sharp_null) for a more detailed description.
We first provide graphical evidence using a partial density plot using
the function partial_density_plot(). When
nt_plot <-
partial_density_plot(df = mother_data,
d = "treat",
m = "grandmother",
y = "motherfinancial",
num_Ybins = 5,
plot_nts = T,
density_1_label = "Prob. In Treated Group (P(Y,M=0|D=1))",
density_0_label = "Prob. In Control Group (P(Y,M=0|D=0))") +
ylab("Probability") +
xlab("Event: No Grandmother Present (M=0) and Y in Stated Range")
nt_plot +
annotate(geom = "text",
x = 5,y=0.15,
label = "Higher prob. in \n treated group, \n violating sharp null", size = 3) +
geom_segment(x=4.7,y=0.17,
xend = 4.5, yend = 0.175,
arrow = arrow(length = unit(0.03,"npc")),
color = "black", show.legend = F) +
geom_segment(x=5,y=0.13,
xend = 5, yend = 0.10,
arrow = arrow(length = unit(0.03,"npc")),
color = "black", show.legend = F) +
theme(legend.position="bottom",
legend.title = element_blank())This figure shows estimates of
The argument plot_nts = T tells the package to make a plot showing the
inequalities corresponding to there being no treatment effect for the
“never-takers” (i.e. individuals with plot_nts is set to F, we get a similar plot for the
always-takers, which checks whether
at_plot <-
partial_density_plot(df = mother_data,
d = "treat",
m = "grandmother",
y = "motherfinancial",
num_Ybins = 5,
plot_nts = F,
density_1_label = "Prob. In Treated Group (P(Y,M=1|D=1))",
density_0_label = "Prob. In Control Group (P(Y,M=1|D=0))") +
ylab("Probability") +
xlab("Event: Grandmother Present (M=1) and Y in Stated Range")
at_plot +
theme(legend.position="bottom",
legend.title = element_blank())While the figure above hints at a possible violation of the sharp
null, it does not come with any uncertainty quantification. The
function test_sharp_null() conducts statistical inference of the
sharp null using the method described in Section 4 of the paper. The
following snippet runs this test based on the test proposed in Cox and
Shi (2023), which is our recommended approach for most applications. The
package supports using the tests provided by Andrews, Roth, and Pakes
(2023) and Fang, Santos, Shaikh, and Torgovitsky (2023); these methods
can be specified by changing the method argument from "CS" to
"ARP" or "FSST". When M is binary, as in our example here, one can
also use the test from Kitagawa (2015) by setting method = "toru".
test_result <- test_sharp_null(df = mother_data,
d = "treat",
m = "grandmother",
y = "motherfinancial",
method = "CS", #use Cox and Shi test
num_Ybins = 5, #discretize using 5 bins
cluster = "uc") #cluster SEs at uc level
#> Loading required package: lpinfer
test_result$pval
#> [,1]
#> [1,] 0.02283916The test gives a p-value of 0.023, and thus the sharp null is rejected
at the 5% significance level. Here, the p-value corresponds to the
smallest value of num_Ybins = 5. Currently, the function discretizes num_Ybins value is not provided but
The test above suggests that the treatment effect does not operate
entirely through the presence of a grandmother in the home. There are
some people (never-takers) whose outcome is affected by the treatment
despite having no change in lb_frac_affected computes a point estimate of this lower bound. The
argument at_group = 0 corresponds to computing this lower bound for
the never-takers, who are referred to as “0-always takers” in the more
general notation in the paper.
lb_nts <- lb_frac_affected(df = mother_data,
d = "treat",
m = "grandmother",
y = "motherfinancial",
num_Ybins = 5,
at_group = 0)
lb_nts
#> [1] 0.1858912Our estimates of the lower bound imply that at least 19 percent of
never-takers are affected by the treatment. One could likewise test the
fraction of never-takers affected by setting at_group = 1 (in this
case, the lower bound is zero). If at_group is set to NULL, then the
package calculates the fraction pooling across all types that have the
same value of
By default, TestMechs imposes the monotonicity assumption that the
treatment can only increase the value of max_defiers_share parameter to be non-zero, which bounds the number of
“defiers” by max_defiers_share.
We rerun the test above with max_defiers_share = .01, which allows one
percent of the population to be defiers.
test_result_defiers <- test_sharp_null(df = mother_data,
d = "treat",
m = "grandmother",
y = "motherfinancial",
method = "CS",
num_Ybins = 5,
cluster = "uc",
max_defiers_share = .01)
test_result_defiers$pval
#> [,1]
#> [1,] 0.04630939The p-value increases to 0.046, so the test rejects the sharp null even if you allow one percent of the population to be defiers. (Allowing for larger shares of defiers will eventually lead to an insignificant result.)
Likewise, we can also calculate the lower bound on the fraction of never-takers under this relaxed monotonicity.
lb_nts_defiers <- lb_frac_affected(df = mother_data,
d = "treat",
m = "grandmother",
y = "motherfinancial",
num_Ybins = 5,
at_group = 0,
max_defiers_share = .01)
lb_nts_defiers
#> [1] 0.1716415Our estimates of the lower bound imply that at least 17 percent of never-takers are affected by the treatment when we allow one percent of the population to be defiers.
We next turn to the setting where we are interested in testing whether the effect is mediated by relationship quality with the husband, which is measured on a 1-5 scale. We can again test the sharp null and estimate a lower bound on the fraction affected.
test_sharp_null(df = mother_data,
d = "treat",
m = "relationship_husb",
y = "motherfinancial",
num_Ybins = 5,
method = "CS",
cluster = "uc")$pval
#> [,1]
#> [1,] 0.02838332Again, we reject the sharp null that all the treatment effect goes through the relationship quality with the husband.
We can also estimate a lower bound on the fraction of always-takers:
lb_frac_affected(df = mother_data,
d = "treat",
m = "relationship_husb",
y = "motherfinancial",
num_Ybins = 5,
at_group = NULL,
allow_min_defiers = TRUE)
#> [1] 0.1002207Here, the parameter at_group = NULL asks the function compute the
lower bound on the fraction of always-takers, pooled across different
allow_min_defiers = TRUE
calculates the lower bound allowing for the minimum number of defiers
consistent with the empirical distribution — see footnote 25 of the
paper for details.)
We note that while the method works with multi-valued discrete mediators
(such as our 1-5 score), we generally expect the power of the test to
decrease as one approaches an approximately continuous mediator. See the
discussion in Remarks 2 and 3 of the paper regarding power and
discretization of
Next, we test the null hypothesis that the treatment effect is explained
by the combination of the two mechanisms. This is done by passing a
vector of variables names for the m argument.
test_result_both <- test_sharp_null(df = mother_data,
d = "treat",
m = c("relationship_husb",
"grandmother"),
y = "motherfinancial",
num_Ybins = 5,
method = "CS",
cluster = "uc")
test_result_both$pval
#> [,1]
#> [1,] 0.6540863With a p-value of 0.654, we cannot reject the sharp null that the combination of presence of grandmother and relationship quality with husband fully explain the treatment effect.
Again, we can estimate a lower bound on the fraction of those affected
by treatment, pooled across different
lb_frac_both <- lb_frac_affected(df = mother_data,
d = "treat",
m = c("relationship_husb",
"grandmother"),
y = "motherfinancial",
num_Ybins = 5,
allow_min_defiers = TRUE)
lb_frac_both
#> [1] 0.07251284We estimate a lower bound of 7 percent, although this does not appear to be statistically significant given the test result above.
The examples above focus on data from a randomized controlled trial,
where treatment reg_formula argument,
which allows the researcher to provide a regression formula (or IV
formula) to estimate treatment effects after adjusting linearly for
observable characteristics. The reg_formula argument is passed to the
fixest package to estimate treatment effects, and thus accommodates
fixest functionality, including IV and high-dimensional fixed effects.
While adjusting for covariates is not necessary in our running example,
which is an RCT, we can still adjust for covariates to increase
precision. Below we show how this is done using the baseline covariates
age_baseline, edu_mo_baseline, and wealth_baseline. For brevity,
we focus on testing the sharp null using covariates using the function
test_sharp_null(), although the reg_formula result can analogously
be passed to the functions partial_density_plot() and
lb_frac_affected().
test_result_gm_ols <- test_sharp_null(df = mother_data,
d = "treat",
m = "grandmother",
y = "motherfinancial",
reg_formula = "~ treat + age_baseline + edu_mo_baseline + wealth_baseline",
method = "CS",
num_Ybins = 5,
cluster = "uc")
#> Loading required package: lpinfer
test_result_gm_ols$pval
#> [,1]
#> [1,] 0.03105106The key new argument is
reg_formula = "~ treat + age_baseline + edu_mo_baseline + wealth_baseline".
This says that we should estimate the effect of the treatment (treat)
on
We can also use reg_formula to estimate treatment effects using
instrumental variables. To illustrate how this works, we create an
instrumental variable iv equal to the true treatment variable plus
noise. We now modify the reg_formula argument to use iv as an
instrument for treat.
set.seed(0)
mother_data$iv <- mother_data$treat + rnorm(n = length(mother_data$treat), sd = 0.1) #iv = treat + noise
test_result_gm_iv <- test_sharp_null(df = mother_data,
d = "treat",
m = "grandmother",
y = "motherfinancial",
reg_formula = "~ age_baseline + edu_mo_baseline + wealth_baseline | treat ~ iv", #iv spec
method = "CS",
num_Ybins = 5,
cluster = "uc")
test_result_gm_iv$pval
#> [,1]
#> [1,] 0.0311524
