moderndid.ddd_mp_rc#
- moderndid.ddd_mp_rc(data, y_col, time_col, id_col, group_col, partition_col, covariate_cols=None, control_group='nevertreated', base_period='universal', est_method='dr', boot=False, biters=1000, cband=False, cluster=None, alpha=0.05, trim_level=0.995, random_state=None, n_jobs=1)[source]#
Compute the multi-period doubly robust DDD estimator for the ATT with repeated cross-section data.
Implements the multi-period triple difference-in-differences estimator from [1] for repeated cross-section data with staggered treatment adoption. Unlike panel data, different samples are observed in each period.
The target parameters are the group-time average treatment effects
\[ATT(g, t) = \mathbb{E}[Y_t(g) - Y_t(\infty) \mid S=g, Q=1]\]for all treatment cohorts \(g \in \mathcal{G}_{\mathrm{trt}}\) and time periods \(t \in \{2, \ldots, T\}\) such that \(t \geq g\).
For each \((g, t)\) cell, the estimator compares outcomes at time \(t\) to a base period. With
base_period="universal", all comparisons use period \(g-1\) (the last pre-treatment period for cohort \(g\)). Withbase_period="varying", each comparison uses period \(t-1\).For repeated cross-sections, the estimator follows the approach of [2], extending the DDD framework from [1]. Unlike panel data where outcomes are differenced within units, RCS fits separate outcome regression models for the target period \(t\) and the base period for each subgroup.
When multiple comparison groups are available (not-yet-treated setting), the estimator combines them using optimal GMM weights (Equation 4.11 from [1])
\[\widehat{w}_{\mathrm{gmm}}^{g,t} = \frac{\widehat{\Omega}_{g,t}^{-1} \mathbf{1}} {\mathbf{1}' \widehat{\Omega}_{g,t}^{-1} \mathbf{1}}\]where \(\widehat{\Omega}_{g,t}\) is the covariance matrix of \(\widehat{ATT}_{\mathrm{dr},g_c}(g,t)\) across comparison groups. The GMM estimator (Equation 4.12 from [1]) is then
\[\widehat{ATT}_{\mathrm{dr,gmm}}(g,t) = \frac{\mathbf{1}' \widehat{\Omega}_{g,t}^{-1}} {\mathbf{1}' \widehat{\Omega}_{g,t}^{-1} \mathbf{1}} \widehat{ATT}_{\mathrm{dr}}(g,t).\]- Parameters:
- data
DataFrame Repeated cross-section data in long format with columns for outcome, time, observation id, treatment group, and partition.
- y_col
str Name of the outcome variable column.
- time_col
str Name of the time period column.
- id_col
str Name of the observation identifier column. For RCS, this can be a row index since units are not tracked across periods.
- group_col
str Name of the treatment group column (first period when treatment enabled). Use 0 or np.inf for never-treated units.
- partition_col
str Name of the partition/eligibility column (1 = eligible, 0 = ineligible).
- covariate_cols
listofstrorNone, defaultNone Names of covariate columns in the data. If None, uses intercept only.
- control_group{“nevertreated”, “notyettreated”}, default “nevertreated”
Which units to use as controls. With “notyettreated”, multiple comparison groups may be available, triggering GMM aggregation.
- base_period{“universal”, “varying”}, default “universal”
Base period selection. “universal” uses period g-1 as baseline for all comparisons; “varying” uses period t-1 for each t.
- est_method{“dr”, “reg”, “ipw”}, default “dr”
Estimation method for each 2-period comparison.
- bootbool, default
False Whether to use multiplier bootstrap for inference.
- biters
int, default 1000 Number of bootstrap repetitions (only used if boot=True).
- cbandbool, default
False Whether to compute uniform confidence bands (only used if boot=True).
- cluster
strorNone, defaultNone Name of the column containing cluster identifiers for clustered standard errors. If provided, the bootstrap resamples at the cluster level (only used if boot=True).
- alpha
float, default 0.05 Significance level for confidence intervals.
- trim_level
float, default 0.995 Trimming level for propensity scores.
- random_state
int,Generator, orNone, defaultNone Controls random number generation for bootstrap reproducibility.
- n_jobs
int, default=1 Number of parallel jobs for group-time estimation. 1 = sequential (default), -1 = all cores, >1 = that many workers.
- data
- Returns:
DDDMultiPeriodRCResultA NamedTuple containing:
att: Array of ATT(g,t) point estimates
se: Standard errors for each ATT(g,t)
uci, lci: Confidence interval bounds
groups: Treatment cohort for each estimate
times: Time period for each estimate
glist, tlist: Unique cohorts and periods
inf_func_mat: Influence function matrix (n_obs x k)
n: Number of observations
args: Estimation arguments
See also
References
[1] (1,2,3,4)Ortiz-Villavicencio, M., & Sant’Anna, P. H. C. (2025). Better Understanding Triple Differences Estimators. arXiv preprint arXiv:2505.09942. https://arxiv.org/abs/2505.09942
[2]Sant’Anna, P. H. C., & Zhao, J. (2020). Doubly robust difference-in-differences estimators. Journal of Econometrics, 219(1), 101-122. https://doi.org/10.1016/j.jeconom.2020.06.003