moderndid.ddd_mp_rc#

moderndid.ddd_mp_rc(data, y_col, time_col, id_col, group_col, partition_col, covariate_cols=None, control_group='nevertreated', base_period='universal', est_method='dr', boot=False, biters=1000, cband=False, cluster=None, alpha=0.05, trim_level=0.995, random_state=None, n_jobs=1)[source]#

Compute the multi-period doubly robust DDD estimator for the ATT with repeated cross-section data.

Implements the multi-period triple difference-in-differences estimator from [1] for repeated cross-section data with staggered treatment adoption. Unlike panel data, different samples are observed in each period.

The target parameters are the group-time average treatment effects

\[ATT(g, t) = \mathbb{E}[Y_t(g) - Y_t(\infty) \mid S=g, Q=1]\]

for all treatment cohorts \(g \in \mathcal{G}_{\mathrm{trt}}\) and time periods \(t \in \{2, \ldots, T\}\) such that \(t \geq g\).

For each \((g, t)\) cell, the estimator compares outcomes at time \(t\) to a base period. With base_period="universal", all comparisons use period \(g-1\) (the last pre-treatment period for cohort \(g\)). With base_period="varying", each comparison uses period \(t-1\).

For repeated cross-sections, the estimator follows the approach of [2], extending the DDD framework from [1]. Unlike panel data where outcomes are differenced within units, RCS fits separate outcome regression models for the target period \(t\) and the base period for each subgroup.

When multiple comparison groups are available (not-yet-treated setting), the estimator combines them using optimal GMM weights (Equation 4.11 from [1])

\[\widehat{w}_{\mathrm{gmm}}^{g,t} = \frac{\widehat{\Omega}_{g,t}^{-1} \mathbf{1}} {\mathbf{1}' \widehat{\Omega}_{g,t}^{-1} \mathbf{1}}\]

where \(\widehat{\Omega}_{g,t}\) is the covariance matrix of \(\widehat{ATT}_{\mathrm{dr},g_c}(g,t)\) across comparison groups. The GMM estimator (Equation 4.12 from [1]) is then

\[\widehat{ATT}_{\mathrm{dr,gmm}}(g,t) = \frac{\mathbf{1}' \widehat{\Omega}_{g,t}^{-1}} {\mathbf{1}' \widehat{\Omega}_{g,t}^{-1} \mathbf{1}} \widehat{ATT}_{\mathrm{dr}}(g,t).\]
Parameters:
dataDataFrame

Repeated cross-section data in long format with columns for outcome, time, observation id, treatment group, and partition.

y_colstr

Name of the outcome variable column.

time_colstr

Name of the time period column.

id_colstr

Name of the observation identifier column. For RCS, this can be a row index since units are not tracked across periods.

group_colstr

Name of the treatment group column (first period when treatment enabled). Use 0 or np.inf for never-treated units.

partition_colstr

Name of the partition/eligibility column (1 = eligible, 0 = ineligible).

covariate_colslist of str or None, default None

Names of covariate columns in the data. If None, uses intercept only.

control_group{“nevertreated”, “notyettreated”}, default “nevertreated”

Which units to use as controls. With “notyettreated”, multiple comparison groups may be available, triggering GMM aggregation.

base_period{“universal”, “varying”}, default “universal”

Base period selection. “universal” uses period g-1 as baseline for all comparisons; “varying” uses period t-1 for each t.

est_method{“dr”, “reg”, “ipw”}, default “dr”

Estimation method for each 2-period comparison.

bootbool, default False

Whether to use multiplier bootstrap for inference.

bitersint, default 1000

Number of bootstrap repetitions (only used if boot=True).

cbandbool, default False

Whether to compute uniform confidence bands (only used if boot=True).

clusterstr or None, default None

Name of the column containing cluster identifiers for clustered standard errors. If provided, the bootstrap resamples at the cluster level (only used if boot=True).

alphafloat, default 0.05

Significance level for confidence intervals.

trim_levelfloat, default 0.995

Trimming level for propensity scores.

random_stateint, Generator, or None, default None

Controls random number generation for bootstrap reproducibility.

n_jobsint, default=1

Number of parallel jobs for group-time estimation. 1 = sequential (default), -1 = all cores, >1 = that many workers.

Returns:
DDDMultiPeriodRCResult

A NamedTuple containing:

  • att: Array of ATT(g,t) point estimates

  • se: Standard errors for each ATT(g,t)

  • uci, lci: Confidence interval bounds

  • groups: Treatment cohort for each estimate

  • times: Time period for each estimate

  • glist, tlist: Unique cohorts and periods

  • inf_func_mat: Influence function matrix (n_obs x k)

  • n: Number of observations

  • args: Estimation arguments

See also

ddd_rc

Two-period DDD estimator for repeated cross-section data.

ddd_mp

Multi-period DDD estimator for panel data.

References

[1] (1,2,3,4)

Ortiz-Villavicencio, M., & Sant’Anna, P. H. C. (2025). Better Understanding Triple Differences Estimators. arXiv preprint arXiv:2505.09942. https://arxiv.org/abs/2505.09942

[2]

Sant’Anna, P. H. C., & Zhao, J. (2020). Doubly robust difference-in-differences estimators. Journal of Econometrics, 219(1), 101-122. https://doi.org/10.1016/j.jeconom.2020.06.003