moderndid.ddd_mp#

moderndid.ddd_mp(data, y_col, time_col, id_col, group_col, partition_col, covariate_cols=None, control_group='nevertreated', base_period='universal', est_method='dr', boot=False, biters=1000, cband=False, cluster=None, alpha=0.05, random_state=None, n_jobs=1)[source]#

Compute the multi-period doubly robust DDD estimator for the ATT with panel data.

Implements the multi-period triple difference-in-differences estimator from [1]. The target parameters are the group-time average treatment effects

\[ATT(g, t) = \mathbb{E}[Y_t(g) - Y_t(\infty) \mid S=g, Q=1]\]

for all treatment cohorts \(g \in \mathcal{G}_{trt}\) and time periods \(t \in \{2, \ldots, T\}\) such that \(t \geq g\).

For each (g,t) cell with comparison group \(g_{\mathrm{c}}\), the doubly robust estimand (Equation 4.8 from [1]) is

\[\begin{split}\widehat{ATT}_{\mathrm{dr},g_{\mathrm{c}}}(g,t) &= \mathbb{E}_n\left[ \left(\widehat{w}_{\mathrm{trt}}^{S=g,Q=1}(S,Q) - \widehat{w}_{g,0}^{S=g,Q=1}(S,Q,X)\right) \left(Y_t - Y_{g-1} - \widehat{m}_{Y_t-Y_{g-1}}^{S=g,Q=0}(X)\right)\right] \\ &+ \mathbb{E}_n\left[ \left(\widehat{w}_{\mathrm{trt}}^{S=g,Q=1}(S,Q) - \widehat{w}_{g_{\mathrm{c}},1}^{S=g,Q=1}(S,Q,X)\right) \left(Y_t - Y_{g-1} - \widehat{m}_{Y_t-Y_{g-1}}^{S=g_{\mathrm{c}},Q=1}(X)\right)\right] \\ &- \mathbb{E}_n\left[ \left(\widehat{w}_{\mathrm{trt}}^{S=g,Q=1}(S,Q) - \widehat{w}_{g_{\mathrm{c}},0}^{S=g,Q=1}(S,Q,X)\right) \left(Y_t - Y_{g-1} - \widehat{m}_{Y_t-Y_{g-1}}^{S=g_{\mathrm{c}},Q=0}(X)\right)\right].\end{split}\]

When multiple comparison groups are available (not-yet-treated setting), the estimator combines them using optimal GMM weights (Equation 4.11 from [1])

\[\widehat{w}_{gmm}^{g,t} = \frac{\widehat{\Omega}_{g,t}^{-1} \mathbf{1}} {\mathbf{1}' \widehat{\Omega}_{g,t}^{-1} \mathbf{1}}\]

where \(\widehat{\Omega}_{g,t}\) is the covariance matrix of \(\widehat{ATT}_{dr,g_c}(g,t)\) across comparison groups. The GMM estimator (Equation 4.12 from [1]) is then

\[\widehat{ATT}_{dr,gmm}(g,t) = \frac{\mathbf{1}' \widehat{\Omega}_{g,t}^{-1}} {\mathbf{1}' \widehat{\Omega}_{g,t}^{-1} \mathbf{1}} \widehat{ATT}_{dr}(g,t).\]
Parameters:
dataDataFrame

Panel data in long format with columns for outcome, time, unit id, treatment group, and partition.

y_colstr

Name of the outcome variable column.

time_colstr

Name of the time period column.

id_colstr

Name of the unit identifier column.

group_colstr

Name of the treatment group column (first period when treatment enabled). Use 0 or np.inf for never-treated units.

partition_colstr

Name of the partition/eligibility column (1 = eligible, 0 = ineligible).

covariate_colslist of str or None, default None

Names of covariate columns in the data. If None, uses intercept only.

control_group{“nevertreated”, “notyettreated”}, default “nevertreated”

Which units to use as controls. With “notyettreated”, multiple comparison groups may be available, triggering GMM aggregation.

base_period{“universal”, “varying”}, default “universal”

Base period selection. “universal” uses period g-1 as baseline for all comparisons; “varying” uses period t-1 for each t.

est_method{“dr”, “reg”, “ipw”}, default “dr”

Estimation method for each 2-period comparison.

bootbool, default False

Whether to use multiplier bootstrap for inference.

bitersint, default 1000

Number of bootstrap repetitions (only used if boot=True).

cbandbool, default False

Whether to compute uniform confidence bands (only used if boot=True).

clusterstr or None, default None

Name of the column containing cluster identifiers for clustered standard errors. If provided, the bootstrap resamples at the cluster level (only used if boot=True).

alphafloat, default 0.05

Significance level for confidence intervals.

random_stateint, Generator, or None, default None

Controls random number generation for bootstrap reproducibility.

n_jobsint, default=1

Number of parallel jobs for group-time estimation. 1 = sequential (default), -1 = all cores, >1 = that many workers.

Returns:
DDDMultiPeriodResult

A NamedTuple containing:

  • att: Array of ATT(g,t) point estimates

  • se: Standard errors for each ATT(g,t)

  • uci, lci: Confidence interval bounds

  • groups: Treatment cohort for each estimate

  • times: Time period for each estimate

  • glist, tlist: Unique cohorts and periods

  • inf_func_mat: Influence function matrix (n x k)

  • n: Number of units

  • args: Estimation arguments

See also

ddd_panel

Two-period DDD estimator for panel data.

Notes

The influence functions are rescaled by \(n / n_{g,t}\) where \(n_{g,t}\) is the number of units in each (g,t) cell, following the approach in [1].

The standard errors are computed from the influence function matrix as

\[\widehat{V} = \frac{1}{n} \widehat{\Psi}' \widehat{\Psi}, \quad \widehat{se}_{g,t} = \sqrt{\widehat{V}_{g,t,g,t} / n}\]

where \(\widehat{\Psi}\) is the \(n \times k\) matrix of influence functions. For cells with GMM aggregation, the standard error formula from Equation 4.12 is used instead.

References

[1] (1,2,3,4,5)

Ortiz-Villavicencio, M., & Sant’Anna, P. H. C. (2025). Better Understanding Triple Differences Estimators. arXiv preprint arXiv:2505.09942. https://arxiv.org/abs/2505.09942