moderndid.ddd_mp#

moderndid.ddd_mp(data, y_col, time_col, id_col, group_col, partition_col, covariate_cols=None, control_group='nevertreated', base_period='universal', est_method='dr', boot=False, biters=1000, cband=False, cluster=None, alpha=0.05, random_state=None, n_jobs=1)[source]#

Compute the multi-period doubly robust DDD estimator for the ATT with panel data.

Implements the multi-period triple difference-in-differences estimator from [1]. The target parameters are the group-time average treatment effects

\[ATT(g, t) = \mathbb{E}[Y_t(g) - Y_t(\infty) \mid S=g, Q=1]\]

for all treatment cohorts \(g \in \mathcal{G}_{trt}\) and time periods \(t \in \{2, \ldots, T\}\) such that \(t \geq g\).

For each (g,t) cell with comparison group \(g_{\mathrm{c}}\), the doubly robust estimand (Equation 4.8 from [1]) is

\[\begin{split}\widehat{ATT}_{\mathrm{dr},g_{\mathrm{c}}}(g,t) &= \mathbb{E}_n\left[ \left(\widehat{w}_{\mathrm{trt}}^{S=g,Q=1}(S,Q) - \widehat{w}_{g,0}^{S=g,Q=1}(S,Q,X)\right) \left(Y_t - Y_{g-1} - \widehat{m}_{Y_t-Y_{g-1}}^{S=g,Q=0}(X)\right)\right] \\ &+ \mathbb{E}_n\left[ \left(\widehat{w}_{\mathrm{trt}}^{S=g,Q=1}(S,Q) - \widehat{w}_{g_{\mathrm{c}},1}^{S=g,Q=1}(S,Q,X)\right) \left(Y_t - Y_{g-1} - \widehat{m}_{Y_t-Y_{g-1}}^{S=g_{\mathrm{c}},Q=1}(X)\right)\right] \\ &- \mathbb{E}_n\left[ \left(\widehat{w}_{\mathrm{trt}}^{S=g,Q=1}(S,Q) - \widehat{w}_{g_{\mathrm{c}},0}^{S=g,Q=1}(S,Q,X)\right) \left(Y_t - Y_{g-1} - \widehat{m}_{Y_t-Y_{g-1}}^{S=g_{\mathrm{c}},Q=0}(X)\right)\right].\end{split}\]

When multiple comparison groups are available (not-yet-treated setting), the estimator combines them using optimal GMM weights (Equation 4.11 from [1])

\[\widehat{w}_{gmm}^{g,t} = \frac{\widehat{\Omega}_{g,t}^{-1} \mathbf{1}} {\mathbf{1}' \widehat{\Omega}_{g,t}^{-1} \mathbf{1}}\]

where \(\widehat{\Omega}_{g,t}\) is the covariance matrix of \(\widehat{ATT}_{dr,g_c}(g,t)\) across comparison groups. The GMM estimator (Equation 4.12 from [1]) is then

\[\widehat{ATT}_{dr,gmm}(g,t) = \frac{\mathbf{1}' \widehat{\Omega}_{g,t}^{-1}} {\mathbf{1}' \widehat{\Omega}_{g,t}^{-1} \mathbf{1}} \widehat{ATT}_{dr}(g,t).\]

Parameters:

dataDataFrame: Panel data in long format with columns for outcome, time, unit id, treatment group, and partition.
y_colstr: Name of the outcome variable column.
time_colstr: Name of the time period column.
id_colstr: Name of the unit identifier column.
group_colstr: Name of the treatment group column (first period when treatment enabled). Use 0 or np.inf for never-treated units.
partition_colstr: Name of the partition/eligibility column (1 = eligible, 0 = ineligible).
covariate_colslist of str or None, default None: Names of covariate columns in the data. If None, uses intercept only.
control_group{“nevertreated”, “notyettreated”}, default “nevertreated”: Which units to use as controls. With “notyettreated”, multiple comparison groups may be available, triggering GMM aggregation.
base_period{“universal”, “varying”}, default “universal”: Base period selection. “universal” uses period g-1 as baseline for all comparisons; “varying” uses period t-1 for each t.
est_method{“dr”, “reg”, “ipw”}, default “dr”: Estimation method for each 2-period comparison.
bootbool, default False: Whether to use multiplier bootstrap for inference.
bitersint, default 1000: Number of bootstrap repetitions (only used if boot=True).
cbandbool, default False: Whether to compute uniform confidence bands (only used if boot=True).
clusterstr or None, default None: Name of the column containing cluster identifiers for clustered standard errors. If provided, the bootstrap resamples at the cluster level (only used if boot=True).
alphafloat, default 0.05: Significance level for confidence intervals.
random_stateint, Generator, or None, default None: Controls random number generation for bootstrap reproducibility.
n_jobsint, default=1: Number of parallel jobs for group-time estimation. 1 = sequential (default), -1 = all cores, >1 = that many workers.

Returns:

DDDMultiPeriodResult

A NamedTuple containing:

att: Array of ATT(g,t) point estimates
se: Standard errors for each ATT(g,t)
uci, lci: Confidence interval bounds
groups: Treatment cohort for each estimate
times: Time period for each estimate
glist, tlist: Unique cohorts and periods
inf_func_mat: Influence function matrix (n x k)
n: Number of units
args: Estimation arguments