moderndid.att_gt#

moderndid.att_gt(data, yname, tname, idname=None, gname=None, xformla=None, weightsname=None, alp=0.05, cband=True, boot=False, biters=1000, clustervars=None, est_method='dr', panel=True, allow_unbalanced_panel=False, control_group='nevertreated', anticipation=0, base_period='varying', random_state=None, n_jobs=1, n_partitions=None, max_cohorts=None, backend=None)[source]#

Compute group-time average treatment effects.

Implements difference-in-differences estimation for staggered adoption designs where treatment timing varies across units, following [1]. This approach addresses the challenges of standard two-way fixed-effects regressions by providing flexible estimators that allow for treatment effect heterogeneity across groups and over time.

Let \(G_i\) denote the time period when unit \(i\) is first treated, with \(G_i = \infty\) for never-treated units, and let \(C_i\) be an indicator for never-treated status. The fundamental parameter of interest is the group-time average treatment effect, \(ATT(g,t)\), which measures the average effect for units first treated in period \(g\) as of time \(t\)

\[ATT(g,t) = \mathbb{E}[Y_t(g) - Y_t(0) \mid G = g].\]

Identification relies on a conditional parallel trends assumption. When using never-treated units as the comparison group, the assumption requires that trends in untreated potential outcomes are the same for the treatment group and never-treated units conditional on covariates \(X\)

\[\mathbb{E}[Y_t(0) - Y_{t-1}(0) \mid X, G = g] = \mathbb{E}[Y_t(0) - Y_{t-1}(0) \mid X, C = 1],\]

where \(C = 1\) indicates never-treated units. The doubly robust estimand combines inverse probability weighting and outcome regression, providing consistency if either the propensity score or outcome model is correctly specified

\[ATT_{dr}(g,t) = \mathbb{E}\left[\left(\frac{G_g}{\mathbb{E}[G_g]} - \frac{\frac{p_g(X) C}{1 - p_g(X)}} {\mathbb{E}\left[\frac{p_g(X) C}{1 - p_g(X)}\right]}\right) \left(\Delta Y_t - m_{g,t}(X)\right)\right],\]

where \(p_g(X)\) is the propensity score, \(\Delta Y_t\) is the change in outcomes, and \(m_{g,t}(X)\) is the expected outcome change for the comparison group.

Parameters:

dataDataFrame: Panel data in long format. Accepts any object implementing the Arrow PyCapsule Interface (__arrow_c_stream__), including polars, pandas, pyarrow Table, and cudf DataFrames.
ynamestr: The name of the outcome variable.
tnamestr: The name of the column containing the time periods.
idnamestr, optional: The individual (cross-sectional unit) id name. Required for panel data.
gnamestr: The name of the variable that contains the first period when a particular observation is treated. This should be a positive number for all observations in treated groups. It defines which “group” a unit belongs to. It should be 0 for units in the untreated group.
xformlastr, optional: A formula for the covariates to include in the model. It should be of the form “~ X1 + X2”. Default is None which is equivalent to xformla=”~1”.
weightsnamestr, optional: The name of the column containing the sampling weights. If not set, all observations have same weight.
alpfloat, default=0.05: The significance level.
cbandbool, default=True: Whether or not to compute a uniform confidence band that covers all of the group-time average treatment effects with fixed probability 1-alp.
bootbool, default=False: Whether or not to compute standard errors using the multiplier bootstrap. If standard errors are clustered, then one must set boot=True.
bitersint, default=1000: The number of bootstrap iterations to use. Only applicable if boot=True.
clustervarslist[str], optional: A list of variables names to cluster on. At most, there can be two variables (otherwise will throw an error) and one of these must be the same as idname which allows for clustering at the individual level.
est_method{“dr”, “ipw”, “reg”} or callable, default=”dr”: The method to compute group-time average treatment effects. The default is “dr” which uses the doubly robust approach. Other built-in methods include “ipw” for inverse probability weighting and “reg” for first step regression estimators. The user can also pass their own function for estimating group time average treatment effects.
panelbool, default=True: Whether or not the data is a panel dataset. The panel dataset should be provided in long format.
allow_unbalanced_panelbool, default=False: Whether or not function should “balance” the panel with respect to time and id. The default values if False which means that att_gt will drop all units where data is not observed in all periods.
control_group{“nevertreated”, “notyettreated”}, default=”nevertreated”: Which units to use the control group. The default is “nevertreated” which sets the control group to be the group of units that never participate in the treatment.
anticipationint, default=0: The number of time periods before participating in the treatment where units can anticipate participating in the treatment and therefore it can affect their untreated potential outcomes.
base_period{“varying”, “universal”}, default=”varying”: Whether to use a “varying” base period or a “universal” base period.
random_stateint, Generator, optional: Controls the randomness of the bootstrap. Pass an int for reproducible results across multiple function calls. Can also accept a NumPy Generator instance.
n_jobsint, default=1: Number of parallel jobs for group-time estimation. 1 = sequential (default), -1 = all cores, >1 = that many workers.
n_partitionsint or None, default=None: Number of Dask partitions per cell. Only used when data is a Dask DataFrame; ignored for non-Dask inputs.
max_cohortsint or None, default=None: Maximum number of treatment cohorts to process in parallel. Only used when data is a Dask DataFrame; ignored for non-Dask inputs.
backend{“numpy”, “cupy”} or None, default=None: Array backend to use for this call only. When set, the backend is activated for the duration of this call and reverted automatically when the call returns. None (the default) uses whatever backend is currently active (see set_backend). Ignored when data is a Dask DataFrame.

Returns:

MPResult

Object containing group-time average treatment effect results:

groups: Array indicating which group (period first treated) each ATT is for
times: Array indicating which time period each ATT is for
att_gt: Array of group-time average treatment effects
se_gt: Standard errors for each ATT(g,t)
vcov_analytical: Analytical variance-covariance matrix
critical_value: Critical value for confidence intervals (simultaneous if bootstrap with cband=True)
influence_func: Influence function matrix for each ATT(g,t)
n_units: Number of unique cross-sectional units
wald_stat: Wald statistic for pre-testing parallel trends
wald_pvalue: P-value for the parallel trends pre-test
alpha: Significance level used
estimation_params: Dictionary with estimation details (control_group, anticipation_periods, etc.)
G: Unit-level group assignments
weights_ind: Unit-level sampling weights (if provided)

See also

aggte: Aggregate group-time average treatment effects.

References

[1]

Callaway, B., & Sant’Anna, P. H. (2021). “Difference-in-differences with multiple time periods.” Journal of Econometrics, 225(2), 200-230. https://doi.org/10.1016/j.jeconom.2020.12.001

Examples

The dataset below contains 500 observations of county-level teen employment rates from 2003-2007. Some states are first treated in 2004, some in 2006, and some in 2007. The variable first.treat indicates the first period in which a state is treated:

In [1]: import numpy as np
   ...: from moderndid import att_gt, load_mpdta
   ...: 
   ...: df = load_mpdta()
   ...: print(df.head())
   ...: 
shape: (5, 6)
┌──────┬────────────┬──────────┬──────────┬─────────────┬───────┐
│ year ┆ countyreal ┆ lpop     ┆ lemp     ┆ first.treat ┆ treat │
│ ---  ┆ ---        ┆ ---      ┆ ---      ┆ ---         ┆ ---   │
│ i64  ┆ i64        ┆ f64      ┆ f64      ┆ i64         ┆ i64   │
╞══════╪════════════╪══════════╪══════════╪═════════════╪═══════╡
│ 2003 ┆ 8001       ┆ 5.896761 ┆ 8.461469 ┆ 2007        ┆ 1     │
│ 2004 ┆ 8001       ┆ 5.896761 ┆ 8.33687  ┆ 2007        ┆ 1     │
│ 2005 ┆ 8001       ┆ 5.896761 ┆ 8.340217 ┆ 2007        ┆ 1     │
│ 2006 ┆ 8001       ┆ 5.896761 ┆ 8.378161 ┆ 2007        ┆ 1     │
│ 2007 ┆ 8001       ┆ 5.896761 ┆ 8.487352 ┆ 2007        ┆ 1     │
└──────┴────────────┴──────────┴──────────┴─────────────┴───────┘

We can compute group-time average treatment effects for a staggered adoption design where different units adopt treatment at different time periods. The output is an object of type MPResult which is a container for the results:

In [2]: result = att_gt(
   ...:     data=df,
   ...:     yname="lemp",
   ...:     tname="year",
   ...:     gname="first.treat",
   ...:     idname="countyreal",
   ...:     est_method="dr",
   ...:     boot=False
   ...: )
   ...: print(result)
   ...: 
==============================================================================
 Group-Time Average Treatment Effects
==============================================================================

┌───────┬──────┬──────────┬────────────┬────────────────────────────┐
│ Group │ Time │ ATT(g,t) │ Std. Error │ [95% Pointwise Conf. Band] │
├───────┼──────┼──────────┼────────────┼────────────────────────────┤
│  2004 │ 2004 │  -0.0105 │     0.0233 │ [-0.0561,  0.0351]         │
│  2004 │ 2005 │  -0.0704 │     0.0310 │ [-0.1312, -0.0097] *       │
│  2004 │ 2006 │  -0.1373 │     0.0364 │ [-0.2087, -0.0658] *       │
│  2004 │ 2007 │  -0.1008 │     0.0344 │ [-0.1682, -0.0335] *       │
│  2006 │ 2004 │   0.0065 │     0.0233 │ [-0.0392,  0.0522]         │
│  2006 │ 2005 │  -0.0028 │     0.0196 │ [-0.0411,  0.0356]         │
│  2006 │ 2006 │  -0.0046 │     0.0178 │ [-0.0394,  0.0302]         │
│  2006 │ 2007 │  -0.0412 │     0.0202 │ [-0.0809, -0.0016] *       │
│  2007 │ 2004 │   0.0305 │     0.0150 │ [ 0.0010,  0.0600] *       │
│  2007 │ 2005 │  -0.0027 │     0.0164 │ [-0.0349,  0.0294]         │
│  2007 │ 2006 │  -0.0311 │     0.0179 │ [-0.0661,  0.0040]         │
│  2007 │ 2007 │  -0.0261 │     0.0167 │ [-0.0587,  0.0066]         │
└───────┴──────┴──────────┴────────────┴────────────────────────────┘

------------------------------------------------------------------------------
 Signif. codes: '*' confidence band does not cover 0

 P-value for pre-test of parallel trends assumption:  0.1681

------------------------------------------------------------------------------
 Data Info
------------------------------------------------------------------------------
 Control Group:  Never Treated
 Anticipation Periods:  0

------------------------------------------------------------------------------
 Estimation Details
------------------------------------------------------------------------------
 Estimation Method:  Doubly Robust

------------------------------------------------------------------------------
 Inference
------------------------------------------------------------------------------
 Significance level: 0.05
 Analytical standard errors
==============================================================================
 Reference: Callaway and Sant'Anna (2021)