moderndid.ipwdid#

moderndid.ipwdid(data, yname, tname, idname=None, treatname=None, xformla=None, panel=True, est_method='ipw', weightsname=None, boot=False, boot_type='weighted', n_boot=999, inf_func=False, trim_level=0.995)[source]#

Wrap the inverse propensity weighted DiD estimators for the ATT.

This function is a wrapper for inverse propensity weighted (IPW) DiD estimators. It can be used with panel or stationary repeated cross-section data and calls the appropriate estimator based on the panel argument and estimation method.

Parameters:

datapandas.DataFrame | polars.DataFrame

The input data containing outcome, time, unit ID, treatment, and optionally covariates and weights. Accepts both pandas and polars DataFrames.

ynamestr

Name of the column containing the outcome variable.

tnamestr

Name of the column containing the time periods (must have exactly 2 periods).

idnamestr | None, default None

Name of the column containing the unit ID. Required if panel=True.

treatnamestr

Name of the column containing the treatment group indicator. For panel data: time-invariant indicator (1 if ever treated, 0 if never treated). For repeated cross-sections: treatment status in the post-period.

xformlastr | None, default None

A formula for the covariates to include in the model. Should be of the form “~ X1 + X2” (intercept is always included). If None, equivalent to “~ 1” (intercept only).

panelbool, default True

Whether the data is panel (True) or repeated cross-sections (False). Panel data should be in long format with each row representing a unit-time observation.

est_method{“ipw”, “std_ipw”}, default “ipw”

The IPW estimation method to use.

“ipw”: Standard inverse propensity weighted estimator (Horvitz-Thompson type). Weights are not normalized to sum to one. This is based on Abadie (2005).
“std_ipw”: Standardized (Hajek-type) inverse propensity weighted estimator. Weights are normalized to sum to one, which can improve finite sample performance when propensity scores are close to 0 or 1.

weightsnamestr | None, default None

Name of the column containing sampling weights. If None, all observations have equal weight. Weights are normalized to have mean 1.

bootbool, default False

Whether to compute bootstrap standard errors. If False, analytical standard errors are reported.

boot_type{“weighted”, “multiplier”}, default “weighted”

Type of bootstrap to perform (only relevant if boot=True).

n_bootint, default 999

Number of bootstrap repetitions (only relevant if boot=True).

inf_funcbool, default False

Whether to return the influence function values.

trim_levelfloat, default 0.995

The level of trimming for the propensity score.

Returns:

IPWDIDResult

NamedTuple containing:

att: The IPW DiD point estimate.
se: The IPW DiD standard error.
uci: The upper bound of a 95% confidence interval.
lci: The lower bound of a 95% confidence interval.
boots: Bootstrap draws of the ATT if boot=True.
att_inf_func: Influence function values if inf_func=True.
call_params: Original function call parameters.
args: Arguments used in the estimation.

See also

drdid: Doubly robust DiD estimator.
ordid: Outcome regression DiD estimator.

Notes

The IPW estimator uses the propensity score (probability of being in the treated group) to reweight observations and create a balanced comparison between treated and control units. The standard IPW estimator (“ipw”) uses unnormalized weights, while the standardized version (“std_ipw”) normalizes weights to sum to one within each group, which can improve performance when propensity scores are extreme.

Unlike doubly robust methods, IPW estimators are not robust to misspecification of the propensity score model. However, they can be more efficient when the propensity score model is correctly specified and there is substantial overlap between treated and control groups.

References

[1]

Abadie, A. (2005), “Semiparametric Difference-in-Differences Estimators”, Review of Economic Studies, vol. 72(1), pp. 1-19. https://doi.org/10.1111/0034-6527.00321

[2]

Sant’Anna, P. H. C. and Zhao, J. (2020), “Doubly Robust Difference-in-Differences Estimators.” Journal of Econometrics, Vol. 219 (1), pp. 101-122. https://doi.org/10.1016/j.jeconom.2020.06.003

Examples

Estimate the average treatment effect on the treated (ATT) using inverse propensity weighting with panel data from a job training program. IPW reweights observations to create balance between treated and control groups.

In [1]: import moderndid
   ...: from moderndid import load_nsw
   ...: 
   ...: nsw_data = load_nsw()
   ...: 
   ...: att_result = moderndid.ipwdid(
   ...:     data=nsw_data,
   ...:     yname="re",
   ...:     tname="year",
   ...:     idname="id",
   ...:     treatname="experimental",
   ...:     xformla="~ age + educ + black + married + nodegree + hisp + re74",
   ...:     panel=True,
   ...:     est_method="ipw",
   ...: )
   ...: 

In [2]: print(att_result)
==============================================================================
 Inverse Probability Weighted DiD Estimator
==============================================================================
 Computed from 32834 observations and 12 covariates.

┌────────────┬────────────┬──────────┬───────────────────────────┐
│        ATT │ Std. Error │ Pr(>|t|) │ [95% Conf. Interval]      │
├────────────┼────────────┼──────────┼───────────────────────────┤
│ -1107.8720 │   408.6127 │   0.0067 │ [-1908.7530, -306.9911] * │
└────────────┴────────────┴──────────┴───────────────────────────┘

------------------------------------------------------------------------------
 Data Info
------------------------------------------------------------------------------
 Data structure: Panel data

------------------------------------------------------------------------------
 Estimation Details
------------------------------------------------------------------------------
 Weight type: Normalized (Hajek-type estimator)
 Propensity score: Logistic regression

------------------------------------------------------------------------------
 Inference
------------------------------------------------------------------------------
 Standard errors: Analytical
 Propensity score trimming: 0.995
==============================================================================
 Reference: Abadie (2005), Review of Economic Studies

We can also use the standardized (Hajek-type) IPW estimator, which normalizes weights to sum to one and can be more stable with extreme propensity scores.

In [3]: att_result_std = moderndid.ipwdid(
   ...:     data=nsw_data,
   ...:     yname="re",
   ...:     tname="year",
   ...:     idname="id",
   ...:     treatname="experimental",
   ...:     xformla="~ age + educ + black + married + nodegree + hisp + re74",
   ...:     panel=True,
   ...:     est_method="std_ipw",
   ...: )
   ...: 

In [4]: print(att_result_std)
==============================================================================
 Standardized IPW DiD Estimator (Hajek-type)
==============================================================================
 Computed from 32834 observations and 12 covariates.

┌────────────┬────────────┬──────────┬───────────────────────────┐
│        ATT │ Std. Error │ Pr(>|t|) │ [95% Conf. Interval]      │
├────────────┼────────────┼──────────┼───────────────────────────┤
│ -1021.6095 │   397.5201 │   0.0102 │ [-1800.7488, -242.4701] * │
└────────────┴────────────┴──────────┴───────────────────────────┘

------------------------------------------------------------------------------
 Data Info
------------------------------------------------------------------------------
 Data structure: Panel data

------------------------------------------------------------------------------
 Estimation Details
------------------------------------------------------------------------------
 Weight type: Normalized (Hajek-type estimator)
 Propensity score: Logistic regression

------------------------------------------------------------------------------
 Inference
------------------------------------------------------------------------------
 Standard errors: Analytical
 Propensity score trimming: 0.995
==============================================================================
 Reference: Abadie (2005), Review of Economic Studies