moderndid.calculate_pscore_ipt#

moderndid.calculate_pscore_ipt(D, X, iw, quantiles=None)[source]#

Calculate propensity scores using Inverse Probability Tilting for the ATT.

Implements a specific variant of IPT tailored for estimating the Average Treatment Effect on the Treated (ATT). Instead of re-weighting both the treated and control groups to match the full sample, it estimates a propensity score model that implies a re-weighting of the control group to match the covariate distribution of the treated group. This is achieved by solving the following optimization problem for the propensity score parameters \(\gamma\) given by

\[\widehat{\gamma}^{ipt} = \arg\max_{\gamma \in \Gamma} \mathbb{E}_{n} \left[D X^{\prime} \gamma - (1-D) \exp(X^{\prime} \gamma)\right].\]

The first-order condition of this problem implies the balancing property

\[\sum_{i: D_i=1} w_i X_i = \sum_{i: D_i=0} w_i \frac{\widehat{p}(X_i)}{1-\widehat{p}(X_i)} X_i,\]

where \(\widehat{p}(X) = \text{expit}(X'\widehat{\gamma}^{ipt})\) is the estimated propensity score (i.e., the logistic function, \(\exp(v) / (1 + \exp(v))\)) and \(w_i\) are the observation weights. This property ensures that the weighted average of covariates in the control group matches the weighted average in the treated group, which is a key condition for identifying the ATT.

Parameters:

Dnumpy.ndarray: Treatment indicator (1D array).
Xnumpy.ndarray: Covariate matrix (2D array, n_obs x n_features), must include intercept.
iwnumpy.ndarray: Individual weights (1D array).
quantilesdict[int, list[float]] | None: Dict mapping column indices to quantiles (values between 0 and 1). For example, {1: [0.25, 0.5, 0.75]} adds 25th, 50th, 75th percentiles of the 2nd column as balance constraints. Default is None (no quantiles).

Returns:

numpy.ndarray: Propensity scores.

Notes

The general IPT framework described in [1] for the ATE involves solving two separate moment equations to find weights for the treated and control groups that balance covariates with the full sample. These are given by equations (8) and (11) in their paper

\[\frac{1}{N} \sum_{i=1}^{N}\left\{\frac{D_{i}} {G\left(t\left(X_{i}\right)^{\prime} \delta_{I P T}^{1}\right)}-1\right\} t\left(X_{i}\right)=0\]

and

\[\frac{1}{N} \sum_{i=1}^{N}\left\{\frac{1-D_{i}} {1-G\left(t\left(X_{i}\right)^{\prime} \delta_{I P T}^{0}\right)}-1\right\} t\left(X_{i}\right)=0.\]

This implementation, following [2], uses a single objective function tailored for ATT estimation. The function attempts to solve this using a trust-constr optimizer, falling back to a BFGS optimization of a modified loss function, and finally to a standard logit model if the IPT optimizations fail.

References

[1]

Graham, B., Pinto, C., and Egel, D. (2012), “Inverse Probability Tilting for Moment Condition Models with Missing Data,” The Review of Economic Studies, 79(3), 1053-1079. https://doi.org/10.1093/restud/rdr047

[2]

Sant’Anna, P. H., and Zhao, J. (2020), “Inverse Probability Weighting with Missing Data,” Journal of the American Statistical Association, 115(530), 1542-1552. https://doi.org/10.1080/01621459.2019.1635520