moderndid.diddynamic.dyn_balancing#

moderndid.diddynamic.dyn_balancing(data, yname: str, tname: str, idname: str, treatment_name: str, ds1: list[int], ds2: list[int], xformla: str | None = None, fixed_effects: list[str] | None = None, pooled: bool = False, clustervars: list[str] | None = None, balancing: str = 'dcb', method: str = 'lasso_plain', alp: float = 0.05, final_period: int | None = None, initial_period: int | None = None, adaptive_balancing: bool = True, debias: bool = False, continuous_treatment: bool = False, lb: float = 0.0005, ub: float = 2.0, regularization: bool = True, fast_adaptive: bool = False, grid_length: int = 1000, n_beta_nonsparse: float = 0.0001, ratio_coefficients: float = 0.3333333333333333, nfolds: int = 10, lags: int | None = None, robust_quantile: bool = True, demeaned_fe: bool = False, histories_length: list[int] | None = None, final_periods: list[int] | None = None, impulse_response: bool = False, n_jobs: int = 1) → DynBalancingResult | DynBalancingHistoryResult | DynBalancingHetResult[source]#

Estimate treatment effects under dynamic treatment regimes.

Implements the dynamic covariate balancing (DCB) estimator of [1] for comparing potential outcomes under two treatment histories \(d_{1:T}\) and \(d'_{1:T}\). The average treatment effect is defined as

\[\text{ATE}(d_{1:T}, d'_{1:T}) = \mu_T(d_{1:T}) - \mu_T(d'_{1:T}),\]

where \(\mu_T(d_{1:T}) = \mathbb{E}[Y_T(d_{1:T})]\) is the potential outcome under treatment history \(d_{1:T}\).

Identification relies on a sequential conditional independence assumption and overlap. For each period \(t\), the DCB estimator solves a quadratic program to find balancing weights \(\hat{\gamma}_t\) that satisfy dynamic covariate balance constraints while minimising the \(\ell_2\) norm. The potential outcome is then estimated as a bias-corrected weighted average of outcomes in the final period. IPW, AIPW, and IPW-MSM alternatives are also available as benchmarks.

Parameters:

dataDataFrame: Panel data in long format. Accepts any object implementing the Arrow PyCapsule Interface (__arrow_c_stream__), including polars, pandas, pyarrow Table, and cudf DataFrames.
ynamestr: The name of the outcome variable.
tnamestr: The name of the column containing the time periods.
idnamestr: The individual (cross-sectional unit) id name.
treatment_namestr: The name of the binary treatment column.
ds1list[int]: Target treatment history for the first potential outcome. Length must equal the number of time periods.
ds2list[int]: Target treatment history for the second potential outcome. Must have the same length as ds1.
xformlastr or None, default=None: A formula for the covariates to include in the model. It should be of the form "~ X1 + X2".
fixed_effectslist[str] or None, default=None: Column names to include as fixed-effect dummies.
pooledbool, default=False: If True, pool observations across periods for coefficient estimation.
clustervarslist[str] or None, default=None: Column names on which to cluster standard errors.
balancing{‘dcb’, ‘aipw’, ‘ipw’, ‘ipw_msm’}, default=’dcb’: Weighting strategy. 'dcb' uses dynamic covariate balancing, 'ipw' uses inverse probability weighting, 'aipw' uses augmented IPW, and 'ipw_msm' uses stabilised marginal structural model weights.
method{‘lasso_plain’, ‘lasso_subsample’}, default=’lasso_plain’: LASSO estimation strategy for the coefficient stage.
alpfloat, default=0.05: Significance level for confidence intervals.
final_periodint or None, default=None: Last time period to include. Defaults to the maximum in the data.
initial_periodint or None, default=None: First time period to include. Defaults to the minimum in the data.
adaptive_balancingbool, default=True: If True, use tighter balance constraints on covariates with large estimated coefficients.
debiasbool, default=False: If True, apply bootstrap debiasing with 20 replicates.
continuous_treatmentbool, default=False: If True, treat the treatment variable as continuous.
lbfloat, default=0.0005: Lower bound for tuning constant grid search.
ubfloat, default=2.0: Upper bound for tuning constant grid search.
regularizationbool, default=True: If True use cross-validated LASSO, otherwise ridge.
fast_adaptivebool, default=False: If True, use flat grid search instead of three-segment nested search.
grid_lengthint, default=1000: Number of grid points for tuning constant search.
n_beta_nonsparsefloat, default=1e-4: Threshold below which a rescaled coefficient is treated as zero.
ratio_coefficientsfloat, default=1/3: Fraction of largest coefficients to prioritise when sparsity is low.
nfoldsint, default=10: Cross-validation folds for LASSO.
lagsint or None, default=None: Treatment lags for the coefficient stage.
robust_quantilebool, default=True: If True, use chi-squared critical values for inference.
demeaned_febool, default=False: If True, demean fixed effects before estimation.
histories_lengthlist[int] or None, default=None: If provided, estimate ATEs for varying treatment history lengths. Each entry k must satisfy 1 <= k <= len(ds1). For each k, the last k elements of ds1 and ds2 are used. Returns a DynBalancingHistoryResult. Mutually exclusive with final_periods.
final_periodslist[int] or None, default=None: If provided, estimate ATEs at each specified final period. Returns a DynBalancingHetResult. Mutually exclusive with histories_length.
impulse_responsebool, default=False: If True (requires histories_length), estimate impulse responses instead of cumulative effects. For each history length k, the treatment sequences are set to ds1 = [1, 0, ..., 0] and ds2 = [0, 0, ..., 0] (both length k), measuring the effect of a one-period treatment shock at varying horizons.
n_jobsint, default=1: Number of parallel workers for histories_length and final_periods modes. 1 = sequential, -1 = all cores, >1 = that many threads.

Returns:

DynBalancingResult or DynBalancingHistoryResult or DynBalancingHetResult

When neither histories_length nor final_periods is set, returns a single DynBalancingResult. Otherwise returns the corresponding multi-result container.

att: The ATE point estimate (\(\mu_1 - \mu_2\))
var_att: Variance of the ATE
mu1: Potential outcome estimate under ds1
mu2: Potential outcome estimate under ds2
var_mu1: Variance of mu1
var_mu2: Variance of mu2
robust_quantile: Chi-squared critical value for inference
gaussian_quantile: Gaussian critical value for inference
gammas: Balancing weights per treatment history
coefficients: LASSO coefficients per treatment history
imbalances: Covariate imbalance measures
estimation_params: Metadata (observation count, variable names, etc.)

References

[1] (1,2)

Viviano, D. and Bradic, J. (2026). “Dynamic covariate balancing: estimating treatment effects over time with potential local projections.” Biometrika, asag016. https://doi.org/10.1093/biomet/asag016

[2]

Acemoglu, D., Naidu, S., Restrepo, P., and Robinson, J.A. (2019). “Democracy does cause growth.” Journal of Political Economy, 127(1), 47-100. https://doi.org/10.1086/700936

Examples

The dataset below contains 141 countries observed across six five-year periods (1989–2010) from the democracy and growth study of Acemoglu et al. (2019) [2]. The treatment D is a binary democracy indicator that can switch on and off across periods, and the outcome Y is log GDP per capita. This is the same application used in [1].

We estimate the effect of being democratic for two consecutive periods compared to not being democratic, controlling for five country-level covariates and region fixed effects. The treatment histories ds1=[1, 1] and ds2=[0, 0] specify the two sequences to compare, read left to right from the earliest to the most recent period:

from moderndid import load_acemoglu, dyn_balancing

df = load_acemoglu()
result = dyn_balancing(
    data=df,
    yname="Y",
    tname="Time",
    idname="Unit",
    treatment_name="D",
    ds1=[1, 1],
    ds2=[0, 0],
    xformla="~ V1 + V2 + V3 + V4 + V5",
    fixed_effects=["region"],
)
print(result)

==============================================================================
 Dynamic Covariate Balancing Estimation
==============================================================================

 DCB estimation for the ATE:

┌────────┬────────────┬──────────┬────────────────────────┐
│    ATE │ Std. Error │ Pr(>|t|) │ [95% Conf. Interval]   │
├────────┼────────────┼──────────┼────────────────────────┤
│ 0.3011 │     0.2032 │   0.1383 │ [ -0.0971,   0.6993]   │
└────────┴────────────┴──────────┴────────────────────────┘

------------------------------------------------------------------------------
 Signif. codes: '*' confidence interval does not cover 0

------------------------------------------------------------------------------
 Potential Outcomes
------------------------------------------------------------------------------
 mu(ds1):  8.0044  (0.1397)
 mu(ds2):  7.7033  (0.1476)

------------------------------------------------------------------------------
 Data Info
------------------------------------------------------------------------------
 Treatment history ds1: [1, 1]
 Treatment history ds2: [0, 0]
 Outcome variable: Y
 Units: 137
 Observations: 274

------------------------------------------------------------------------------
 Estimation Details
------------------------------------------------------------------------------
 Balancing: DCB
 Coefficient estimation: lasso_plain

------------------------------------------------------------------------------
 Inference
------------------------------------------------------------------------------
 Significance level: 0.05
 Analytical standard errors
 Robust (chi-squared) critical values
==============================================================================
 Viviano and Bradic (2026)