moderndid.diddynamic.dyn_balancing#

moderndid.diddynamic.dyn_balancing(data, yname: str, tname: str, idname: str, treatment_name: str, ds1: list[int], ds2: list[int], xformla: str | None = None, fixed_effects: list[str] | None = None, pooled: bool = False, clustervars: list[str] | None = None, balancing: str = 'dcb', method: str = 'lasso_plain', alp: float = 0.05, final_period: int | None = None, initial_period: int | None = None, adaptive_balancing: bool = True, debias: bool = False, continuous_treatment: bool = False, lb: float = 0.0005, ub: float = 2.0, regularization: bool = True, fast_adaptive: bool = False, grid_length: int = 1000, n_beta_nonsparse: float = 0.0001, ratio_coefficients: float = 0.3333333333333333, nfolds: int = 10, lags: int | None = None, robust_quantile: bool = True, demeaned_fe: bool = False, histories_length: list[int] | None = None, final_periods: list[int] | None = None, impulse_response: bool = False, n_jobs: int = 1) DynBalancingResult | DynBalancingHistoryResult | DynBalancingHetResult[source]#

Estimate treatment effects under dynamic treatment regimes.

Implements the dynamic covariate balancing (DCB) estimator of [1] for comparing potential outcomes under two treatment histories \(d_{1:T}\) and \(d'_{1:T}\). The average treatment effect is defined as

\[\text{ATE}(d_{1:T}, d'_{1:T}) = \mu_T(d_{1:T}) - \mu_T(d'_{1:T}),\]

where \(\mu_T(d_{1:T}) = \mathbb{E}[Y_T(d_{1:T})]\) is the potential outcome under treatment history \(d_{1:T}\).

Identification relies on a sequential conditional independence assumption and overlap. For each period \(t\), the DCB estimator solves a quadratic program to find balancing weights \(\hat{\gamma}_t\) that satisfy dynamic covariate balance constraints while minimising the \(\ell_2\) norm. The potential outcome is then estimated as a bias-corrected weighted average of outcomes in the final period. IPW, AIPW, and IPW-MSM alternatives are also available as benchmarks.

Parameters:
dataDataFrame

Panel data in long format. Accepts any object implementing the Arrow PyCapsule Interface (__arrow_c_stream__), including polars, pandas, pyarrow Table, and cudf DataFrames.

ynamestr

The name of the outcome variable.

tnamestr

The name of the column containing the time periods.

idnamestr

The individual (cross-sectional unit) id name.

treatment_namestr

The name of the binary treatment column.

ds1list[int]

Target treatment history for the first potential outcome. Length must equal the number of time periods.

ds2list[int]

Target treatment history for the second potential outcome. Must have the same length as ds1.

xformlastr or None, default=None

A formula for the covariates to include in the model. It should be of the form "~ X1 + X2".

fixed_effectslist[str] or None, default=None

Column names to include as fixed-effect dummies.

pooledbool, default=False

If True, pool observations across periods for coefficient estimation.

clustervarslist[str] or None, default=None

Column names on which to cluster standard errors.

balancing{‘dcb’, ‘aipw’, ‘ipw’, ‘ipw_msm’}, default=’dcb’

Weighting strategy. 'dcb' uses dynamic covariate balancing, 'ipw' uses inverse probability weighting, 'aipw' uses augmented IPW, and 'ipw_msm' uses stabilised marginal structural model weights.

method{‘lasso_plain’, ‘lasso_subsample’}, default=’lasso_plain’

LASSO estimation strategy for the coefficient stage.

alpfloat, default=0.05

Significance level for confidence intervals.

final_periodint or None, default=None

Last time period to include. Defaults to the maximum in the data.

initial_periodint or None, default=None

First time period to include. Defaults to the minimum in the data.

adaptive_balancingbool, default=True

If True, use tighter balance constraints on covariates with large estimated coefficients.

debiasbool, default=False

If True, apply bootstrap debiasing with 20 replicates.

continuous_treatmentbool, default=False

If True, treat the treatment variable as continuous.

lbfloat, default=0.0005

Lower bound for tuning constant grid search.

ubfloat, default=2.0

Upper bound for tuning constant grid search.

regularizationbool, default=True

If True use cross-validated LASSO, otherwise ridge.

fast_adaptivebool, default=False

If True, use flat grid search instead of three-segment nested search.

grid_lengthint, default=1000

Number of grid points for tuning constant search.

n_beta_nonsparsefloat, default=1e-4

Threshold below which a rescaled coefficient is treated as zero.

ratio_coefficientsfloat, default=1/3

Fraction of largest coefficients to prioritise when sparsity is low.

nfoldsint, default=10

Cross-validation folds for LASSO.

lagsint or None, default=None

Treatment lags for the coefficient stage.

robust_quantilebool, default=True

If True, use chi-squared critical values for inference.

demeaned_febool, default=False

If True, demean fixed effects before estimation.

histories_lengthlist[int] or None, default=None

If provided, estimate ATEs for varying treatment history lengths. Each entry k must satisfy 1 <= k <= len(ds1). For each k, the last k elements of ds1 and ds2 are used. Returns a DynBalancingHistoryResult. Mutually exclusive with final_periods.

final_periodslist[int] or None, default=None

If provided, estimate ATEs at each specified final period. Returns a DynBalancingHetResult. Mutually exclusive with histories_length.

impulse_responsebool, default=False

If True (requires histories_length), estimate impulse responses instead of cumulative effects. For each history length k, the treatment sequences are set to ds1 = [1, 0, ..., 0] and ds2 = [0, 0, ..., 0] (both length k), measuring the effect of a one-period treatment shock at varying horizons.

n_jobsint, default=1

Number of parallel workers for histories_length and final_periods modes. 1 = sequential, -1 = all cores, >1 = that many threads.

Returns:
DynBalancingResult or DynBalancingHistoryResult or DynBalancingHetResult

When neither histories_length nor final_periods is set, returns a single DynBalancingResult. Otherwise returns the corresponding multi-result container.

  • att: The ATE point estimate (\(\mu_1 - \mu_2\))

  • var_att: Variance of the ATE

  • mu1: Potential outcome estimate under ds1

  • mu2: Potential outcome estimate under ds2

  • var_mu1: Variance of mu1

  • var_mu2: Variance of mu2

  • robust_quantile: Chi-squared critical value for inference

  • gaussian_quantile: Gaussian critical value for inference

  • gammas: Balancing weights per treatment history

  • coefficients: LASSO coefficients per treatment history

  • imbalances: Covariate imbalance measures

  • estimation_params: Metadata (observation count, variable names, etc.)

References

[1] (1,2)

Viviano, D. and Bradic, J. (2026). “Dynamic covariate balancing: estimating treatment effects over time with potential local projections.” Biometrika, asag016. https://doi.org/10.1093/biomet/asag016

[2]

Acemoglu, D., Naidu, S., Restrepo, P., and Robinson, J.A. (2019). “Democracy does cause growth.” Journal of Political Economy, 127(1), 47-100. https://doi.org/10.1086/700936

Examples

The dataset below contains 141 countries observed across six five-year periods (1989–2010) from the democracy and growth study of Acemoglu et al. (2019) [2]. The treatment D is a binary democracy indicator that can switch on and off across periods, and the outcome Y is log GDP per capita. This is the same application used in [1].

We estimate the effect of being democratic for two consecutive periods compared to not being democratic, controlling for five country-level covariates and region fixed effects. The treatment histories ds1=[1, 1] and ds2=[0, 0] specify the two sequences to compare, read left to right from the earliest to the most recent period:

from moderndid import load_acemoglu, dyn_balancing

df = load_acemoglu()
result = dyn_balancing(
    data=df,
    yname="Y",
    tname="Time",
    idname="Unit",
    treatment_name="D",
    ds1=[1, 1],
    ds2=[0, 0],
    xformla="~ V1 + V2 + V3 + V4 + V5",
    fixed_effects=["region"],
)
print(result)
==============================================================================
 Dynamic Covariate Balancing Estimation
==============================================================================

 DCB estimation for the ATE:

┌────────┬────────────┬──────────┬────────────────────────┐
│    ATE │ Std. Error │ Pr(>|t|) │ [95% Conf. Interval]   │
├────────┼────────────┼──────────┼────────────────────────┤
│ 0.3011 │     0.2032 │   0.1383 │ [ -0.0971,   0.6993]   │
└────────┴────────────┴──────────┴────────────────────────┘

------------------------------------------------------------------------------
 Signif. codes: '*' confidence interval does not cover 0

------------------------------------------------------------------------------
 Potential Outcomes
------------------------------------------------------------------------------
 mu(ds1):  8.0044  (0.1397)
 mu(ds2):  7.7033  (0.1476)

------------------------------------------------------------------------------
 Data Info
------------------------------------------------------------------------------
 Treatment history ds1: [1, 1]
 Treatment history ds2: [0, 0]
 Outcome variable: Y
 Units: 137
 Observations: 274

------------------------------------------------------------------------------
 Estimation Details
------------------------------------------------------------------------------
 Balancing: DCB
 Coefficient estimation: lasso_plain

------------------------------------------------------------------------------
 Inference
------------------------------------------------------------------------------
 Significance level: 0.05
 Analytical standard errors
 Robust (chi-squared) critical values
==============================================================================
 Viviano and Bradic (2026)