moderndid.diddynamic.dyn_balancing#
- moderndid.diddynamic.dyn_balancing(data, yname: str, tname: str, idname: str, treatment_name: str, ds1: list[int], ds2: list[int], xformla: str | None = None, fixed_effects: list[str] | None = None, pooled: bool = False, clustervars: list[str] | None = None, balancing: str = 'dcb', method: str = 'lasso_plain', alp: float = 0.05, final_period: int | None = None, initial_period: int | None = None, adaptive_balancing: bool = True, debias: bool = False, continuous_treatment: bool = False, lb: float = 0.0005, ub: float = 2.0, regularization: bool = True, fast_adaptive: bool = False, grid_length: int = 1000, n_beta_nonsparse: float = 0.0001, ratio_coefficients: float = 0.3333333333333333, nfolds: int = 10, lags: int | None = None, robust_quantile: bool = True, demeaned_fe: bool = False, histories_length: list[int] | None = None, final_periods: list[int] | None = None, impulse_response: bool = False, n_jobs: int = 1) DynBalancingResult | DynBalancingHistoryResult | DynBalancingHetResult[source]#
Estimate treatment effects under dynamic treatment regimes.
Implements the dynamic covariate balancing (DCB) estimator of [1] for comparing potential outcomes under two treatment histories \(d_{1:T}\) and \(d'_{1:T}\). The average treatment effect is defined as
\[\text{ATE}(d_{1:T}, d'_{1:T}) = \mu_T(d_{1:T}) - \mu_T(d'_{1:T}),\]where \(\mu_T(d_{1:T}) = \mathbb{E}[Y_T(d_{1:T})]\) is the potential outcome under treatment history \(d_{1:T}\).
Identification relies on a sequential conditional independence assumption and overlap. For each period \(t\), the DCB estimator solves a quadratic program to find balancing weights \(\hat{\gamma}_t\) that satisfy dynamic covariate balance constraints while minimising the \(\ell_2\) norm. The potential outcome is then estimated as a bias-corrected weighted average of outcomes in the final period. IPW, AIPW, and IPW-MSM alternatives are also available as benchmarks.
- Parameters:
- data
DataFrame Panel data in long format. Accepts any object implementing the Arrow PyCapsule Interface (
__arrow_c_stream__), including polars, pandas, pyarrow Table, and cudf DataFrames.- yname
str The name of the outcome variable.
- tname
str The name of the column containing the time periods.
- idname
str The individual (cross-sectional unit) id name.
- treatment_name
str The name of the binary treatment column.
- ds1
list[int] Target treatment history for the first potential outcome. Length must equal the number of time periods.
- ds2
list[int] Target treatment history for the second potential outcome. Must have the same length as
ds1.- xformla
strorNone, default=None A formula for the covariates to include in the model. It should be of the form
"~ X1 + X2".- fixed_effects
list[str] orNone, default=None Column names to include as fixed-effect dummies.
- pooledbool, default=False
If True, pool observations across periods for coefficient estimation.
- clustervars
list[str] orNone, default=None Column names on which to cluster standard errors.
- balancing{‘dcb’, ‘aipw’, ‘ipw’, ‘ipw_msm’}, default=’dcb’
Weighting strategy.
'dcb'uses dynamic covariate balancing,'ipw'uses inverse probability weighting,'aipw'uses augmented IPW, and'ipw_msm'uses stabilised marginal structural model weights.- method{‘lasso_plain’, ‘lasso_subsample’}, default=’lasso_plain’
LASSO estimation strategy for the coefficient stage.
- alp
float, default=0.05 Significance level for confidence intervals.
- final_period
intorNone, default=None Last time period to include. Defaults to the maximum in the data.
- initial_period
intorNone, default=None First time period to include. Defaults to the minimum in the data.
- adaptive_balancingbool, default=True
If True, use tighter balance constraints on covariates with large estimated coefficients.
- debiasbool, default=False
If True, apply bootstrap debiasing with 20 replicates.
- continuous_treatmentbool, default=False
If True, treat the treatment variable as continuous.
- lb
float, default=0.0005 Lower bound for tuning constant grid search.
- ub
float, default=2.0 Upper bound for tuning constant grid search.
- regularizationbool, default=True
If True use cross-validated LASSO, otherwise ridge.
- fast_adaptivebool, default=False
If True, use flat grid search instead of three-segment nested search.
- grid_length
int, default=1000 Number of grid points for tuning constant search.
- n_beta_nonsparse
float, default=1e-4 Threshold below which a rescaled coefficient is treated as zero.
- ratio_coefficients
float, default=1/3 Fraction of largest coefficients to prioritise when sparsity is low.
- nfolds
int, default=10 Cross-validation folds for LASSO.
- lags
intorNone, default=None Treatment lags for the coefficient stage.
- robust_quantilebool, default=True
If True, use chi-squared critical values for inference.
- demeaned_febool, default=False
If True, demean fixed effects before estimation.
- histories_length
list[int] orNone, default=None If provided, estimate ATEs for varying treatment history lengths. Each entry
kmust satisfy1 <= k <= len(ds1). For eachk, the lastkelements ofds1andds2are used. Returns aDynBalancingHistoryResult. Mutually exclusive withfinal_periods.- final_periods
list[int] orNone, default=None If provided, estimate ATEs at each specified final period. Returns a
DynBalancingHetResult. Mutually exclusive withhistories_length.- impulse_responsebool, default=False
If True (requires
histories_length), estimate impulse responses instead of cumulative effects. For each history lengthk, the treatment sequences are set tods1 = [1, 0, ..., 0]andds2 = [0, 0, ..., 0](both lengthk), measuring the effect of a one-period treatment shock at varying horizons.- n_jobs
int, default=1 Number of parallel workers for
histories_lengthandfinal_periodsmodes. 1 = sequential, -1 = all cores, >1 = that many threads.
- data
- Returns:
DynBalancingResultorDynBalancingHistoryResultorDynBalancingHetResultWhen neither
histories_lengthnorfinal_periodsis set, returns a singleDynBalancingResult. Otherwise returns the corresponding multi-result container.att: The ATE point estimate (\(\mu_1 - \mu_2\))
var_att: Variance of the ATE
mu1: Potential outcome estimate under
ds1mu2: Potential outcome estimate under
ds2var_mu1: Variance of
mu1var_mu2: Variance of
mu2robust_quantile: Chi-squared critical value for inference
gaussian_quantile: Gaussian critical value for inference
gammas: Balancing weights per treatment history
coefficients: LASSO coefficients per treatment history
imbalances: Covariate imbalance measures
estimation_params: Metadata (observation count, variable names, etc.)
References
[1] (1,2)Viviano, D. and Bradic, J. (2026). “Dynamic covariate balancing: estimating treatment effects over time with potential local projections.” Biometrika, asag016. https://doi.org/10.1093/biomet/asag016
[2]Acemoglu, D., Naidu, S., Restrepo, P., and Robinson, J.A. (2019). “Democracy does cause growth.” Journal of Political Economy, 127(1), 47-100. https://doi.org/10.1086/700936
Examples
The dataset below contains 141 countries observed across six five-year periods (1989–2010) from the democracy and growth study of Acemoglu et al. (2019) [2]. The treatment
Dis a binary democracy indicator that can switch on and off across periods, and the outcomeYis log GDP per capita. This is the same application used in [1].We estimate the effect of being democratic for two consecutive periods compared to not being democratic, controlling for five country-level covariates and region fixed effects. The treatment histories
ds1=[1, 1]andds2=[0, 0]specify the two sequences to compare, read left to right from the earliest to the most recent period:from moderndid import load_acemoglu, dyn_balancing df = load_acemoglu() result = dyn_balancing( data=df, yname="Y", tname="Time", idname="Unit", treatment_name="D", ds1=[1, 1], ds2=[0, 0], xformla="~ V1 + V2 + V3 + V4 + V5", fixed_effects=["region"], ) print(result)
============================================================================== Dynamic Covariate Balancing Estimation ============================================================================== DCB estimation for the ATE: ┌────────┬────────────┬──────────┬────────────────────────┐ │ ATE │ Std. Error │ Pr(>|t|) │ [95% Conf. Interval] │ ├────────┼────────────┼──────────┼────────────────────────┤ │ 0.3011 │ 0.2032 │ 0.1383 │ [ -0.0971, 0.6993] │ └────────┴────────────┴──────────┴────────────────────────┘ ------------------------------------------------------------------------------ Signif. codes: '*' confidence interval does not cover 0 ------------------------------------------------------------------------------ Potential Outcomes ------------------------------------------------------------------------------ mu(ds1): 8.0044 (0.1397) mu(ds2): 7.7033 (0.1476) ------------------------------------------------------------------------------ Data Info ------------------------------------------------------------------------------ Treatment history ds1: [1, 1] Treatment history ds2: [0, 0] Outcome variable: Y Units: 137 Observations: 274 ------------------------------------------------------------------------------ Estimation Details ------------------------------------------------------------------------------ Balancing: DCB Coefficient estimation: lasso_plain ------------------------------------------------------------------------------ Inference ------------------------------------------------------------------------------ Significance level: 0.05 Analytical standard errors Robust (chi-squared) critical values ============================================================================== Viviano and Bradic (2026)