moderndid.gen_ddd_mult_periods#

moderndid.gen_ddd_mult_periods(n: int, dgp_type: int = 1, panel: bool = True, random_state=None) → dict[source]#

Generate data with staggered treatment adoption for multi-period DDD.

Generates data where units adopt treatment at different times across three periods. The DGP has 3 timing groups (cohort=0 never treated, 2=treated at period 2, 3=treated at period 3) and two partitions (eligible/ineligible).

Parameters:

nint

Number of units to simulate. For panel data, this is the total number of units observed in all periods. For repeated cross-section data, this is the number of observations per period.

dgp_type{1, 2, 3, 4}, default=1

Controls nuisance function specification:

1: Both propensity score and outcome regression use Z (both correct)
2: Propensity score uses X, outcome regression uses Z (OR correct)
3: Propensity score uses Z, outcome regression uses X (PS correct)
4: Both use X (both misspecified when estimating with Z)

panelbool, default=True

If True, generate panel data where each unit is observed in all periods. If False, generate repeated cross-section data where different units are sampled in each period.

random_stateint, Generator, or None, default=None

Controls randomness for reproducibility.

Returns:

dict

Dictionary containing:

data: pl.DataFrame in long format with columns [id, group, partition, time, y, cov1, cov2, cov3, cov4, cluster]
data_wide: pl.DataFrame in wide format with one row per unit (only for panel=True)
es_0_oracle: Oracle event-study parameter at event time 0
prob_g2_p1: Proportion of units with cohort=2 and eligibility
prob_g3_p1: Proportion of units with cohort=3 and eligibility