moderndid.cont_did#

moderndid.cont_did(data, yname, tname, idname, gname=None, dname=None, xformla='~1', target_parameter='level', aggregation='dose', treatment_type='continuous', dose_est_method='parametric', dvals=None, degree=3, num_knots=0, allow_unbalanced_panel=False, control_group='notyettreated', anticipation=0, weightsname=None, alp=0.05, cband=False, boot=False, boot_type='multiplier', biters=1000, clustervars=None, base_period='varying', random_state=None, n_partitions=None, backend=None, **kwargs)[source]#

Compute difference-in-differences with a continuous treatment.

Implements difference-in-differences estimation for settings where treatment intensity varies across units but remains constant over time for each unit, following [1].

With continuous treatments, two distinct causal parameters are of interest. The average treatment effect on the treated at dose \(d\), denoted \(ATT(d|d)\), measures the effect of receiving dose \(d\) compared to no treatment among units that actually received dose \(d\)

\[ATT(d|d) = \mathbb{E}[Y_{t}(d) - Y_{t}(0) \mid D = d].\]

The average causal response on the treated, \(ACRT(d|d)\), measures the marginal effect of increasing the dose, i.e., the slope of the dose-response function

\[ACRT(d|d) = \left.\frac{\partial}{\partial l} \mathbb{E}[Y_{t}(l) \mid D = d]\right|_{l=d}.\]

Under a parallel trends assumption, the \(ATT(d|d)\) is identified by comparing outcome changes between dose group \(d\) and the untreated

\[ATT(d|d) = \mathbb{E}[\Delta Y \mid D = d] - \mathbb{E}[\Delta Y \mid D = 0].\]

Aggregating over the dose distribution among treated units yields the overall average treatment effect and average causal response

\[ATT^o = \mathbb{E}[ATT(D|D) \mid D > 0], \quad ACRT^o = \mathbb{E}[ACRT(D|D) \mid D > 0].\]

Parameters:

dataDataFrame

Panel data in long format. Accepts any object implementing the Arrow PyCapsule Interface (__arrow_c_stream__), including polars, pandas, pyarrow Table, and cudf DataFrames.

ynamestr

Name of the column containing the outcome variable.

tnamestr

Name of the column containing the time period variable.

idnamestr

Name of the column containing the unit ID variable.

gnamestr, optional

Name of the column containing the timing-group variable indicating when treatment starts for each unit. If None, it will be computed from the treatment variable. Should be 0 for never-treated units.

dnamestr

Name of the column containing the continuous treatment variable. This should represent the “dose” or amount of treatment received, and should be constant across time periods for each unit. Use 0 for never-treated units.

xformlastr, default=”~1”

A formula for the covariates to include in the model. Should be of the form “~ X1 + X2” (intercept is always included). Currently only “~1” (no covariates) is supported.

target_parameter{“level”, “slope”}, default=”level”

Type of treatment effect to focus on:

“level”: Average treatment effect (ATT) at different dose levels
“slope”: Average causal response (ACRT), the derivative of the dose-response curve

For aggregation="dose", both ATT(d) and ACRT(d) are always computed and reported regardless of this setting. This parameter mainly affects aggregation="eventstudy", where it determines whether to aggregate ATT or ACRT over event time.

aggregation{“dose”, “eventstudy”}, default=”dose”

How to aggregate the treatment effects:

“dose”: Average across timing-groups and time periods, report by dose. Both ATT(d) and ACRT(d) curves are returned.
“eventstudy”: Average across timing-groups and doses, report by event time. Returns ATT or ACRT by event time depending on target_parameter.

treatment_type{“continuous”, “discrete”}, default=”continuous”

Nature of the treatment variable. Only “continuous” is currently supported.

dose_est_method{“parametric”, “cck”}, default=”parametric”

Method for estimating dose-specific effects:

“parametric”: Use B-splines with specified degree and knots
“cck”: Use non-parametric method based on [2].

dvalsarray_like, optional

Values of the treatment dose at which to compute effects. If None, uses quantiles of the dose distribution among treated units.

degreeint, default=3

Degree of the B-spline basis functions. Combined with num_knots=0 (default), this fits a global polynomial of the specified degree.

num_knotsint, default=0

Number of interior knots for the B-spline. More knots allow more flexibility but may increase variance.

allow_unbalanced_panelbool, default=False

Whether to allow unbalanced panel data. Currently not supported.

control_group{“notyettreated”, “nevertreated”}, default=”notyettreated”

Which units to use as controls:

“notyettreated”: Units not yet treated by time t
“nevertreated”: Only never-treated units

anticipationint, default=0

Number of time periods before treatment where effects may appear.

weightsnamestr, optional

Name of the column containing sampling weights. If None, all observations have equal weight.

alpfloat, default=0.05

Significance level for confidence intervals (e.g., 0.05 for 95% CI).

cbandbool, default=False

Whether to compute uniform confidence bands over all dose values.

bootbool, default=False

Whether to use bootstrap inference. If False, uses analytical standard errors.

boot_typestr, default=”multiplier”

Type of bootstrap to perform (“multiplier” or “empirical”). Only used when boot=True.

bitersint, default=1000

Number of bootstrap iterations for inference. Only used when boot=True.

clustervarsstr, optional

Variable(s) for clustering standard errors. Not currently supported.

base_period{“varying”, “universal”}, default=”varying”

How to choose the base period for comparisons:

“varying”: Use different base periods for different timing groups
“universal”: Use the same base period for all comparisons

random_stateint, Generator, optional

Controls the randomness of the bootstrap. Pass an int for reproducible results across multiple function calls. Can also accept a NumPy Generator instance.

n_partitionsint, optional

Number of partitions for distributed computation when data is a Dask or Spark DataFrame. If None, defaults to the framework’s default parallelism.

backend{“numpy”, “cupy”} or None, default=None

Array backend to use for this call only. When set, the backend is activated before estimation and the previous backend is restored when the call returns. None (the default) uses whatever backend is currently active (see set_backend). Ignored when data is a Dask or Spark DataFrame.

**kwargs

Additional keyword arguments passed to internal functions.

Returns:

DoseResult or PTEResult

Results object containing:

dose : Array of dose values at which effects are evaluated
att_d : Dose-specific ATT estimates
att_d_se : Standard errors for dose-specific ATT
acrt_d : Dose-specific ACRT estimates (if target_parameter=”slope”)
acrt_d_se : Standard errors for dose-specific ACRT
overall_att : Overall average treatment effect
overall_att_se : Standard error for overall ATT
overall_acrt : Overall average causal response (if applicable)
overall_acrt_se : Standard error for overall ACRT

References

[1]

Callaway, B., Goodman-Bacon, A., & Sant’Anna, P. H. (2024). “Difference-in-differences with a continuous treatment.” Journal of Econometrics, forthcoming. https://arxiv.org/abs/2107.02637

[2]

Chen, X., Christensen, T. M., & Kankanala, S. (2024). “Adaptive Estimation and Uniform Confidence Bands for Nonparametric Structural Functions and Elasticities.” https://arxiv.org/abs/2107.11869

Examples

Estimate the dose-response function using simulated data with continuous treatment:

In [1]: import moderndid
   ...: data = moderndid.gen_cont_did_data(n=500, seed=42)
   ...: data.head()
   ...: 
Out[1]: 
shape: (5, 5)
┌─────┬─────┬──────────┬─────────────┬──────────┐
│ id  ┆ G   ┆ D        ┆ time_period ┆ Y        │
│ --- ┆ --- ┆ ---      ┆ ---         ┆ ---      │
│ i64 ┆ i64 ┆ f64      ┆ i64         ┆ f64      │
╞═════╪═════╪══════════╪═════════════╪══════════╡
│ 1   ┆ 4   ┆ 0.736706 ┆ 1           ┆ 2.89755  │
│ 1   ┆ 4   ┆ 0.736706 ┆ 2           ┆ 3.830672 │
│ 1   ┆ 4   ┆ 0.736706 ┆ 3           ┆ 4.490186 │
│ 1   ┆ 4   ┆ 0.736706 ┆ 4           ┆ 6.080861 │
│ 2   ┆ 2   ┆ 0.886403 ┆ 1           ┆ 5.049402 │
└─────┴─────┴──────────┴─────────────┴──────────┘

Estimate ATT as a function of dose using the parametric (B-spline) estimator:

In [2]: result = moderndid.cont_did(
   ...:     data=data,
   ...:     yname="Y",
   ...:     tname="time_period",
   ...:     idname="id",
   ...:     gname="G",
   ...:     dname="D",
   ...:     target_parameter="level",
   ...:     aggregation="dose",
   ...:     degree=3,
   ...:     biters=100
   ...: )
   ...: result
   ...: 
Out[2]: 
==============================================================================
 Continuous Treatment Dose-Response Results
==============================================================================

 Overall ATT:

┌────────┬────────────┬────────────────────────┐
│    ATT │ Std. Error │ [95% Conf. Interval]   │
├────────┼────────────┼────────────────────────┤
│ 0.1267 │     0.1578 │ [ -0.1826,   0.4360]   │
└────────┴────────────┴────────────────────────┘

 Overall ACRT:

┌────────┬────────────┬────────────────────────┐
│   ACRT │ Std. Error │ [95% Conf. Interval]   │
├────────┼────────────┼────────────────────────┤
│ 0.1947 │     0.1984 │ [ -0.1942,   0.5837]   │
└────────┴────────────┴────────────────────────┘

------------------------------------------------------------------------------
 Signif. codes: '*' confidence band does not cover 0

------------------------------------------------------------------------------
 Data Info
------------------------------------------------------------------------------
 Control Group: Not Yet Treated
 Anticipation Periods: 0

------------------------------------------------------------------------------
 Estimation Details
------------------------------------------------------------------------------
 Estimation Method: Parametric (B-spline)
 Spline Degree: 3
 Number of Knots: 0

------------------------------------------------------------------------------
 Inference
------------------------------------------------------------------------------
 Significance level: 0.05
 Bootstrap standard errors
==============================================================================
 Reference: Callaway et al. (2024)

For the non-parametric CCK estimator, we need exactly 2 groups and 2 time periods:

In [3]: data_cck = moderndid.gen_cont_did_data(
   ...:     n=500, num_time_periods=2, seed=42
   ...: )
   ...: cck_result = moderndid.cont_did(
   ...:     data=data_cck,
   ...:     yname="Y",
   ...:     tname="time_period",
   ...:     idname="id",
   ...:     gname="G",
   ...:     dname="D",
   ...:     dose_est_method="cck",
   ...:     target_parameter="level",
   ...:     aggregation="dose",
   ...:     biters=100
   ...: )
   ...: cck_result
   ...: 
Out[3]: 
==============================================================================
 Continuous Treatment Dose-Response Results
==============================================================================

 Overall ATT:

┌────────┬────────────┬────────────────────────┐
│    ATT │ Std. Error │ [95% Conf. Interval]   │
├────────┼────────────┼────────────────────────┤
│ 0.3220 │     0.0932 │ [  0.1394,   0.5046] * │
└────────┴────────────┴────────────────────────┘

 Overall ACRT:

┌────────┬────────────┬────────────────────────┐
│   ACRT │ Std. Error │ [95% Conf. Interval]   │
├────────┼────────────┼────────────────────────┤
│ 2.7915 │     4.4607 │ [ -5.9513,  11.5344]   │
└────────┴────────────┴────────────────────────┘

------------------------------------------------------------------------------
 Signif. codes: '*' confidence band does not cover 0

------------------------------------------------------------------------------
 Data Info
------------------------------------------------------------------------------
 Control Group: Not Yet Treated
 Anticipation Periods: 0

------------------------------------------------------------------------------
 Estimation Details
------------------------------------------------------------------------------
 Estimation Method: Non-parametric (CCK)

------------------------------------------------------------------------------
 Inference
------------------------------------------------------------------------------
 Significance level: 0.05
 Bootstrap standard errors
==============================================================================
 Reference: Callaway et al. (2024)