moderndid.load_nsw#

moderndid.load_nsw() DataFrame[source]#

Load the NSW (National Supported Work) demonstration dataset.

This dataset is from the National Supported Work (NSW) Demonstration, a randomized employment training program operated in the mid-1970s. It has been widely used in the causal inference literature, particularly for demonstrating difference-in-differences methods.

The dataset is a balanced panel in long format with 16,417 individuals observed in 1975 (pre-treatment) and 1978 (post-treatment), for a total of 32,834 observations.

Returns:
polars.DataFrame

A DataFrame with the following columns:

  • id: Individual identifier

  • year: Year (1975 or 1978)

  • experimental: Treatment indicator (1 if treated, 0 if control)

  • re: Real earnings (outcome variable)

  • age: Age in years

  • educ: Years of education

  • black: Indicator for Black race

  • married: Indicator for married status

  • nodegree: Indicator for no high school degree

  • hisp: Indicator for Hispanic ethnicity

  • re74: Real earnings in 1974

References

[1]

Lalonde, R. (1986). Evaluating the econometric evaluations of training programs with experimental data. American Economic Review, 76(4), 604-620.