moderndid.load_nsw#
- moderndid.load_nsw() DataFrame[source]#
Load the NSW (National Supported Work) demonstration dataset.
This dataset is from the National Supported Work (NSW) Demonstration, a randomized employment training program operated in the mid-1970s. It has been widely used in the causal inference literature, particularly for demonstrating difference-in-differences methods.
The dataset is a balanced panel in long format with 16,417 individuals observed in 1975 (pre-treatment) and 1978 (post-treatment), for a total of 32,834 observations.
- Returns:
polars.DataFrameA DataFrame with the following columns:
id: Individual identifier
year: Year (1975 or 1978)
experimental: Treatment indicator (1 if treated, 0 if control)
re: Real earnings (outcome variable)
age: Age in years
educ: Years of education
black: Indicator for Black race
married: Indicator for married status
nodegree: Indicator for no high school degree
hisp: Indicator for Hispanic ethnicity
re74: Real earnings in 1974
References
[1]Lalonde, R. (1986). Evaluating the econometric evaluations of training programs with experimental data. American Economic Review, 76(4), 604-620.