What is ModernDiD?#
Difference-in-differences (DiD) is one of the most widely used methods for causal inference from observational data. The modern DiD literature has produced many estimators, but implementations are scattered across separate R and Stata packages with incompatible APIs and output formats.
ModernDiD brings them together into a single Python library with a consistent API. Every estimator follows the same three-step workflow of estimate, aggregate, visualize. Switching between designs means changing one function call, not learning a new package.
Who is this for?#
ModernDiD is built for applied researchers, economists, and data scientists doing causal inference with difference-in-differences. If you already use R packages for DiD, ModernDiD gives you the same estimators in Python with a unified interface. If you are new to DiD, the introduction covers the key ideas.
A typical analysis estimates group-time effects, aggregates them, and plots the result.
import moderndid as did
# Works with pandas, polars, pyarrow, duckdb, or any Arrow-compatible DataFrame
data = did.load_mpdta()
# Estimate group-time ATTs
result = did.att_gt(
data=data,
yname="lemp",
tname="year",
idname="countyreal",
gname="first.treat",
)
# Aggregate into an event study
event_study = did.aggte(result, type="dynamic")
# Visualize
did.plot_event_study(event_study)
Switching between estimators means changing one function call. The aggregation and plotting interface stays the same.
Key features#
Consistent API. Every estimator returns a typed result object with the same interface for summarizing, aggregating, and plotting. Learn the pattern once and apply it everywhere.
DataFrame agnostic. Pass any Arrow-compatible DataFrame such as polars, pandas, pyarrow, duckdb, and more, powered by narwhals.
Scales up. Runs locally on a laptop, then transparently scales to multi-node Dask and Spark clusters for datasets that exceed single-machine memory. Just pass a Dask or Spark DataFrame and the distributed backend activates automatically.
Fast computation. Polars for internal data wrangling, NumPy vectorization, Numba JIT compilation, and threaded parallel compute.
GPU acceleration. Optional CuPy-accelerated regression and propensity score estimation on NVIDIA GPUs, with multi-GPU scaling in distributed environments. See the GPU guide.
Native plots. Built-in plotnine visualizations returning standard
ggplotobjects you can customize with the full grammar of graphics.Publication tables. Result objects implement the maketables plug-in interface, so you can build publication-quality regression tables directly from estimation output. See the publication tables guide.
Panel utilities. Diagnose, reshape, and clean panel data with built-in tools for gap detection, balancing, first-differencing, and wide/long conversion. See the panel utilities guide.
Robust inference. Analytical standard errors, bootstrap (weighted and multiplier), and simultaneous confidence bands.
Estimators#
ModernDiD covers the main DiD designs used in applied work. Each estimator targets a different treatment structure. See the estimator overview for detailed descriptions, key arguments, and guidance on when to use each one.
Design |
When to use |
Reference |
|---|---|---|
Staggered adoption ( |
Binary treatment turns on permanently at different times |
|
Two-period DR DiD ( |
Classic two-period, two-group setting |
|
Continuous treatment ( |
Treatment intensity varies across units |
|
Triple differences ( |
Within-group eligibility variation (e.g., eligible vs. ineligible) |
|
Intertemporal treatment ( |
Non-absorbing treatment that switches on/off over time |
|
Sensitivity analysis ( |
Assess robustness to parallel trends violations |
Next steps#
Installation for detailed install options and optional extras.
Introduction to DiD for background on the difference-in-differences framework.
Quickstart to learn the API with working examples.