moderndid.core.panel.make_balanced_panel#

moderndid.core.panel.make_balanced_panel(data: Any, idname: str, tname: str) Any[source]#

Drop units not observed in every time period.

Many difference-in-differences estimators require a strictly balanced panel where every unit appears in every time period. When allow_unbalanced_panel=False (the default in att_gt), the preprocessing pipeline calls this function automatically. Calling it beforehand lets you inspect how many units will be dropped and decide whether balancing, gap-filling with fill_panel_gaps, or a flexible threshold via complete_data is more appropriate.

Parameters:
dataDataFrame

Panel data. Accepts any object implementing the Arrow PyCapsule Interface (__arrow_c_stream__), including polars, pandas, pyarrow Table, and cudf DataFrames.

idnamestr

Unit identifier column.

tnamestr

Time period column.

Returns:
DataFrame

Balanced panel in the same format as data.

See also

complete_data

Keep units observed in at least min_periods periods.

fill_panel_gaps

Insert null rows instead of dropping units.

is_balanced_panel

Check whether the panel is already balanced.

Examples

In [1]: from moderndid import make_balanced_panel, load_favara_imbs
   ...: 
   ...: df = load_favara_imbs()
   ...: balanced = make_balanced_panel(df, idname="county", tname="year")
   ...: print(f"Before: {df.shape[0]} rows, After: {balanced.shape[0]} rows")
   ...: 
Before: 12538 rows, After: 12516 rows