moderndid.core.panel.assign_rc_ids#

moderndid.core.panel.assign_rc_ids(data: Any) Any[source]#

Add a unique "rowid" column for repeated cross-section data.

In repeated cross-section designs each observation is a different individual, so there is no natural unit identifier to track over time. This function assigns a sequential integer "rowid" that can be passed as the idname argument to att_gt with panel=False.

Parameters:
dataDataFrame

Panel data. Accepts any object implementing the Arrow PyCapsule Interface (__arrow_c_stream__), including polars, pandas, pyarrow Table, and cudf DataFrames.

Returns:
DataFrame

Original data plus an integer "rowid" column, in the same format as data.

See also

att_gt

Pass panel=False for repeated cross-section estimation.

Examples

In [1]: from moderndid import assign_rc_ids, load_favara_imbs
   ...: 
   ...: df = load_favara_imbs()
   ...: df = assign_rc_ids(df)
   ...: df.select("rowid", "county", "year").head(5)
   ...: 
Out[1]: 
shape: (5, 3)
┌───────┬────────┬──────┐
│ rowid ┆ county ┆ year │
│ ---   ┆ ---    ┆ ---  │
│ u32   ┆ i64    ┆ i64  │
╞═══════╪════════╪══════╡
│ 0     ┆ 1001   ┆ 1994 │
│ 1     ┆ 1001   ┆ 1995 │
│ 2     ┆ 1001   ┆ 1996 │
│ 3     ┆ 1001   ┆ 1997 │
│ 4     ┆ 1001   ┆ 1998 │
└───────┴────────┴──────┘