moderndid.npiv_j#

moderndid.npiv_j(y, x, w, x_grid=None, j_x_degree=3, k_w_degree=4, j_x_segments_set=None, k_w_segments_set=None, knots='uniform', basis='tensor', x_min=None, x_max=None, w_min=None, w_max=None, grid_num=50, biters=99, alpha=0.5, check_is_fullrank=False, seed=None)[source]#

Implement Lepski’s method for optimal sieve dimension selection.

Implements the bootstrap-based test from [1] for selecting the optimal number of B-spline basis functions in nonparametric instrumental variables (NPIV) estimation. The method compares estimates across a grid of sieve dimensions \(\hat{\mathcal{J}}\). For each pair \((J, J_2)\) with \(J_2 > J\), it computes a sup-t-statistic for the difference in estimates

\[\sup_{x \in \mathcal{X}} \left| \frac{\hat{h}_J(x) - \hat{h}_{J_2}(x)}{\hat{\sigma}_{J, J_2}(x)} \right|.\]

The optimal dimension \(\hat{J}\) is the smallest \(J \in \hat{\mathcal{J}}\) for which this statistic is below a bootstrap critical value \(\theta_{1-\hat{\alpha}}^*\) for all \(J_2 > J\).

The bootstrap critical value is the \((1-\hat{\alpha})\) quantile of the multiplier bootstrap process

\[\sup_{\left\{\left(x, J, J_{2}\right) \in \mathcal{X} \times \hat{\mathcal{J}} \times \hat{\mathcal{J}}: J_{2}>J\right\}} \left|\frac{D_{J}^{*}(x)-D_{J_{2}}^{*}(x)}{\hat{\sigma}_{J, J_{2}}(x)}\right|,\]

where \(D_J^*(x) = (\psi^J(x))' \mathbf{M}_J \hat{\mathbf{u}}_J^*\) is a multiplier bootstrap version of the estimation error, and \(\hat{\sigma}_{J, J_2}^2(x)\) is the estimated variance of the difference in estimators. This procedure avoids the need to select tuning parameters for the test itself and performs well in practice.

Parameters:

ynumpy.ndarray: Dependent variable vector.
xnumpy.ndarray: Endogenous regressor matrix.
wnumpy.ndarray: Instrument matrix.
x_gridnumpy.ndarray, optional: Grid points for evaluation. If None, created automatically.
j_x_degreeint, default=3: Degree of B-spline basis for \(X\).
k_w_degreeint, default=4: Degree of B-spline basis for \(W\).
j_x_segments_setnumpy.ndarray, optional: Set of \(J\) values to test. If None, uses [1, 3, 7, 15, 31, 63].
k_w_segments_setnumpy.ndarray, optional: Set of \(K\) values to test. If None, computed from \(J\) values.
knots{“uniform”, “quantiles”}, default=”uniform”: Knot placement method.
basis{“tensor”, “additive”, “glp”}, default=”tensor”: Type of basis.
x_min, x_max, w_min, w_maxfloat, optional: Range limits for basis construction.
grid_numint, default=50: Number of grid points for evaluation.
bitersint, default=99: Number of bootstrap replications.
alphafloat, default=0.5: Significance level for test.
check_is_fullrankbool, default=False: Whether to check for full rank.
seedint, optional: Random seed for reproducibility.

Returns:

dict

Dictionary containing:

j_tilde: Selected J value
j_hat: Unadjusted Lepski choice
j_hat_n: Truncated value
j_x_seg: Final selected J segments
k_w_seg: Corresponding K segments
theta_star: Bootstrap critical value