moderndid.npiv_j#

moderndid.npiv_j(y, x, w, x_grid=None, j_x_degree=3, k_w_degree=4, j_x_segments_set=None, k_w_segments_set=None, knots='uniform', basis='tensor', x_min=None, x_max=None, w_min=None, w_max=None, grid_num=50, biters=99, alpha=0.5, check_is_fullrank=False, seed=None)[source]#

Implement Lepski’s method for optimal sieve dimension selection.

Implements the bootstrap-based test from [1] for selecting the optimal number of B-spline basis functions in nonparametric instrumental variables (NPIV) estimation. The method compares estimates across a grid of sieve dimensions \(\hat{\mathcal{J}}\). For each pair \((J, J_2)\) with \(J_2 > J\), it computes a sup-t-statistic for the difference in estimates

\[\sup_{x \in \mathcal{X}} \left| \frac{\hat{h}_J(x) - \hat{h}_{J_2}(x)}{\hat{\sigma}_{J, J_2}(x)} \right|.\]

The optimal dimension \(\hat{J}\) is the smallest \(J \in \hat{\mathcal{J}}\) for which this statistic is below a bootstrap critical value \(\theta_{1-\hat{\alpha}}^*\) for all \(J_2 > J\).

The bootstrap critical value is the \((1-\hat{\alpha})\) quantile of the multiplier bootstrap process

\[\sup_{\left\{\left(x, J, J_{2}\right) \in \mathcal{X} \times \hat{\mathcal{J}} \times \hat{\mathcal{J}}: J_{2}>J\right\}} \left|\frac{D_{J}^{*}(x)-D_{J_{2}}^{*}(x)}{\hat{\sigma}_{J, J_{2}}(x)}\right|,\]

where \(D_J^*(x) = (\psi^J(x))' \mathbf{M}_J \hat{\mathbf{u}}_J^*\) is a multiplier bootstrap version of the estimation error, and \(\hat{\sigma}_{J, J_2}^2(x)\) is the estimated variance of the difference in estimators. This procedure avoids the need to select tuning parameters for the test itself and performs well in practice.

Parameters:
ynumpy.ndarray

Dependent variable vector.

xnumpy.ndarray

Endogenous regressor matrix.

wnumpy.ndarray

Instrument matrix.

x_gridnumpy.ndarray, optional

Grid points for evaluation. If None, created automatically.

j_x_degreeint, default=3

Degree of B-spline basis for \(X\).

k_w_degreeint, default=4

Degree of B-spline basis for \(W\).

j_x_segments_setnumpy.ndarray, optional

Set of \(J\) values to test. If None, uses [1, 3, 7, 15, 31, 63].

k_w_segments_setnumpy.ndarray, optional

Set of \(K\) values to test. If None, computed from \(J\) values.

knots{“uniform”, “quantiles”}, default=”uniform”

Knot placement method.

basis{“tensor”, “additive”, “glp”}, default=”tensor”

Type of basis.

x_min, x_max, w_min, w_maxfloat, optional

Range limits for basis construction.

grid_numint, default=50

Number of grid points for evaluation.

bitersint, default=99

Number of bootstrap replications.

alphafloat, default=0.5

Significance level for test.

check_is_fullrankbool, default=False

Whether to check for full rank.

seedint, optional

Random seed for reproducibility.

Returns:
dict

Dictionary containing:

  • j_tilde: Selected J value

  • j_hat: Unadjusted Lepski choice

  • j_hat_n: Truncated value

  • j_x_seg: Final selected J segments

  • k_w_seg: Corresponding K segments

  • theta_star: Bootstrap critical value

See also

npiv_choose_j

Full data-driven selection procedure

npiv_jhat_max

Compute maximum feasible dimension

References

[1]

Chen, X., Christensen, T. M., & Kankanala, S. (2024). Adaptive Estimation and Uniform Confidence Bands for Nonparametric Structural Functions and Elasticities. https://arxiv.org/abs/2107.11869.