moderndid.npiv_j#
- moderndid.npiv_j(y, x, w, x_grid=None, j_x_degree=3, k_w_degree=4, j_x_segments_set=None, k_w_segments_set=None, knots='uniform', basis='tensor', x_min=None, x_max=None, w_min=None, w_max=None, grid_num=50, biters=99, alpha=0.5, check_is_fullrank=False, seed=None)[source]#
Implement Lepski’s method for optimal sieve dimension selection.
Implements the bootstrap-based test from [1] for selecting the optimal number of B-spline basis functions in nonparametric instrumental variables (NPIV) estimation. The method compares estimates across a grid of sieve dimensions \(\hat{\mathcal{J}}\). For each pair \((J, J_2)\) with \(J_2 > J\), it computes a sup-t-statistic for the difference in estimates
\[\sup_{x \in \mathcal{X}} \left| \frac{\hat{h}_J(x) - \hat{h}_{J_2}(x)}{\hat{\sigma}_{J, J_2}(x)} \right|.\]The optimal dimension \(\hat{J}\) is the smallest \(J \in \hat{\mathcal{J}}\) for which this statistic is below a bootstrap critical value \(\theta_{1-\hat{\alpha}}^*\) for all \(J_2 > J\).
The bootstrap critical value is the \((1-\hat{\alpha})\) quantile of the multiplier bootstrap process
\[\sup_{\left\{\left(x, J, J_{2}\right) \in \mathcal{X} \times \hat{\mathcal{J}} \times \hat{\mathcal{J}}: J_{2}>J\right\}} \left|\frac{D_{J}^{*}(x)-D_{J_{2}}^{*}(x)}{\hat{\sigma}_{J, J_{2}}(x)}\right|,\]where \(D_J^*(x) = (\psi^J(x))' \mathbf{M}_J \hat{\mathbf{u}}_J^*\) is a multiplier bootstrap version of the estimation error, and \(\hat{\sigma}_{J, J_2}^2(x)\) is the estimated variance of the difference in estimators. This procedure avoids the need to select tuning parameters for the test itself and performs well in practice.
- Parameters:
- y
numpy.ndarray Dependent variable vector.
- x
numpy.ndarray Endogenous regressor matrix.
- w
numpy.ndarray Instrument matrix.
- x_grid
numpy.ndarray, optional Grid points for evaluation. If None, created automatically.
- j_x_degree
int, default=3 Degree of B-spline basis for \(X\).
- k_w_degree
int, default=4 Degree of B-spline basis for \(W\).
- j_x_segments_set
numpy.ndarray, optional Set of \(J\) values to test. If None, uses [1, 3, 7, 15, 31, 63].
- k_w_segments_set
numpy.ndarray, optional Set of \(K\) values to test. If None, computed from \(J\) values.
- knots{“uniform”, “quantiles”}, default=”uniform”
Knot placement method.
- basis{“tensor”, “additive”, “glp”}, default=”tensor”
Type of basis.
- x_min, x_max, w_min, w_max
float, optional Range limits for basis construction.
- grid_num
int, default=50 Number of grid points for evaluation.
- biters
int, default=99 Number of bootstrap replications.
- alpha
float, default=0.5 Significance level for test.
- check_is_fullrankbool, default=False
Whether to check for full rank.
- seed
int, optional Random seed for reproducibility.
- y
- Returns:
dictDictionary containing:
j_tilde: Selected J value
j_hat: Unadjusted Lepski choice
j_hat_n: Truncated value
j_x_seg: Final selected J segments
k_w_seg: Corresponding K segments
theta_star: Bootstrap critical value
See also
npiv_choose_jFull data-driven selection procedure
npiv_jhat_maxCompute maximum feasible dimension
References
[1]Chen, X., Christensen, T. M., & Kankanala, S. (2024). Adaptive Estimation and Uniform Confidence Bands for Nonparametric Structural Functions and Elasticities. https://arxiv.org/abs/2107.11869.