# Sieve 2SLS Instrumental Variable Estimation¶

The sieve based instrumental variable estimator SieveTSLS is based on a two-stage least squares estimation procedure. The user must specify the sieve basis for $$T$$, $$X$$ and $$Y$$ (Hermite polynomial or a set of indicator functions), and the number of elements of the basis expansion to include. Formally, we now assume that we can write:

$\begin{split}Y =~& \sum_{d=1}^{d^Y} \sum_{k=1}^{d^X} \beta^Y_{d,k} \psi_d(T) \rho_k(X) + \gamma (X,W) + \epsilon \\ T =~& \sum_{d=1}^{d^T} \sum_{k=1}^{d^X} \beta^T_{d,k} \phi_d(Z) \rho_k(X) + \delta (X,W) + u\end{split}$

where $$\{\psi_d\}$$ is the sieve basis for $$Y$$ with degree $$d^Y$$, $$\{\rho_k\}$$ is the sieve basis for $$X$$, with degree $$d^X$$, $$\{\phi_d\}$$ is the sieve basis for $$T$$ with degree $$d^T$$, $$Z$$ are the instruments, $$(X,W)$$ is the horizontal concatenation of $$X$$ and $$W$$, and $$u$$ and $$\varepsilon$$ may be correlated. Each of the $$\psi_d$$ is a function from $$\dim(T)$$ into $$\mathbb{R}$$, each of the $$\rho_k$$ is a function from $$\dim(X)$$ into $$\mathbb{R}$$ and each of the $$\phi_d$$ is a function from $$\dim(Z)$$ into $$\mathbb{R}$$.

Our goal is to estimate

$\tau(\vec{t}_0, \vec{t}_1, \vec{x}) = \sum_{d=1}^{d^Y} \sum_{k=1}^{d^X} \beta^Y_{d,k} \rho_k(\vec{x}) \left(\psi_d(\vec{t_1}) - \psi_d(\vec{t_0})\right)$

We do this by first estimating each of the functions $$\E[\psi_d(T)|X,Z,W]$$ by linear projection of $$\psi_d(t_i)$$ onto the features $$\{\phi_d(z_i) \rho_k(x_i) \}$$ and $$(x_i,w_i)$$. We will then project $$y_i$$ onto these estimated functions and $$(x_i,w_i)$$ again to arrive at an estimate $$\hat{\beta}^Y$$ whose individual coefficients $$\beta^Y_{d,k}$$ can be used to return our estimate of $$\tau$$.