================= Inference ================= \ Bootstrap Inference ==================== Every estimator can provide bootstrap based confidence intervals by passing ``inference='bootstrap'`` or ``inference=BootstrapInference(n_bootstrap_samples=100, n_jobs=-1)`` (see :class:`.BootstrapInference`). These intervals are calculated by training multiple versions of the original estimator on bootstrap subsamples with replacement. Then the intervals are calculated based on the quantiles of the estimate distribution across the multiple clones. See also :class:`.BootstrapEstimator` for more details on this. For instance: .. testsetup:: import numpy as np X = np.random.choice(np.arange(5), size=(100,3)) Y = np.random.normal(size=(100,2)) y = np.random.normal(size=(100,)) T = T0 = T1 = np.random.choice(np.arange(3), size=(100,2)) t = t0 = t1 = T[:,0] W = np.random.normal(size=(100,2)) .. testcode:: from econml.dml import NonParamDML from sklearn.ensemble import RandomForestRegressor est = NonParamDML(model_y=RandomForestRegressor(n_estimators=10, min_samples_leaf=10), model_t=RandomForestRegressor(n_estimators=10, min_samples_leaf=10), model_final=RandomForestRegressor(n_estimators=10, min_samples_leaf=10)) est.fit(y, t, X=X, W=W, inference='bootstrap') point = est.const_marginal_effect(X) lb, ub = est.const_marginal_effect_interval(X, alpha=0.05) OLS Inference ==================== For estimators where the final stage CATE estimate is based on an Ordinary Least Squares regression, then we offer normality-based confidence intervals by default (leaving the setting ``inference='auto'`` unchanged), or by explicitly setting ``inference='statsmodels'``, or dependent on the estimator one can alter the covariance type calculation via ``inference=StatsModelsInference(cov_type='HC1)`` or ``inference=StatsModelsInferenceDiscrete(cov_type='HC1)``. See :class:`.StatsModelsInference` and :class:`.StatsModelsInferenceDiscrete` for more details. This for instance holds for the :class:`.LinearDML` and the :class:`.LinearDRLearner`, e.g.: .. testcode:: from econml.dml import LinearDML from sklearn.ensemble import RandomForestRegressor est = LinearDML(model_y=RandomForestRegressor(n_estimators=10, min_samples_leaf=10), model_t=RandomForestRegressor(n_estimators=10, min_samples_leaf=10)) est.fit(y, t, X=X, W=W) point = est.const_marginal_effect(X) lb, ub = est.const_marginal_effect_interval(X, alpha=0.05) .. testcode:: from econml.dr import LinearDRLearner from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier est = LinearDRLearner(model_regression=RandomForestRegressor(n_estimators=10, min_samples_leaf=10), model_propensity=RandomForestClassifier(n_estimators=10, min_samples_leaf=10)) est.fit(y, t, X=X, W=W) point = est.effect(X) lb, ub = est.effect_interval(X, alpha=0.05) This inference is enabled by our :class:`.StatsModelsLinearRegression` extension to the scikit-learn :class:`~sklearn.linear_model.LinearRegression`. Debiased Lasso Inference ========================= For estimators where the final stage CATE estimate is based on a high dimensional linear model with a sparsity constraint, then we offer confidence intervals using the debiased lasso technique. This for instance holds for the :class:`.SparseLinearDML` and the :class:`.SparseLinearDRLearner`. You can enable such intervals by default (leaving the setting ``inference='auto'`` unchanged), or by explicitly setting ``inference='debiasedlasso'``, e.g.: .. testcode:: from econml.dml import SparseLinearDML from sklearn.ensemble import RandomForestRegressor est = SparseLinearDML(model_y=RandomForestRegressor(n_estimators=10, min_samples_leaf=10), model_t=RandomForestRegressor(n_estimators=10, min_samples_leaf=10)) est.fit(y, t, X=X, W=W) point = est.const_marginal_effect(X) lb, ub = est.const_marginal_effect_interval(X, alpha=0.05) .. testcode:: from econml.dr import SparseLinearDRLearner from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier est = SparseLinearDRLearner(model_regression=RandomForestRegressor(n_estimators=10, min_samples_leaf=10), model_propensity=RandomForestClassifier(n_estimators=10, min_samples_leaf=10)) est.fit(y, t, X=X, W=W) point = est.effect(X) lb, ub = est.effect_interval(X, alpha=0.05) This inference is enabled by our implementation of the :class:`.DebiasedLasso` extension to the scikit-learn :class:`~sklearn.linear_model.Lasso`. Subsampled Honest Forest Inference =================================== For estimators where the final stage CATE estimate is a non-parametric model based on a Random Forest, we offer confidence intervals via the bootstrap-of-little-bags approach (see [Athey2019]_) for estimating the uncertainty of an Honest Random Forest. This for instance holds for the :class:`.CausalForestDML` and the :class:`.ForestDRLearner`. Such intervals are enabled by leaving inference at its default setting of ``'auto'`` or by explicitly setting ``inference='blb'``, e.g.: .. testcode:: from econml.dml import CausalForestDML from sklearn.ensemble import RandomForestRegressor est = CausalForestDML(model_y=RandomForestRegressor(n_estimators=10, min_samples_leaf=10), model_t=RandomForestRegressor(n_estimators=10, min_samples_leaf=10)) est.fit(y, t, X=X, W=W) point = est.const_marginal_effect(X) lb, ub = est.const_marginal_effect_interval(X, alpha=0.05) .. testcode:: from econml.dr import ForestDRLearner from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier est = ForestDRLearner(model_regression=RandomForestRegressor(n_estimators=10, min_samples_leaf=10), model_propensity=RandomForestClassifier(n_estimators=10, min_samples_leaf=10)) est.fit(y, t, X=X, W=W) point = est.effect(X) lb, ub = est.effect_interval(X, alpha=0.05) This inference is enabled by our implementation of the :class:`~econml.grf.RegressionForest` extension to the scikit-learn :class:`~sklearn.ensemble.RandomForestRegressor`. OrthoForest Bootstrap of Little Bags Inference ============================================== For the Orthogonal Random Forest estimators (see :class:`.DMLOrthoForest`, :class:`.DROrthoForest`), we provide confidence intervals built via the bootstrap-of-little-bags approach ([Athey2019]_). This technique is well suited for estimating the uncertainty of the honest causal forests underlying the OrthoForest estimators. Such intervals are enabled by leaving inference at its default setting of ``'auto'`` or by explicitly setting ``inference='blb'``, e.g.: .. testcode:: from econml.orf import DMLOrthoForest from econml.sklearn_extensions.linear_model import WeightedLasso est = DMLOrthoForest(n_trees=10, min_leaf_size=3, model_T=WeightedLasso(alpha=0.01), model_Y=WeightedLasso(alpha=0.01)) est.fit(y, t, X=X, W=W) point = est.const_marginal_effect(X) lb, ub = est.const_marginal_effect_interval(X, alpha=0.05) .. todo:: * Subsampling * Doubly Robust Gradient Inference