Inference

Bootstrap Inference

Every estimator can provide bootstrap based confidence intervals by passing inference='bootstrap' or inference=BootstrapInference(n_bootstrap_samples=100, n_jobs=-1) (see BootstrapInference). These intervals are calculated by training multiple versions of the original estimator on bootstrap subsamples with replacement. Then the intervals are calculated based on the quantiles of the estimate distribution across the multiple clones. See also BootstrapEstimator for more details on this.

For instance:

from econml.dml import NonParamDML
from sklearn.ensemble import RandomForestRegressor
est = NonParamDML(model_y=RandomForestRegressor(n_estimators=10, min_samples_leaf=10),
                            model_t=RandomForestRegressor(n_estimators=10, min_samples_leaf=10),
                            model_final=RandomForestRegressor(n_estimators=10, min_samples_leaf=10))
est.fit(y, t, X=X, W=W, inference='bootstrap')
point = est.const_marginal_effect(X)
lb, ub = est.const_marginal_effect_interval(X, alpha=0.05)

OLS Inference

For estimators where the final stage CATE estimate is based on an Ordinary Least Squares regression, then we offer normality-based confidence intervals by default (leaving the setting inference='auto' unchanged), or by explicitly setting inference='statsmodels', or dependent on the estimator one can alter the covariance type calculation via inference=StatsModelsInference(cov_type='HC1) or inference=StatsModelsInferenceDiscrete(cov_type='HC1). See StatsModelsInference and StatsModelsInferenceDiscrete for more details. This for instance holds for the LinearDML and the LinearDRLearner, e.g.:

from econml.dml import LinearDML
from sklearn.ensemble import RandomForestRegressor
est = LinearDML(model_y=RandomForestRegressor(n_estimators=10, min_samples_leaf=10),
                             model_t=RandomForestRegressor(n_estimators=10, min_samples_leaf=10))
est.fit(y, t, X=X, W=W)
point = est.const_marginal_effect(X)
lb, ub = est.const_marginal_effect_interval(X, alpha=0.05)

from econml.dr import LinearDRLearner
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
est = LinearDRLearner(model_regression=RandomForestRegressor(n_estimators=10, min_samples_leaf=10),
                      model_propensity=RandomForestClassifier(n_estimators=10, min_samples_leaf=10))
est.fit(y, t, X=X, W=W)
point = est.effect(X)
lb, ub = est.effect_interval(X, alpha=0.05)

This inference is enabled by our StatsModelsLinearRegression extension to the scikit-learn LinearRegression.

Debiased Lasso Inference

For estimators where the final stage CATE estimate is based on a high dimensional linear model with a sparsity constraint, then we offer confidence intervals using the debiased lasso technique. This for instance holds for the SparseLinearDML and the SparseLinearDRLearner. You can enable such intervals by default (leaving the setting inference='auto' unchanged), or by explicitly setting inference='debiasedlasso', e.g.:

from econml.dml import SparseLinearDML
from sklearn.ensemble import RandomForestRegressor
est = SparseLinearDML(model_y=RandomForestRegressor(n_estimators=10, min_samples_leaf=10),
                                   model_t=RandomForestRegressor(n_estimators=10, min_samples_leaf=10))
est.fit(y, t, X=X, W=W)
point = est.const_marginal_effect(X)
lb, ub = est.const_marginal_effect_interval(X, alpha=0.05)

from econml.dr import SparseLinearDRLearner
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
est = SparseLinearDRLearner(model_regression=RandomForestRegressor(n_estimators=10, min_samples_leaf=10),
                            model_propensity=RandomForestClassifier(n_estimators=10, min_samples_leaf=10))
est.fit(y, t, X=X, W=W)
point = est.effect(X)
lb, ub = est.effect_interval(X, alpha=0.05)

This inference is enabled by our implementation of the DebiasedLasso extension to the scikit-learn Lasso.

Subsampled Honest Forest Inference

For estimators where the final stage CATE estimate is a non-parametric model based on a Random Forest, we offer confidence intervals via the bootstrap-of-little-bags approach (see [Athey2019]) for estimating the uncertainty of an Honest Random Forest. This for instance holds for the CausalForestDML and the ForestDRLearner. Such intervals are enabled by leaving inference at its default setting of 'auto' or by explicitly setting inference='blb', e.g.:

from econml.dml import CausalForestDML
from sklearn.ensemble import RandomForestRegressor
est = CausalForestDML(model_y=RandomForestRegressor(n_estimators=10, min_samples_leaf=10),
                      model_t=RandomForestRegressor(n_estimators=10, min_samples_leaf=10))
est.fit(y, t, X=X, W=W)
point = est.const_marginal_effect(X)
lb, ub = est.const_marginal_effect_interval(X, alpha=0.05)

from econml.dr import ForestDRLearner
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
est = ForestDRLearner(model_regression=RandomForestRegressor(n_estimators=10, min_samples_leaf=10),
                      model_propensity=RandomForestClassifier(n_estimators=10, min_samples_leaf=10))
est.fit(y, t, X=X, W=W)
point = est.effect(X)
lb, ub = est.effect_interval(X, alpha=0.05)

This inference is enabled by our implementation of the RegressionForest extension to the scikit-learn RandomForestRegressor.

OrthoForest Bootstrap of Little Bags Inference

For the Orthogonal Random Forest estimators (see DMLOrthoForest, DROrthoForest), we provide confidence intervals built via the bootstrap-of-little-bags approach ([Athey2019]). This technique is well suited for estimating the uncertainty of the honest causal forests underlying the OrthoForest estimators. Such intervals are enabled by leaving inference at its default setting of 'auto' or by explicitly setting inference='blb', e.g.:

from econml.orf import DMLOrthoForest
from econml.sklearn_extensions.linear_model import WeightedLasso
est = DMLOrthoForest(n_trees=10,
                     min_leaf_size=3,
                     model_T=WeightedLasso(alpha=0.01),
                     model_Y=WeightedLasso(alpha=0.01))
est.fit(y, t, X=X, W=W)
point = est.const_marginal_effect(X)
lb, ub = est.const_marginal_effect_interval(X, alpha=0.05)