econml.grf._base_grf.BaseGRF
- class econml.grf._base_grf.BaseGRF(n_estimators=100, *, criterion='mse', max_depth=None, min_samples_split=10, min_samples_leaf=5, min_weight_fraction_leaf=0.0, min_var_fraction_leaf=None, min_var_leaf_on_val=False, max_features='auto', min_impurity_decrease=0.0, max_samples=0.45, min_balancedness_tol=0.45, honest=True, inference=True, fit_intercept=True, subforest_size=4, n_jobs=- 1, random_state=None, verbose=0, warm_start=False)[source]
Bases:
econml._ensemble._ensemble.BaseEnsemble
Base class for Genearlized Random Forests for solving linear moment equations of the form:
E[J * theta(x) - A | X = x] = 0
where J is an (d, d) random matrix, A is an (d, 1) random vector and theta(x) is a local parameter to be estimated, which might contain both relevant and nuisance parameters.
Warning: This class should not be used directly. Use derived classes instead.
- __init__(n_estimators=100, *, criterion='mse', max_depth=None, min_samples_split=10, min_samples_leaf=5, min_weight_fraction_leaf=0.0, min_var_fraction_leaf=None, min_var_leaf_on_val=False, max_features='auto', min_impurity_decrease=0.0, max_samples=0.45, min_balancedness_tol=0.45, honest=True, inference=True, fit_intercept=True, subforest_size=4, n_jobs=- 1, random_state=None, verbose=0, warm_start=False)[source]
Methods
__init__
([n_estimators, criterion, ...])apply
(X)Apply trees in the forest to X, return leaf indices.
Return the decision path in the forest.
feature_importances
([max_depth, ...])The feature importances based on the amount of parameter heterogeneity they create. The higher, the more important the feature. The importance of a feature is computed as the (normalized) total heterogeneity that the feature creates. For each tree and for each split that the feature was chosen adds::.
fit
(X, T, y, *[, sample_weight])Build a forest of trees from the training set (X, T, y) and any other auxiliary variables.
get_params
([deep])Get parameters for this estimator.
Re-generate the example same sample indices as those at fit time using same pseudo-randomness.
oob_predict
(Xtrain)Returns the relevant output predictions for each of the training data points, when only trees where that data point was not used are incorporated.
predict
(X[, interval, alpha])Return the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs].
predict_alpha_and_jac
(X[, slice, parallel])Return the value of the conditional jacobian E[J | X=x] and the conditional alpha E[A | X=x] using the forest as kernel weights, i.e..
Return the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs] and their covariance matrix.
predict_full
(X[, interval, alpha])Return the fitted local parameters for each x in X, i.e. theta(x).
predict_interval
(X[, alpha])Return the confidence interval for the relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs].
predict_moment_and_var
(X, parameter[, ...])Return the value of the conditional expected moment vector at each sample and for the given parameter estimate for each sample.
predict_projection
(X, projector)Return the inner product of the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs], with a projector vector projector(x), i.e.::.
predict_projection_and_var
(X, projector)Return the inner product of the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs], with a projector vector projector(x), i.e.::.
predict_projection_var
(X, projector)Return the variance of the inner product of the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs], with a projector vector projector(x), i.e.::.
Return the prefix of relevant fitted local parameters for each X, i.e. theta(X)[1..n_relevant_outputs].
Return the fitted local parameters for each X, i.e. theta(X).
predict_var
(X)Return the covariance matrix of the prefix of relevant fitted local parameters for each x in X.
Return the standard deviation of each coordinate of the prefix of relevant fitted local parameters for each x in X.
set_params
(**params)Set the parameters of this estimator.
Attributes
feature_importances_
- apply(X)[source]
Apply trees in the forest to X, return leaf indices.
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.- Returns
X_leaves – For each datapoint x in X and for each tree in the forest, return the index of the leaf x ends up in.
- Return type
ndarray of shape (n_samples, n_estimators)
- decision_path(X)[source]
Return the decision path in the forest.
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.- Returns
indicator (sparse matrix of shape (n_samples, n_nodes)) – Return a node indicator matrix where non zero elements indicates that the samples goes through the nodes. The matrix is of CSR format.
n_nodes_ptr (ndarray of shape (n_estimators + 1,)) – The columns from indicator[n_nodes_ptr[i]:n_nodes_ptr[i+1]] gives the indicator value for the i-th estimator.
- feature_importances(max_depth=4, depth_decay_exponent=2.0)[source]
The feature importances based on the amount of parameter heterogeneity they create. The higher, the more important the feature. The importance of a feature is computed as the (normalized) total heterogeneity that the feature creates. For each tree and for each split that the feature was chosen adds:
parent_weight * (left_weight * right_weight) * mean((value_left[k] - value_right[k])**2) / parent_weight**2
to the importance of the feature. Each such quantity is also weighted by the depth of the split. These importances are normalized at the tree level and then averaged across trees.
- Parameters
max_depth (int, default 4) – Splits of depth larger than max_depth are not used in this calculation
depth_decay_exponent (double, default 2.0) – The contribution of each split to the total score is re-weighted by 1 / (1 + depth)**2.0.
- Returns
feature_importances_ – Normalized total parameter heterogeneity inducing importance of each feature
- Return type
ndarray of shape (n_features,)
- fit(X, T, y, *, sample_weight=None, **kwargs)[source]
Build a forest of trees from the training set (X, T, y) and any other auxiliary variables.
- Parameters
X (array_like of shape (n_samples, n_features)) – The training input samples. Internally, its dtype will be converted to
dtype=np.float64
.T (array_like of shape (n_samples, n_treatments)) – The treatment vector for each sample
y (array_like of shape (n_samples,) or (n_samples, n_outcomes)) – The outcome values for each sample.
sample_weight (array_like of shape (n_samples,), default None) – Sample weights. If None, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node.
**kwargs (dictionary of array_like items of shape (n_samples, d_var)) – Auxiliary random variables that go into the moment function (e.g. instrument, censoring etc) Any of these variables will be passed on as is to the get_pointJ and get_alpha method of the children classes.
- Returns
self
- Return type
- get_params(deep=True)
Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
- get_subsample_inds()[source]
Re-generate the example same sample indices as those at fit time using same pseudo-randomness.
- oob_predict(Xtrain)[source]
Returns the relevant output predictions for each of the training data points, when only trees where that data point was not used are incorporated. This method is not available is the estimator was trained with warm_start=True.
- Parameters
Xtrain ((n_training_samples, n_features) matrix) – Must be the same exact X matrix that was passed to the forest at fit time.
- Returns
oob_preds – The out-of-bag predictions of the relevant output parameters for each of the training points
- Return type
(n_training_samples, n_relevant_outputs) matrix
- predict(X, interval=False, alpha=0.05)[source]
Return the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs].
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.interval (bool, default False) – Whether to return a confidence interval too
alpha (float in (0, 1), default 0.05) – The confidence level of the confidence interval. Returns a symmetric (alpha/2, 1-alpha/2) confidence interval.
- Returns
theta(X)[1, .., n_relevant_outputs] (array_like of shape (n_samples, n_relevant_outputs)) – The estimated relevant parameters for each row of X
lb(x), ub(x) (array_like of shape (n_samples, n_relevant_outputs)) – The lower and upper end of the confidence interval for each parameter. Return value is omitted if interval=False.
- predict_alpha_and_jac(X, slice=None, parallel=True)[source]
Return the value of the conditional jacobian E[J | X=x] and the conditional alpha E[A | X=x] using the forest as kernel weights, i.e.:
alpha(x) = (1/n_trees) sum_{trees} (1/ |leaf(x)|) sum_{val sample i in leaf(x)} w[i] A[i] jac(x) = (1/n_trees) sum_{trees} (1/ |leaf(x)|) sum_{val sample i in leaf(x)} w[i] J[i]
where w[i] is the sample weight (1.0 if sample_weight is None).
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.slice (list of int or None, default None) – If not None, then only the trees with index in slice, will be used to calculate the mean and the variance.
parallel (bool , default True) – Whether the averaging should happen using parallelism or not. Parallelism adds some overhead but makes it faster with many trees.
- Returns
alpha (array_like of shape (n_samples, n_outputs)) – The estimated conditional A, alpha(x) for each sample x in X
jac (array_like of shape (n_samples, n_outputs, n_outputs)) – The estimated conditional J, jac(x) for each sample x in X
- predict_and_var(X)[source]
Return the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs] and their covariance matrix.
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.- Returns
theta(x)[1, .., n_relevant_outputs] (array_like of shape (n_samples, n_relevant_outputs)) – The estimated relevant parameters for each row of X
var(theta(x)) (array_like of shape (n_samples, n_relevant_outputs, n_relevant_outputs)) – The covariance of theta(x)[1, .., n_relevant_outputs]
- predict_full(X, interval=False, alpha=0.05)[source]
Return the fitted local parameters for each x in X, i.e. theta(x).
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.interval (bool, default False) – Whether to return a confidence interval too
alpha (float in (0, 1), default 0.05) – The confidence level of the confidence interval. Returns a symmetric (alpha/2, 1-alpha/2) confidence interval.
- Returns
theta(x) (array_like of shape (n_samples, n_outputs)) – The estimated relevant parameters for each row x of X
lb(x), ub(x) (array_like of shape (n_samples, n_outputs)) – The lower and upper end of the confidence interval for each parameter. Return value is omitted if interval=False.
- predict_interval(X, alpha=0.05)[source]
Return the confidence interval for the relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs].
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.alpha (float in (0, 1), default 0.05) – The confidence level of the confidence interval. Returns a symmetric (alpha/2, 1-alpha/2) confidence interval.
- Returns
lb(x), ub(x) – The lower and upper end of the confidence interval for each parameter. Return value is omitted if interval=False.
- Return type
array_like of shape (n_samples, n_relevant_outputs)
- predict_moment_and_var(X, parameter, slice=None, parallel=True)[source]
Return the value of the conditional expected moment vector at each sample and for the given parameter estimate for each sample:
M(x; theta(x)) := E[J | X=x] theta(x) - E[A | X=x]
where conditional expectations are estimated based on the forest weights, i.e.:
M_tree(x; theta(x)) := (1/ |leaf(x)|) sum_{val sample i in leaf(x)} w[i] (J[i] theta(x) - A[i]) M(x; theta(x) = (1/n_trees) sum_{trees} M_tree(x; theta(x))
where w[i] is the sample weight (1.0 if sample_weight is None), as well as the variance of the local moment vector across trees:
Var(M_tree(x; theta(x))) = (1/n_trees) sum_{trees} M_tree(x; theta(x)) @ M_tree(x; theta(x)).T
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.parameter (array_like of shape (n_samples, n_outputs)) – An estimate of the parameter theta(x) for each sample x in X
slice (list of int or None, default None) – If not None, then only the trees with index in slice, will be used to calculate the mean and the variance.
parallel (bool , default True) – Whether the averaging should happen using parallelism or not. Parallelism adds some overhead but makes it faster with many trees.
- Returns
moment (array_like of shape (n_samples, n_outputs)) – The estimated conditional moment M(x; theta(x)) for each sample x in X
moment_var (array_like of shape (n_samples, n_outputs)) – The variance of the conditional moment Var(M_tree(x; theta(x))) across trees for each sample x
- predict_projection(X, projector)[source]
Return the inner product of the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs], with a projector vector projector(x), i.e.:
mu(x) := <theta(x)[1..n_relevant_outputs], projector(x)>
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.projector (array_like of shape (n_samples, n_relevant_outputs)) – The projector vector for each sample x in X
- Returns
mu(x) – The estimated inner product of the relevant parameters with the projector for each row x of X
- Return type
array_like of shape (n_samples, 1)
- predict_projection_and_var(X, projector)[source]
Return the inner product of the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs], with a projector vector projector(x), i.e.:
mu(x) := <theta(x)[1..n_relevant_outputs], projector(x)>
as well as the variance of mu(x).
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.projector (array_like of shape (n_samples, n_relevant_outputs)) – The projector vector for each sample x in X
- Returns
mu(x) (array_like of shape (n_samples, 1)) – The estimated inner product of the relevant parameters with the projector for each row x of X
var(mu(x)) (array_like of shape (n_samples, 1)) – The variance of the estimated inner product
- predict_projection_var(X, projector)[source]
Return the variance of the inner product of the prefix of relevant fitted local parameters for each x in X, i.e. theta(x)[1..n_relevant_outputs], with a projector vector projector(x), i.e.:
Var(mu(x)) for mu(x) := <theta(x)[1..n_relevant_outputs], projector(x)>
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.projector (array_like of shape (n_samples, n_relevant_outputs)) – The projector vector for each sample x in X
- Returns
var(mu(x)) – The variance of the estimated inner product
- Return type
array_like of shape (n_samples, 1)
- predict_tree_average(X)[source]
Return the prefix of relevant fitted local parameters for each X, i.e. theta(X)[1..n_relevant_outputs]. This method simply returns the average of the parameters estimated by each tree. predict should be preferred over pred_tree_average, as it performs a more stable averaging across trees.
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.- Returns
theta(X)[1, .., n_relevant_outputs] – The estimated relevant parameters for each row of X
- Return type
array_like of shape (n_samples, n_relevant_outputs)
- predict_tree_average_full(X)[source]
Return the fitted local parameters for each X, i.e. theta(X). This method simply returns the average of the parameters estimated by each tree. predict_full should be preferred over pred_tree_average_full, as it performs a more stable averaging across trees.
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.- Returns
theta(X) – The estimated relevant parameters for each row of X
- Return type
array_like of shape (n_samples, n_outputs)
- predict_var(X)[source]
Return the covariance matrix of the prefix of relevant fitted local parameters for each x in X.
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.- Returns
var(theta(x)) – The covariance of theta(x)[1, .., n_relevant_outputs]
- Return type
array_like of shape (n_samples, n_relevant_outputs, n_relevant_outputs)
- prediction_stderr(X)[source]
Return the standard deviation of each coordinate of the prefix of relevant fitted local parameters for each x in X.
- Parameters
X (array_like of shape (n_samples, n_features)) – The input samples. Internally, it will be converted to
dtype=np.float64
.- Returns
std(theta(x)) – The standard deviation of each theta(x)[i] for i in {1, .., n_relevant_outputs}
- Return type
array_like of shape (n_samples, n_relevant_outputs)
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance