econml.sklearn_extensions.linear_model.StatsModelsLinearRegression
- class econml.sklearn_extensions.linear_model.StatsModelsLinearRegression(fit_intercept=True, cov_type='HC0', *, enable_federation=False)[source]
Bases:
econml.sklearn_extensions.linear_model._StatsModelsWrapper
Class which mimics weighted linear regression from the statsmodels package.
However, unlike statsmodels WLS, this class also supports sample variances in addition to sample weights, which enables more accurate inference when working with summarized data.
- Parameters
fit_intercept (bool, default True) – Whether to fit an intercept in this model
cov_type (string, default “HC0”) – The covariance approach to use. Supported values are “HCO”, “HC1”, and “nonrobust”.
enable_federation (bool, default False) – Whether to enable federation (aggregating this model’s results with other models in a distributed setting). This requires additional memory proportional to the number of columns in X to the fourth power.
Methods
__init__
([fit_intercept, cov_type, ...])aggregate
(models)Aggregate multiple models into one.
coef__interval
([alpha])Gets a confidence interval bounding the fitted coefficients.
fit
(X, y[, sample_weight, freq_weight, ...])Fits the model.
get_params
([deep])Get parameters for this estimator.
intercept__interval
([alpha])Gets a confidence interval bounding the intercept(s) (or 0 if no intercept was fit).
predict
(X)Predicts the output given an array of instances.
predict_interval
(X[, alpha])Gets a confidence interval bounding the prediction.
Gets the standard error of the predictions.
set_params
(**params)Set the parameters of this estimator.
Attributes
Get the model's coefficients on the covariates.
Gets the standard error of the fitted coefficients.
Get the intercept(s) (or 0 if no intercept was fit).
Gets the standard error of the intercept(s) (or 0 if no intercept was fit).
- static aggregate(models: List[econml.sklearn_extensions.linear_model.StatsModelsLinearRegression])[source]
Aggregate multiple models into one.
- Parameters
models (list of StatsModelsLinearRegression) – The models to aggregate
- Returns
agg_model – The aggregated model
- Return type
- coef__interval(alpha=0.05)
Gets a confidence interval bounding the fitted coefficients.
- Parameters
alpha (float, default 0.05) – The confidence level. Will calculate the alpha/2-quantile and the (1-alpha/2)-quantile of the parameter distribution as confidence interval
- Returns
coef__interval – The lower and upper bounds of the confidence interval of the coefficients
- Return type
{tuple ((p, d) array, (p,d) array), tuple ((d,) array, (d,) array)}
- fit(X, y, sample_weight=None, freq_weight=None, sample_var=None)[source]
Fits the model.
- Parameters
X ((N, d) nd array_like) – co-variates
y ({(N,), (N, p)} nd array_like) – output variable(s)
sample_weight ((N,) array_like or None) – Individual weights for each sample. If None, it assumes equal weight.
freq_weight ((N, ) array_like of int or None) – Weight for the observation. Observation i is treated as the mean outcome of freq_weight[i] independent observations. When
sample_var
is not None, this should be provided.sample_var ({(N,), (N, p)} nd array_like or None) – Variance of the outcome(s) of the original freq_weight[i] observations that were used to compute the mean outcome represented by observation i.
- Returns
self
- Return type
- get_params(deep=True)
Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
- intercept__interval(alpha=0.05)
Gets a confidence interval bounding the intercept(s) (or 0 if no intercept was fit).
- Parameters
alpha (float, default 0.05) – The confidence level. Will calculate the alpha/2-quantile and the (1-alpha/2)-quantile of the parameter distribution as confidence interval
- Returns
intercept__interval – The lower and upper bounds of the confidence interval of the intercept(s)
- Return type
- predict(X)
Predicts the output given an array of instances.
- Parameters
X ((n, d) array_like) – The covariates on which to predict
- Returns
predictions – The predicted mean outcomes
- Return type
{(n,) array, (n,p) array}
- predict_interval(X, alpha=0.05)
Gets a confidence interval bounding the prediction.
- Parameters
X ((n, d) array_like) – The covariates on which to predict
alpha (float, default 0.05) – The confidence level. Will calculate the alpha/2-quantile and the (1-alpha/2)-quantile of the parameter distribution as confidence interval
- Returns
prediction_intervals – The lower and upper bounds of the confidence intervals of the predicted mean outcomes
- Return type
{tuple ((n,) array, (n,) array), tuple ((n,p) array, (n,p) array)}
- prediction_stderr(X)
Gets the standard error of the predictions.
- Parameters
X ((n, d) array_like) – The covariates at which to predict
- Returns
prediction_stderr – The standard error of each coordinate of the output at each point we predict
- Return type
(n, p) array_like
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- property coef_
Get the model’s coefficients on the covariates.
- Returns
coef_ – The coefficients of the variables in the linear regression. If label y was p-dimensional, then the result is a matrix of coefficents, whose p-th row containts the coefficients corresponding to the p-th coordinate of the label.
- Return type
{(d,), (p, d)} nd array_like
- property coef_stderr_
Gets the standard error of the fitted coefficients.
- Returns
coef_stderr_ – The standard error of the coefficients
- Return type
{(d,), (p, d)} nd array_like
- property intercept_
Get the intercept(s) (or 0 if no intercept was fit).
- Returns
intercept_ – The intercept of the linear regresion. If label y was p-dimensional, then the result is a vector whose p-th entry containts the intercept corresponding to the p-th coordinate of the label.
- Return type
float or (p,) nd array_like