econml.sklearn_extensions.linear_model.StatsModelsLinearRegression

class econml.sklearn_extensions.linear_model.StatsModelsLinearRegression(fit_intercept=True, cov_type='HC0', *, enable_federation=False)[source]

Bases: econml.sklearn_extensions.linear_model._StatsModelsWrapper

Class which mimics weighted linear regression from the statsmodels package.

However, unlike statsmodels WLS, this class also supports sample variances in addition to sample weights, which enables more accurate inference when working with summarized data.

Parameters
  • fit_intercept (bool, default True) – Whether to fit an intercept in this model

  • cov_type (string, default “HC0”) – The covariance approach to use. Supported values are “HCO”, “HC1”, and “nonrobust”.

  • enable_federation (bool, default False) – Whether to enable federation (aggregating this model’s results with other models in a distributed setting). This requires additional memory proportional to the number of columns in X to the fourth power.

__init__(fit_intercept=True, cov_type='HC0', *, enable_federation=False)[source]

Methods

__init__([fit_intercept, cov_type, ...])

aggregate(models)

Aggregate multiple models into one.

coef__interval([alpha])

Gets a confidence interval bounding the fitted coefficients.

fit(X, y[, sample_weight, freq_weight, ...])

Fits the model.

get_params([deep])

Get parameters for this estimator.

intercept__interval([alpha])

Gets a confidence interval bounding the intercept(s) (or 0 if no intercept was fit).

predict(X)

Predicts the output given an array of instances.

predict_interval(X[, alpha])

Gets a confidence interval bounding the prediction.

prediction_stderr(X)

Gets the standard error of the predictions.

set_params(**params)

Set the parameters of this estimator.

Attributes

coef_

Get the model's coefficients on the covariates.

coef_stderr_

Gets the standard error of the fitted coefficients.

intercept_

Get the intercept(s) (or 0 if no intercept was fit).

intercept_stderr_

Gets the standard error of the intercept(s) (or 0 if no intercept was fit).

static aggregate(models: List[econml.sklearn_extensions.linear_model.StatsModelsLinearRegression])[source]

Aggregate multiple models into one.

Parameters

models (list of StatsModelsLinearRegression) – The models to aggregate

Returns

agg_model – The aggregated model

Return type

StatsModelsLinearRegression

coef__interval(alpha=0.05)

Gets a confidence interval bounding the fitted coefficients.

Parameters

alpha (float, default 0.05) – The confidence level. Will calculate the alpha/2-quantile and the (1-alpha/2)-quantile of the parameter distribution as confidence interval

Returns

coef__interval – The lower and upper bounds of the confidence interval of the coefficients

Return type

{tuple ((p, d) array, (p,d) array), tuple ((d,) array, (d,) array)}

fit(X, y, sample_weight=None, freq_weight=None, sample_var=None)[source]

Fits the model.

Parameters
  • X ((N, d) nd array_like) – co-variates

  • y ({(N,), (N, p)} nd array_like) – output variable(s)

  • sample_weight ((N,) array_like or None) – Individual weights for each sample. If None, it assumes equal weight.

  • freq_weight ((N, ) array_like of int or None) – Weight for the observation. Observation i is treated as the mean outcome of freq_weight[i] independent observations. When sample_var is not None, this should be provided.

  • sample_var ({(N,), (N, p)} nd array_like or None) – Variance of the outcome(s) of the original freq_weight[i] observations that were used to compute the mean outcome represented by observation i.

Returns

self

Return type

StatsModelsLinearRegression

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

dict

intercept__interval(alpha=0.05)

Gets a confidence interval bounding the intercept(s) (or 0 if no intercept was fit).

Parameters

alpha (float, default 0.05) – The confidence level. Will calculate the alpha/2-quantile and the (1-alpha/2)-quantile of the parameter distribution as confidence interval

Returns

intercept__interval – The lower and upper bounds of the confidence interval of the intercept(s)

Return type

{tuple ((p,) array, (p,) array), tuple (float, float)}

predict(X)

Predicts the output given an array of instances.

Parameters

X ((n, d) array_like) – The covariates on which to predict

Returns

predictions – The predicted mean outcomes

Return type

{(n,) array, (n,p) array}

predict_interval(X, alpha=0.05)

Gets a confidence interval bounding the prediction.

Parameters
  • X ((n, d) array_like) – The covariates on which to predict

  • alpha (float, default 0.05) – The confidence level. Will calculate the alpha/2-quantile and the (1-alpha/2)-quantile of the parameter distribution as confidence interval

Returns

prediction_intervals – The lower and upper bounds of the confidence intervals of the predicted mean outcomes

Return type

{tuple ((n,) array, (n,) array), tuple ((n,p) array, (n,p) array)}

prediction_stderr(X)

Gets the standard error of the predictions.

Parameters

X ((n, d) array_like) – The covariates at which to predict

Returns

prediction_stderr – The standard error of each coordinate of the output at each point we predict

Return type

(n, p) array_like

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

estimator instance

property coef_

Get the model’s coefficients on the covariates.

Returns

coef_ – The coefficients of the variables in the linear regression. If label y was p-dimensional, then the result is a matrix of coefficents, whose p-th row containts the coefficients corresponding to the p-th coordinate of the label.

Return type

{(d,), (p, d)} nd array_like

property coef_stderr_

Gets the standard error of the fitted coefficients.

Returns

coef_stderr_ – The standard error of the coefficients

Return type

{(d,), (p, d)} nd array_like

property intercept_

Get the intercept(s) (or 0 if no intercept was fit).

Returns

intercept_ – The intercept of the linear regresion. If label y was p-dimensional, then the result is a vector whose p-th entry containts the intercept corresponding to the p-th coordinate of the label.

Return type

float or (p,) nd array_like

property intercept_stderr_

Gets the standard error of the intercept(s) (or 0 if no intercept was fit).

Returns

intercept_stderr_ – The standard error of the intercept(s)

Return type

float or (p,) nd array_like