econml.iv.nnet.DeepIV

class econml.iv.nnet.DeepIV(*, n_components, m, h, n_samples, use_upper_bound_loss=False, n_gradient_samples=0, optimizer='adam', first_stage_options={'epochs': 100}, second_stage_options={'epochs': 100})[source]

Bases: econml._cate_estimator.BaseCateEstimator

The Deep IV Estimator (see http://proceedings.mlr.press/v70/hartford17a/hartford17a.pdf).

Parameters

n_components (int) – Number of components in the mixture density network
m ((tensor, tensor) -> Layer) – Method for building a Keras model that featurizes the z and x inputs
h ((tensor, tensor) -> Layer) – Method for building a model of y given t and x
n_samples (int) – The number of samples to use
use_upper_bound_loss (bool, optional) – Whether to use an upper bound to the true loss (equivalent to adding a regularization penalty on the variance of h). Defaults to False.
n_gradient_samples (int, optional) – The number of separate additional samples to use when calculating the gradient. This can only be nonzero if user_upper_bound is False, in which case the gradient of the returned loss will be an unbiased estimate of the gradient of the true loss. Defaults to 0.
optimizer (str, optional) – The optimizer to use. Defaults to “adam”
first_stage_options (dictionary, optional) – The keyword arguments to pass to Keras’s fit method when training the first stage model. Defaults to {“epochs”: 100}.
second_stage_options (dictionary, optional) – The keyword arguments to pass to Keras’s fit method when training the second stage model. Defaults to {“epochs”: 100}.

__init__(*, n_components, m, h, n_samples, use_upper_bound_loss=False, n_gradient_samples=0, optimizer='adam', first_stage_options={'epochs': 100}, second_stage_options={'epochs': 100})[source]

Methods

`__init__`(*, n_components, m, h, n_samples[, ...])
`ate`([X])	Calculate the average treatment effect \(E_X[\tau(X, T0, T1)]\).
`ate_inference`([X])	Inference results for the quantity \(E_X[\tau(X, T0, T1)]\) produced by the model.
`ate_interval`([X, alpha])	Confidence intervals for the quantity \(E_X[\tau(X, T0, T1)]\) produced by the model.
`cate_feature_names`([feature_names])	Public interface for getting feature names.
`cate_output_names`([output_names])	Public interface for getting output names.
`cate_treatment_names`([treatment_names])	Public interface for getting treatment names.
`effect`([X, T0, T1])	Calculate the heterogeneous treatment effect τ(·,·,·).
`effect_inference`([X, T0, T1])	Inference results for the quantities \(\tau(X, T0, T1)\) produced by the model.
`effect_interval`([X, T0, T1, alpha])	Confidence intervals for the quantities \(\tau(X, T0, T1)\) produced by the model.
`fit`(Y, T, *, X, Z[, inference])	Estimate the counterfactual model from data.
`marginal_ate`(T[, X])	Calculate the average marginal effect \(E_{T, X}[\partial\tau(T, X)]\).
`marginal_ate_inference`(T[, X])	Inference results for the quantities \(E_{T,X}[\partial \tau(T, X)]\) produced by the model.
`marginal_ate_interval`(T[, X, alpha])	Confidence intervals for the quantities \(E_{T,X}[\partial \tau(T, X)]\) produced by the model.
`marginal_effect`(T[, X])	Calculate the marginal effect ∂τ(·, ·) around a base treatment point conditional on features.
`marginal_effect_inference`(T[, X])	Inference results for the quantities \(\partial \tau(T, X)\) produced by the model.
`marginal_effect_interval`(T[, X, alpha])	Confidence intervals for the quantities \(\partial \tau(T, X)\) produced by the model.
`predict`(T, X)	Predict outcomes given treatment assignments and features.

Attributes

dowhy

Get an instance of DoWhyWrapper to allow other functionalities from dowhy package.

ate(X=None, *, T0, T1)

Calculate the average treatment effect \(E_X[\tau(X, T0, T1)]\).

The effect is calculated between the two treatment points and is averaged over the population of X variables.

Parameters

T0 ((m, d_t) matrix or vector of length m) – Base treatments for each sample
T1 ((m, d_t) matrix or vector of length m) – Target treatments for each sample
X ((m, d_x) matrix, optional) – Features for each sample

Returns

τ – Average treatment effects on each outcome Note that when Y is a vector rather than a 2-dimensional array, the result will be a scalar

Return type

float or (d_y,) array

ate_inference(X=None, *, T0, T1)

Inference results for the quantity \(E_X[\tau(X, T0, T1)]\) produced by the model. Available only when inference is not None, when calling the fit method.

Parameters

X ((m, d_x) matrix, optional) – Features for each sample
T0 ((m, d_t) matrix or vector of length m, default 0) – Base treatments for each sample
T1 ((m, d_t) matrix or vector of length m, default 1) – Target treatments for each sample

Returns

PopulationSummaryResults – The inference results instance contains prediction and prediction standard error and can on demand calculate confidence interval, z statistic and p value. It can also output a dataframe summary of these inference results.

Return type

object

ate_interval(X=None, *, T0, T1, alpha=0.05)

Confidence intervals for the quantity \(E_X[\tau(X, T0, T1)]\) produced by the model. Available only when inference is not None, when calling the fit method.

Parameters

X ((m, d_x) matrix, optional) – Features for each sample
T0 ((m, d_t) matrix or vector of length m, default 0) – Base treatments for each sample
T1 ((m, d_t) matrix or vector of length m, default 1) – Target treatments for each sample
alpha (float in [0, 1], default 0.05) – The overall level of confidence of the reported interval. The alpha/2, 1-alpha/2 confidence interval is reported.

Returns

lower, upper – The lower and the upper bounds of the confidence interval for each quantity.

Return type

tuple(type of ate(X, T0, T1), type of ate(X, T0, T1)) )

cate_feature_names(feature_names=None)

Public interface for getting feature names.

To be overriden by estimators that apply transformations the input features.

Parameters: feature_names (list of str of length X.shape[1] or None) – The names of the input features. If None and X is a dataframe, it defaults to the column names from the dataframe.
Returns: out_feature_names – Returns feature names.
Return type: list of str or None

cate_output_names(output_names=None)

Public interface for getting output names.

To be overriden by estimators that apply transformations the outputs.

Parameters: output_names (list of str of length Y.shape[1] or None) – The names of the outcomes. If None and the Y passed to fit was a dataframe, it defaults to the column names from the dataframe.
Returns: output_names – Returns output names.
Return type: list of str

cate_treatment_names(treatment_names=None)

Public interface for getting treatment names.

To be overriden by estimators that apply transformations the treatments.

Parameters: treatment_names (list of str of length T.shape[1] or None) – The names of the treatments. If None and the T passed to fit was a dataframe, it defaults to the column names from the dataframe.
Returns: treatment_names – Returns treatment names.
Return type: list of str

effect(X=None, T0=0, T1=1)[source]

Calculate the heterogeneous treatment effect τ(·,·,·).

The effect is calculated between the two treatment points conditional on a vector of features on a set of m test samples {T0ᵢ, T1ᵢ, Xᵢ}.

Parameters

T0 ((m × dₜ) matrix) – Base treatments for each sample
T1 ((m × dₜ) matrix) – Target treatments for each sample
X ((m × dₓ) matrix, optional) – Features for each sample

Returns

τ – Heterogeneous treatment effects on each outcome for each sample Note that when Y is a vector rather than a 2-dimensional array, the corresponding singleton dimension will be collapsed (so this method will return a vector)

Return type

(m × d_y) matrix

effect_inference(X=None, *, T0=0, T1=1)

Inference results for the quantities \(\tau(X, T0, T1)\) produced by the model. Available only when inference is not None, when calling the fit method.

Parameters

X ((m, d_x) matrix, optional) – Features for each sample
T0 ((m, d_t) matrix or vector of length m, default 0) – Base treatments for each sample
T1 ((m, d_t) matrix or vector of length m, default 1) – Target treatments for each sample

Returns

InferenceResults – The inference results instance contains prediction and prediction standard error and can on demand calculate confidence interval, z statistic and p value. It can also output a dataframe summary of these inference results.

Return type

object

effect_interval(X=None, *, T0=0, T1=1, alpha=0.05)

Confidence intervals for the quantities \(\tau(X, T0, T1)\) produced by the model. Available only when inference is not None, when calling the fit method.

Parameters

X ((m, d_x) matrix, optional) – Features for each sample
T0 ((m, d_t) matrix or vector of length m, default 0) – Base treatments for each sample
T1 ((m, d_t) matrix or vector of length m, default 1) – Target treatments for each sample
alpha (float in [0, 1], default 0.05) – The overall level of confidence of the reported interval. The alpha/2, 1-alpha/2 confidence interval is reported.

Returns

lower, upper – The lower and the upper bounds of the confidence interval for each quantity.

Return type

tuple(type of effect(X, T0, T1), type of effect(X, T0, T1)) )

fit(Y, T, *, X, Z, inference=None)[source]

Estimate the counterfactual model from data.

That is, estimate functions τ(·, ·, ·), ∂τ(·, ·).

Parameters

Y ((n × d_y) matrix or vector of length n) – Outcomes for each sample
T ((n × dₜ) matrix or vector of length n) – Treatments for each sample
X ((n × dₓ) matrix) – Features for each sample
Z ((n × d_z) matrix) – Instruments for each sample
inference (str, Inference instance, or None) – Method for performing inference. This estimator supports ‘bootstrap’ (or an instance of BootstrapInference)

Return type

self

marginal_ate(T, X=None)

Calculate the average marginal effect \(E_{T, X}[\partial\tau(T, X)]\).

The marginal effect is calculated around a base treatment point and averaged over the population of X.

Parameters

T ((m, d_t) matrix) – Base treatments for each sample
X ((m, d_x) matrix, optional) – Features for each sample

Returns

grad_tau – Average marginal effects on each outcome Note that when Y or T is a vector rather than a 2-dimensional array, the corresponding singleton dimensions in the output will be collapsed (e.g. if both are vectors, then the output of this method will be a scalar)

Return type

(d_y, d_t) array

marginal_ate_inference(T, X=None)

Inference results for the quantities \(E_{T,X}[\partial \tau(T, X)]\) produced by the model. Available only when inference is not None, when calling the fit method.

Parameters

T ((m, d_t) matrix) – Base treatments for each sample
X ((m, d_x) matrix, optional) – Features for each sample

Returns

PopulationSummaryResults – The inference results instance contains prediction and prediction standard error and can on demand calculate confidence interval, z statistic and p value. It can also output a dataframe summary of these inference results.

Return type

object

marginal_ate_interval(T, X=None, *, alpha=0.05)

Confidence intervals for the quantities \(E_{T,X}[\partial \tau(T, X)]\) produced by the model. Available only when inference is not None, when calling the fit method.

Parameters

T ((m, d_t) matrix) – Base treatments for each sample
X ((m, d_x) matrix, optional) – Features for each sample
alpha (float in [0, 1], default 0.05) – The overall level of confidence of the reported interval. The alpha/2, 1-alpha/2 confidence interval is reported.

Returns

lower, upper – The lower and the upper bounds of the confidence interval for each quantity.

Return type

tuple(type of marginal_ate(T, X), type of marginal_ate(T, X) )

marginal_effect(T, X=None)[source]

Calculate the marginal effect ∂τ(·, ·) around a base treatment point conditional on features.

Parameters

T ((m × dₜ) matrix) – Base treatments for each sample
X ((m × dₓ) matrix, optional) – Features for each sample

Returns

grad_tau – Heterogeneous marginal effects on each outcome for each sample Note that when Y or T is a vector rather than a 2-dimensional array, the corresponding singleton dimensions in the output will be collapsed (e.g. if both are vectors, then the output of this method will also be a vector)

Return type

(m × d_y × dₜ) array

marginal_effect_inference(T, X=None)

Inference results for the quantities \(\partial \tau(T, X)\) produced by the model. Available only when inference is not None, when calling the fit method.

Parameters

T ((m, d_t) matrix) – Base treatments for each sample
X ((m, d_x) matrix, optional) – Features for each sample

Returns

InferenceResults – The inference results instance contains prediction and prediction standard error and can on demand calculate confidence interval, z statistic and p value. It can also output a dataframe summary of these inference results.

Return type

object

marginal_effect_interval(T, X=None, *, alpha=0.05)

Confidence intervals for the quantities \(\partial \tau(T, X)\) produced by the model. Available only when inference is not None, when calling the fit method.

Parameters

T ((m, d_t) matrix) – Base treatments for each sample
X ((m, d_x) matrix, optional) – Features for each sample
alpha (float in [0, 1], default 0.05) – The overall level of confidence of the reported interval. The alpha/2, 1-alpha/2 confidence interval is reported.

Returns

lower, upper – The lower and the upper bounds of the confidence interval for each quantity.

Return type

tuple(type of marginal_effect(T, X), type of marginal_effect(T, X) )

predict(T, X)[source]

Predict outcomes given treatment assignments and features.

Parameters

T ((m × dₜ) matrix) – Base treatments for each sample
X ((m × dₓ) matrix) – Features for each sample

Returns

Y – Outcomes for each sample Note that when Y is a vector rather than a 2-dimensional array, the corresponding singleton dimension will be collapsed (so this method will return a vector)

Return type

(m × d_y) matrix

property dowhy

Get an instance of DoWhyWrapper to allow other functionalities from dowhy package. (e.g. causal graph, refutation test, etc.)

Returns: DoWhyWrapper – An instance of DoWhyWrapper
Return type: instance