API Reference
— MethodConfiguration(;estimands, scm=nothing, adjustment=nothing) = Configuration(estimands, scm, adjustment)
A Configuration is a set of estimands to be estimated. If the set of estimands contains causal (identifiable) estimands, these will be identified using the provided scm
and adjustment
— MethodOSE(;models=default_models(), resampling=nothing, ps_lowerbound=1e-8, machine_cache=false)
Defines a One Step Estimator using the specified models for estimation of the nuisance parameters. The estimator is a function that can be applied to estimate estimands for a dataset.
- models: A Dict(variable => model, ...) where the
are the outcome variables modeled by themodels
. - resampling: Outer resampling strategy. Setting it to
(default) falls back to vanilla estimation while
any valid MLJ.ResamplingStrategy
will result in CV-OSE.
- ps_lowerbound: Lowerbound for the propensity score to avoid division by 0. The special value
result in a data adaptive definition as described in here.
- machine_cache: Whether MLJ.machine created during estimation should cache data.
using MLJLinearModels
models = Dict(:Y => LinearRegressor(), :T => LogisticClassifier())
ose = OSE()
Ψ̂ₙ, cache = ose(Ψ, dataset)
— TypeA SCM is simply a wrapper around a MetaGraph over a Directed Acyclic Graph.
— MethodTMLEE(;models=default_models(), resampling=nothing, ps_lowerbound=1e-8, weighted=false, tol=nothing, machine_cache=false)
Defines a TMLE estimator using the specified models for estimation of the nuisance parameters. The estimator is a function that can be applied to estimate estimands for a dataset.
- models (default:
): A Dict(variable => model, ...) where thevariables
are the outcome variables modeled by themodels
. - resampling (default: nothing): Outer resampling strategy. Setting it to
(default) falls back to vanilla TMLE while
any valid MLJ.ResamplingStrategy
will result in CV-TMLE.
- ps_lowerbound (default: 1e-8): Lowerbound for the propensity score to avoid division by 0. The special value
result in a data adaptive definition as described in here.
- weighted (default: false): Whether the fluctuation model is a classig GLM or a weighted version. The weighted fluctuation has
been show to be more robust to positivity violation in practice.
- tol (default: nothing): Convergence threshold for the TMLE algorithm iterations. If nothing (default), 1/(sample size) will be used. See also
. - max_iter (default: 1): Maximum number of iterations for the TMLE algorithm.
- machine_cache (default: false): Whether MLJ.machine created during estimation should cache data.
using MLJLinearModels
tmle = TMLEE()
Ψ̂ₙ, cache = tmle(Ψ, dataset)
— MethodTreatmentTransformer(;encoder=encoder())
Treatments in TMLE are represented by CategoricalArrays
. If a treatment column has type OrderedFactor
, then its integer representation is used, make sure that the levels correspond to your expectations. All other columns are one-hot encoded.
— MethodDistributions.estimate(r::JointEstimate)
Retrieves the final estimate: after the TMLE step.
— MethodDistributions.estimate(r::EICEstimate)
Retrieves the final estimate: after the TMLE step.
— MethodA plate Structural Causal Model where:
- For all outcomes: oᵢ = fᵢ(treatments, confounders, outcomeextracovariates)
- For all treatments: tⱼ = fⱼ(confounders)
StaticSCM([:Y], [:T₁, :T₂], [:W₁, :W₂, :W₃]; outcomeextracovariates=[:C])
— Methodbrute_force_ordering(estimands; η_counts = nuisance_function_counts(estimands))
Finds an optimal ordering of the estimands to minimize maximum cache size. The approach is a brute force one, all permutations are generated and evaluated, if a minimum is found fast it is immediatly returned. The theoretical complexity is in O(N!). However due to the stop fast approach and the shuffling, this is actually expected to be much smaller than that.
— Methodcompose(f, estimation_results::Vararg{EICEstimate, N}) where N
Provides an estimator of f(estimation_results...).
Mathematical details
The following is a summary from Asymptotic Statistics
, A. W. van der Vaart.
Consider k TMLEs computed from a dataset of size n and embodied by Tₙ = (T₁,ₙ, ..., Tₖ,ₙ). Since each of them is asymptotically normal, the multivariate CLT provides the joint distribution:
√n(Tₙ - Ψ₀) ↝ N(0, Σ),
where Σ is the covariance matrix of the TMLEs influence curves.
Let f:ℜᵏ→ℜᵐ, be a differentiable map at Ψ₀. Then, the delta method provides the limiting distribution of √n(f(Tₙ) - f(Ψ₀)). Because Tₙ is normal, the result is:
√n(f(Tₙ) - f(Ψ₀)) ↝ N(0, ∇f(Ψ₀) ̇Σ ̇(∇f(Ψ₀))ᵀ),
where ∇f(Ψ₀):ℜᵏ→ℜᵐ is a linear map such that by abusing notations and identifying the function with the multiplication matrix: ∇f(Ψ₀):h ↦ ∇f(Ψ₀) ̇h. And the matrix ∇f(Ψ₀) is the jacobian of f at Ψ₀.
Hence, the only thing we need to do is:
- Compute the covariance matrix Σ
- Compute the jacobian ∇f, which can be done using Julia's automatic differentiation facilities.
- The final estimator is normal with mean f₀=f(Ψ₀) and variance σ₀=∇f(Ψ₀) ̇Σ ̇(∇f(Ψ₀))ᵀ
- f: An array-input differentiable map.
- estimation_results: 1 or more
Assuming res₁
and res₂
are TMLEs:
f(x, y) = [x^2 - y, y - 3x]
compose(f, res₁, res₂)
— Methoddefault_models(;Q_binary=LinearBinaryClassifier(), Q_continuous=LinearRegressor(), G=LinearBinaryClassifier()) = (
Create a Dictionary containing default models to be used by downstream estimators. Each provided model is prepended (in a MLJ.Pipeline
) with an MLJ.ContinuousEncoder
By default: - Qbinary is a LinearBinaryClassifier - Qcontinuous is a LinearRegressor - G is a LinearBinaryClassifier
The following changes the default Q_binary
to a LogisticClassifier
and provides a RidgeRegressor
for special_y
using MLJLinearModels
models = default_models(
Q_binary = LogisticClassifier(),
special_y = RidgeRegressor()
— Methodepsilons(cache)
Retrieves the fluctuations' epsilons corresponding to each targeting step from the cache.
— Methodestimates(cache)
Retrieves the estimates corresponding to each targeting step from the cache.
— MethodfactorialEstimand(
constructor::Union{typeof(CM), typeof(ATE), typeof(AIE)},
treatments, outcome;
Generates a factorial JointEstimand
with components of type constructor
For the ATE and the AIE, the generated components are restricted to the Cartesian Product of single treatment levels transitions. For example, consider two treatment variables T₁ and T₂ each taking three possible values (0, 1, 2). For each treatment variable, the single treatment levels transitions are defined by (0 → 1, 1 → 2). Then, the Cartesian Product of these transitions is taken, resulting in a 2 x 2 = 4 dimensional joint estimand:
- (T₁: 0 → 1, T₂: 0 → 1)
- (T₁: 0 → 1, T₂: 1 → 2)
- (T₁: 1 → 2, T₂: 0 → 1)
- (T₁: 1 → 2, T₂: 1 → 2)
A JointEstimand
with causal or statistical components.
: CM, ATE or AIE.treatments
: An AbstractDictionary/NamedTuple of treatment levels (e.g.(T=(0, 1, 2),)
) or a treatment iterator, then a dataset must be provided to infer the levels from it.outcome
: The outcome variable.confounders=nothing
: The generated components will inherit these confounding variables. Ifnothing
, causal estimands are generated.outcome_extra_covariates=()
: The generated components will inherit theseoutcome_extra_covariates
: An optional dataset to enforce a positivity constraint and infer treatment levels.positivity_constraint=nothing
: Only components that pass the positivity constraint are added to theJointEstimand
. Adataset
must then be provided.freq_table
: This is only to be used byfactorialEstimands
to avoid unecessary computations.verbosity=1
: Verbosity level.
- An Average Treatment Effect with causal components:
factorialEstimand(ATE, (T₁ = (0, 1), T₂=(0, 1, 2)), :Y₁)
- An Average Interaction Effect with statistical components:
factorial(AIE, (T₁ = (0, 1, 2), T₂=(0, 1, 2)), :Y₁, confounders=[:W₁, :W₂])
- With a dataset, the treatment levels can be infered and a positivity constraint enforced:
factorialEstimand(ATE, [:T₁, :T₂], :Y₁,
confounders=[:W₁, :W₂],
— MethodfactorialEstimands( constructor::Union{typeof(ATE), typeof(AIE)}, dataset, treatments, outcomes; confounders=nothing, outcomeextracovariates=(), positivity_constraint=nothing, verbosity=1 )
Generates a JointEstimand
for each outcome in outcomes
. See factorialEstimand
— Methodgradients(cache)
Retrieves the gradients corresponding to each targeting step from the cache.
— Methodgroups_ordering(estimands)
This will order estimands based on: propensity score first, outcome mean second. This heuristic should work reasonably well in practice. It could be optimized further by:
- Organising the propensity score groups that share similar components to be close together.
- Brute forcing the ordering of these groups to find an optimal one.
— Functionsignificance_test(estimate::EICEstimate, Ψ₀=0)
Performs a TTest
— Functionsignificance_test(estimate::JointEstimate, Ψ₀=zeros(size(estimate.estimate, 1)))
Performs a TTest if the estimate is one dimensional and a HotellingT2Test otherwise.