Home
Overview
TMLE.jl is a Julia implementation of the Targeted Minimum Loss-Based Estimation (TMLE) framework. If you are interested in leveraging the power of modern machine-learning methods while preserving interpretability and statistical inference guarantees, you are in the right place. TMLE.jl is compatible with any MLJ compliant algorithm and any dataset respecting the Tables interface.
Installation
TMLE.jl can be installed via the Package Manager and supports Julia v1.6
and greater.
Pkg> add TMLE
Quick Start
To run an estimation procedure, we need 3 ingredients:
1. A dataset: here a simulation dataset
For illustration, assume we know the actual data generating process is as follows:
\[\begin{aligned} W &\sim \mathcal{Uniform}(0, 1) \\ T &\sim \mathcal{Bernoulli}(logistic(1-2 \cdot W)) \\ Y &\sim \mathcal{Normal}(1 + 3 \cdot T - T \cdot W, 0.01) \end{aligned}\]
Because we know the data generating process, we can simulate some data accordingly:
using TMLE
using Distributions
using StableRNGs
using Random
using CategoricalArrays
using MLJLinearModels
using LogExpFunctions
rng = StableRNG(123)
n = 100
W = rand(rng, Uniform(), n)
T = rand(rng, Uniform(), n) .< logistic.(1 .- 2W)
Y = 1 .+ 3T .- T.*W .+ rand(rng, Normal(0, 0.01), n)
dataset = (Y=Y, T=categorical(T), W=W)
2. A quantity of interest: here the Average Treatment Effect (ATE)
The Average Treatment Effect of $T$ on $Y$ confounded by $W$ is defined as:
Ψ = ATE(
outcome=:Y,
treatment_values=(T=(case=true, control = false),),
treatment_confounders=(T=[:W],)
)
TMLE.StatisticalATE
-----
Outcome: Y
Treatment: OrderedCollections.OrderedDict{Symbol, @NamedTuple{control::Bool, case::Bool}}(:T => (control = 0, case = 1))
3. An estimator: here a Targeted Maximum Likelihood Estimator (TMLE)
tmle = TMLEE()
result, _ = tmle(Ψ, dataset, verbosity=0);
result
One sample t-test
-----------------
Population details:
parameter of interest: Mean
value under h_0: 0
point estimate: 2.49314
95% confidence interval: (2.434, 2.552)
Test summary:
outcome with 95% confidence: reject h_0
two-sided p-value: <1e-93
Details:
number of observations: 100
t-statistic: 83.86604877048882
degrees of freedom: 99
empirical standard error: 0.029727653554063295
We are comforted to see that our estimator covers the ground truth! 🥳
Scope and Distinguishing Features
The goal of this package is to provide an entry point for semi-parametric asymptotic unbiased and efficient estimation in Julia. The two main general estimators that are known to achieve these properties are the One-Step estimator and the Targeted Maximum-Likelihood estimator. Most of the current effort has been centered around estimands that are composite of the counterfactual mean.
Distinguishing Features:
- Estimands: Counterfactual Mean, Average Treatment Effect, Interactions, Any composition thereof
- Estimators: TMLE, One-Step, in both canonical and cross-validated versions.
- Machine-Learning: Any MLJ compatible model
- Dataset: Any dataset respecting the Tables interface (e.g. DataFrames.jl)
- Factorial Treatment Variables:
- Multiple treatments
- Categorical treatment values
Citing TMLE.jl
If you use TMLE.jl for your own work and would like to cite us, here are the BibTeX and APA formats:
- BibTeX
@software{Labayle_TMLE_jl,
author = {Labayle, Olivier and Beentjes, Sjoerd and Khamseh, Ava and Ponting, Chris},
title = {{TMLE.jl}},
url = {https://github.com/olivierlabayle/TMLE.jl}
}
- APA
Labayle, O., Beentjes, S., Khamseh, A., & Ponting, C. TMLE.jl [Computer software]. https://github.com/olivierlabayle/TMLE.jl