Home

Overview

TMLE.jl is a Julia implementation of the Targeted Minimum Loss-Based Estimation (TMLE) framework. If you are interested in leveraging the power of modern machine-learning methods while preserving interpretability and statistical inference guarantees, you are in the right place. TMLE.jl is compatible with any MLJ compliant algorithm and any dataset respecting the Tables interface.

Installation

TMLE.jl can be installed via the Package Manager and supports Julia v1.6 and greater.

Pkg> add TMLE

Quick Start

To run an estimation procedure, we need 3 ingredients:

1. A dataset: here a simulation dataset

For illustration, assume we know the actual data generating process is as follows:

\[\begin{aligned} W &\sim \mathcal{Uniform}(0, 1) \\ T &\sim \mathcal{Bernoulli}(logistic(1-2 \cdot W)) \\ Y &\sim \mathcal{Normal}(1 + 3 \cdot T - T \cdot W, 0.01) \end{aligned}\]

Because we know the data generating process, we can simulate some data accordingly:

using TMLE
using Distributions
using StableRNGs
using Random
using CategoricalArrays
using MLJLinearModels
using LogExpFunctions

rng = StableRNG(123)
n = 100
W = rand(rng, Uniform(), n)
T = rand(rng, Uniform(), n) .< logistic.(1 .- 2W)
Y = 1 .+ 3T .- T.*W .+ rand(rng, Normal(0, 0.01), n)
dataset = (Y=Y, T=categorical(T), W=W)

2. A quantity of interest: here the Average Treatment Effect (ATE)

The Average Treatment Effect of $T$ on $Y$ confounded by $W$ is defined as:

Ψ = ATE(
    outcome=:Y,
    treatment_values=(T=(case=true, control = false),),
    treatment_confounders=(T=[:W],)
)
TMLE.StatisticalATE
-----
Outcome: Y
Treatment: OrderedCollections.OrderedDict{Symbol, @NamedTuple{control::Bool, case::Bool}}(:T => (control = 0, case = 1))

3. An estimator: here a Targeted Maximum Likelihood Estimator (TMLE)

tmle = TMLEE()
result, _ = tmle(Ψ, dataset, verbosity=0);
result
One sample t-test
-----------------
Population details:
    parameter of interest:   Mean
    value under h_0:         0
    point estimate:          2.49314
    95% confidence interval: (2.434, 2.552)

Test summary:
    outcome with 95% confidence: reject h_0
    two-sided p-value:           <1e-93

Details:
    number of observations:   100
    t-statistic:              83.86604877048882
    degrees of freedom:       99
    empirical standard error: 0.029727653554063295

We are comforted to see that our estimator covers the ground truth! 🥳

Scope and Distinguishing Features

The goal of this package is to provide an entry point for semi-parametric asymptotic unbiased and efficient estimation in Julia. The two main general estimators that are known to achieve these properties are the One-Step estimator and the Targeted Maximum-Likelihood estimator. Most of the current effort has been centered around estimands that are composite of the counterfactual mean.

Distinguishing Features:

  • Estimands: Counterfactual Mean, Average Treatment Effect, Interactions, Any composition thereof
  • Estimators: TMLE, One-Step, in both canonical and cross-validated versions.
  • Machine-Learning: Any MLJ compatible model
  • Dataset: Any dataset respecting the Tables interface (e.g. DataFrames.jl)
  • Factorial Treatment Variables:
    • Multiple treatments
    • Categorical treatment values

Citing TMLE.jl

If you use TMLE.jl for your own work and would like to cite us, here are the BibTeX and APA formats:

  • BibTeX
@software{Labayle_TMLE_jl,
    author = {Labayle, Olivier and Beentjes, Sjoerd and Khamseh, Ava and Ponting, Chris},
    title = {{TMLE.jl}},
    url = {https://github.com/olivierlabayle/TMLE.jl}
}
  • APA

Labayle, O., Beentjes, S., Khamseh, A., & Ponting, C. TMLE.jl [Computer software]. https://github.com/olivierlabayle/TMLE.jl