Density Ratio Training

Density ratio estimation is the core machine-learning step in the SBI workflow. The goal is to learn the ratio \(p_A(x) / p_B(x)\) between two hypotheses directly from simulated data, without estimating either density individually. This is done by training a binary classifier to distinguish events drawn from each hypothesis — the classifier score is then converted to a density ratio.

How it works

The density_ratio_trainer provides an end-to-end interface for training density ratio networks. Given a dataset containing events from both hypotheses (with per-event weights and binary labels), the trainer handles feature scaling, network training, optional post-hoc calibration, and a suite of diagnostic checks.

A typical training call looks like:

from nsbi_common_utils.training import density_ratio_trainer

trainer = density_ratio_trainer(
    dataset=df,
    weights=weights,
    training_labels=labels,
    features=feature_list,
    features_scaling=feature_list,
    sample_name=["htautau", "ztautau"],
    output_name="htautau_vs_ztautau",
    path_to_figures="plots/",
    path_to_models="models/",
)

trainer.train(
    hidden_layers=3,
    neurons=64,
    number_of_epochs=200,
    batch_size=1024,
    learning_rate=1e-3,
    scalerType="StandardScaler",
    ensemble_index=0,
)

The trained model is automatically exported to ONNX format for portable, backend-agnostic inference.

Using the fit configuration

In practice, many of the inputs to the trainer — training features, which processes to train, and which process serves as the reference hypothesis — are read from the fit configuration file via ConfigManager:

from nsbi_common_utils import configuration, datasets

config = configuration.ConfigManager(file_path_string="config_fit.yml")

# Training features and which to standardise
features, features_scaling = config.get_training_features()

# Which processes get their own density ratio network
basis_samples = config.get_basis_samples()        # e.g. ["htautau", "ztautau"]

# The denominator process in the density ratio
reference_samples = config.get_reference_samples() # e.g. ["ztautau"]

# Load data from ROOT files defined in the config
datasets_helper = datasets.datasets(config_path="config_fit.yml", branches_to_load=features)
dataset_dict = datasets_helper.load_datasets_from_config(load_systematics=False)

These values can also be passed manually if you are using the training APIs independently of the configuration system.

Data requirements

The trainer expects a single DataFrame with events from both hypotheses, along with:

Weights — per-event weights, normalised independently per class so each class contributes equally.
Labels — 1 for hypothesis A (numerator) and 0 for hypothesis B (denominator).

The data is automatically split into training, validation, and holdout sets. The random seed and split metadata are saved to disk for reproducibility.

Feature scaling

Three scaling strategies are available via the scalerType parameter: "StandardScaler", "MinMax", and "PowerTransform_Yeo". The features_scaling argument controls which features are scaled — features not listed pass through unchanged.

Ensemble training

To reduce variance in the learned density ratios, multiple independent networks can be trained by passing different ensemble_index values. Each ensemble member saves its own model, scaler, and metadata with an index suffix. On a cluster, ensemble members are trained in parallel via HTCondor/DAGMan.

Calibration

Raw classifier outputs may not be perfectly calibrated probabilities. The trainer supports optional post-hoc calibration using either isotonic regression or histogram-based methods. When enabled, the calibrator is saved alongside the model and applied automatically at inference time.

Diagnostics

After training, several built-in diagnostic methods help validate the quality of the learned density ratios:

Overtraining check (make_overfit_plots) — compares score distributions between training and holdout data.
Calibration curve (make_calib_plots) — verifies that predicted scores match true class fractions.
Reweighting check (make_reweighted_plots) — the key closure test: reweighting hypothesis B by the learned ratio should reproduce hypothesis A.
Normalisation test (test_normalization) — checks that \(\int r(x) \, p_B(x) \, dx \approx 1\).

Extending with custom models

The training infrastructure is not limited to the built-in DensityRatioLightning and MultiClassLightning modules. The Lightning modules, trainer classes, and utility functions (ONNX export, batched inference, calibration) are designed as independent, composable components.

To add a new model type — for example a direct density estimator based on normalising flows — you would:

Write a new pl.LightningModule subclass that defines the architecture, loss, and optimiser. It should expose a mlp and out attribute if you want to reuse the ONNX export utilities (save_model, convert_torch_to_onnx) directly, or you can handle export separately.
Use the existing utility functions (save_model, load_trained_model, predict_with_onnx) for serialisation and inference — these work with any ONNX-compatible model.
Optionally write a new trainer class following the same pattern as density_ratio_trainer or preselection_network_trainer to handle data splitting, scaling, and diagnostics.

The shared utilities in nsbi_common_utils.training.utils and the callbacks/datasets in nsbi_common_utils.lightning_tools are reusable across any model type.

Where it fits in the pipeline

Density ratio training happens after data preprocessing and preselection (Stages 2/2b), and before model evaluation and workspace construction (Stage 3b). The trained models produce per-event density ratio arrays that are assembled into the statistical model by the workspace builder.