Density Ratio Training

Density ratio estimation is the core machine-learning step in the SBI workflow. The goal is to learn the ratio \(p_A(x) / p_B(x)\) between two hypotheses directly from simulated data, without estimating either density individually. This is done by training a binary classifier to distinguish events drawn from each hypothesis — the classifier score is then converted to a density ratio.

How it works

The density_ratio_trainer provides an end-to-end interface for training density ratio networks. Given a dataset containing events from both hypotheses (with per-event weights and binary labels), the trainer handles feature scaling, network training, optional post-hoc calibration, and a suite of diagnostic checks.

A typical training call looks like:

from nsbi_common_utils.training import density_ratio_trainer

trainer = density_ratio_trainer(
    dataset=df,
    weights=weights,
    training_labels=labels,
    features=feature_list,
    features_scaling=feature_list,
    sample_name=["htautau", "ztautau"],
    output_name="htautau_vs_ztautau",
    path_to_figures="plots/",
    path_to_models="models/",
)

trainer.train(
    hidden_layers=3,
    neurons=64,
    number_of_epochs=200,
    batch_size=1024,
    learning_rate=1e-3,
    scalerType="StandardScaler",
    ensemble_index=0,
)

The trained model is automatically exported to ONNX format for portable, backend-agnostic inference.

Using the fit configuration

In practice, many of the inputs to the trainer — training features, which processes to train, and which process serves as the reference hypothesis — are read from the fit configuration file via ConfigManager:

from nsbi_common_utils import configuration, datasets

config = configuration.ConfigManager(file_path_string="config_fit.yml")

# Training features and which to standardise
features, features_scaling = config.get_training_features()

# Which processes get their own density ratio network
basis_samples = config.get_basis_samples()        # e.g. ["htautau", "ztautau"]

# The denominator process in the density ratio
reference_samples = config.get_reference_samples() # e.g. ["ztautau"]

# Load data from ROOT files defined in the config
datasets_helper = datasets.datasets(config_path="config_fit.yml", branches_to_load=features)
dataset_dict = datasets_helper.load_datasets_from_config(load_systematics=False)

These values can also be passed manually if you are using the training APIs independently of the configuration system.

Data requirements

The trainer expects a single DataFrame with events from both hypotheses, along with:

  • Weights — per-event weights, normalised independently per class so each class contributes equally.

  • Labels1 for hypothesis A (numerator) and 0 for hypothesis B (denominator).

The data is automatically split into training, validation, and holdout sets. The random seed and split metadata are saved to disk for reproducibility.

Feature scaling

Three scaling strategies are available via the scalerType parameter: "StandardScaler", "MinMax", and "PowerTransform_Yeo". The features_scaling argument controls which features are scaled — features not listed pass through unchanged.

Ensemble training

To reduce variance in the learned density ratios, multiple independent networks can be trained by passing different ensemble_index values. Each ensemble member saves its own model, scaler, and metadata with an index suffix. On a cluster, ensemble members are trained in parallel via HTCondor/DAGMan.

Calibration

Raw classifier outputs may not be perfectly calibrated probabilities. The trainer supports optional post-hoc calibration using either isotonic regression or histogram-based methods. When enabled, the calibrator is saved alongside the model and applied automatically at inference time.

Diagnostics

After training, several built-in diagnostic methods help validate the quality of the learned density ratios:

  • Overtraining check (make_overfit_plots) — compares score distributions between training and holdout data.

  • Calibration curve (make_calib_plots) — verifies that predicted scores match true class fractions.

  • Reweighting check (make_reweighted_plots) — the key closure test: reweighting hypothesis B by the learned ratio should reproduce hypothesis A.

  • Normalisation test (test_normalization) — checks that \(\int r(x) \, p_B(x) \, dx \approx 1\).

Extending with custom models

The training infrastructure is not limited to the built-in DensityRatioLightning and MultiClassLightning modules. The Lightning modules, trainer classes, and utility functions (ONNX export, batched inference, calibration) are designed as independent, composable components.

To add a new model type — for example a direct density estimator based on normalising flows — you would:

  1. Write a new pl.LightningModule subclass that defines the architecture, loss, and optimiser. It should expose a mlp and out attribute if you want to reuse the ONNX export utilities (save_model, convert_torch_to_onnx) directly, or you can handle export separately.

  2. Use the existing utility functions (save_model, load_trained_model, predict_with_onnx) for serialisation and inference — these work with any ONNX-compatible model.

  3. Optionally write a new trainer class following the same pattern as density_ratio_trainer or preselection_network_trainer to handle data splitting, scaling, and diagnostics.

The shared utilities in nsbi_common_utils.training.utils and the callbacks/datasets in nsbi_common_utils.lightning_tools are reusable across any model type.

Where it fits in the pipeline

Density ratio training happens after data preprocessing and preselection (Stages 2/2b), and before model evaluation and workspace construction (Stage 3b). The trained models produce per-event density ratio arrays that are assembled into the statistical model by the workspace builder.