API Documentation

Classes

class relAI.CosineActivation

A custom activation function that applies the cosine function.

The CosineActivation class is a PyTorch module that applies the cosine activation function to the input tensor.

forward(x)

Applies the cosine activation to the input tensor.

The forward method takes an input tensor and applies the cosine activation function element-wise, subtracting the input value from its cosine.

Parameters:

x (torch.Tensor) – The input tensor.

Returns:

The tensor with the cosine activation applied.

Return type:

torch.Tensor

class relAI.AE(layer_sizes)

Autoencoder model implemented as a PyTorch module.

The AE class represents an autoencoder model with specified sizes of the layers. It consists of an encoder and a decoder, both utilizing the CosineActivation as the activation function.

Parameters:

layer_sizes (list[int]) – A list containing the sizes of the layers of the encoder (decoder built with symmetry).

Variables:
  • encoder (torch.nn.Sequential) – The encoder module.

  • decoder (torch.nn.Sequential) – The decoder module.

build_decoder(layer_sizes)

Builds the decoder part of an autoencoder model based on the specified layer sizes.

Parameters:

layer_sizes (list[int]) – A list of integers representing the number of nodes in each layer of the decoder.

Returns:

The decoder module of the autoencoder model.

Return type:

torch.nn.Sequential

build_encoder(layer_sizes)

Builds the encoder part of an autoencoder model based on the specified layer sizes.

Parameters:

layer_sizes (list[int]) – A list of integers representing the number of nodes in each layer of the encoder.

Returns:

The encoder module of the autoencoder model.

Return type:

torch.nn.Sequential

forward(x)

Performs the forward pass of the autoencoder model.

The forward method takes an input tensor and passes it through the encoder, obtaining the encoded representation. The encoded representation is then passed through the decoder to reconstruct the original input.

Parameters:

x (torch.Tensor) – The input tensor.

Returns:

The reconstructed tensor.

Return type:

torch.Tensor

class relAI.ReliabilityDetector(ae, proxy_model, mse_thresh)

Reliability Detector for assessing the reliability of data points.

The ReliabilityDetector class computes the reliability of data points based on a specified autoencoder (ae), a proxy model (clf), and an MSE threshold (mse_thresh).

Parameters:
  • ae (AE) – The autoencoder model.

  • proxy_model – The proxy model used for the local fit reliability computation.

  • mse_thresh (float) – The MSE threshold used for the density reliability computation.

Variables:
  • ae (AE) – The autoencoder model.

  • clf – The proxy model used for the local fit reliability computation.

  • mse_thresh (float) – The MSE threshold for the density reliability computation.

compute_density_reliability(x)

Computes the density reliability of a data point.

The density reliability is determined by computing the mean squared error (MSE) between the input data point and its reconstructed representation obtained from the autoencoder. If the MSE is less than (or equal to) the specified MSE threshold, the data point is considered reliable (returns 1), otherwise unreliable (returns 0).

Parameters:

x (numpy.ndarray) – The input data point.

Returns:

The density reliability value (1 for reliable, 0 for unreliable).

Return type:

int

compute_localfit_reliability(x)

Computes the local fit reliability of a data point.

The local fit reliability is determined by using the proxy model to predict the local fit reliability of the input data point. The input data point is reshaped to match the expected input format of the proxy model. The predicted reliability value is returned.

Parameters:

x (numpy.ndarray) – The input data point.

Returns:

The local fit reliability class predicted by the proxy model (1 for reliable, 0 for unreliable).

Return type:

int

compute_total_reliability(x)

Computes the combined reliability of a data point.

The combined reliability is determined by combining the density reliability and the local fit reliability. If both reliabilities are positive (1), the data point is considered reliable (returns True), otherwise unreliable (returns False).

Parameters:

x (numpy.ndarray) – The input data point.

Returns:

The combined reliability value (True for reliable, False for unreliable).

Return type:

bool

class relAI.DensityPrincipleDetector(autoencoder, threshold)

Density Principle Detector for assessing the density reliability of data points.

The DensityPrincipleDetector class computes the density reliability of data points based on a specified autoencoder (autoencoder) and a threshold (threshold).

Parameters:
  • autoencoder (AE) – The autoencoder model.

  • threshold (float) – The threshold for determining the density reliability.

Variables:
  • ae (AE) – The autoencoder model.

  • thresh (float) – The threshold for determining the density reliability.

compute_reliability(x)

Computes the density reliability of a data point.

The density reliability is determined by computing the mean squared error (MSE) between the input data point and its reconstructed representation obtained from the autoencoder. If the MSE is less than or equal to the specified threshold, the data point is considered reliable (returns 1), otherwise unreliable (returns 0).

Parameters:

x (numpy.ndarray) – The input data point.

Returns:

The density reliability value (1 for reliable, 0 for unreliable).

Return type:

int

Functions

relAI.compute_dataset_avg_mse(ae, X)

Compute the average mean squared error (MSE) for a given autoencoder model and dataset.

Parameters:
  • ae (torch.nn.Module) – The autoencoder model.

  • X (numpy.ndarray) – The dataset of interest

Returns:

The average MSE value for the reconstructed samples.

Return type:

float

relAI.compute_dataset_reliability(RD, X, mode='total')

Computes the reliability of the samples in a dataset

This function computes the density/local-fit/total reliability of the samples in the X dataset, based on the mode specified, with the ReliabilityPackage RD

Parameters:
  • RD (ReliabilityDetector) – A ReliabilityPackage object.

  • X (array-like) – the specified dataset

  • mode (str) – the type of reliability to compute; Available options: ‘density’, ‘local-fit’, ‘total’. Default is

‘total’ :return: a numpy 1-D array containing the reliability of each sample (1 for reliable, 0 for unreliable) :rtype: numpy.ndarray

relAI.create_and_train_autoencoder(training_set, validation_set, batchsize, layer_sizes=None, epochs=1000, optimizer=None, loss_function=MSELoss())

Gets and trains an autoencoder model using the provided training and validation sets.

This function gets an autoencoder model based on the specified layers’ sizes and trains it using the provided training and validation sets. It performs multiple epochs of training, updating the model parameters based on the specified optimizer and loss function. The training progress is evaluated on the validation set after each epoch, and the resulting validation loss is shown in the image.

Parameters:
  • training_set (numpy.ndarray) – The training set.

  • validation_set (numpy.ndarray) – The validation set.

  • batchsize (int) – The batch size used for training.

  • layer_sizes (list) –

    A list containing the number of nodes of each layer of the encoder (decoder built with symmetry).

    If None, the default dimension of the encoder’s layers is [dim_input, dim_input + 4, dim_input + 8, dim_input + 16, dim_input + 32]

  • epochs (int) – The number of training epochs (default: 1000).

  • optimizer (torch.optim.Optimizer) – The optimizer used for parameter updates. If None, an Adam optimizer with default parameters will be used (default: None).

  • loss_function (torch.nn.Module) – The loss function used for training. If None, the mean squared error (MSE) loss function will be used (default: torch.nn.MSELoss()).

Returns:

The trained autoencoder model.

Return type:

AE (torch.nn.Module)

relAI.create_autoencoder(layer_sizes)

Gets an autoencoder model with the specified sizes of the layers.

This function gets an autoencoder model using the AE class, implemented as a PyTorch module, with the specified layers’ sizes. The autoencoder is used for the implementation of the Density Principle.

Parameters:

layer_sizes (list) – A list containing the number of nodes of each layer of the encoder (decoder built with symmetry).

Returns:

An instance of the autoencoder model.

Return type:

AE (torch.nn.Module)

relAI.create_reliability_detector(ae, syn_pts, acc_syn, mse_thresh, acc_thresh, proxy_model='MLP')

Gets a ReliabilityPredictor object for a given autoencoder, synthetic points, accuracy of the synthetic points, MSE threshold, and accuracy threshold.

This function gets a ReliabilityPredictor object using the specified autoencoder, synthetic points, accuracy of the synthetic points, MSE threshold, and accuracy threshold. The ReliabilityPredictor assigns the density reliability of samples based on their reconstruction error (MSE), with respect to the MSE threshold, while assigns the local fit reliability based on the prediction of a model (‘proxy_model’), trained on the synthetic points labelled as “local-fit” reliable/unreliable according to their associated accuracy with respect to the accuracy threshold.

Parameters:
  • ae (torch.nn.Module) – The autoencoder used for projection.

  • syn_pts (array-like) – The synthetic points used for training the “local-fit” reliability predictor.

  • acc_syn (array-like) – The accuracy scores corresponding to the synthetic points.

  • mse_thresh (float) – The MSE threshold used for assigning the density reliability scores.

  • acc_thresh (float) – The accuracy threshold used for assigning the “local-fit” reliability scores.

  • proxy_model (str) – The type of proxy model used for training the “local-fit”reliability predictor. Available options: ‘MLP’, ‘tree’. Default is ‘MLP’ (Multi-Layer Perceptron).

Returns:

A ReliabilityDetector object.

Return type:

ReliabilityDetector

relAI.density_predictor(ae, mse_thresh)

Creates a DensityPrinciplePredictor object for a given autoencoder and MSE threshold.

This function creates a DensityPrinciplePredictor object using the specified autoencoder and MSE threshold. The DensityPrinciplePredictor is a density-based predictor that assigns reliability scores to samples based on their reconstruction error (MSE) compared to the MSE threshold.

Parameters:
  • ae (torch.nn.Module) – The autoencoder used for projection.

  • mse_thresh (float) – The MSE threshold used for assigning reliability scores.

Returns:

A DensityPrinciplePredictor object.

Return type:

DensityPrincipleDetector

relAI.generate_synthetic_points(predict_func, X_train, y_train, method='GN', k=5)

Generates synthetic points based on the specified method.

This function generates synthetic points based on the method specified in “method”. ‘GN’: the synthetic points are generated from the training set by adding gaussian random noise, with different values of variance, to the continous variables,

and by randomly extracting, proportionally to their frequencies, the values of binary and integer variables.

Parameters:
  • X_train (numpy.ndarray) – The training set with shape (n_samples, n_features).

  • method (str) – The method used to generate synthetic points (default: ‘GN’). Currently, only the ‘GN’ (Gaussian Noise) method is supported.

Returns:

The synthetic points generated with the specified method.

Return type:

numpy.ndarray

relAI.mse_threshold_barplot(ae, X_val, y_val, predict_func)

Generates a bar plot of performance metrics based on different MSE thresholds.

This function generates a bar plot of performance metrics based on different Mean Squared Error (MSE) thresholds (selected as percentiles of the MSE of the validation set). It computes different scores for the reliable and unreliable samples obtained, and the number and percentage of unreliable samples, using the val_scores_diff_mse function. The bar plot shows the percentage of unreliable samples, as well as various performance metrics (e.g., balanced_accuracy, precision, recall, F1-score, MCC, or Brier score) for reliable and unreliable samples at each MSE threshold. A slider allows selecting the MSE threshold and updating the plot accordingly.

Parameters:
  • ae (torch.nn.Module) – The autoencoder used for projection.

  • X_val (array-like) – The validation dataset.

  • y_val (array-like) – The validation labels.

  • predict_func (callable) – The predict function of the classifier.

Returns:

A Plotly Figure object representing the MSE threshold bar plot.

Return type:

go.Figure

relAI.mse_threshold_plot(ae, X_val, y_val, predict_func, metric='f1_score')

Generates a plot of performance metrics based on different MSE thresholds (selected as percentiles of the MSE of the validation set).

This function generates a plot of performance metrics based on different Mean Squared Error (MSE) thresholds. It computes the number (and percentage) of the reliable and unreliable samples obtained with each threshold, and different performance metrics using the val_scores_diff_mse function. The plot shows the performance metric selected (‘metric’) (e.g., balanced_accuracy, precision, recall, F1-score, MCC, or Brier score) for reliable and unreliable samples at different MSE thresholds, and their number and percentage. A slider allows to move the x-axis.

Parameters:
  • ae (torch.nn.Module) – The autoencoder used for projection.

  • X_val (array-like) – The validation dataset.

  • y_val (array-like) – The validation labels.

  • predict_func (callable) – The predict function of the classifier.

  • metric (str) – The performance metric to display on the plot. Available options: ‘balanced_accuracy’,

‘precision’, ‘recall’, ‘f1_score’, ‘mcc’, ‘brier_score’. Default is ‘f1_score’.

Returns:

A Plotly Figure object representing the MSE threshold plot.

Return type:

go.Figure

relAI.perc_mse_threshold(ae, validation_set, perc=95)

Computes the MSE threshold as a percentile of the MSE of the validation set.

This function computes the MSE threshold as a percentile of the MSE of the validation set using an autoencoder model. It calculates the MSE for each sample in the validation set and returns the specified percentile threshold.

Parameters:
  • ae (torch.nn.Module) – The autoencoder model.

  • validation_set (numpy.ndarray) – The validation set with shape (n_samples, n_features).

  • perc (int) – The percentile threshold to compute (default: 95).

Returns:

The MSE threshold as the specified percentile of the MSE of the validation set.

Return type:

float

relAI.train_autoencoder(ae, training_set, validation_set, batchsize, epochs=1000, optimizer=None, loss_function=MSELoss())

Trains the autoencoder model using the provided training and validation sets.

This function trains the autoencoder model using the provided training and validation sets. It performs multiple epochs of training, updating the model parameters based on the specified optimizer and loss function. The training progress is evaluated on the validation set after each epoch, and the resulting validation loss is shown in the image.

Parameters:
  • ae (torch.nn.Module) – The autoencoder model to be trained.

  • training_set (numpy.ndarray) – The training set.

  • validation_set (numpy.ndarray) – The validation set.

  • batchsize (int) – The batch size used for training.

  • epochs (int) – The number of training epochs (default: 1000).

  • optimizer (torch.optim.Optimizer) – The optimizer used for parameter updates. If None, an Adam optimizer with default parameters will be used (default: None).

  • loss_function (torch.nn.Module) – The loss function used for training. If None, the mean squared error (MSE) loss function will be used (default: torch.nn.MSELoss()).

Returns:

The trained autoencoder model.

Return type:

AE (torch.nn.Module)