LogGammaLightningModule#

class cosmolayer.LogGammaLightningModule(num_segment_types, temperature_exponents, area_per_segment, reference_temperature=298.15, max_iter=100, learning_rate=0.001, weight_decay=0.0, normalize_targets=False, loss_function='mse_loss', initialization=42)[source]#

PyTorch Lightning module for batched training of a learnable CosmoLayer.

This class is the canonical high-level training interface for CosmoLayer. It constructs an internal CosmoLayer with learnable interaction matrices and defines the optimization, training, validation, test, and prediction logic.

The targets are the log-activity coefficients of the components. In order to handle other tasks, the user must subclass LogGammaLightningModule and override the predict_from_log_gamma method. For instance:

from scipy.constants import R

class ExcessGibbsLightningModule(LogGammaLightningModule):
    def predict_from_log_gamma(self, T, x, log_gamma):
        return (R * T * (x * log_gamma).sum(dim=-1)).unsqueeze(-1)

The module is batch-first throughout. All inputs must represent a minibatch of b datapoints, and the returned predictions must have leading dimension b. Targets must have the same shape as the predictions.

Parameters:
  • num_segment_types (int) – Number of COSMO segment types.

  • temperature_exponents (tuple[int, ...]) – Exponents defining the temperature dependence of the interaction matrices.

  • area_per_segment (float) – Area associated with one segment.

  • reference_temperature (float, optional) – Reference temperature used by CosmoLayer. Default is 298.15.

  • max_iter (int, optional) – Maximum number of internal fixed-point or iterative solver steps used by CosmoLayer. Default is 100.

  • learning_rate (float, optional) – Learning rate for the Adam optimizer. Default is 1e-3.

  • weight_decay (float, optional) – Weight decay for the Adam optimizer. Default is 0.0.

  • loss_function (str, optional) – Loss function used in training, validation, and test steps. Must be a valid loss function from torch.nn.functional. Default is "mse_loss".

  • initialization (Sequence[NDArray[np.float64]] | int, optional) –

    Initialization for the learnable interaction matrices.

    • If an int is provided, it is interpreted as the random seed used to sample one matrix per temperature exponent from a standard normal distribution.

    • If a sequence of NumPy arrays is provided, it must contain exactly one array per temperature exponent, and each array must have shape (num_segment_types, num_segment_types).

    Default is 42.

Examples

>>> import torch
>>> from importlib.resources import files
>>> import cosmolayer as cl
>>> from cosmolayer import cosmosac
>>> model = cosmosac.CosmoSac2010Model
>>> module = LogGammaLightningModule(
...     num_segment_types=model.num_segment_types,
...     temperature_exponents=model.temperature_exponents,
...     area_per_segment=model.area_per_segment,
... )
>>> solute_path = files("cosmolayer.data") / "NCCO.cosmo"
>>> solvent_path = files("cosmolayer.data") / "O.cosmo"
>>> datapoint = cosmosac.CosmoSacMixtureDatapoint(
...     cosmo_files=[solute_path, solvent_path],
...     mole_fractions=[0.2, 0.8],
...     temperature=298.15,
...     targets=[-0.2, 0.02],
...     model=model,
... )
>>> single_inputs = datapoint.get_inputs()
>>> batched_inputs = tuple(x.unsqueeze(0) for x in single_inputs)
>>> preds = module(batched_inputs)
>>> preds.shape
torch.Size([1, 2])

Methods

configure_optimizers()[source]#

Configure the optimizer used during training.

Returns:

Adam optimizer over all module parameters.

Return type:

torch.optim.Optimizer

forward(inputs)[source]#

Compute predictions for a minibatch of datapoints.

Parameters:

inputs (InputsType) – Batched input tuple (temperature, mole_fractions, areas, volumes, probabilities). All tensors must be batch-first and represent the same minibatch of size b.

Returns:

Batched predictions with leading dimension b.

Return type:

torch.Tensor

on_fit_start()[source]#

Called at the very beginning of fit.

If on DDP it is called on every process

predict_from_log_gamma(T, x, log_gamma)[source]#

Convert log-activity coefficients to final predictions.

Parameters:
  • T (torch.Tensor) – Temperature in the same units as the reference temperature. Shape: (…,).

  • x (torch.Tensor) – Mole fractions of the components. Must sum to 1. Shape: (…, num_components).

  • log_gamma (torch.Tensor) – Logarithms of the activity coefficients. Shape: (…, num_components).

Returns:

Final predictions.

Return type:

torch.Tensor

test_step(batch, batch_idx)[source]#

Run one test step on a minibatch and update regression metrics.

Parameters:
  • batch (tuple[InputsType, torch.Tensor]) – Batched inputs and batched ground-truth targets. Targets must have the same shape as the model predictions, with leading dimension equal to the minibatch size.

  • batch_idx (int) – Index of the current batch.

Returns:

Test loss for the batch.

Return type:

torch.Tensor

training_step(batch, batch_idx)[source]#

Run one training step on a minibatch.

Parameters:
  • batch (tuple[InputsType, torch.Tensor]) – Batched inputs and batched ground-truth targets. Targets must have the same shape as the model predictions, with leading dimension equal to the minibatch size.

  • batch_idx (int) – Index of the current batch.

Returns:

Training loss for the batch.

Return type:

torch.Tensor

validation_step(batch, batch_idx)[source]#

Run one validation step on a minibatch.

Parameters:
  • batch (tuple[InputsType, torch.Tensor]) – Batched inputs and batched ground-truth targets. Targets must have the same shape as the model predictions, with leading dimension equal to the minibatch size.

  • batch_idx (int) – Index of the current batch.

Returns:

Validation loss for the batch.

Return type:

torch.Tensor