CosmoSacMixtureDatapoint#

class cosmolayer.cosmosac.CosmoSacMixtureDatapoint(cosmo_files, mole_fractions, temperature, targets=None, model=Model(min_sigma=-0.025, max_sigma=0.025, num_points=51, area_per_segment=7.25, averaging_radius=np.float64(1.5191269449366247), f_decay=3.57, sigma_0=0.007, merge_profiles=False, temperature_exponents=(1, 3)))[source]#

Subclass of MixtureDatapoint for COSMO-SAC mixtures.

Parameters:
  • cosmo_files (Sequence[os.PathLike[str]]) – Paths to COSMO files, one per component. Order must match mole_fractions and rows of component_targets.

  • mole_fractions (Sequence[float]) – Mole fractions for each component (should sum to 1).

  • temperature (float) – Temperature in Kelvin.

  • targets (Sequence[float] | None, optional) – Target values for the mixture (e.g. activity coefficients, excess properties). Length defines the number of training targets. If None, no training targets are stored.

  • model (Model) – COSMO-SAC model used to load components and compute probabilities.

Raises:

ValueError – If the number of mole fractions does not match the number of COSMO files.

Examples

Build a binary mixture datapoint from packaged COSMO files and read inputs and targets:

>>> from importlib.resources import files
>>> from cosmolayer.cosmosac import CosmoSac2002Model
>>> from cosmolayer.cosmosac.datapoint import CosmoSacMixtureDatapoint
>>> data = files("cosmolayer.data")
>>> cosmo_files = [data / "C=C(N)O.cosmo", data / "NCCO.cosmo"]
>>> mole_fractions = [0.5, 0.5]
>>> temperature = 298.15
>>> targets = [1.2]
>>> dp = CosmoSacMixtureDatapoint(
...     cosmo_files, mole_fractions, temperature,
...     targets, CosmoSac2002Model,
... )
>>> dp.temperature
298.15
>>> dp.mole_fractions.shape
(2,)
>>> dp.areas.shape, dp.volumes.shape
((2,), (2,))
>>> dp.probabilities.shape
(2, 51)
>>> dp.targets.shape
(1,)
>>> dp.num_components, dp.num_segment_types
(2, 51)
>>> dp.num_targets
1

Methods

classmethod from_series(series, cosmo_files, mole_fractions, temperature, targets, model=Model(min_sigma=-0.025, max_sigma=0.025, num_points=51, area_per_segment=7.25, averaging_radius=np.float64(1.5191269449366247), f_decay=3.57, sigma_0=0.007, merge_profiles=False, temperature_exponents=(1, 3)))[source]#

Build a mixture datapoint from one row of a DataFrame (as a Series).

This method is useful for building MixtureTrainingDataset instances from a pandas DataFrame using pandas.DataFrame.apply.

Column specifiers can be column names (strings), in which case values are taken from series[key], or literal numbers or paths (floats or os.PathLike), which are converted to strings and used as-is. This allows mixing DataFrame columns with fixed values (e.g. same solvent, same temperature, or same mole fractions for all datapoints).

Examples

>>> from importlib.resources import files
>>> from pathlib import Path
>>> data = Path(str(files("cosmolayer") / "data"))
>>> row = pd.Series(
...     {
...         "file_a": data / "C=C(N)O.cosmo",
...         "target_1": 1.2,
...     }
... )
>>> point = CosmoSacMixtureDatapoint.from_series(
...     series=row,
...     cosmo_files=["file_a", data / "NCCO.cosmo"],
...     mole_fractions=[0.25, 0.75],
...     temperature=298.15,
...     targets=["target_1"],
... )
>>> point.num_components, point.num_targets
(2, 1)
>>> point.mole_fractions.tolist()
[0.25, 0.75]
Parameters:
  • series (pd.Series) – One row of a DataFrame (e.g. from df.iloc[i] or df.iterrows()).

  • cosmo_files (Sequence[str | pathlib.Path]) – For each component, either a column name (str) or a path to a COSMO file (pathlib.Path).

  • mole_fractions (Sequence[str | float]) – For each component, either a column name (str) or a literal mole fraction (float). Values should sum to 1.

  • temperature (str | float) – Column name for temperature in Kelvin, or a literal temperature.

  • targets (Sequence[str | float]) – For each target, either a column name (str) or a literal value (float).

  • model (Model, optional) – COSMO-SAC model used to load components. Default is CosmoSac2010Model.

Returns:

A datapoint built from the series values.

Return type:

CosmoSacMixtureDatapoint