Models API
Embedder
The main SIMBA model for embedding MS/MS spectra.
- class simba.core.models.transformers.embedder.Embedder(d_model, n_layers, dropout=0.1, weights=None, lr=None, use_element_wise=True, use_cosine_distance=True, use_adduct=False, categorical_adducts=False, adduct_mass_map='', use_ce=False, use_ion_activation=False, use_ion_method=False)[source]
Bases:
LightningModuleIt receives a set of pairs of molecules and it must train the similarity model based on it. Embed spectra.
- __init__(d_model, n_layers, dropout=0.1, weights=None, lr=None, use_element_wise=True, use_cosine_distance=True, use_adduct=False, categorical_adducts=False, adduct_mass_map='', use_ce=False, use_ion_activation=False, use_ion_method=False)[source]
Initialize the CCSPredictor
Spectrum Transformer Encoder
- class simba.core.models.transformers.spectrum_transformer_encoder_custom.SpectrumTransformerEncoderCustom(*args, use_adduct: bool = False, categorical_adducts: bool = False, adduct_mass_map: str = '', use_ce: bool = False, use_ion_activation: bool = False, use_ion_method: bool = False, **kwargs)[source]
Bases:
SpectrumTransformerEncoder- __init__(*args, use_adduct: bool = False, categorical_adducts: bool = False, adduct_mass_map: str = '', use_ce: bool = False, use_ion_activation: bool = False, use_ion_method: bool = False, **kwargs)[source]
Custom Spectrum Transformer Encoder with optional metadata usage.
- use_adduct
use adduct info during training
- Type:
bool
- categorical_adduct
convert adduct mass to vector
- Type:
bool
- adduct_mass_map
file that maps adduct masses to vectors
- Type:
str
- use_ce
use collision energy during training
- Type:
bool
- use_ion_activation
use ion activation info during training
- Type:
bool
- use_ion_method
use ionization method during training
- Type:
bool
- precursor_hook(mz_array: Tensor, intensity_array: Tensor, **kwargs: dict)[source]
Define how additional information in the batch may be used.
Overwrite this method to define custom functionality dependent on information in the batch. Examples would be to incorporate any combination of the mass, charge, retention time, or ion mobility of a precursor ion.
The representation returned by this method is preprended to the peak representations that are fed into the Transformer encoder and ultimately contribute to the spectrum representation that is the first element of the sequence in the model output.
By default, this method returns a tensor of zeros.
- Parameters:
mz_array (torch.Tensor of shape (n_spectra, n_peaks)) – The zero-padded m/z dimension for a batch of mass spectra.
intensity_array (torch.Tensor of shape (n_spectra, n_peaks)) – The zero-padded intensity dimension for a batch of mass spctra.
**kwargs (dict) – The additional data passed with the batch.
- Returns:
The precursor representations.
- Return type:
torch.Tensor of shape (batch_size, d_model)