dsipts.models.Samformer module

class dsipts.models.Samformer.Samformer(out_channels, past_steps, future_steps, past_channels, future_channels, embs, hidden_size, use_revin, rho=0.5, dropout_rate=0.1, activation='', persistence_weight=0.0, loss_type='l1', quantiles=[], optim=None, optim_config=None, scheduler_config=None, **kwargs)[source]

Bases: Base

Samformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention. https://arxiv.org/pdf/2402.10198

Parameters:
  • out_channels (int) – number of variables to be predicted

  • past_steps (int) – Lookback window length

  • future_steps (int) – Horizon window length

  • past_channels (int) – number of past variables

  • future_channels (int) – number of future auxiliary variables

  • embs (List[int]) – list of embeddings

  • hidden_size (int) – first embedding size of the model (‘r’ in the paper)

  • d_model (int) – second embedding size (r^{tilda} in the model). Should be smaller than hidden_size

  • n_head (int) – number of heads

  • n_layer_decoder (int) – number layers

  • dropout_rate (float)

  • class_strategy (str) – strategy (see paper) projection/average/cls_token

  • activation (str, optional) – activation function to be used ‘nn.GELU’.

  • persistence_weight (float, optional) – Defaults to 0.0.

  • loss_type (str, optional) – Defaults to ‘l1’.

  • quantiles (List[float], optional) – Defaults to []. NOT USED

  • optim (Union[str,None], optional) – Defaults to None.

  • optim_config (Union[dict,None], optional) – Defaults to None.

  • scheduler_config (Union[dict,None], optional) – Defaults to None.

handle_multivariate = True
handle_future_covariates = False
handle_categorical_variables = False
handle_quantile_loss = False
description = 'Can   handle multivariate output \nCan NOT  handle future covariates\nCan NOT  handle categorical covariates\nCan NOT  handle Quantile loss function'
__init__(out_channels, past_steps, future_steps, past_channels, future_channels, embs, hidden_size, use_revin, rho=0.5, dropout_rate=0.1, activation='', persistence_weight=0.0, loss_type='l1', quantiles=[], optim=None, optim_config=None, scheduler_config=None, **kwargs)[source]

Samformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention. https://arxiv.org/pdf/2402.10198

Parameters:
  • out_channels (int) – number of variables to be predicted

  • past_steps (int) – Lookback window length

  • future_steps (int) – Horizon window length

  • past_channels (int) – number of past variables

  • future_channels (int) – number of future auxiliary variables

  • embs (List[int]) – list of embeddings

  • hidden_size (int) – first embedding size of the model (‘r’ in the paper)

  • d_model (int) – second embedding size (r^{tilda} in the model). Should be smaller than hidden_size

  • n_head (int) – number of heads

  • n_layer_decoder (int) – number layers

  • dropout_rate (float)

  • class_strategy (str) – strategy (see paper) projection/average/cls_token

  • activation (str, optional) – activation function to be used ‘nn.GELU’.

  • persistence_weight (float, optional) – Defaults to 0.0.

  • loss_type (str, optional) – Defaults to ‘l1’.

  • quantiles (List[float], optional) – Defaults to []. NOT USED

  • optim (Union[str,None], optional) – Defaults to None.

  • optim_config (Union[dict,None], optional) – Defaults to None.

  • scheduler_config (Union[dict,None], optional) – Defaults to None.

forward(batch)[source]

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor