dsipts.models.Samformer module¶

class dsipts.models.Samformer.Samformer(out_channels, past_steps, future_steps, past_channels, future_channels, embs, hidden_size, use_revin, rho=0.5, dropout_rate=0.1, activation='', persistence_weight=0.0, loss_type='l1', quantiles=[], optim=None, optim_config=None, scheduler_config=None, **kwargs)[source]¶

Bases: Base

Samformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention. https://arxiv.org/pdf/2402.10198

Parameters:

out_channels (int) – number of variables to be predicted
past_steps (int) – Lookback window length
future_steps (int) – Horizon window length
past_channels (int) – number of past variables
future_channels (int) – number of future auxiliary variables
embs (List[int]) – list of embeddings
hidden_size (int) – first embedding size of the model (‘r’ in the paper)
d_model (int) – second embedding size (r^{tilda} in the model). Should be smaller than hidden_size
n_head (int) – number of heads
n_layer_decoder (int) – number layers
dropout_rate (float)
class_strategy (str) – strategy (see paper) projection/average/cls_token
activation (str, optional) – activation function to be used ‘nn.GELU’.
persistence_weight (float, optional) – Defaults to 0.0.
loss_type (str, optional) – Defaults to ‘l1’.
quantiles (List[float], optional) – Defaults to []. NOT USED
optim (Union[str,None], optional) – Defaults to None.
optim_config (Union[dict,None], optional) – Defaults to None.
scheduler_config (Union[dict,None], optional) – Defaults to None.

handle_multivariate = True¶

handle_future_covariates = False¶

handle_categorical_variables = False¶

handle_quantile_loss = False¶

description = 'Can handle multivariate output \nCan NOT handle future covariates\nCan NOT handle categorical covariates\nCan NOT handle Quantile loss function'¶

__init__(out_channels, past_steps, future_steps, past_channels, future_channels, embs, hidden_size, use_revin, rho=0.5, dropout_rate=0.1, activation='', persistence_weight=0.0, loss_type='l1', quantiles=[], optim=None, optim_config=None, scheduler_config=None, **kwargs)[source]¶

Samformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention. https://arxiv.org/pdf/2402.10198

Parameters:

out_channels (int) – number of variables to be predicted
past_steps (int) – Lookback window length
future_steps (int) – Horizon window length
past_channels (int) – number of past variables
future_channels (int) – number of future auxiliary variables
embs (List[int]) – list of embeddings
hidden_size (int) – first embedding size of the model (‘r’ in the paper)
d_model (int) – second embedding size (r^{tilda} in the model). Should be smaller than hidden_size
n_head (int) – number of heads
n_layer_decoder (int) – number layers
dropout_rate (float)
class_strategy (str) – strategy (see paper) projection/average/cls_token
activation (str, optional) – activation function to be used ‘nn.GELU’.
persistence_weight (float, optional) – Defaults to 0.0.
loss_type (str, optional) – Defaults to ‘l1’.
quantiles (List[float], optional) – Defaults to []. NOT USED
optim (Union[str,None], optional) – Defaults to None.
optim_config (Union[dict,None], optional) – Defaults to None.
scheduler_config (Union[dict,None], optional) – Defaults to None.

forward(batch)[source]¶

Forlward method used during the training loop

Parameters:: batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
Returns:: output of the mode;
Return type:: torch.tensor