dsipts package

Subpackages

Module contents

class dsipts.Autoformer(label_len: int, d_model: int, dropout_rate: float, kernel_size: int, activation: str = 'torch.nn.ReLU', factor: float = 0.5, n_head: int = 1, n_layer_encoder: int = 2, n_layer_decoder: int = 2, hidden_size: int = 1048, **kwargs)

Bases: Base

Autoformer from https://github.com/cure-lab/LTSF-Linear

Parameters:
  • label_len (int) – see the original implementation, seems like a warmup dimension (the decoder part will produce also some past predictions that are filter out at the end)

  • d_model (int) – embedding dimension of the attention layer

  • dropout_rate (float) – dropout raye

  • kernel_size (int) – kernel size

  • activation (str, optional) – _description_. Defaults to ‘torch.nn.ReLU’.

  • factor (int, optional) – parameter of .autoformer.layers.AutoCorrelation for find the top k. Defaults to 0.5.

  • n_head (int, optional) – number of heads. Defaults to 1.

  • n_layer_encoder (int, optional) – number of encoder layers. Defaults to 2.

  • n_layer_decoder (int, optional) – number of decoder layers. Defaults to 2.

  • hidden_size (int, optional) – output dimension of the transformer layer. Defaults to 1048.

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
class dsipts.Base(verbose: bool, past_steps: int, future_steps: int, past_channels: int, future_channels: int, out_channels: int, embs_past: List[int], embs_fut: List[int], n_classes: int = 0, persistence_weight: float = 0.0, loss_type: str = 'l1', quantiles: List[int] = [], reduction_mode: str = 'mean', use_classical_positional_encoder: bool = False, emb_dim: int = 16, optim: str | None = None, optim_config: dict = None, scheduler_config: dict = None)

Bases: LightningModule

This is the basic model, each model implemented must overwrite the init method and the forward method. The inference step is optional, by default it uses the forward method but for recurrent network you should implement your own method

Parameters:
  • verbose (bool) – Flag to enable verbose logging.

  • past_steps (int) – Number of past time steps to consider.

  • future_steps (int) – Number of future time steps to predict.

  • past_channels (int) – Number of channels in the past input data.

  • future_channels (int) – Number of channels in the future input data.

  • out_channels (int) – Number of output channels.

  • embs_past (List[int]) – List of embedding dimensions for past data.

  • embs_fut (List[int]) – List of embedding dimensions for future data.

  • n_classes (int, optional) – Number of classes for classification. Defaults to 0.

  • persistence_weight (float, optional) – Weight for persistence in loss calculation. Defaults to 0.0.

  • loss_type (str, optional) – Type of loss function to use (‘l1’ or ‘mse’). Defaults to ‘l1’.

  • quantiles (List[int], optional) – List of quantiles for quantile loss. Defaults to an empty list.

  • reduction_mode (str, optional) – Mode for reduction for categorical embedding layer (‘mean’, ‘sum’, ‘none’). Defaults to ‘mean’.

  • use_classical_positional_encoder (bool, optional) – Flag to use classical positional encoding or using embedding layer also for the positions. Defaults to False.

  • emb_dim (int, optional) – Dimension of categorical embeddings. Defaults to 16.

  • optim (Union[str, None], optional) – Optimizer type. Defaults to None.

  • optim_config (dict, optional) – Configuration for the optimizer. Defaults to None.

  • scheduler_config (dict, optional) – Configuration for the learning rate scheduler. Defaults to None.

Raises:
  • AssertionError – If the number of quantiles is not equal to 3 when quantiles are provided.

  • AssertionError – If the number of output channels is not 1 for classification tasks.

description = 'Can NOT  handle multivariate output \nCan NOT  handle future covariates\nCan NOT  handle categorical covariates\nCan NOT  handle Quantile loss function'
abstractmethod forward(batch: dict) tensor

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = False
handle_future_covariates = False
handle_multivariate = False
handle_quantile_loss = False
inference(batch: dict) tensor

Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN)

Parameters:

batch (dict) – batch

Returns:

result

Return type:

torch.tensor

class dsipts.Categorical(name: str, frequency: int, duration: List[int], classes: int, action: ActionEnum, level: List[float])

Bases: object

Class for generating toy categorical data

Parameters:
  • name (str) – name of the categorical signal

  • frequency (int) – frequency of the signal

  • duration (List[int]) – duration of each class

  • classes (int) – number of classes

  • action (str) – one between additive or multiplicative

  • level (List[float]) – intensity of each class

generate_signal(length: int) None

Generate the resposne signal

Parameters:

length (int) – length of the signal

plot() None

Plot the series

class dsipts.CrossFormer(https://openreview.net/forum?id=vSVLM2j9eie)

Bases: Base

Parameters:
  • d_model (int) – The dimensionality of the model.

  • hidden_size (int) – The size of the hidden layers.

  • n_head (int) – The number of attention heads.

  • seg_len (int) – The length of the segments.

  • n_layer_encoder (int) – The number of layers in the encoder.

  • win_size (int) – The size of the window for attention.

  • factor (int, optional) – see .crossformer.attn.TwoStageAttentionLayer. Defaults to 10.

  • dropout_rate (float, optional) – The dropout rate. Defaults to 0.1.

  • activation (str, optional) – The activation function to use. Defaults to ‘torch.nn.ReLU’.

  • **kwargs – Additional keyword arguments for the parent class.

Returns:

This method does not return a value.

Return type:

None

Raises:

ValueError – If the activation function is not recognized.

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
class dsipts.D3VAE(scale=0.1, hidden_size=64, num_layers=2, dropout_rate=0.1, diff_steps=200, loss_type='kl', beta_end=0.01, beta_schedule='linear', channel_mult=2, mult=1, num_preprocess_blocks=1, num_preprocess_cells=3, num_channels_enc=16, arch_instance='res_mbconv', num_latent_per_group=6, num_channels_dec=16, groups_per_scale=2, num_postprocess_blocks=1, num_postprocess_cells=2, beta_start=0, freq='h', **kwargs)

Bases: Base

This is the basic model, each model implemented must overwrite the init method and the forward method. The inference step is optional, by default it uses the forward method but for recurrent network you should implement your own method

Parameters:
  • verbose (bool) – Flag to enable verbose logging.

  • past_steps (int) – Number of past time steps to consider.

  • future_steps (int) – Number of future time steps to predict.

  • past_channels (int) – Number of channels in the past input data.

  • future_channels (int) – Number of channels in the future input data.

  • out_channels (int) – Number of output channels.

  • embs_past (List[int]) – List of embedding dimensions for past data.

  • embs_fut (List[int]) – List of embedding dimensions for future data.

  • n_classes (int, optional) – Number of classes for classification. Defaults to 0.

  • persistence_weight (float, optional) – Weight for persistence in loss calculation. Defaults to 0.0.

  • loss_type (str, optional) – Type of loss function to use (‘l1’ or ‘mse’). Defaults to ‘l1’.

  • quantiles (List[int], optional) – List of quantiles for quantile loss. Defaults to an empty list.

  • reduction_mode (str, optional) – Mode for reduction for categorical embedding layer (‘mean’, ‘sum’, ‘none’). Defaults to ‘mean’.

  • use_classical_positional_encoder (bool, optional) – Flag to use classical positional encoding or using embedding layer also for the positions. Defaults to False.

  • emb_dim (int, optional) – Dimension of categorical embeddings. Defaults to 16.

  • optim (Union[str, None], optional) – Optimizer type. Defaults to None.

  • optim_config (dict, optional) – Configuration for the optimizer. Defaults to None.

  • scheduler_config (dict, optional) – Configuration for the learning rate scheduler. Defaults to None.

Raises:
  • AssertionError – If the number of quantiles is not equal to 3 when quantiles are provided.

  • AssertionError – If the number of output channels is not 1 for classification tasks.

forward(batch: dict) tensor

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

inference(batch: dict) tensor

Care here, we need to implement it because for predicting the N-step it will use the prediction at step N-1. TODO fix if because I did not implement the know continuous variable presence here

Parameters:

batch (dict) – batch of the dataloader

Returns:

result

Return type:

torch.tensor

class dsipts.Diffusion(d_model: int, out_channels: int, past_steps: int, future_steps: int, past_channels: int, future_channels: int, embs: List[int], learn_var: bool, cosine_alpha: bool, diffusion_steps: int, beta: float, gamma: float, n_layers_RNN: int, d_head: int, n_head: int, dropout_rate: float, activation: str, subnet: int, perc_subnet_learning_for_step: float, persistence_weight: float = 0.0, loss_type: str = 'l1', quantiles: List[float] = [], optim: str | None = None, optim_config: dict | None = None, scheduler_config: dict | None = None, **kwargs)

Bases: Base

Denoising Diffusion Probabilistic Model

Parameters:
  • d_model (int)

  • out_channels (int) – number of target variables

  • past_steps (int) – size of past window

  • future_steps (int) – size of future window to be predicted

  • past_channels (int) – number of variables available for the past context

  • future_channels (int) – number of variables known in the future, available for forecasting

  • embs (list[int]) – categorical variables dimensions for embeddings

  • learn_var (bool) – Flag to make the model train the posterior variance (if True) or use the variance of posterior distribution

  • cosine_alpha (bool) – Flag for the generation of alphas and betas

  • diffusion_steps (int) – number of noising steps for the initial sample

  • beta (float) – starting variable to generate the diffusion perturbations. Ignored if cosine_alpha == True

  • gamma (float) – trade_off variable to balance loss over noise prediction and NegativeLikelihood/KL_Divergence.

  • n_layers_RNN (int) – param for subnet

  • d_head (int) – param for subnet

  • n_head (int) – param for subnet

  • dropout_rate (float) – param for subnet

  • activation (str) – param for subnet

  • subnet (int) – =1 for attention subnet, =2 for linear subnet. Others can be added(wait for Black Friday for discounts)

  • perc_subnet_learning_for_step (float) – percentage to choose how many subnet has to be trained for every batch. Decrease this value if the loss blows up.

  • persistence_weight (float, optional) – Defaults to 0.0.

  • loss_type (str, optional) – Defaults to ‘l1’.

  • quantiles (List[float], optional) – Only [] accepted. Defaults to [].

  • optim (Union[str,None], optional) – Defaults to None.

  • optim_config (Union[dict,None], optional) – Defaults to None.

  • scheduler_config (Union[dict,None], optional) – Defaults to None.

cat_categorical_vars(batch: dict)

Extracting categorical context about past and future

Parameters:

batch (dict) – Keys checked -> [‘x_cat_past’, ‘x_cat_future’]

Returns:

cat_emb_past, cat_emb_fut

Return type:

List[torch.Tensor, torch.Tensor]

description = 'Can NOT  handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan NOT  handle Quantile loss function'
forward(batch: dict) float

training process of the diffusion network

Parameters:

batch (dict) – variables loaded

Returns:

total loss about the prediction of the noises over all subnets extracted

Return type:

float

gaussian_likelihood(x, mean, var)
gaussian_log_likelihood(x, mean, var)
handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = False
handle_quantile_loss = False
improving_weight_during_training()

Each time we sample from multinomial we subtract the minimum for more precise sampling, avoiding great learning differences among subnets

This lead to more stable inference also in early training, mainly for common context embedding.

For probabilistic reason, weights has to be >0, so we subtract min-1

inference(batch: dict) Tensor

Inference process to forecast future y

Parameters:

batch (dict) – Keys checked [‘x_num_past, ‘idx_target’, ‘x_num_future’, ‘x_cat_past’, ‘x_cat_future’]

Returns:

generated sequence [batch_size, future_steps, num_var]

Return type:

torch.Tensor

normal_kl(mean1, logvar1, mean2, logvar2)

Compute the KL divergence between two gaussians. Also called relative entropy. KL divergence of P from Q is the expected excess surprise from using Q as a model when the actual distribution is P. KL(P||Q) = P*log(P/Q) or -P*log(Q/P)

# In the context of machine learning, KL(P||Q) is often called the ‘information gain’ # achieved if P would be used instead of Q which is currently used.

Shapes are automatically broadcasted, so batches can be compared to scalars, among other use cases.

q_sample(x_start: Tensor, t: int) List[Tensor]

Diffuse x_start for t diffusion steps.

In other words, sample from q(x_t | x_0).

Also, compute the mean and variance of the diffusion posterior:

q(x_{t-1} | x_t, x_0)

Posterior mean and variance are the ones to be predicted

Parameters:
  • x_start (torch.Tensor) – values to be predicted

  • t (int) – diffusion step

Returns:

q_sample, posterior mean, posterior log variance and the actual noise

Return type:

List[torch.Tensor, torch.Tensor, torch.Tensor]

remove_var(tensor: Tensor, indexes_to_exclude: list, dimension: int) Tensor

Function to remove variables from tensors in chosen dimension and position

Parameters:
  • tensor (torch.Tensor) – starting tensor

  • indexes_to_exclude (list) – index of the chosen dimension we want t oexclude

  • dimension (int) – dimension of the tensor on which we want to work (not list od dims!!)

Returns:

new tensor without the chosen variables

Return type:

torch.Tensor

class dsipts.DilatedConv(sum_layers: bool, hidden_RNN: int, num_layers_RNN: int, kind: str, kernel_size: int, activation: str = 'torch.nn.ReLU', remove_last=False, dropout_rate: float = 0.1, use_bn: bool = False, use_glu: bool = True, glu_percentage: float = 1.0, **kwargs)

Bases: Base

Custom encoder-decoder

Parameters:
  • sum_layers (bool) – Flag indicating whether to sum the layers.

  • hidden_RNN (int) – Number of hidden units in the RNN.

  • num_layers_RNN (int) – Number of layers in the RNN.

  • kind (str) – Type of RNN to use (e.g., ‘LSTM’, ‘GRU’).

  • kernel_size (int) – Size of the convolutional kernel.

  • activation (str, optional) – Activation function to use. Defaults to ‘torch.nn.ReLU’.

  • remove_last (bool, optional) – Flag to indicate whether to remove the last element in the sequence. Defaults to False.

  • dropout_rate (float, optional) – Dropout rate for regularization. Defaults to 0.1.

  • use_bn (bool, optional) – Flag to indicate whether to use batch normalization. Defaults to False.

  • use_glu (bool, optional) – Flag to indicate whether to use Gated Linear Units (GLU). Defaults to True.

  • glu_percentage (float, optional) – Percentage of GLU to apply. Defaults to 1.0.

  • **kwargs – Additional keyword arguments.

Returns:

None

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forward(batch)

It is mandatory to implement this method

Parameters:

batch (dict) – batch of the dataloader

Returns:

result

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
inference(batch: dict) tensor

Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN)

Parameters:

batch (dict) – batch

Returns:

result

Return type:

torch.tensor

class dsipts.DilatedConvED(sum_layers: bool, hidden_RNN: int, num_layers_RNN: int, kind: str, kernel_size: int, dropout_rate: float = 0.1, use_bn: bool = False, use_cumsum: bool = True, use_bilinear: bool = False, activation: str = 'torch.nn.ReLU', **kwargs)

Bases: Base

Initialize the model with specified parameters.

Parameters:
  • sum_layers (bool) – Flag indicating whether to sum layers in the encoder/decoder blocks.

  • hidden_RNN (int) – Number of hidden units in the RNN.

  • num_layers_RNN (int) – Number of layers in the RNN.

  • kind (str) – Type of RNN to use (‘lstm’ or ‘gru’).

  • kernel_size (int) – Size of the convolutional kernel.

  • dropout_rate (float, optional) – Dropout rate for regularization. Defaults to 0.1.

  • use_bn (bool, optional) – Flag to use batch normalization. Defaults to False.

  • use_cumsum (bool, optional) – Flag to use cumulative sum. Defaults to True.

  • use_bilinear (bool, optional) – Flag to use bilinear layers. Defaults to False.

  • activation (str, optional) – Activation function to use. Defaults to ‘torch.nn.ReLU’.

  • **kwargs – Additional keyword arguments.

Raises:

ValueError – If the specified activation function is not recognized or if the kind is not ‘lstm’ or ‘gru’.

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forward(batch)

It is mandatory to implement this method

Parameters:

batch (dict) – batch of the dataloader

Returns:

result

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
class dsipts.Duet(factor: int, d_model: int, n_head: int, n_layer: int, CI: bool, d_ff: int, noisy_gating: bool, num_experts: int, kernel_size: int, hidden_size: int, k: int, dropout_rate: float = 0.1, activation: str = '', **kwargs)

Bases: Base

Initializes the model with the specified parameters. https://github.com/decisionintelligence/DUET

Parameters:
  • factor (int) – The factor for attention scaling. NOT USED but in the original implementation

  • d_model (int) – The dimensionality of the model.

  • n_head (int) – The number of attention heads.

  • n_layer (int) – The number of layers in the encoder.

  • CI (bool) – Perform channel independent operations.

  • d_ff (int) – The dimensionality of the feedforward layer.

  • noisy_gating (bool) – Flag to indicate if noisy gating is used.

  • num_experts (int) – The number of experts in the mixture of experts.

  • kernel_size (int) – The size of the convolutional kernel.

  • hidden_size (int) – The size of the hidden layer.

  • k (int) – The number of clusters for the linear extractor.

  • dropout_rate (float, optional) – The dropout rate. Defaults to 0.1.

  • activation (str, optional) – The activation function to use. Defaults to ‘’.

  • **kwargs – Additional keyword arguments.

Raises:

ValueError – If the activation function is not recognized.

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forward(batch: dict) float

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
class dsipts.ITransformer(hidden_size: int, d_model: int, n_head: int, n_layer_decoder: int, use_norm: bool, class_strategy: str = 'projection', dropout_rate: float = 0.1, activation: str = '', **kwargs)

Bases: Base

Initialize the ITransformer model for time series forecasting.

This class implements the Inverted Transformer architecture as described in the paper “ITRANSFORMER: INVERTED TRANSFORMERS ARE EFFECTIVE FOR TIME SERIES FORECASTING” (https://arxiv.org/pdf/2310.06625).

Parameters:
  • hidden_size (int) – The first embedding size of the model (‘r’ in the paper).

  • d_model (int) – The second embedding size (r^{tilda} in the model). Should be smaller than hidden_size.

  • n_head (int) – The number of attention heads.

  • n_layer_decoder (int) – The number of layers in the decoder.

  • use_norm (bool) – Flag to indicate whether to use normalization.

  • class_strategy (str, optional) – The strategy for classification, can be ‘projection’, ‘average’, or ‘cls_token’. Defaults to ‘projection’.

  • dropout_rate (float, optional) – The dropout rate for regularization. Defaults to 0.1.

  • activation (str, optional) – The activation function to be used. Defaults to ‘’.

  • **kwargs – Additional keyword arguments.

Raises:

ValueError – If the activation function is not recognized.

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
forward(batch: dict) float

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
class dsipts.Informer(d_model: int, hidden_size: int, n_layer_encoder: int, n_layer_decoder: int, mix: bool = True, activation: str = 'torch.nn.ReLU', remove_last=False, attn: str = 'prob', distil: bool = True, factor: int = 5, n_head: int = 1, dropout_rate: float = 0.1, **kwargs)

Bases: Base

Initialize the model with specified parameters. hhttps://github.com/zhouhaoyi/Informer2020/tree/main/models

Parameters:
  • d_model (int) – The dimensionality of the model.

  • hidden_size (int) – The size of the hidden layers.

  • n_layer_encoder (int) – The number of layers in the encoder.

  • n_layer_decoder (int) – The number of layers in the decoder.

  • mix (bool, optional) – Whether to use mixed attention. Defaults to True.

  • activation (str, optional) – The activation function to use. Defaults to ‘torch.nn.ReLU’.

  • remove_last (bool, optional) – Whether to remove the last layer. Defaults to False.

  • attn (str, optional) – The type of attention mechanism to use. Defaults to ‘prob’.

  • distil (bool, optional) – Whether to use distillation. Defaults to True.

  • factor (int, optional) – The factor for attention. Defaults to 5.

  • n_head (int, optional) – The number of attention heads. Defaults to 1.

  • dropout_rate (float, optional) – The dropout rate. Defaults to 0.1.

  • **kwargs – Additional keyword arguments.

Raises:

ValueError – If any of the parameters are invalid.

Notes

Ensure to set up split_params: shift: ${model_configs.future_steps} as it is required!!

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
class dsipts.LinearTS(kernel_size: int, hidden_size: int, dropout_rate: float = 0.1, activation: str = 'torch.nn.ReLU', kind: str = 'linear', use_bn: bool = False, simple: bool = False, **kwargs)

Bases: Base

Initialize the model with specified parameters. Linear model from https://github.com/cure-lab/LTSF-Linear/blob/main/run_longExp.py

Parameters:
  • kernel_size (int) – Kernel dimension for the initial moving average.

  • hidden_size (int) – Hidden size of the linear block.

  • dropout_rate (float, optional) – Dropout rate in Dropout layers. Default is 0.1.

  • activation (str, optional) – Activation function in PyTorch. Default is ‘torch.nn.ReLU’.

  • kind (str, optional) – Type of model, can be ‘linear’, ‘dlinear’ (de-trending), or ‘nlinear’ (differential). Defaults to ‘linear’.

  • use_bn (bool, optional) – If True, Batch Normalization layers will be added and Dropouts will be removed. Default is False.

  • simple (bool, optional) – If True, the model used is the same as illustrated in the paper; otherwise, a more complex model with the same idea is used. Default is False.

  • **kwargs – Additional keyword arguments for the parent class.

Raises:

ValueError – If an invalid activation function is provided.

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function\n THE SIMPLE IMPLEMENTATION DOES NOT USE CATEGORICAL NOR FUTURE VARIABLES'
forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
class dsipts.Monash(filename: str, baseUrl: str = 'https://forecastingdata.org/', rebuild: bool = False)

Bases: object

Class for downloading datasets listed here https://forecastingdata.org/

Parameters:
  • filename (str) – name of the class, used for saving

  • baseUrl (str, optional) – url to the source page. Defaults to ‘https://forecastingdata.org/’.

  • rebuild (bool, optional) – if true the table will be loaded from the webpage otherwise it will be loaded from the saved file. Defaults to False.

download_dataset(path: str, id: int, rebuild=False) None

download a specific dataset

Parameters:
  • path (str) – path in which save the data

  • id (int) – id of the dataset

  • rebuild (bool, optional) – if true the dataset will be re-downloaded. Defaults to False.

generate_dataset(id: int) None | DataFrame

Parse the id-th dataset in a convient format and return a pandas dataset

Parameters:

id (int) – id of the dataset

Returns:

dataframe

Return type:

None or pd.DataFrame

load(filename: str) None

Load a monarch structure

Parameters:

filename (str) – filename to load

save(filename: str) None

Save the monarch structure

Parameters:

filename (str) – name of the file to generate

class dsipts.PatchTST(d_model: int, patch_len: int, kernel_size: int, decomposition: bool = True, activation: str = 'torch.nn.ReLU', n_head: int = 1, n_layer: int = 2, stride: int = 8, remove_last: bool = False, hidden_size: int = 1048, dropout_rate: float = 0.1, **kwargs)

Bases: Base

Initializes the model with specified parameters.https://github.com/yuqinie98/PatchTST/blob/main/

Parameters:
  • d_model (int) – The dimensionality of the model.

  • patch_len (int) – The length of the patches.

  • kernel_size (int) – The size of the kernel for convolutional layers.

  • decomposition (bool, optional) – Whether to use decomposition. Defaults to True.

  • activation (str, optional) – The activation function to use. Defaults to ‘torch.nn.ReLU’.

  • n_head (int, optional) – The number of attention heads. Defaults to 1.

  • n_layer (int, optional) – The number of layers in the model. Defaults to 2.

  • stride (int, optional) – The stride for convolutional layers. Defaults to 8.

  • remove_last (bool, optional) – Whether to remove the last layer. Defaults to False.

  • hidden_size (int, optional) – The size of the hidden layers. Defaults to 1048.

  • dropout_rate (float, optional) – The dropout rate for regularization. Defaults to 0.1.

  • **kwargs – Additional keyword arguments.

Raises:

ValueError – If the activation function is not recognized.

description = 'Can   handle multivariate output \nCan NOT  handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = False
handle_multivariate = True
handle_quantile_loss = True
class dsipts.Persistent(**kwargs)

Bases: Base

Simple persistent model aligned with all the other

description = 'Can   handle multivariate output \nCan NOT  handle future covariates\nCan NOT  handle categorical covariates\nCan NOT  handle Quantile loss function'
forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = False
handle_future_covariates = False
handle_multivariate = True
handle_quantile_loss = False
class dsipts.RNN(hidden_RNN: int, num_layers_RNN: int, kind: str, kernel_size: int, activation: str = 'torch.nn.ReLU', remove_last=False, dropout_rate: float = 0.1, use_bn: bool = False, num_blocks: int = 4, bidirectional: bool = True, lstm_type: str = 'slstm', **kwargs)

Bases: Base

Initialize a recurrent model with an encoder-decoder structure.

Parameters:
  • hidden_RNN (int) – Hidden size of the RNN block.

  • num_layers_RNN (int) – Number of RNN layers.

  • kind (str) – Type of RNN to use, either ‘gru’ or ‘lstm’ or xlstm.

  • kernel_size (int) – Kernel size in the encoder convolutional block.

  • activation (str, optional) – Activation function from PyTorch. Default is ‘torch.nn.ReLU’.

  • remove_last (bool, optional) – If True, the model learns the difference with respect to the last seen point. Default is False.

  • dropout_rate (float, optional) – Dropout rate in Dropout layers. Default is 0.1.

  • use_bn (bool, optional) – If True, Batch Normalization layers will be added and Dropouts will be removed. Default is False.

  • num_blocks (int, optional) – Number of xLSTM blocks (only for xLSTM). Default is 4.

  • bidirectional (bool, optional) – If True, the RNN is bidirectional. Default is True.

  • lstm_type (str, optional) – Type of LSTM to use (only for xLSTM), either ‘slstm’ or ‘mlstm’. Default is ‘slstm’.

  • **kwargs – Additional keyword arguments.

Raises:

ValueError – If the specified kind is not ‘lstm’, ‘gru’, or ‘xlstm’.

forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
class dsipts.Samformer(hidden_size: int, use_revin: bool, activation: str = '', **kwargs)

Bases: Base

Initialize the model with specified parameters. Samformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention. https://arxiv.org/pdf/2402.10198

Parameters:
  • hidden_size (int) – The size of the hidden layer.

  • use_revin (bool) – Flag indicating whether to use RevIN.

  • activation (str, optional) – The activation function to use. Defaults to ‘’.

  • **kwargs – Additional keyword arguments passed to the parent class.

Raises:

ValueError – If the activation function is not recognized.

description = 'Can   handle multivariate output \nCan NOT  handle future covariates\nCan NOT  handle categorical covariates\nCan NOT  handle Quantile loss function'
forward(batch: dict) float

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = False
handle_future_covariates = False
handle_multivariate = True
handle_quantile_loss = False
class dsipts.Simple(hidden_size: int, dropout_rate: float = 0.1, activation: str = 'torch.nn.ReLU', **kwargs)

Bases: Base

This is the basic model, each model implemented must overwrite the init method and the forward method. The inference step is optional, by default it uses the forward method but for recurrent network you should implement your own method

Parameters:
  • verbose (bool) – Flag to enable verbose logging.

  • past_steps (int) – Number of past time steps to consider.

  • future_steps (int) – Number of future time steps to predict.

  • past_channels (int) – Number of channels in the past input data.

  • future_channels (int) – Number of channels in the future input data.

  • out_channels (int) – Number of output channels.

  • embs_past (List[int]) – List of embedding dimensions for past data.

  • embs_fut (List[int]) – List of embedding dimensions for future data.

  • n_classes (int, optional) – Number of classes for classification. Defaults to 0.

  • persistence_weight (float, optional) – Weight for persistence in loss calculation. Defaults to 0.0.

  • loss_type (str, optional) – Type of loss function to use (‘l1’ or ‘mse’). Defaults to ‘l1’.

  • quantiles (List[int], optional) – List of quantiles for quantile loss. Defaults to an empty list.

  • reduction_mode (str, optional) – Mode for reduction for categorical embedding layer (‘mean’, ‘sum’, ‘none’). Defaults to ‘mean’.

  • use_classical_positional_encoder (bool, optional) – Flag to use classical positional encoding or using embedding layer also for the positions. Defaults to False.

  • emb_dim (int, optional) – Dimension of categorical embeddings. Defaults to 16.

  • optim (Union[str, None], optional) – Optimizer type. Defaults to None.

  • optim_config (dict, optional) – Configuration for the optimizer. Defaults to None.

  • scheduler_config (dict, optional) – Configuration for the learning rate scheduler. Defaults to None.

Raises:
  • AssertionError – If the number of quantiles is not equal to 3 when quantiles are provided.

  • AssertionError – If the number of output channels is not 1 for classification tasks.

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function\n THE SIMPLE IMPLEMENTATION DOES NOT USE CATEGORICAL NOR FUTURE VARIABLES'
forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
class dsipts.TFT(d_model: int, num_layers_RNN: int, d_head: int, n_head: int, dropout_rate: float, **kwargs)

Bases: Base

Initializes the model for time series forecasting with attention mechanisms and recurrent neural networks.

This model is designed for direct forecasting, allowing for multi-output and multi-horizon predictions. It leverages attention mechanisms to enhance the selection of relevant past time steps and learn long-term dependencies. The architecture includes RNN enrichment, gating mechanisms to minimize the impact of irrelevant variables, and the ability to output prediction intervals through quantile regression.

Key features include: - Direct Model: Predicts all future steps at once. - Multi-Output Forecasting: Capable of predicting one or more variables simultaneously. - Multi-Horizon Forecasting: Predicts variables at multiple future time steps. - Attention-Based Mechanism: Enhances the selection of relevant past time steps and learns long-term dependencies. - RNN Enrichment: Utilizes LSTM for initial autoregressive approximation, which is refined by the rest of the network. - Gating Mechanisms: Reduces the contribution of irrelevant variables. - Prediction Intervals: Outputs percentiles (e.g., 10th, 50th, 90th) at each time step.

The model also facilitates interpretability by identifying: - Global importance of variables for both past and future. - Temporal patterns. - Significant events.

Parameters:
  • d_model (int) – General hidden dimension across the network, adjustable in sub-networks.

  • num_layers_RNN (int) – Number of layers in the recurrent neural network (LSTM).

  • d_head (int) – Dimension of each attention head.

  • n_head (int) – Number of attention heads.

  • dropout_rate (float) – Dropout rate applied uniformly across all dropout layers.

  • **kwargs – Additional keyword arguments for further customization.

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forward(batch: dict) Tensor

Temporal Fusion Transformer

Collectiong Data - Extract the autoregressive variable(s) - Embedding and compute a first approximated prediction - ‘summary_past’ and ‘summary_fut’ collecting data about past and future Concatenating on the dimension 2 all different datas, which will be mixed through a MEAN over that imension Info get from other tensor of the batch taken as input

TFT actual computations - Residual Connection for y_past and summary_past - Residual Connection for y_fut and summary_fut - GRN1 for past and for fut - ATTENTION(summary_fut, summary_past, y_past) - Residual Connection for attention itself - GRN2 for attention - Residual Connection for attention and summary_fut - Linear for actual values and reshape

Parameters:

batch (dict) – Keys used are [‘x_num_past’, ‘idx_target’, ‘x_num_future’, ‘x_cat_past’, ‘x_cat_future’]

Returns:

shape [B, self.future_steps, self.out_channels, self.mul] or [B, self.future_steps, self.out_channels] according to quantiles

Return type:

torch.Tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
remove_var(tensor: Tensor, indexes_to_exclude: int, dimension: int) Tensor

Function to remove variables from tensors in chosen dimension and position

Parameters:
  • tensor (torch.Tensor) – starting tensor

  • indexes_to_exclude (int) – index of the chosen dimension we want t oexclude

  • dimension (int) – dimension of the tensor on which we want to work

Returns:

new tensor without the chosen variables

Return type:

torch.Tensor

class dsipts.TIDE(hidden_size: int, d_model: int, n_add_enc: int, n_add_dec: int, dropout_rate: float, activation: str = '', **kwargs)

Bases: Base

Initializes the model with specified parameters for a neural network architecture. Long-term Forecasting with TiDE: Time-series Dense Encoder https://arxiv.org/abs/2304.08424

Parameters:
  • hidden_size (int) – The size of the hidden layers.

  • d_model (int) – The dimensionality of the model.

  • n_add_enc (int) – The number of additional encoder layers.

  • n_add_dec (int) – The number of additional decoder layers.

  • dropout_rate (float) – The dropout rate to be applied in the layers.

  • activation (str, optional) – The activation function to be used. Defaults to an empty string.

  • **kwargs – Additional keyword arguments passed to the parent class.

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forward(batch: dict) float

training process of the diffusion network

Parameters:

batch (dict) – variables loaded

Returns:

total loss about the prediction of the noises over all subnets extracted

Return type:

float

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
remove_var(tensor: Tensor, indexes_to_exclude: list, dimension: int) Tensor

Function to remove variables from tensors in chosen dimension and position

Parameters:
  • tensor (torch.Tensor) – starting tensor

  • indexes_to_exclude (list) – index of the chosen dimension we want t oexclude

  • dimension (int) – dimension of the tensor on which we want to work (not list od dims!!)

Returns:

new tensor without the chosen variables

Return type:

torch.Tensor

class dsipts.TTM(model_path: str, past_steps: int, future_steps: int, freq_prefix_tuning: bool, freq: str, prefer_l1_loss: bool, prefer_longer_context: bool, loss_type: str, num_input_channels, prediction_channel_indices, exogenous_channel_indices, decoder_mode, fcm_context_length, fcm_use_mixer, fcm_mix_layers, fcm_prepend_past, enable_forecast_channel_mixing, out_channels: int, embs: List[int], remove_last=False, optim: str | None = None, optim_config: dict = None, scheduler_config: dict = None, verbose=False, use_quantiles=False, persistence_weight: float = 0.0, quantiles: List[int] = [], **kwargs)

Bases: Base

TODO and FIX for future and past categorical variables

Parameters:
  • model_path (str) – _description_

  • past_steps (int) – _description_

  • future_steps (int) – _description_

  • freq_prefix_tuning (bool) – _description_

  • freq (str) – _description_

  • prefer_l1_loss (bool) – _description_

  • loss_type (str) – _description_

  • num_input_channels (_type_) – _description_

  • prediction_channel_indices (_type_) – _description_

  • exogenous_channel_indices (_type_) – _description_

  • decoder_mode (_type_) – _description_

  • fcm_context_length (_type_) – _description_

  • fcm_use_mixer (_type_) – _description_

  • fcm_mix_layers (_type_) – _description_

  • fcm_prepend_past (_type_) – _description_

  • enable_forecast_channel_mixing (_type_) – _description_

  • out_channels (int) – _description_

  • embs (List[int]) – _description_

  • remove_last (bool, optional) – _description_. Defaults to False.

  • optim (Union[str,None], optional) – _description_. Defaults to None.

  • optim_config (dict, optional) – _description_. Defaults to None.

  • scheduler_config (dict, optional) – _description_. Defaults to None.

  • verbose (bool, optional) – _description_. Defaults to False.

  • use_quantiles (bool, optional) – _description_. Defaults to False.

  • persistence_weight (float, optional) – _description_. Defaults to 0.0.

  • quantiles (List[int], optional) – _description_. Defaults to [].

forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

class dsipts.TimeSeries(name: str, stacked: bool = False)

Bases: object

Class for generating time series object. If you don’t have any time series you can build one fake timeseries using some helping classes (Categorical for instance).

Parameters:
  • name (str) – name of the series

  • stacked (bool) – if true it is a stacked model

Usage:

For example we can generate a toy timeseries:

  • add a multiplicative categorical feature (weekly)

>>> settimana = Categorical('settimanale',1,[1,1,1,1,1,1,1],7,'multiplicative',[0.9,0.8,0.7,0.6,0.5,0.99,0.99])
  • an additive montly feature (here a year is composed by 5 months)

>>> mese = Categorical('mensile',1,[31,28,20,10,33],5,'additive',[10,20,-10,20,0])
  • a spotted categorical variable that happens every 100 days and lasts 1 day

>>> spot = Categorical('spot',100,[7],1,'additive',[10])
>>> ts = TimeSeries('prova')
>>> ts.generate_signal(length = 5000,categorical_variables = [settimana,mese,spot],noise_mean=1,type=0) ##we can add also noise
>>> ts.plot()
create_data_loader(data: DataFrame, past_steps: int, future_steps: int, shift: int = 0, keep_entire_seq_while_shifting: bool = False, starting_point: None | dict = None, skip_step: int = 1, is_inference: bool = False) MyDataset

Create the dataset for the training/inference step

Parameters:
  • data (pd.DataFrame) – input dataset, usually a subset of self.data

  • past_steps (int) – past context length

  • future_steps (int) – future lags to predict

  • shift (int, optional) – if >0 the future input variables will be shifted (categorical and numerical). For example for attention model it is better to start with a know value of y and use it during the process. Defaults to 0.

  • keep_entire_seq_while_shifting (bool, optional) – if the dataset is shifted, you may want the future data be of length future_step+shift (like informer), default false

  • starting_point (Union[None,dict], optional) – a dictionary indicating if a sample must be considered. It is checked for the first lag in the future (useful in the case your model has to predict only starting from hour 12). Defaults to None.

  • skip_step (int, optional) – list of the categortial variables (same for past and future). Usual there is a skip of one between two saples but for debugging or training time purposes you can skip some samples. Defaults to 1.

Returns:

class that extends torch.utils.data.Dataset (see utils)

keys of a batch: y : the target variable(s) x_num_past: the numerical past variables x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Return type:

MyDataset

enrich(dataset, columns)
generate_signal(length: int = 5000, categorical_variables: List[Categorical] = [], noise_mean: int = 1, type: int = 0) None

This will generate a syntetic signal with a selected length, a noise level and some categorical variables. The additive series are added at the end while the multiplicative series acts on the original signal The TS structure will be populated

Parameters:
  • length (int, optional) – length of the signal. Defaults to 5000.

  • categorical_variables (List[Categorical], optional) – list of Categorical variables. Defaults to [].

  • noise_mean (int, optional) – variance of the noise to add at the end. Defaults to 1.

  • type (int, optional) – type of the timeseries (only type=0 available right now). Defaults to 0.

inference(batch_size: int = 100, num_workers: int = 4, split_params: None | dict = None, rescaling: bool = True, data: DataFrame = None, steps_in_future: int = 0, check_holes_and_duplicates: bool = True, is_inference: bool = False) DataFrame

similar to inference_on_set only change is split_params that must contain this keys but using the default can be sufficient: ‘past_steps’,’future_steps’,’shift’,’keep_entire_seq_while_shifting’,’starting_point’

skip_step is set to 1 for convenience (generally you want all the predictions) You can set split_params to None and use the standard parameters (at your own risck)

Parameters:
  • batch_size (int, optional) – see inference_on_set. Defaults to 100.

  • num_workers (int, optional) – inference_on_set. Defaults to 4.

  • split_params (Union[None,dict], optional) – inference_on_set. Defaults to None.

  • rescaling (bool, optional) – inference_on_set. Defaults to True.

  • data (pd.DataFrame, optional) – startin dataset. Defaults to None.

  • steps_in_future (int, optional) – if>0 the dataset is extendend in order to make predictions in the future. Defaults to 0.

  • check_holes_and_duplicates (bool, optional) – if False the routine does not check for holes or for duplicates, set to False for stacked model. Defaults to True.

Returns:

predicted values

Return type:

pd.DataFrame

inference_on_set(batch_size: int = 100, num_workers: int = 4, split_params: None | dict = None, set: str = 'test', rescaling: bool = True, data: None | Dataset = None) DataFrame

This function allows to get the prediction on a particular set (train, test or validation).

Parameters:
  • batch_size (int, optional) – barch sise. Defaults to 100.

  • num_workers (int, optional) – num workers. Defaults to 4.

  • split_params (Union[None,dict], optional) – if not None the spliting procedure will use the given data otherwise it will use the same configuration used in train. Defaults to None.

  • set (str, optional) – trai, validation or test. Defaults to ‘test’.

  • rescaling (bool, optional) – If rescaling is true the output will be rescaled to the initial values. . Defaults to True.

  • data (None or pd.DataFrame, optional)

Returns:

the predicted values in a pandas format

Return type:

pd.DataFrame

load(model: Base, filename: str, load_last: bool = True, dirpath: str | None = None, weight_path: str | None = None) None

Load a saved model

Parameters:
  • model (Base) – class of the model to load (it will be initiated by pytorch-lightening)

  • filename (str) – filename of the saved model

  • load_last (bool, optional) – if true the last checkpoint will be loaded otherwise the best (in the validation set). Defaults to True.

  • dirpath (Union[str,None], optional) – if None we asssume that the model is loaded from the same pc where it has been trained, otherwise we can pass the dirpath where all the stuff has been saved . Defaults to None.

  • weight_path (Union[str, None], optional) – if None the standard path will be used. Defaults to None.

load_signal(data: DataFrame, enrich_cat: List[str] = [], past_variables: List[str] = [], future_variables: List[str] = [], target_variables: List[str] = [], cat_past_var: List[str] = [], cat_fut_var: List[str] = [], check_past: bool = True, group: None | str = None, check_holes_and_duplicates: bool = True, silly_model: bool = False) None
This is a crucial point in the data structure. We expect here to have a dataset with time as timestamp.
There are some checks:

1- the duplicates will tbe removed taking the first instance

2- the frequency will the inferred taking the minumum time distance between samples

3- the dataset will be filled completing the missing timestamps

Parameters:
  • data (pd.DataFrame) – input dataset the column indicating the time must be called time

  • enrich_cat (List[str], optional) – it is possible to let this function enrich the dataset for example adding the standard columns: hour, dow, month and minute. Defaults to [].

  • past_variables (List[str], optional) – list of column names of past variables not available for future times . Defaults to [].

  • future_variables (List[str], optional) – list of future variables available for tuture times. Defaults to [].

  • target_variables (List[str], optional) – list of the target variables. They will added to past_variables by default unless check_past is false. Defaults to [].

  • cat_past_var (List[str], optional) – list of the past categorical variables. Defaults to [].

  • cat_future_var (List[str], optional) – list of the future categorical variables. Defaults to [].

  • check_past (bool, optional) – see target_variables. Defaults to True.

  • group (str or None, optional) – if not None the time serie dataset is considered composed by omogeneus timeseries coming from different realization (for example point of sales, cities, locations) and the relative series are not splitted during the sample generation. Defaults to None

  • check_holes_and_duplicates (bool, optional) – if False duplicates or holes will not checked, the dataloader can not correctly work, disable at your own risk. Defaults True

  • silly_model (bool, optional) – if True, target variables will be added to the pool of the future variables. This can be useful to see if information passes throught the decoder part of your model (if any)

plot()

Easy way to control the loaded data :returns: figure of the target variables :rtype: plotly.graph_objects._figure.Figure

save(filename: str) None

save the timeseries object

Parameters:

filename (str) – name of the file

set_model(model: Base, config: dict = None, custom_init: bool = False)

Set the model to train

Parameters:
  • model (Base) – see models

  • config (dict, optional) – usually the configuration used by the model. Defaults to None.

  • custom_init (bool, optional) – if true a custom initialization paradigm will be used (see weight_init in models/utils.py ) .

set_verbose(verbose: bool)
split_for_train(perc_train: float | None = 0.6, perc_valid: float | None = 0.2, range_train: List[datetime | str] | None = None, range_validation: List[datetime | str] | None = None, range_test: List[datetime | str] | None = None, past_steps: int = 100, future_steps: int = 20, shift: int = 0, keep_entire_seq_while_shifting: bool = False, starting_point: None | dict = None, skip_step: int = 1, normalize_per_group: bool = False, check_consecutive: bool = True, scaler: str = 'StandardScaler()') List[DataLoader]

Split the data and create the datasets.

Parameters:
  • perc_train (Union[float,None], optional) – fraction of the training set. Defaults to 0.6.

  • perc_valid (Union[float,None], optional) – fraction of the test set. Defaults to 0.2.

  • range_train (Union[List[Union[datetime, str]],None], optional) – a list of two elements indicating the starting point and end point of the training set (string date style or datetime). Defaults to None.

  • range_validation (Union[List[Union[datetime, str]],None], optional) – a list of two elements indicating the starting point and end point of the validation set (string date style or datetime). Defaults to None.

  • range_test (Union[List[Union[datetime, str]],None], optional) – a list of two elements indicating the starting point and end point of the test set (string date style or datetime). Defaults to None.

  • past_steps (int, optional) – past step to consider for making the prediction. Defaults to 100.

  • future_steps (int, optional) – future step to predict. Defaults to 20.

  • shift (int, optional) – see create_data_loader. Defaults to 0.

  • keep_entire_seq_while_shifting (bool, optional) – if the dataset is shifted, you may want the future data be of length future_step+shift (like informer), default false

  • starting_point (Union[None, dict], optional) – see create_data_loader. Defaults to None.

  • skip_step (int, optional) – see create_data_loader. Defaults to 1.

  • normalize_per_group (boolean, optional) – if true and self.group is not None, the variables are scaled respect to the groups. Default False

  • check_consecutive (boolean, optional) – if false it skips the check on the consecutive ranges. Default True

  • scaler – instance of a sklearn.preprocessing scaler. Default ‘StandardScaler()’

Returns:

three dataloader used for training or inference

Return type:

List[DataLoader,DataLoader,DataLoadtrainer]

train_model(dirpath: str, split_params: dict, batch_size: int = 100, num_workers: int = 4, max_epochs: int = 500, auto_lr_find: bool = True, gradient_clip_val: float | None = None, gradient_clip_algorithm: str = 'value', devices: str | List[int] = 'auto', precision: str | int = 32, modifier: None | str = None, modifier_params: None | dict = None, seed: int = 42) float

Train the model

Parameters:
  • dirpath (str) – path where to put all the useful things

  • split_params (dict) – see split_for_train

  • batch_size (int, optional) – batch size. Defaults to 100.

  • num_workers (int, optional) – num_workers for the dataloader. Defaults to 4.

  • max_epochs (int, optional) – maximum epochs to perform. Defaults to 500.

  • auto_lr_find (bool, optional) – find initial learning rate, see pytorch-lightening. Defaults to True.

  • gradient_clip_val (Union[float,None], optional) – gradient_clip_val. Defaults to None. See https://lightning.ai/docs/pytorch/stable/advanced/training_tricks.html

  • gradient_clip_algorithm (str, optional) – gradient_clip_algorithm. Defaults to ‘norm ‘. See https://lightning.ai/docs/pytorch/stable/advanced/training_tricks.html

  • devices (Union[str,List[int]], optional) – devices to use. Use auto if cpu or the list of gpu to use otherwise. Defaults to ‘auto’.

  • precision (Union[str,int], optional) – precision to use. Usually 32 bit is fine but for larger model you should try ‘bf16’. If ‘auto’ it will use bf16 for GPU and 32 for cpu

  • modifier (Union[str,int], optional) – if not None a modifier is applyed to the dataloader. Sometimes lightening has very restrictive rules on the dataloader, or we want to use a ML model before or after the DL model (See readme for more information)

  • modifier_params (Union[dict,int], optional) – parameters of the modifier

  • seed (int, optional) – seed for reproducibility

class dsipts.TimeXER(patch_len: int, d_model: int, n_head: int, d_ff: int = 512, dropout_rate: float = 0.1, n_layer_decoder: int = 1, activation: str = '', **kwargs)

Bases: Base

Initialize the model with specified parameters. https://github.com/thuml/Time-Series-Library/blob/main/models/TimeMixer.py

Parameters:
  • patch_len (int) – Length of the patches.

  • d_model (int) – Dimension of the model.

  • n_head (int) – Number of attention heads.

  • d_ff (int, optional) – Dimension of the feedforward network. Defaults to 512.

  • dropout_rate (float, optional) – Dropout rate for regularization. Defaults to 0.1.

  • n_layer_decoder (int, optional) – Number of layers in the decoder. Defaults to 1.

  • activation (str, optional) – Activation function to use. Defaults to ‘’.

  • **kwargs – Additional keyword arguments passed to the superclass.

Raises:

ValueError – If an invalid activation function is provided.

description = 'Can   handle multivariate output \nCan   handle future covariates\nCan   handle categorical covariates\nCan   handle Quantile loss function'
forward(batch: dict) float

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

handle_categorical_variables = True
handle_future_covariates = True
handle_multivariate = True
handle_quantile_loss = True
class dsipts.VQVAEA(past_steps: int, future_steps: int, past_channels: int, future_channels: int, hidden_channels: int, embs: List[int], d_model: int, max_voc_size: int, num_layers: int, dropout_rate: float, commitment_cost: float, decay: float, n_heads: int, out_channels: int, epoch_vqvae: int, persistence_weight: float = 0.0, loss_type: str = 'l1', quantiles: List[int] = [], optim: str | None = None, optim_config: dict = None, scheduler_config: dict = None, **kwargs)

Bases: Base

Custom encoder-decoder

Parameters:
  • past_steps (int) – number of past datapoints used

  • future_steps (int) – number of future lag to predict

  • past_channels (int) – number of numeric past variables, must be >0

  • future_channels (int) – number of future numeric variables

  • embs (List) – list of the initial dimension of the categorical variables

  • cat_emb_dim (int) – final dimension of each categorical variable

  • hidden_RNN (int) – hidden size of the RNN block

  • num_layers_RNN (int) – number of RNN layers

  • kind (str) – one among GRU or LSTM

  • kernel_size (int) – kernel size in the encoder convolutional block

  • sum_emb (bool) – if true the contribution of each embedding will be summed-up otherwise stacked

  • out_channels (int) – number of output channels

  • activation (str, optional) – activation fuction function pytorch. Default torch.nn.ReLU

  • remove_last (bool, optional) – if True the model learns the difference respect to the last seen point

  • persistence_weight (float) – weight controlling the divergence from persistence model. Default 0

  • loss_type (str, optional) – this model uses custom losses or l1 or mse. Custom losses can be linear_penalization or exponential_penalization. Default l1,

  • quantiles (List[int], optional) – we can use quantile loss il len(quantiles) = 0 (usually 0.1,0.5, 0.9) or L1loss in case len(quantiles)==0. Defaults to [].

  • dropout_rate (float, optional) – dropout rate in Dropout layers

  • use_bn (bool, optional) – if true BN layers will be added and dropouts will be removed

  • use_glu (bool,optional) – use GLU for feature selection. Defaults to True.

  • glu_percentage (float, optiona) – percentage of features to use. Defaults to 1.0.

  • n_classes (int) – number of classes (0 in regression)

  • optim (str, optional) – if not None it expects a pytorch optim method. Defaults to None that is mapped to Adam.

  • optim_config (dict, optional) – configuration for Adam optimizer. Defaults to None.

  • scheduler_config (dict, optional) – configuration for stepLR scheduler. Defaults to None.

description = 'Can NOT  handle multivariate output \nCan NOT  handle future covariates\nCan NOT  handle categorical covariates\nCan NOT  handle Quantile loss function'
forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

generate(idx, max_new_tokens, temperature=1.0, do_sample=False, top_k=None, num_samples=100)

Take a conditioning sequence of indices idx (LongTensor of shape (b,t)) and complete the sequence max_new_tokens times, feeding the predictions back into the model each time. Most likely you’ll want to make sure to be in model.eval() mode of operation for this.

gpt(tokens)
handle_categorical_variables = False
handle_future_covariates = False
handle_multivariate = False
handle_quantile_loss = False
inference(batch: dict) tensor

Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN)

Parameters:

batch (dict) – batch

Returns:

result

Return type:

torch.tensor

class dsipts.VVA(past_steps: int, future_steps: int, past_channels: int, future_channels: int, embs: List[int], d_model: int, max_voc_size: int, token_split: int, num_layers: int, dropout_rate: float, n_heads: int, out_channels: int, persistence_weight: float = 0.0, loss_type: str = 'l1', quantiles: List[int] = [], optim: str | None = None, optim_config: dict = None, scheduler_config: dict = None, **kwargs)

Bases: Base

Custom encoder-decoder

Parameters:
  • past_steps (int) – number of past datapoints used

  • future_steps (int) – number of future lag to predict

  • past_channels (int) – number of numeric past variables, must be >0

  • future_channels (int) – number of future numeric variables

  • embs (List) – list of the initial dimension of the categorical variables

  • cat_emb_dim (int) – final dimension of each categorical variable

  • hidden_RNN (int) – hidden size of the RNN block

  • num_layers_RNN (int) – number of RNN layers

  • kind (str) – one among GRU or LSTM

  • kernel_size (int) – kernel size in the encoder convolutional block

  • sum_emb (bool) – if true the contribution of each embedding will be summed-up otherwise stacked

  • out_channels (int) – number of output channels

  • activation (str, optional) – activation fuction function pytorch. Default torch.nn.ReLU

  • remove_last (bool, optional) – if True the model learns the difference respect to the last seen point

  • persistence_weight (float) – weight controlling the divergence from persistence model. Default 0

  • loss_type (str, optional) – this model uses custom losses or l1 or mse. Custom losses can be linear_penalization or exponential_penalization. Default l1,

  • quantiles (List[int], optional) – we can use quantile loss il len(quantiles) = 0 (usually 0.1,0.5, 0.9) or L1loss in case len(quantiles)==0. Defaults to [].

  • dropout_rate (float, optional) – dropout rate in Dropout layers

  • use_bn (bool, optional) – if true BN layers will be added and dropouts will be removed

  • use_glu (bool,optional) – use GLU for feature selection. Defaults to True.

  • glu_percentage (float, optiona) – percentage of features to use. Defaults to 1.0.

  • n_classes (int) – number of classes (0 in regression)

  • optim (str, optional) – if not None it expects a pytorch optim method. Defaults to None that is mapped to Adam.

  • optim_config (dict, optional) – configuration for Adam optimizer. Defaults to None.

  • scheduler_config (dict, optional) – configuration for stepLR scheduler. Defaults to None.

configure_optimizers()

This long function is unfortunately doing something very simple and is being very defensive: We are separating out all parameters of the model into two buckets: those that will experience weight decay for regularization and those that won’t (biases, and layernorm/embedding weights). We are then returning the PyTorch optimizer object.

description = 'Can NOT  handle multivariate output \nCan NOT  handle future covariates\nCan NOT  handle categorical covariates\nCan NOT  handle Quantile loss function'
forward(batch)

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

generate(idx, max_new_tokens, temperature=1.0, do_sample=False, top_k=None, num_samples=100)

Take a conditioning sequence of indices idx (LongTensor of shape (b,t)) and complete the sequence max_new_tokens times, feeding the predictions back into the model each time. Most likely you’ll want to make sure to be in model.eval() mode of operation for this.

handle_categorical_variables = False
handle_future_covariates = False
handle_multivariate = False
handle_quantile_loss = False
inference(batch: dict) tensor

Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN)

Parameters:

batch (dict) – batch

Returns:

result

Return type:

torch.tensor

dsipts.beauty_string(message: str, type: str, verbose: bool)
dsipts.extend_time_df(x: DataFrame, freq: str | int, group: str | None = None, global_minmax: bool = False) DataFrame

Utility for generating a full dataset and then merge the real data

Parameters:
  • x (pd.DataFrame) – dataframe containing the column time

  • freq (str) – frequency (in pandas notation) of the resulting dataframe

  • group (string or None) – if not None the min max are computed by the group column, default None

  • global_minmax (bool) – if True the min_max is computed globally for each group. Usually used for stacked model

Returns:

a dataframe with the column time ranging from thr minumum of x to the maximum with frequency freq

Return type:

pd.DataFrame

dsipts.get_freq(freq) str

Get the frequency based on the string reported. I don’t think there are all the possibilities here

Parameters:

freq (str) – string coming from

Returns:

pandas frequency format

Return type:

str

dsipts.read_public_dataset(path: str, dataset: str) Tuple[DataFrame, List[str]]

Returns the public dataset chosen. Pleas download the dataset from here https://drive.google.com/drive/folders/1ZOYpTUa82_jCcxIdTmyr0LXQfvaM9vIy or ask to agobbi@fbk.eu.

Parameters:
  • path (str) – path to data

  • dataset (str) – dataset (one of ‘electricity’,’etth1’,’etth2’,’ettm1’,’ettm2’,’exchange_rate’,’illness’,’traffic’,’weather’)

Returns:

The target variable is y and the time index is time and the list of the covariates

Return type:

Tuple[pd.DataFrame,List[str]]