dsipts.models package¶
Subpackages¶
- dsipts.models.autoformer package
- dsipts.models.crossformer package
- dsipts.models.d3vae package
- Submodules
- dsipts.models.d3vae.diffusion_process module
- dsipts.models.d3vae.embedding module
- dsipts.models.d3vae.encoder module
- dsipts.models.d3vae.model module
- dsipts.models.d3vae.neural_operations module
- dsipts.models.d3vae.resnet module
- dsipts.models.d3vae.utils module
- Module contents
- dsipts.models.duet package
- dsipts.models.informer package
- dsipts.models.itransformer package
- Submodules
- dsipts.models.itransformer.Embed module
- dsipts.models.itransformer.SelfAttention_Family module
- dsipts.models.itransformer.Transformer_EncDec module
- Module contents
- dsipts.models.patchtst package
- dsipts.models.samformer package
- dsipts.models.tft package
- dsipts.models.timexer package
- dsipts.models.ttm package
- Submodules
- dsipts.models.ttm.configuration_tinytimemixer module
- dsipts.models.ttm.consts module
- dsipts.models.ttm.modeling_tinytimemixer module
FeatureMixerBlock
ForecastChannelHeadMixer
PatchMixerBlock
PinballLoss
SampleTinyTimeMixerPredictionOutput
TinyTimeMixerAdaptivePatchingBlock
TinyTimeMixerAttention
TinyTimeMixerBatchNorm
TinyTimeMixerBlock
TinyTimeMixerCategoricalEmbeddingLayer
TinyTimeMixerChannelFeatureMixerBlock
TinyTimeMixerDecoder
TinyTimeMixerEncoder
TinyTimeMixerEncoderOutput
TinyTimeMixerForMaskedPrediction
TinyTimeMixerForPrediction
TinyTimeMixerForPredictionHead
TinyTimeMixerForPredictionOutput
TinyTimeMixerForPredictionOutput.backbone_hidden_state
TinyTimeMixerForPredictionOutput.decoder_hidden_state
TinyTimeMixerForPredictionOutput.hidden_states
TinyTimeMixerForPredictionOutput.loc
TinyTimeMixerForPredictionOutput.loss
TinyTimeMixerForPredictionOutput.prediction_outputs
TinyTimeMixerForPredictionOutput.scale
TinyTimeMixerGatedAttention
TinyTimeMixerLayer
TinyTimeMixerMLP
TinyTimeMixerMeanScaler
TinyTimeMixerModel
TinyTimeMixerModelOutput
TinyTimeMixerNOPScaler
TinyTimeMixerNormLayer
TinyTimeMixerPatchify
TinyTimeMixerPositionalEncoding
TinyTimeMixerPreTrainedModel
TinyTimeMixerStdScaler
nll()
weighted_average()
- dsipts.models.ttm.utils module
- Module contents
- dsipts.models.vva package
- dsipts.models.xlstm package
Submodules¶
dsipts.models.Autoformer module¶
- class dsipts.models.Autoformer.Autoformer(label_len: int, d_model: int, dropout_rate: float, kernel_size: int, activation: str = 'torch.nn.ReLU', factor: float = 0.5, n_head: int = 1, n_layer_encoder: int = 2, n_layer_decoder: int = 2, hidden_size: int = 1048, **kwargs)¶
Bases:
Base
Autoformer from https://github.com/cure-lab/LTSF-Linear
- Parameters:
label_len (int) – see the original implementation, seems like a warmup dimension (the decoder part will produce also some past predictions that are filter out at the end)
d_model (int) – embedding dimension of the attention layer
dropout_rate (float) – dropout raye
kernel_size (int) – kernel size
activation (str, optional) – _description_. Defaults to ‘torch.nn.ReLU’.
factor (int, optional) – parameter of .autoformer.layers.AutoCorrelation for find the top k. Defaults to 0.5.
n_head (int, optional) – number of heads. Defaults to 1.
n_layer_encoder (int, optional) – number of encoder layers. Defaults to 2.
n_layer_decoder (int, optional) – number of decoder layers. Defaults to 2.
hidden_size (int, optional) – output dimension of the transformer layer. Defaults to 1048.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.CrossFormer module¶
- class dsipts.models.CrossFormer.CrossFormer(https://openreview.net/forum?id=vSVLM2j9eie)¶
Bases:
Base
- Parameters:
d_model (int) – The dimensionality of the model.
hidden_size (int) – The size of the hidden layers.
n_head (int) – The number of attention heads.
seg_len (int) – The length of the segments.
n_layer_encoder (int) – The number of layers in the encoder.
win_size (int) – The size of the window for attention.
factor (int, optional) – see .crossformer.attn.TwoStageAttentionLayer. Defaults to 10.
dropout_rate (float, optional) – The dropout rate. Defaults to 0.1.
activation (str, optional) – The activation function to use. Defaults to ‘torch.nn.ReLU’.
**kwargs – Additional keyword arguments for the parent class.
- Returns:
This method does not return a value.
- Return type:
None
- Raises:
ValueError – If the activation function is not recognized.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.D3VAE module¶
- class dsipts.models.D3VAE.D3VAE(scale=0.1, hidden_size=64, num_layers=2, dropout_rate=0.1, diff_steps=200, loss_type='kl', beta_end=0.01, beta_schedule='linear', channel_mult=2, mult=1, num_preprocess_blocks=1, num_preprocess_cells=3, num_channels_enc=16, arch_instance='res_mbconv', num_latent_per_group=6, num_channels_dec=16, groups_per_scale=2, num_postprocess_blocks=1, num_postprocess_cells=2, beta_start=0, freq='h', **kwargs)¶
Bases:
Base
This is the basic model, each model implemented must overwrite the init method and the forward method. The inference step is optional, by default it uses the forward method but for recurrent network you should implement your own method
- Parameters:
verbose (bool) – Flag to enable verbose logging.
past_steps (int) – Number of past time steps to consider.
future_steps (int) – Number of future time steps to predict.
past_channels (int) – Number of channels in the past input data.
future_channels (int) – Number of channels in the future input data.
out_channels (int) – Number of output channels.
embs_past (List[int]) – List of embedding dimensions for past data.
embs_fut (List[int]) – List of embedding dimensions for future data.
n_classes (int, optional) – Number of classes for classification. Defaults to 0.
persistence_weight (float, optional) – Weight for persistence in loss calculation. Defaults to 0.0.
loss_type (str, optional) – Type of loss function to use (‘l1’ or ‘mse’). Defaults to ‘l1’.
quantiles (List[int], optional) – List of quantiles for quantile loss. Defaults to an empty list.
reduction_mode (str, optional) – Mode for reduction for categorical embedding layer (‘mean’, ‘sum’, ‘none’). Defaults to ‘mean’.
use_classical_positional_encoder (bool, optional) – Flag to use classical positional encoding or using embedding layer also for the positions. Defaults to False.
emb_dim (int, optional) – Dimension of categorical embeddings. Defaults to 16.
optim (Union[str, None], optional) – Optimizer type. Defaults to None.
optim_config (dict, optional) – Configuration for the optimizer. Defaults to None.
scheduler_config (dict, optional) – Configuration for the learning rate scheduler. Defaults to None.
- Raises:
AssertionError – If the number of quantiles is not equal to 3 when quantiles are provided.
AssertionError – If the number of output channels is not 1 for classification tasks.
- forward(batch: dict) tensor ¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- inference(batch: dict) tensor ¶
Care here, we need to implement it because for predicting the N-step it will use the prediction at step N-1. TODO fix if because I did not implement the know continuous variable presence here
- Parameters:
batch (dict) – batch of the dataloader
- Returns:
result
- Return type:
torch.tensor
- dsipts.models.D3VAE.copy_parameters(net_source: Module, net_dest: Module, strict=True) None ¶
Copies parameters from one network to another. :param net_source: Input network. :param net_dest: Output network. :param strict: whether to strictly enforce that the keys
in
state_dict
match the keys returned by this module’sstate_dict()
function. Default:True
dsipts.models.Diffusion module¶
- class dsipts.models.Diffusion.Diffusion(d_model: int, out_channels: int, past_steps: int, future_steps: int, past_channels: int, future_channels: int, embs: List[int], learn_var: bool, cosine_alpha: bool, diffusion_steps: int, beta: float, gamma: float, n_layers_RNN: int, d_head: int, n_head: int, dropout_rate: float, activation: str, subnet: int, perc_subnet_learning_for_step: float, persistence_weight: float = 0.0, loss_type: str = 'l1', quantiles: List[float] = [], optim: str | None = None, optim_config: dict | None = None, scheduler_config: dict | None = None, **kwargs)¶
Bases:
Base
Denoising Diffusion Probabilistic Model
- Parameters:
d_model (int)
out_channels (int) – number of target variables
past_steps (int) – size of past window
future_steps (int) – size of future window to be predicted
past_channels (int) – number of variables available for the past context
future_channels (int) – number of variables known in the future, available for forecasting
embs (list[int]) – categorical variables dimensions for embeddings
learn_var (bool) – Flag to make the model train the posterior variance (if True) or use the variance of posterior distribution
cosine_alpha (bool) – Flag for the generation of alphas and betas
diffusion_steps (int) – number of noising steps for the initial sample
beta (float) – starting variable to generate the diffusion perturbations. Ignored if cosine_alpha == True
gamma (float) – trade_off variable to balance loss over noise prediction and NegativeLikelihood/KL_Divergence.
n_layers_RNN (int) – param for subnet
d_head (int) – param for subnet
n_head (int) – param for subnet
dropout_rate (float) – param for subnet
activation (str) – param for subnet
subnet (int) – =1 for attention subnet, =2 for linear subnet. Others can be added(wait for Black Friday for discounts)
perc_subnet_learning_for_step (float) – percentage to choose how many subnet has to be trained for every batch. Decrease this value if the loss blows up.
persistence_weight (float, optional) – Defaults to 0.0.
loss_type (str, optional) – Defaults to ‘l1’.
quantiles (List[float], optional) – Only [] accepted. Defaults to [].
optim (Union[str,None], optional) – Defaults to None.
optim_config (Union[dict,None], optional) – Defaults to None.
scheduler_config (Union[dict,None], optional) – Defaults to None.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- cat_categorical_vars(batch: dict)¶
Extracting categorical context about past and future
- Parameters:
batch (dict) – Keys checked -> [‘x_cat_past’, ‘x_cat_future’]
- Returns:
cat_emb_past, cat_emb_fut
- Return type:
List[torch.Tensor, torch.Tensor]
- description = 'Can NOT handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan NOT handle Quantile loss function'¶
- forward(batch: dict) float ¶
training process of the diffusion network
- Parameters:
batch (dict) – variables loaded
- Returns:
total loss about the prediction of the noises over all subnets extracted
- Return type:
float
- gaussian_likelihood(x, mean, var)¶
- gaussian_log_likelihood(x, mean, var)¶
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = False¶
- handle_quantile_loss = False¶
- improving_weight_during_training()¶
Each time we sample from multinomial we subtract the minimum for more precise sampling, avoiding great learning differences among subnets
This lead to more stable inference also in early training, mainly for common context embedding.
For probabilistic reason, weights has to be >0, so we subtract min-1
- inference(batch: dict) Tensor ¶
Inference process to forecast future y
- Parameters:
batch (dict) – Keys checked [‘x_num_past, ‘idx_target’, ‘x_num_future’, ‘x_cat_past’, ‘x_cat_future’]
- Returns:
generated sequence [batch_size, future_steps, num_var]
- Return type:
torch.Tensor
- normal_kl(mean1, logvar1, mean2, logvar2)¶
Compute the KL divergence between two gaussians. Also called relative entropy. KL divergence of P from Q is the expected excess surprise from using Q as a model when the actual distribution is P. KL(P||Q) = P*log(P/Q) or -P*log(Q/P)
# In the context of machine learning, KL(P||Q) is often called the ‘information gain’ # achieved if P would be used instead of Q which is currently used.
Shapes are automatically broadcasted, so batches can be compared to scalars, among other use cases.
- prepare_data_per_node: bool¶
- q_sample(x_start: Tensor, t: int) List[Tensor] ¶
Diffuse x_start for t diffusion steps.
In other words, sample from q(x_t | x_0).
Also, compute the mean and variance of the diffusion posterior:
q(x_{t-1} | x_t, x_0)
Posterior mean and variance are the ones to be predicted
- Parameters:
x_start (torch.Tensor) – values to be predicted
t (int) – diffusion step
- Returns:
q_sample, posterior mean, posterior log variance and the actual noise
- Return type:
List[torch.Tensor, torch.Tensor, torch.Tensor]
- remove_var(tensor: Tensor, indexes_to_exclude: list, dimension: int) Tensor ¶
Function to remove variables from tensors in chosen dimension and position
- Parameters:
tensor (torch.Tensor) – starting tensor
indexes_to_exclude (list) – index of the chosen dimension we want t oexclude
dimension (int) – dimension of the tensor on which we want to work (not list od dims!!)
- Returns:
new tensor without the chosen variables
- Return type:
torch.Tensor
- training: bool¶
- class dsipts.models.Diffusion.SubNet1(aux_past_ch, aux_fut_ch, learn_var: bool, output_channel: int, d_model: int, d_head: int, n_head: int, activation: str, dropout_rate: float)¶
Bases:
Module
-> SUBNET of the DIFFUSION MODEL (DDPM)
It starts with an autoregressive LSTM Network computation of epsilon, then subtracted to ‘y_noised’ tensor. This is always possible! Now we have an approximation of our ‘eps_hat’, that at the end will pass in a residual connection with its embedded version ‘emb_eps_hat’.
‘emb_eps_hat’ will be update with respect to available info about categorical values of our serie: Through an ATTENTION Network we compare past categorical with future categorical to update the embedded noise predicted.
Also, if we have values about auxiliary numerical variables both in past and future, the changes of these variables will be fetched by another ATTENTION Network.
The goal is ensure valuable computations for ‘eps’ always, and then updating things if we have enough data. Both attentions uses { Q = *_future, K = *_past, V = y_past } using as much as possible context variables for better updates.
- Parameters:
learn_var (bool) – set if the network has to learn the optim variance of each step
output_channel (int) – number of variables to be predicted
future_steps (int) – number of step in the future, so the number of timesstep to be predicted
d_model (int) – hidden dimension of the model
num_layers_RNN (int) – number of layers for autoregressive prediction
d_head (int) – number of heads for Attention Networks
n_head (int) – hidden dimension of heads for Attention Networks
dropout_rate (float)
- forward(y_noised: Tensor, y_past: Tensor, cat_past: Tensor, cat_fut: Tensor, num_past: Tensor | None = None, num_fut: Tensor | None = None)¶
‘DIFFUSION SUBNET :param y_noised: [B, future_step, num_var] :type y_noised: torch.Tensor :param y_past: [B, past_step, num_var] :type y_past: torch.Tensor :param cat_past: [B, past_step, d_model]. Defaults to None. :type cat_past: torch.Tensor, optional :param cat_fut: [B, future_step, d_model]. Defaults to None. :type cat_fut: torch.Tensor, optional :param num_past: [B, past_step, d_model]. Defaults to None. :type num_past: torch.Tensor, optional :param num_fut: [B, future_step, d_model]. Defaults to None. :type num_fut: torch.Tensor, optional
- Returns:
predicted noise [B, future_step, num_var]. According to ‘learn_var’ param in initialization, the subnet returns another tensor of same size about the variance
- Return type:
torch.Tensor
- class dsipts.models.Diffusion.SubNet2(aux_past_ch, aux_fut_ch, learn_var: bool, past_steps, future_steps, output_channel: int, d_model: int, activation: str, dropout_rate: float)¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(y_noised: Tensor, y_past: Tensor, cat_past: Tensor, cat_fut: Tensor, num_past: Tensor | None = None, num_fut: Tensor | None = None)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dsipts.models.Diffusion.SubNet3(learn_var, flag_aux_num, num_var, d_model, pred_step, num_layers, d_head, n_head, dropout)¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(y_noised: Tensor, y_past: Tensor, cat_past: Tensor, cat_fut: Tensor, num_past: Tensor | None = None, num_fut: Tensor | None = None)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
dsipts.models.DilatedConv module¶
- class dsipts.models.DilatedConv.Block(input_channels: int, kernel_size: int, output_channels: int, input_size: int, sum_layers: bool)¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x: tensor) tensor ¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dsipts.models.DilatedConv.DilatedConv(sum_layers: bool, hidden_RNN: int, num_layers_RNN: int, kind: str, kernel_size: int, activation: str = 'torch.nn.ReLU', remove_last=False, dropout_rate: float = 0.1, use_bn: bool = False, use_glu: bool = True, glu_percentage: float = 1.0, **kwargs)¶
Bases:
Base
Custom encoder-decoder
- Parameters:
sum_layers (bool) – Flag indicating whether to sum the layers.
hidden_RNN (int) – Number of hidden units in the RNN.
num_layers_RNN (int) – Number of layers in the RNN.
kind (str) – Type of RNN to use (e.g., ‘LSTM’, ‘GRU’).
kernel_size (int) – Size of the convolutional kernel.
activation (str, optional) – Activation function to use. Defaults to ‘torch.nn.ReLU’.
remove_last (bool, optional) – Flag to indicate whether to remove the last element in the sequence. Defaults to False.
dropout_rate (float, optional) – Dropout rate for regularization. Defaults to 0.1.
use_bn (bool, optional) – Flag to indicate whether to use batch normalization. Defaults to False.
use_glu (bool, optional) – Flag to indicate whether to use Gated Linear Units (GLU). Defaults to True.
glu_percentage (float, optional) – Percentage of GLU to apply. Defaults to 1.0.
**kwargs – Additional keyword arguments.
- Returns:
None
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forward(batch)¶
It is mandatory to implement this method
- Parameters:
batch (dict) – batch of the dataloader
- Returns:
result
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- inference(batch: dict) tensor ¶
Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN)
- Parameters:
batch (dict) – batch
- Returns:
result
- Return type:
torch.tensor
- prepare_data_per_node: bool¶
- training: bool¶
- class dsipts.models.DilatedConv.GLU(d_model: int)¶
Bases:
Module
Gated Linear Unit, ‘Gate’ block in TFT paper Sub net of GRN: linear(x) * sigmoid(linear(x)) No dimension changes
- Parameters:
d_model (int) – model dimension
- forward(x: Tensor) Tensor ¶
Gated Linear Unit Sub net of GRN: linear(x) * sigmoid(linear(x)) No dimension changes: [bs, seq_len, d_model]
- Parameters:
x (torch.Tensor)
- Returns:
torch.Tensor
dsipts.models.DilatedConvED module¶
- class dsipts.models.DilatedConvED.Block(input_channels: int, kernel_size: int, output_channels: int, input_size: int, sum_layers: bool)¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x: tensor) tensor ¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dsipts.models.DilatedConvED.DilatedConvED(sum_layers: bool, hidden_RNN: int, num_layers_RNN: int, kind: str, kernel_size: int, dropout_rate: float = 0.1, use_bn: bool = False, use_cumsum: bool = True, use_bilinear: bool = False, activation: str = 'torch.nn.ReLU', **kwargs)¶
Bases:
Base
Initialize the model with specified parameters.
- Parameters:
sum_layers (bool) – Flag indicating whether to sum layers in the encoder/decoder blocks.
hidden_RNN (int) – Number of hidden units in the RNN.
num_layers_RNN (int) – Number of layers in the RNN.
kind (str) – Type of RNN to use (‘lstm’ or ‘gru’).
kernel_size (int) – Size of the convolutional kernel.
dropout_rate (float, optional) – Dropout rate for regularization. Defaults to 0.1.
use_bn (bool, optional) – Flag to use batch normalization. Defaults to False.
use_cumsum (bool, optional) – Flag to use cumulative sum. Defaults to True.
use_bilinear (bool, optional) – Flag to use bilinear layers. Defaults to False.
activation (str, optional) – Activation function to use. Defaults to ‘torch.nn.ReLU’.
**kwargs – Additional keyword arguments.
- Raises:
ValueError – If the specified activation function is not recognized or if the kind is not ‘lstm’ or ‘gru’.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forward(batch)¶
It is mandatory to implement this method
- Parameters:
batch (dict) – batch of the dataloader
- Returns:
result
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
- class dsipts.models.DilatedConvED.GLU(d_model: int)¶
Bases:
Module
Gated Linear Unit, ‘Gate’ block in TFT paper Sub net of GRN: linear(x) * sigmoid(linear(x)) No dimension changes
- Parameters:
d_model (int) – model dimension
- forward(x: Tensor) Tensor ¶
Gated Linear Unit Sub net of GRN: linear(x) * sigmoid(linear(x)) No dimension changes: [bs, seq_len, d_model]
- Parameters:
x (torch.Tensor)
- Returns:
torch.Tensor
dsipts.models.Duet module¶
- class dsipts.models.Duet.Duet(factor: int, d_model: int, n_head: int, n_layer: int, CI: bool, d_ff: int, noisy_gating: bool, num_experts: int, kernel_size: int, hidden_size: int, k: int, dropout_rate: float = 0.1, activation: str = '', **kwargs)¶
Bases:
Base
Initializes the model with the specified parameters. https://github.com/decisionintelligence/DUET
- Parameters:
factor (int) – The factor for attention scaling. NOT USED but in the original implementation
d_model (int) – The dimensionality of the model.
n_head (int) – The number of attention heads.
n_layer (int) – The number of layers in the encoder.
CI (bool) – Perform channel independent operations.
d_ff (int) – The dimensionality of the feedforward layer.
noisy_gating (bool) – Flag to indicate if noisy gating is used.
num_experts (int) – The number of experts in the mixture of experts.
kernel_size (int) – The size of the convolutional kernel.
hidden_size (int) – The size of the hidden layer.
k (int) – The number of clusters for the linear extractor.
dropout_rate (float, optional) – The dropout rate. Defaults to 0.1.
activation (str, optional) – The activation function to use. Defaults to ‘’.
**kwargs – Additional keyword arguments.
- Raises:
ValueError – If the activation function is not recognized.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forward(batch: dict) float ¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.ITransformer module¶
- class dsipts.models.ITransformer.ITransformer(hidden_size: int, d_model: int, n_head: int, n_layer_decoder: int, use_norm: bool, class_strategy: str = 'projection', dropout_rate: float = 0.1, activation: str = '', **kwargs)¶
Bases:
Base
Initialize the ITransformer model for time series forecasting.
This class implements the Inverted Transformer architecture as described in the paper “ITRANSFORMER: INVERTED TRANSFORMERS ARE EFFECTIVE FOR TIME SERIES FORECASTING” (https://arxiv.org/pdf/2310.06625).
- Parameters:
hidden_size (int) – The first embedding size of the model (‘r’ in the paper).
d_model (int) – The second embedding size (r^{tilda} in the model). Should be smaller than hidden_size.
n_head (int) – The number of attention heads.
n_layer_decoder (int) – The number of layers in the decoder.
use_norm (bool) – Flag to indicate whether to use normalization.
class_strategy (str, optional) – The strategy for classification, can be ‘projection’, ‘average’, or ‘cls_token’. Defaults to ‘projection’.
dropout_rate (float, optional) – The dropout rate for regularization. Defaults to 0.1.
activation (str, optional) – The activation function to be used. Defaults to ‘’.
**kwargs – Additional keyword arguments.
- Raises:
ValueError – If the activation function is not recognized.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)¶
- forward(batch: dict) float ¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.Informer module¶
- class dsipts.models.Informer.Informer(d_model: int, hidden_size: int, n_layer_encoder: int, n_layer_decoder: int, mix: bool = True, activation: str = 'torch.nn.ReLU', remove_last=False, attn: str = 'prob', distil: bool = True, factor: int = 5, n_head: int = 1, dropout_rate: float = 0.1, **kwargs)¶
Bases:
Base
Initialize the model with specified parameters. hhttps://github.com/zhouhaoyi/Informer2020/tree/main/models
- Parameters:
d_model (int) – The dimensionality of the model.
hidden_size (int) – The size of the hidden layers.
n_layer_encoder (int) – The number of layers in the encoder.
n_layer_decoder (int) – The number of layers in the decoder.
mix (bool, optional) – Whether to use mixed attention. Defaults to True.
activation (str, optional) – The activation function to use. Defaults to ‘torch.nn.ReLU’.
remove_last (bool, optional) – Whether to remove the last layer. Defaults to False.
attn (str, optional) – The type of attention mechanism to use. Defaults to ‘prob’.
distil (bool, optional) – Whether to use distillation. Defaults to True.
factor (int, optional) – The factor for attention. Defaults to 5.
n_head (int, optional) – The number of attention heads. Defaults to 1.
dropout_rate (float, optional) – The dropout rate. Defaults to 0.1.
**kwargs – Additional keyword arguments.
- Raises:
ValueError – If any of the parameters are invalid.
Notes
Ensure to set up split_params: shift: ${model_configs.future_steps} as it is required!!
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.LinearTS module¶
- class dsipts.models.LinearTS.LinearTS(kernel_size: int, hidden_size: int, dropout_rate: float = 0.1, activation: str = 'torch.nn.ReLU', kind: str = 'linear', use_bn: bool = False, simple: bool = False, **kwargs)¶
Bases:
Base
Initialize the model with specified parameters. Linear model from https://github.com/cure-lab/LTSF-Linear/blob/main/run_longExp.py
- Parameters:
kernel_size (int) – Kernel dimension for the initial moving average.
hidden_size (int) – Hidden size of the linear block.
dropout_rate (float, optional) – Dropout rate in Dropout layers. Default is 0.1.
activation (str, optional) – Activation function in PyTorch. Default is ‘torch.nn.ReLU’.
kind (str, optional) – Type of model, can be ‘linear’, ‘dlinear’ (de-trending), or ‘nlinear’ (differential). Defaults to ‘linear’.
use_bn (bool, optional) – If True, Batch Normalization layers will be added and Dropouts will be removed. Default is False.
simple (bool, optional) – If True, the model used is the same as illustrated in the paper; otherwise, a more complex model with the same idea is used. Default is False.
**kwargs – Additional keyword arguments for the parent class.
- Raises:
ValueError – If an invalid activation function is provided.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function\n THE SIMPLE IMPLEMENTATION DOES NOT USE CATEGORICAL NOR FUTURE VARIABLES'¶
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
- class dsipts.models.LinearTS.moving_avg(kernel_size, stride)¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dsipts.models.LinearTS.series_decomp(kernel_size)¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
dsipts.models.PatchTST module¶
- class dsipts.models.PatchTST.PatchTST(d_model: int, patch_len: int, kernel_size: int, decomposition: bool = True, activation: str = 'torch.nn.ReLU', n_head: int = 1, n_layer: int = 2, stride: int = 8, remove_last: bool = False, hidden_size: int = 1048, dropout_rate: float = 0.1, **kwargs)¶
Bases:
Base
Initializes the model with specified parameters.https://github.com/yuqinie98/PatchTST/blob/main/
- Parameters:
d_model (int) – The dimensionality of the model.
patch_len (int) – The length of the patches.
kernel_size (int) – The size of the kernel for convolutional layers.
decomposition (bool, optional) – Whether to use decomposition. Defaults to True.
activation (str, optional) – The activation function to use. Defaults to ‘torch.nn.ReLU’.
n_head (int, optional) – The number of attention heads. Defaults to 1.
n_layer (int, optional) – The number of layers in the model. Defaults to 2.
stride (int, optional) – The stride for convolutional layers. Defaults to 8.
remove_last (bool, optional) – Whether to remove the last layer. Defaults to False.
hidden_size (int, optional) – The size of the hidden layers. Defaults to 1048.
dropout_rate (float, optional) – The dropout rate for regularization. Defaults to 0.1.
**kwargs – Additional keyword arguments.
- Raises:
ValueError – If the activation function is not recognized.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan NOT handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = False¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.Persistent module¶
- class dsipts.models.Persistent.Persistent(**kwargs)¶
Bases:
Base
Simple persistent model aligned with all the other
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan NOT handle future covariates\nCan NOT handle categorical covariates\nCan NOT handle Quantile loss function'¶
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = False¶
- handle_future_covariates = False¶
- handle_multivariate = True¶
- handle_quantile_loss = False¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.RNN module¶
- class dsipts.models.RNN.MyBN(channels)¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dsipts.models.RNN.RNN(hidden_RNN: int, num_layers_RNN: int, kind: str, kernel_size: int, activation: str = 'torch.nn.ReLU', remove_last=False, dropout_rate: float = 0.1, use_bn: bool = False, num_blocks: int = 4, bidirectional: bool = True, lstm_type: str = 'slstm', **kwargs)¶
Bases:
Base
Initialize a recurrent model with an encoder-decoder structure.
- Parameters:
hidden_RNN (int) – Hidden size of the RNN block.
num_layers_RNN (int) – Number of RNN layers.
kind (str) – Type of RNN to use, either ‘gru’ or ‘lstm’ or xlstm.
kernel_size (int) – Kernel size in the encoder convolutional block.
activation (str, optional) – Activation function from PyTorch. Default is ‘torch.nn.ReLU’.
remove_last (bool, optional) – If True, the model learns the difference with respect to the last seen point. Default is False.
dropout_rate (float, optional) – Dropout rate in Dropout layers. Default is 0.1.
use_bn (bool, optional) – If True, Batch Normalization layers will be added and Dropouts will be removed. Default is False.
num_blocks (int, optional) – Number of xLSTM blocks (only for xLSTM). Default is 4.
bidirectional (bool, optional) – If True, the RNN is bidirectional. Default is True.
lstm_type (str, optional) – Type of LSTM to use (only for xLSTM), either ‘slstm’ or ‘mlstm’. Default is ‘slstm’.
**kwargs – Additional keyword arguments.
- Raises:
ValueError – If the specified kind is not ‘lstm’, ‘gru’, or ‘xlstm’.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.Samformer module¶
- class dsipts.models.Samformer.Samformer(hidden_size: int, use_revin: bool, activation: str = '', **kwargs)¶
Bases:
Base
Initialize the model with specified parameters. Samformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention. https://arxiv.org/pdf/2402.10198
- Parameters:
hidden_size (int) – The size of the hidden layer.
use_revin (bool) – Flag indicating whether to use RevIN.
activation (str, optional) – The activation function to use. Defaults to ‘’.
**kwargs – Additional keyword arguments passed to the parent class.
- Raises:
ValueError – If the activation function is not recognized.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan NOT handle future covariates\nCan NOT handle categorical covariates\nCan NOT handle Quantile loss function'¶
- forward(batch: dict) float ¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = False¶
- handle_future_covariates = False¶
- handle_multivariate = True¶
- handle_quantile_loss = False¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.Simple module¶
- class dsipts.models.Simple.Simple(hidden_size: int, dropout_rate: float = 0.1, activation: str = 'torch.nn.ReLU', **kwargs)¶
Bases:
Base
This is the basic model, each model implemented must overwrite the init method and the forward method. The inference step is optional, by default it uses the forward method but for recurrent network you should implement your own method
- Parameters:
verbose (bool) – Flag to enable verbose logging.
past_steps (int) – Number of past time steps to consider.
future_steps (int) – Number of future time steps to predict.
past_channels (int) – Number of channels in the past input data.
future_channels (int) – Number of channels in the future input data.
out_channels (int) – Number of output channels.
embs_past (List[int]) – List of embedding dimensions for past data.
embs_fut (List[int]) – List of embedding dimensions for future data.
n_classes (int, optional) – Number of classes for classification. Defaults to 0.
persistence_weight (float, optional) – Weight for persistence in loss calculation. Defaults to 0.0.
loss_type (str, optional) – Type of loss function to use (‘l1’ or ‘mse’). Defaults to ‘l1’.
quantiles (List[int], optional) – List of quantiles for quantile loss. Defaults to an empty list.
reduction_mode (str, optional) – Mode for reduction for categorical embedding layer (‘mean’, ‘sum’, ‘none’). Defaults to ‘mean’.
use_classical_positional_encoder (bool, optional) – Flag to use classical positional encoding or using embedding layer also for the positions. Defaults to False.
emb_dim (int, optional) – Dimension of categorical embeddings. Defaults to 16.
optim (Union[str, None], optional) – Optimizer type. Defaults to None.
optim_config (dict, optional) – Configuration for the optimizer. Defaults to None.
scheduler_config (dict, optional) – Configuration for the learning rate scheduler. Defaults to None.
- Raises:
AssertionError – If the number of quantiles is not equal to 3 when quantiles are provided.
AssertionError – If the number of output channels is not 1 for classification tasks.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function\n THE SIMPLE IMPLEMENTATION DOES NOT USE CATEGORICAL NOR FUTURE VARIABLES'¶
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.TFT module¶
- class dsipts.models.TFT.TFT(d_model: int, num_layers_RNN: int, d_head: int, n_head: int, dropout_rate: float, **kwargs)¶
Bases:
Base
Initializes the model for time series forecasting with attention mechanisms and recurrent neural networks.
This model is designed for direct forecasting, allowing for multi-output and multi-horizon predictions. It leverages attention mechanisms to enhance the selection of relevant past time steps and learn long-term dependencies. The architecture includes RNN enrichment, gating mechanisms to minimize the impact of irrelevant variables, and the ability to output prediction intervals through quantile regression.
Key features include: - Direct Model: Predicts all future steps at once. - Multi-Output Forecasting: Capable of predicting one or more variables simultaneously. - Multi-Horizon Forecasting: Predicts variables at multiple future time steps. - Attention-Based Mechanism: Enhances the selection of relevant past time steps and learns long-term dependencies. - RNN Enrichment: Utilizes LSTM for initial autoregressive approximation, which is refined by the rest of the network. - Gating Mechanisms: Reduces the contribution of irrelevant variables. - Prediction Intervals: Outputs percentiles (e.g., 10th, 50th, 90th) at each time step.
The model also facilitates interpretability by identifying: - Global importance of variables for both past and future. - Temporal patterns. - Significant events.
- Parameters:
d_model (int) – General hidden dimension across the network, adjustable in sub-networks.
num_layers_RNN (int) – Number of layers in the recurrent neural network (LSTM).
d_head (int) – Dimension of each attention head.
n_head (int) – Number of attention heads.
dropout_rate (float) – Dropout rate applied uniformly across all dropout layers.
**kwargs – Additional keyword arguments for further customization.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forward(batch: dict) Tensor ¶
Temporal Fusion Transformer
Collectiong Data - Extract the autoregressive variable(s) - Embedding and compute a first approximated prediction - ‘summary_past’ and ‘summary_fut’ collecting data about past and future Concatenating on the dimension 2 all different datas, which will be mixed through a MEAN over that imension Info get from other tensor of the batch taken as input
TFT actual computations - Residual Connection for y_past and summary_past - Residual Connection for y_fut and summary_fut - GRN1 for past and for fut - ATTENTION(summary_fut, summary_past, y_past) - Residual Connection for attention itself - GRN2 for attention - Residual Connection for attention and summary_fut - Linear for actual values and reshape
- Parameters:
batch (dict) – Keys used are [‘x_num_past’, ‘idx_target’, ‘x_num_future’, ‘x_cat_past’, ‘x_cat_future’]
- Returns:
shape [B, self.future_steps, self.out_channels, self.mul] or [B, self.future_steps, self.out_channels] according to quantiles
- Return type:
torch.Tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- remove_var(tensor: Tensor, indexes_to_exclude: int, dimension: int) Tensor ¶
Function to remove variables from tensors in chosen dimension and position
- Parameters:
tensor (torch.Tensor) – starting tensor
indexes_to_exclude (int) – index of the chosen dimension we want t oexclude
dimension (int) – dimension of the tensor on which we want to work
- Returns:
new tensor without the chosen variables
- Return type:
torch.Tensor
- training: bool¶
dsipts.models.TIDE module¶
- class dsipts.models.TIDE.ResidualBlock(in_size: int, out_size: int, dropout_rate: float, activation_fun: str = '')¶
Bases:
Module
Residual Block as basic layer of the archetecture.
MLP with one hidden layer, activation and skip connection Basically dimension d_model, but better if input_dim and output_dim are explicit
in_size and out_size to handle dimensions at different stages of the NN
- Parameters:
in_size (int)
out_size (int)
dropout_rate (float)
activation_fun (str, optional) – activation function to use in the Residual Block. Defaults to nn.ReLU.
- forward(x, apply_final_norm=True)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dsipts.models.TIDE.TIDE(hidden_size: int, d_model: int, n_add_enc: int, n_add_dec: int, dropout_rate: float, activation: str = '', **kwargs)¶
Bases:
Base
Initializes the model with specified parameters for a neural network architecture. Long-term Forecasting with TiDE: Time-series Dense Encoder https://arxiv.org/abs/2304.08424
- Parameters:
hidden_size (int) – The size of the hidden layers.
d_model (int) – The dimensionality of the model.
n_add_enc (int) – The number of additional encoder layers.
n_add_dec (int) – The number of additional decoder layers.
dropout_rate (float) – The dropout rate to be applied in the layers.
activation (str, optional) – The activation function to be used. Defaults to an empty string.
**kwargs – Additional keyword arguments passed to the parent class.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forward(batch: dict) float ¶
training process of the diffusion network
- Parameters:
batch (dict) – variables loaded
- Returns:
total loss about the prediction of the noises over all subnets extracted
- Return type:
float
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- remove_var(tensor: Tensor, indexes_to_exclude: list, dimension: int) Tensor ¶
Function to remove variables from tensors in chosen dimension and position
- Parameters:
tensor (torch.Tensor) – starting tensor
indexes_to_exclude (list) – index of the chosen dimension we want t oexclude
dimension (int) – dimension of the tensor on which we want to work (not list od dims!!)
- Returns:
new tensor without the chosen variables
- Return type:
torch.Tensor
- training: bool¶
dsipts.models.TTM module¶
- class dsipts.models.TTM.TTM(model_path: str, past_steps: int, future_steps: int, freq_prefix_tuning: bool, freq: str, prefer_l1_loss: bool, prefer_longer_context: bool, loss_type: str, num_input_channels, prediction_channel_indices, exogenous_channel_indices, decoder_mode, fcm_context_length, fcm_use_mixer, fcm_mix_layers, fcm_prepend_past, enable_forecast_channel_mixing, out_channels: int, embs: List[int], remove_last=False, optim: str | None = None, optim_config: dict = None, scheduler_config: dict = None, verbose=False, use_quantiles=False, persistence_weight: float = 0.0, quantiles: List[int] = [], **kwargs)¶
Bases:
Base
TODO and FIX for future and past categorical variables
- Parameters:
model_path (str) – _description_
past_steps (int) – _description_
future_steps (int) – _description_
freq_prefix_tuning (bool) – _description_
freq (str) – _description_
prefer_l1_loss (bool) – _description_
loss_type (str) – _description_
num_input_channels (_type_) – _description_
prediction_channel_indices (_type_) – _description_
exogenous_channel_indices (_type_) – _description_
decoder_mode (_type_) – _description_
fcm_context_length (_type_) – _description_
fcm_use_mixer (_type_) – _description_
fcm_mix_layers (_type_) – _description_
fcm_prepend_past (_type_) – _description_
enable_forecast_channel_mixing (_type_) – _description_
out_channels (int) – _description_
embs (List[int]) – _description_
remove_last (bool, optional) – _description_. Defaults to False.
optim (Union[str,None], optional) – _description_. Defaults to None.
optim_config (dict, optional) – _description_. Defaults to None.
scheduler_config (dict, optional) – _description_. Defaults to None.
verbose (bool, optional) – _description_. Defaults to False.
use_quantiles (bool, optional) – _description_. Defaults to False.
persistence_weight (float, optional) – _description_. Defaults to 0.0.
quantiles (List[int], optional) – _description_. Defaults to [].
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
dsipts.models.TimeXER module¶
- class dsipts.models.TimeXER.TimeXER(patch_len: int, d_model: int, n_head: int, d_ff: int = 512, dropout_rate: float = 0.1, n_layer_decoder: int = 1, activation: str = '', **kwargs)¶
Bases:
Base
Initialize the model with specified parameters. https://github.com/thuml/Time-Series-Library/blob/main/models/TimeMixer.py
- Parameters:
patch_len (int) – Length of the patches.
d_model (int) – Dimension of the model.
n_head (int) – Number of attention heads.
d_ff (int, optional) – Dimension of the feedforward network. Defaults to 512.
dropout_rate (float, optional) – Dropout rate for regularization. Defaults to 0.1.
n_layer_decoder (int, optional) – Number of layers in the decoder. Defaults to 1.
activation (str, optional) – Activation function to use. Defaults to ‘’.
**kwargs – Additional keyword arguments passed to the superclass.
- Raises:
ValueError – If an invalid activation function is provided.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can handle multivariate output \nCan handle future covariates\nCan handle categorical covariates\nCan handle Quantile loss function'¶
- forward(batch: dict) float ¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = True¶
- handle_future_covariates = True¶
- handle_multivariate = True¶
- handle_quantile_loss = True¶
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.VQVAEA module¶
- class dsipts.models.VQVAEA.VQVAEA(past_steps: int, future_steps: int, past_channels: int, future_channels: int, hidden_channels: int, embs: List[int], d_model: int, max_voc_size: int, num_layers: int, dropout_rate: float, commitment_cost: float, decay: float, n_heads: int, out_channels: int, epoch_vqvae: int, persistence_weight: float = 0.0, loss_type: str = 'l1', quantiles: List[int] = [], optim: str | None = None, optim_config: dict = None, scheduler_config: dict = None, **kwargs)¶
Bases:
Base
Custom encoder-decoder
- Parameters:
past_steps (int) – number of past datapoints used
future_steps (int) – number of future lag to predict
past_channels (int) – number of numeric past variables, must be >0
future_channels (int) – number of future numeric variables
embs (List) – list of the initial dimension of the categorical variables
cat_emb_dim (int) – final dimension of each categorical variable
hidden_RNN (int) – hidden size of the RNN block
num_layers_RNN (int) – number of RNN layers
kind (str) – one among GRU or LSTM
kernel_size (int) – kernel size in the encoder convolutional block
sum_emb (bool) – if true the contribution of each embedding will be summed-up otherwise stacked
out_channels (int) – number of output channels
activation (str, optional) – activation fuction function pytorch. Default torch.nn.ReLU
remove_last (bool, optional) – if True the model learns the difference respect to the last seen point
persistence_weight (float) – weight controlling the divergence from persistence model. Default 0
loss_type (str, optional) – this model uses custom losses or l1 or mse. Custom losses can be linear_penalization or exponential_penalization. Default l1,
quantiles (List[int], optional) – we can use quantile loss il len(quantiles) = 0 (usually 0.1,0.5, 0.9) or L1loss in case len(quantiles)==0. Defaults to [].
dropout_rate (float, optional) – dropout rate in Dropout layers
use_bn (bool, optional) – if true BN layers will be added and dropouts will be removed
use_glu (bool,optional) – use GLU for feature selection. Defaults to True.
glu_percentage (float, optiona) – percentage of features to use. Defaults to 1.0.
n_classes (int) – number of classes (0 in regression)
optim (str, optional) – if not None it expects a pytorch optim method. Defaults to None that is mapped to Adam.
optim_config (dict, optional) – configuration for Adam optimizer. Defaults to None.
scheduler_config (dict, optional) – configuration for stepLR scheduler. Defaults to None.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can NOT handle multivariate output \nCan NOT handle future covariates\nCan NOT handle categorical covariates\nCan NOT handle Quantile loss function'¶
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- generate(idx, max_new_tokens, temperature=1.0, do_sample=False, top_k=None, num_samples=100)¶
Take a conditioning sequence of indices idx (LongTensor of shape (b,t)) and complete the sequence max_new_tokens times, feeding the predictions back into the model each time. Most likely you’ll want to make sure to be in model.eval() mode of operation for this.
- gpt(tokens)¶
- handle_categorical_variables = False¶
- handle_future_covariates = False¶
- handle_multivariate = False¶
- handle_quantile_loss = False¶
- inference(batch: dict) tensor ¶
Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN)
- Parameters:
batch (dict) – batch
- Returns:
result
- Return type:
torch.tensor
- prepare_data_per_node: bool¶
- training: bool¶
- dsipts.models.VQVAEA.random() x in the interval [0, 1). ¶
dsipts.models.VVA module¶
- class dsipts.models.VVA.VVA(past_steps: int, future_steps: int, past_channels: int, future_channels: int, embs: List[int], d_model: int, max_voc_size: int, token_split: int, num_layers: int, dropout_rate: float, n_heads: int, out_channels: int, persistence_weight: float = 0.0, loss_type: str = 'l1', quantiles: List[int] = [], optim: str | None = None, optim_config: dict = None, scheduler_config: dict = None, **kwargs)¶
Bases:
Base
Custom encoder-decoder
- Parameters:
past_steps (int) – number of past datapoints used
future_steps (int) – number of future lag to predict
past_channels (int) – number of numeric past variables, must be >0
future_channels (int) – number of future numeric variables
embs (List) – list of the initial dimension of the categorical variables
cat_emb_dim (int) – final dimension of each categorical variable
hidden_RNN (int) – hidden size of the RNN block
num_layers_RNN (int) – number of RNN layers
kind (str) – one among GRU or LSTM
kernel_size (int) – kernel size in the encoder convolutional block
sum_emb (bool) – if true the contribution of each embedding will be summed-up otherwise stacked
out_channels (int) – number of output channels
activation (str, optional) – activation fuction function pytorch. Default torch.nn.ReLU
remove_last (bool, optional) – if True the model learns the difference respect to the last seen point
persistence_weight (float) – weight controlling the divergence from persistence model. Default 0
loss_type (str, optional) – this model uses custom losses or l1 or mse. Custom losses can be linear_penalization or exponential_penalization. Default l1,
quantiles (List[int], optional) – we can use quantile loss il len(quantiles) = 0 (usually 0.1,0.5, 0.9) or L1loss in case len(quantiles)==0. Defaults to [].
dropout_rate (float, optional) – dropout rate in Dropout layers
use_bn (bool, optional) – if true BN layers will be added and dropouts will be removed
use_glu (bool,optional) – use GLU for feature selection. Defaults to True.
glu_percentage (float, optiona) – percentage of features to use. Defaults to 1.0.
n_classes (int) – number of classes (0 in regression)
optim (str, optional) – if not None it expects a pytorch optim method. Defaults to None that is mapped to Adam.
optim_config (dict, optional) – configuration for Adam optimizer. Defaults to None.
scheduler_config (dict, optional) – configuration for stepLR scheduler. Defaults to None.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- configure_optimizers()¶
This long function is unfortunately doing something very simple and is being very defensive: We are separating out all parameters of the model into two buckets: those that will experience weight decay for regularization and those that won’t (biases, and layernorm/embedding weights). We are then returning the PyTorch optimizer object.
- description = 'Can NOT handle multivariate output \nCan NOT handle future covariates\nCan NOT handle categorical covariates\nCan NOT handle Quantile loss function'¶
- forward(batch)¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- generate(idx, max_new_tokens, temperature=1.0, do_sample=False, top_k=None, num_samples=100)¶
Take a conditioning sequence of indices idx (LongTensor of shape (b,t)) and complete the sequence max_new_tokens times, feeding the predictions back into the model each time. Most likely you’ll want to make sure to be in model.eval() mode of operation for this.
- handle_categorical_variables = False¶
- handle_future_covariates = False¶
- handle_multivariate = False¶
- handle_quantile_loss = False¶
- inference(batch: dict) tensor ¶
Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN)
- Parameters:
batch (dict) – batch
- Returns:
result
- Return type:
torch.tensor
- prepare_data_per_node: bool¶
- training: bool¶
dsipts.models.base module¶
- class dsipts.models.base.Base(verbose: bool, past_steps: int, future_steps: int, past_channels: int, future_channels: int, out_channels: int, embs_past: List[int], embs_fut: List[int], n_classes: int = 0, persistence_weight: float = 0.0, loss_type: str = 'l1', quantiles: List[int] = [], reduction_mode: str = 'mean', use_classical_positional_encoder: bool = False, emb_dim: int = 16, optim: str | None = None, optim_config: dict = None, scheduler_config: dict = None)¶
Bases:
LightningModule
This is the basic model, each model implemented must overwrite the init method and the forward method. The inference step is optional, by default it uses the forward method but for recurrent network you should implement your own method
- Parameters:
verbose (bool) – Flag to enable verbose logging.
past_steps (int) – Number of past time steps to consider.
future_steps (int) – Number of future time steps to predict.
past_channels (int) – Number of channels in the past input data.
future_channels (int) – Number of channels in the future input data.
out_channels (int) – Number of output channels.
embs_past (List[int]) – List of embedding dimensions for past data.
embs_fut (List[int]) – List of embedding dimensions for future data.
n_classes (int, optional) – Number of classes for classification. Defaults to 0.
persistence_weight (float, optional) – Weight for persistence in loss calculation. Defaults to 0.0.
loss_type (str, optional) – Type of loss function to use (‘l1’ or ‘mse’). Defaults to ‘l1’.
quantiles (List[int], optional) – List of quantiles for quantile loss. Defaults to an empty list.
reduction_mode (str, optional) – Mode for reduction for categorical embedding layer (‘mean’, ‘sum’, ‘none’). Defaults to ‘mean’.
use_classical_positional_encoder (bool, optional) – Flag to use classical positional encoding or using embedding layer also for the positions. Defaults to False.
emb_dim (int, optional) – Dimension of categorical embeddings. Defaults to 16.
optim (Union[str, None], optional) – Optimizer type. Defaults to None.
optim_config (dict, optional) – Configuration for the optimizer. Defaults to None.
scheduler_config (dict, optional) – Configuration for the learning rate scheduler. Defaults to None.
- Raises:
AssertionError – If the number of quantiles is not equal to 3 when quantiles are provided.
AssertionError – If the number of output channels is not 1 for classification tasks.
- description = 'Can NOT handle multivariate output \nCan NOT handle future covariates\nCan NOT handle categorical covariates\nCan NOT handle Quantile loss function'¶
- abstractmethod forward(batch: dict) tensor ¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = False¶
- handle_future_covariates = False¶
- handle_multivariate = False¶
- handle_quantile_loss = False¶
- inference(batch: dict) tensor ¶
Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN)
- Parameters:
batch (dict) – batch
- Returns:
result
- Return type:
torch.tensor
- dsipts.models.base.dilate_loss(outputs, targets, alpha, gamma, device)¶
- dsipts.models.base.standardize_momentum(x, order)¶
dsipts.models.base_v2 module¶
- class dsipts.models.base_v2.Base(verbose: bool, past_steps: int, future_steps: int, past_channels: int, future_channels: int, out_channels: int, embs_past: List[int], embs_fut: List[int], n_classes: int = 0, persistence_weight: float = 0.0, loss_type: str = 'l1', quantiles: List[int] = [], reduction_mode: str = 'mean', use_classical_positional_encoder: bool = False, emb_dim: int = 16, optim: str | None = None, optim_config: dict = None, scheduler_config: dict = None)¶
Bases:
LightningModule
This is the basic model, each model implemented must overwrite the init method and the forward method. The inference step is optional, by default it uses the forward method but for recurrent network you should implement your own method
- Parameters:
verbose (bool) – Flag to enable verbose logging.
past_steps (int) – Number of past time steps to consider.
future_steps (int) – Number of future time steps to predict.
past_channels (int) – Number of channels in the past input data.
future_channels (int) – Number of channels in the future input data.
out_channels (int) – Number of output channels.
embs_past (List[int]) – List of embedding dimensions for past data.
embs_fut (List[int]) – List of embedding dimensions for future data.
n_classes (int, optional) – Number of classes for classification. Defaults to 0.
persistence_weight (float, optional) – Weight for persistence in loss calculation. Defaults to 0.0.
loss_type (str, optional) – Type of loss function to use (‘l1’ or ‘mse’). Defaults to ‘l1’.
quantiles (List[int], optional) – List of quantiles for quantile loss. Defaults to an empty list.
reduction_mode (str, optional) – Mode for reduction for categorical embedding layer (‘mean’, ‘sum’, ‘none’). Defaults to ‘mean’.
use_classical_positional_encoder (bool, optional) – Flag to use classical positional encoding or using embedding layer also for the positions. Defaults to False.
emb_dim (int, optional) – Dimension of categorical embeddings. Defaults to 16.
optim (Union[str, None], optional) – Optimizer type. Defaults to None.
optim_config (dict, optional) – Configuration for the optimizer. Defaults to None.
scheduler_config (dict, optional) – Configuration for the learning rate scheduler. Defaults to None.
- Raises:
AssertionError – If the number of quantiles is not equal to 3 when quantiles are provided.
AssertionError – If the number of output channels is not 1 for classification tasks.
- allow_zero_length_dataloader_with_multiple_devices: bool¶
- description = 'Can NOT handle multivariate output \nCan NOT handle future covariates\nCan NOT handle categorical covariates\nCan NOT handle Quantile loss function'¶
- abstractmethod forward(batch: dict) tensor ¶
Forlward method used during the training loop
- Parameters:
batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array
- Returns:
output of the mode;
- Return type:
torch.tensor
- handle_categorical_variables = False¶
- handle_future_covariates = False¶
- handle_multivariate = False¶
- handle_quantile_loss = False¶
- inference(batch: dict) tensor ¶
Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN)
- Parameters:
batch (dict) – batch
- Returns:
result
- Return type:
torch.tensor
- prepare_data_per_node: bool¶
- training: bool¶
- dsipts.models.base_v2.dilate_loss(outputs, targets, alpha, gamma, device)¶
- dsipts.models.base_v2.standardize_momentum(x, order)¶
dsipts.models.utils module¶
- class dsipts.models.utils.CPRS(alpha=0.5, reduction='mean')¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(y_hat, target, weights=None)¶
Compute the almost fair CRPS loss (efficient version).
- Parameters:
ensemble – Tensor of shape (batch_size, n_members, …)
target – Tensor of shape (batch_size, …)
weights – Optional per-variable or per-location weights
- Returns:
Loss tensor
- class dsipts.models.utils.Embedding_cat_variables(length: int, d_model: int, emb_dims: list, reduction_mode: str = 'mean', use_classical_positional_encoder: bool = False, device: str = 'cpu')¶
Bases:
Module
Embeds categorical variables with optional positional encodings.
- Parameters:
length (int) – Sequence length (e.g., total time steps).
d_model (int) – Output embedding dimension.
emb_dims (list) – Vocabulary sizes for each categorical feature.
reduction_mode (str) – ‘mean’, ‘sum’, or ‘none’.
use_classical_positional_encoder (bool) – Whether to use sinusoidal positional encoding.
device (str) – Device name (e.g., ‘cpu’ or ‘cuda’).
Notes
If reduction_mode is ‘none’, all embeddings are concatenated.
If use_classical_positional_encoder is True, uses fixed sin/cos encoding.
If False, treats position as a categorical variable and embeds it.
- forward(BS: int, x: Tensor | None) Tensor ¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- get_cat_n_embd(cat_vars)¶
- class dsipts.models.utils.L1Loss¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(preds, target)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dsipts.models.utils.PathDTWBatch(*args, **kwargs)¶
Bases:
Function
- static backward(ctx, grad_output)¶
Define a formula for differentiating the operation with backward mode automatic differentiation.
This function is to be overridden by all subclasses. (Defining this function is equivalent to defining the
vjp
function.)It must accept a context
ctx
as the first argument, followed by as many outputs as theforward()
returned (None will be passed in for non tensor outputs of the forward function), and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Tensor or is a Tensor not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computed w.r.t. the output.
- static forward(ctx, D, gamma)¶
Define the forward of the custom autograd Function.
This function is to be overridden by all subclasses. There are two ways to define forward:
Usage 1 (Combined forward and ctx):
@staticmethod def forward(ctx: Any, *args: Any, **kwargs: Any) -> Any: pass
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
See combining-forward-context for more details
Usage 2 (Separate forward and ctx):
@staticmethod def forward(*args: Any, **kwargs: Any) -> Any: pass @staticmethod def setup_context(ctx: Any, inputs: Tuple[Any, ...], output: Any) -> None: pass
The forward no longer accepts a ctx argument.
Instead, you must also override the
torch.autograd.Function.setup_context()
staticmethod to handle setting up thectx
object.output
is the output of the forward,inputs
are a Tuple of inputs to the forward.See extending-autograd for more details
The context can be used to store arbitrary data that can be then retrieved during the backward pass. Tensors should not be stored directly on ctx (though this is not currently enforced for backward compatibility). Instead, tensors should be saved either with
ctx.save_for_backward()
if they are intended to be used inbackward
(equivalently,vjp
) orctx.save_for_forward()
if they are intended to be used for injvp
.
- class dsipts.models.utils.Permute¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(input)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dsipts.models.utils.QuantileLossMO(quantiles)¶
Bases:
Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(preds, target)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dsipts.models.utils.SinkhornDistance(eps, max_iter, reduction='none')¶
Bases:
object
Given two empirical measures each with \(P_1\) locations \(x\in\mathbb{R}^{D_1}\) and \(P_2\) locations \(y\in\mathbb{R}^{D_2}\), outputs an approximation of the regularized OT cost for point clouds.
- Parameters:
eps (float) – regularization coefficient
max_iter (int) – maximum number of Sinkhorn iterations
reduction (string, optional) – Specifies the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’. ‘none’: no reduction will be applied, ‘mean’: the sum of the output will be divided by the number of elements in the output, ‘sum’: the output will be summed. Default: ‘none’
- Shape:
Input: \((N, P_1, D_1)\), \((N, P_2, D_2)\)
Output: \((N)\) or \(()\), depending on reduction
- M(C, u, v)¶
Modified cost for logarithmic updates
- static ave(u, u1, tau)¶
Barycenter subroutine, used by kinetic acceleration through extrapolation.
- compute(x, y)¶
- class dsipts.models.utils.SoftDTWBatch(*args, **kwargs)¶
Bases:
Function
- static backward(ctx, grad_output)¶
Define a formula for differentiating the operation with backward mode automatic differentiation.
This function is to be overridden by all subclasses. (Defining this function is equivalent to defining the
vjp
function.)It must accept a context
ctx
as the first argument, followed by as many outputs as theforward()
returned (None will be passed in for non tensor outputs of the forward function), and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Tensor or is a Tensor not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computed w.r.t. the output.
- static forward(ctx, D, gamma=1.0)¶
Define the forward of the custom autograd Function.
This function is to be overridden by all subclasses. There are two ways to define forward:
Usage 1 (Combined forward and ctx):
@staticmethod def forward(ctx: Any, *args: Any, **kwargs: Any) -> Any: pass
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
See combining-forward-context for more details
Usage 2 (Separate forward and ctx):
@staticmethod def forward(*args: Any, **kwargs: Any) -> Any: pass @staticmethod def setup_context(ctx: Any, inputs: Tuple[Any, ...], output: Any) -> None: pass
The forward no longer accepts a ctx argument.
Instead, you must also override the
torch.autograd.Function.setup_context()
staticmethod to handle setting up thectx
object.output
is the output of the forward,inputs
are a Tuple of inputs to the forward.See extending-autograd for more details
The context can be used to store arbitrary data that can be then retrieved during the backward pass. Tensors should not be stored directly on ctx (though this is not currently enforced for backward compatibility). Instead, tensors should be saved either with
ctx.save_for_backward()
if they are intended to be used inbackward
(equivalently,vjp
) orctx.save_for_forward()
if they are intended to be used for injvp
.
- dsipts.models.utils.compute_softdtw(D, gamma)¶
- dsipts.models.utils.compute_softdtw_backward(D_, R, gamma)¶
- dsipts.models.utils.dtw_grad(theta, gamma)¶
- dsipts.models.utils.dtw_hessian_prod(theta, Z, Q, E, gamma)¶
- dsipts.models.utils.get_activation(activation)¶
- dsipts.models.utils.get_scope(handle_multivariate, handle_future_covariates, handle_categorical_variables, handle_quantile_loss)¶
- dsipts.models.utils.my_max(x, gamma)¶
- dsipts.models.utils.my_max_hessian_product(p, z, gamma)¶
- dsipts.models.utils.my_min(x, gamma)¶
- dsipts.models.utils.my_min_hessian_product(p, z, gamma)¶
- dsipts.models.utils.pairwise_distances(x, y=None)¶
- Input: x is a Nxd matrix
y is an optional Mxd matirx
- Output: dist is a NxM matrix where dist[i,j] is the square norm between x[i,:] and y[j,:]
if y is not given then use ‘y=x’.
i.e. dist[i,j] = ||x[i,:]-y[j,:]||^2
- dsipts.models.utils.weight_init(m)¶
- Usage:
model = Model() model.apply(weight_init)
- dsipts.models.utils.weight_init_zeros(m)¶