dsipts.models.VVA module

class dsipts.models.VVA.VVA(past_steps, future_steps, past_channels, future_channels, embs, d_model, max_voc_size, token_split, num_layers, dropout_rate, n_heads, out_channels, persistence_weight=0.0, loss_type='l1', quantiles=[], optim=None, optim_config=None, scheduler_config=None, **kwargs)[source]

Bases: Base

Custom encoder-decoder

Parameters:
  • past_steps (int) – number of past datapoints used

  • future_steps (int) – number of future lag to predict

  • past_channels (int) – number of numeric past variables, must be >0

  • future_channels (int) – number of future numeric variables

  • embs (List) – list of the initial dimension of the categorical variables

  • cat_emb_dim (int) – final dimension of each categorical variable

  • hidden_RNN (int) – hidden size of the RNN block

  • num_layers_RNN (int) – number of RNN layers

  • kind (str) – one among GRU or LSTM

  • kernel_size (int) – kernel size in the encoder convolutional block

  • sum_emb (bool) – if true the contribution of each embedding will be summed-up otherwise stacked

  • out_channels (int) – number of output channels

  • activation (str, optional) – activation fuction function pytorch. Default torch.nn.ReLU

  • remove_last (bool, optional) – if True the model learns the difference respect to the last seen point

  • persistence_weight (float) – weight controlling the divergence from persistence model. Default 0

  • loss_type (str, optional) – this model uses custom losses or l1 or mse. Custom losses can be linear_penalization or exponential_penalization. Default l1,

  • quantiles (List[int], optional) – we can use quantile loss il len(quantiles) = 0 (usually 0.1,0.5, 0.9) or L1loss in case len(quantiles)==0. Defaults to [].

  • dropout_rate (float, optional) – dropout rate in Dropout layers

  • use_bn (bool, optional) – if true BN layers will be added and dropouts will be removed

  • use_glu (bool,optional) – use GLU for feature selection. Defaults to True.

  • glu_percentage (float, optiona) – percentage of features to use. Defaults to 1.0.

  • n_classes (int) – number of classes (0 in regression)

  • optim (str, optional) – if not None it expects a pytorch optim method. Defaults to None that is mapped to Adam.

  • optim_config (dict, optional) – configuration for Adam optimizer. Defaults to None.

  • scheduler_config (dict, optional) – configuration for stepLR scheduler. Defaults to None.

handle_multivariate = False
handle_future_covariates = False
handle_categorical_variables = False
handle_quantile_loss = False
description = 'Can NOT  handle multivariate output \nCan NOT  handle future covariates\nCan NOT  handle categorical covariates\nCan NOT  handle Quantile loss function'
__init__(past_steps, future_steps, past_channels, future_channels, embs, d_model, max_voc_size, token_split, num_layers, dropout_rate, n_heads, out_channels, persistence_weight=0.0, loss_type='l1', quantiles=[], optim=None, optim_config=None, scheduler_config=None, **kwargs)[source]

Custom encoder-decoder

Parameters:
  • past_steps (int) – number of past datapoints used

  • future_steps (int) – number of future lag to predict

  • past_channels (int) – number of numeric past variables, must be >0

  • future_channels (int) – number of future numeric variables

  • embs (List) – list of the initial dimension of the categorical variables

  • cat_emb_dim (int) – final dimension of each categorical variable

  • hidden_RNN (int) – hidden size of the RNN block

  • num_layers_RNN (int) – number of RNN layers

  • kind (str) – one among GRU or LSTM

  • kernel_size (int) – kernel size in the encoder convolutional block

  • sum_emb (bool) – if true the contribution of each embedding will be summed-up otherwise stacked

  • out_channels (int) – number of output channels

  • activation (str, optional) – activation fuction function pytorch. Default torch.nn.ReLU

  • remove_last (bool, optional) – if True the model learns the difference respect to the last seen point

  • persistence_weight (float) – weight controlling the divergence from persistence model. Default 0

  • loss_type (str, optional) – this model uses custom losses or l1 or mse. Custom losses can be linear_penalization or exponential_penalization. Default l1,

  • quantiles (List[int], optional) – we can use quantile loss il len(quantiles) = 0 (usually 0.1,0.5, 0.9) or L1loss in case len(quantiles)==0. Defaults to [].

  • dropout_rate (float, optional) – dropout rate in Dropout layers

  • use_bn (bool, optional) – if true BN layers will be added and dropouts will be removed

  • use_glu (bool,optional) – use GLU for feature selection. Defaults to True.

  • glu_percentage (float, optiona) – percentage of features to use. Defaults to 1.0.

  • n_classes (int) – number of classes (0 in regression)

  • optim (str, optional) – if not None it expects a pytorch optim method. Defaults to None that is mapped to Adam.

  • optim_config (dict, optional) – configuration for Adam optimizer. Defaults to None.

  • scheduler_config (dict, optional) – configuration for stepLR scheduler. Defaults to None.

configure_optimizers()[source]

This long function is unfortunately doing something very simple and is being very defensive: We are separating out all parameters of the model into two buckets: those that will experience weight decay for regularization and those that won’t (biases, and layernorm/embedding weights). We are then returning the PyTorch optimizer object.

forward(batch)[source]

Forlward method used during the training loop

Parameters:

batch (dict) – the batch structure. The keys are: y : the target variable(s). This is always present x_num_past: the numerical past variables. This is always present x_num_future: the numerical future variables x_cat_past: the categorical past variables x_cat_future: the categorical future variables idx_target: index of target features in the past array

Returns:

output of the mode;

Return type:

torch.tensor

generate(idx, max_new_tokens, temperature=1.0, do_sample=False, top_k=None, num_samples=100)[source]

Take a conditioning sequence of indices idx (LongTensor of shape (b,t)) and complete the sequence max_new_tokens times, feeding the predictions back into the model each time. Most likely you’ll want to make sure to be in model.eval() mode of operation for this.

inference(batch)[source]

Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN)

Parameters:

batch (dict) – batch

Returns:

result

Return type:

torch.tensor