dsipts.models.ttm package¶

Submodules¶

dsipts.models.ttm.configuration_tinytimemixer module¶

TinyTimeMixer model configuration

class dsipts.models.ttm.configuration_tinytimemixer.TinyTimeMixerConfig(context_length: int = 64, patch_length: int = 8, num_input_channels: int = 1, prediction_length: int = 16, patch_stride: int = 8, prediction_channel_indices: list | None = None, exogenous_channel_indices: list | None = None, d_model: int = 16, expansion_factor: int = 2, num_layers: int = 3, dropout: float = 0.2, mode: str = 'common_channel', gated_attn: bool = True, norm_mlp: str = 'LayerNorm', self_attn: bool = False, self_attn_heads: int = 1, use_positional_encoding: bool = False, positional_encoding_type: str = 'sincos', scaling: str | bool | None = 'std', loss: str | None = 'mse', init_std: float = 0.02, post_init: bool = False, norm_eps: float = 1e-05, adaptive_patching_levels: int = 0, resolution_prefix_tuning: bool = False, frequency_token_vocab_size: int = 5, head_dropout: float = 0.2, distribution_output: str = 'student_t', num_parallel_samples: int = 100, decoder_num_layers: int = 8, decoder_d_model: int = 8, decoder_adaptive_patching_levels: int = 0, decoder_raw_residual: bool = False, decoder_mode: str = 'common_channel', use_decoder: bool = True, enable_forecast_channel_mixing: bool = False, fcm_gated_attn: bool = True, fcm_context_length: int = 1, fcm_use_mixer: bool = False, fcm_mix_layers: int = 2, fcm_prepend_past: bool = True, fcm_prepend_past_offset: int | None = None, categorical_vocab_size_list: list | None = None, prediction_filter_length: int | None = None, init_linear: str = 'pytorch', init_embed: str = 'pytorch', quantile: float = 0.5, huber_delta: float = 1, mask_value: int = 0, **kwargs)¶

Bases: PretrainedConfig

This is the configuration class to store the configuration of a [TinyTimeMixerModel]. It is used to instantiate a TinyTimeMixer model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the TinyTimeMixer {} architecture.

Configuration objects inherit from [PretrainedConfig] and can be used to control the model outputs. Read the documentation from [PretrainedConfig] for more information.

Parameters:

context_length (int, optional, defaults to 64) – The context/history length for the input sequence.
patch_length (int, optional, defaults to 8) – The patch length for the input sequence.
num_input_channels (int) – Number of input variates. For Univariate, set it to 1.
patch_stride (int, optional, defaults to 8) – Amount of points to stride. If its value is same as patch_length, we get non-overlapping patches.
d_model (int, optional, defaults to 16) – Hidden feature size of the model.
prediction_length (int, optional, defaults to 16) – Number of time steps to forecast for a forecasting task. Also known as the Forecast Horizon.
num_parallel_samples (int, optional, defaults to 100) – The number of samples to generate in parallel for probabilistic forecast.
expansion_factor (int, optional, defaults to 2) – Expansion factor to use inside MLP. Recommended range is 2-5. Larger value indicates more complex model.
num_layers (int, optional, defaults to 3) – Number of layers to use. Recommended range is 3-15. Larger value indicates more complex model.
dropout (float, optional, defaults to 0.2) – The dropout probability the TinyTimeMixer backbone. Recommended range is 0.2-0.7
mode (str, optional, defaults to “common_channel”) – Mixer Mode. Determines how to process the channels. Allowed values: “common_channel”, “mix_channel”. In “common_channel” mode, we follow Channel-independent modelling with no explicit channel-mixing. Channel mixing happens in an implicit manner via shared weights across channels. (preferred first approach) In “mix_channel” mode, we follow explicit channel-mixing in addition to patch and feature mixer. (preferred approach when channel correlations are very important to model)
gated_attn (bool, optional, defaults to True) – Enable Gated Attention.
norm_mlp (str, optional, defaults to “LayerNorm”) – Normalization layer (BatchNorm or LayerNorm).
self_attn (bool, optional, defaults to False) – Enable Tiny self attention across patches. This can be enabled when the output of Vanilla TinyTimeMixer with gated attention is not satisfactory. Enabling this leads to explicit pair-wise attention and modelling across patches.
self_attn_heads (int, optional, defaults to 1) – Number of self-attention heads. Works only when self_attn is set to True.
use_positional_encoding (bool, optional, defaults to False) – Enable the use of positional embedding for the tiny self-attention layers. Works only when self_attn is set to True.
positional_encoding_type (str, optional, defaults to “sincos”) – Positional encodings. Options “random” and “sincos” are supported. Works only when use_positional_encoding is set to True
scaling (string or bool, optional, defaults to “std”) – Whether to scale the input targets via “mean” scaler, “std” scaler or no scaler if None. If True, the scaler is set to “mean”.
loss (string, optional, defaults to “mse”) – The loss function to finetune or pretrain the the model. Allowed values are “mse” or “mae” or “pinball” or “huber”. Use pinball loss for probabilistic forecasts of different quantiles. Distribution head (nll) is currently disabled and not allowed.
init_std (float, optional, defaults to 0.02) – The standard deviation of the truncated normal weight initialization distribution.
post_init (bool, optional, defaults to False) – Whether to use custom weight initialization from transformers library, or the default initialization in PyTorch. Setting it to False performs PyTorch weight initialization.
norm_eps (float, optional, defaults to 1e-05) – A value added to the denominator for numerical stability of normalization.
adaptive_patching_levels (int, optional, defaults to 0) –
If adaptive_patching_levels is i, then we will have i levels with each level having n_layers. Level id starts with 0. num_patches at level i will be multipled by (2^i) and num_features at level i will be divided by (2^i). For Ex. if adaptive_patching_levels is 3 - then we will have 3 levels:

level 2: num_features//(2^2), num_patches*(2^2) level 1: num_features//(2^1), num_patches*(2^1) level 0: num_features//(2^0), num_patches*(2^0)

adaptive_patching_levels = 1 is same as one level PatchTSMixer. This module gets disabled when adaptive_patching_levels is 0 or neg value. Defaults to 0 (off mode).
resolution_prefix_tuning (bool, optional, defaults to False) – Enable if your dataloader has time resolution information as defined in get_freq_mapping function in modelling_tinytimemixer.
frequency_token_vocab_size (int, optional, defaults to 5) – Vocab size to use when resolution_prefix_tuning is enabled.
head_dropout (float, optional, defaults to 0.2) – The dropout probability the TinyTimeMixer head.
distribution_output (string, optional, defaults to “student_t”) – The distribution emission head for the model when loss is “nll”. Could be either “student_t”, “normal” or “negative_binomial”.
prediction_channel_indices (list, optional) – List of channel indices to forecast. If None, forecast all channels. Target data is expected to have all channels and we explicitly filter the channels in prediction and target before loss computation. Please provide the indices in sorted ascending order.
exogenous_channel_indices (list, optional) – List of channel indices whose values are known in the forecast period. Please provide the indices in sorted ascending order.
decoder_num_layers (int, optional, defaults to 8) – Number of layers to use in decoder
decoder_d_model (int, optional, defaults to 16) – Defines the hidden feature size of the decoder.
decoder_adaptive_patching_levels (int, optional, defaults to 0) – Adaptive Patching levels for decoder. Preferable to set it to 0 for decoder to keep it light weight.
decoder_raw_residual (bool, optional, defaults to False) – Flag to enable merging of raw embedding with encoder embedding for decoder input. Defaults to False.
decoder_mode (string, optional, defaults to “common_channel”) – Decoder channel mode. Use “common_channel” for channel-independent modelling and `”mix_channel” for channel-mixing modelling
use_decoder (bool, optional, defaults to True) – Enable to use decoder.
enable_forecast_channel_mixing (bool, optional, defaults to False) – Enable if we want to reconcile forecasts across all channels and also to enable exogenous infusion, if you have them.
fcm_gated_attn (bool, optional, defaults to True) – Enable gated attention in forecast channel mixing block.
fcm_context_length (int, optional, defaults to `1) – Surrounding context length to use. For Ex. If we want to consider 2 lag point before and after a data point, provide value 2 for fcm_context_length
fcm_use_mixer (bool, optional, defaults to True) – Enable Mixing in forecast channel mixing block.
fcm_mix_layers (int, optional, defaults to 2) – Number of mixer layers to use if fcm_use_mixer is enabled
fcm_prepend_past (bool, optional, defaults to True) – Prepend last context for forecast reconciliation
fcm_prepend_past_offset (int, optional, defaults to None)
categorical_vocab_size_list (list, optional) – List of vocab size for all the tokenized categorical variables to use. Pass it in the same order as used in the foreward call param static_categorical_values.
prediction_filter_length (int,*optional*, defaults to None) – Actual length in the prediction output to use for loss calculations.

Example:

```python >>> from transformers import TinyTimeMixerConfig, TinyTimeMixerModel

>>> # Initializing a default TinyTimeMixer configuration
>>> configuration = TinyTimeMixerConfig()

>>> # Randomly initializing a model (with random weights) from the configuration
>>> model = TinyTimeMixerModel(configuration)

>>> # Accessing the model configuration
>>> configuration = model.config
```

attribute_map: dict[str, str] = {'hidden_size': 'd_model', 'num_hidden_layers': 'num_layers'}¶

check_and_init_preprocessing()¶

model_type: str = 'tinytimemixer'¶

dsipts.models.ttm.consts module¶

dsipts.models.ttm.modeling_tinytimemixer module¶

PyTorch TinyTimeMixer model.

class dsipts.models.ttm.modeling_tinytimemixer.FeatureMixerBlock(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(hidden: Tensor)¶

Parameters:: hidden (torch.Tensor of shape (batch_size, num_patches, d_model)) – Input tensor to the layer.
Returns:: Transformed tensor.
Return type:: torch.Tensor

class dsipts.models.ttm.modeling_tinytimemixer.ForecastChannelHeadMixer(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(base_forecasts: Tensor, past_values: Tensor | None, future_values: Tensor | None = None)¶

Parameters:

base_forecasts (torch.Tensor of shape (batch_size, prediction length, forecast_channels)) – Base Forecasts to reconcile
past_values (torch.FloatTensor of shape (batch_size, seq_length, num_input_channels))
task (Context values of the time series. For a forecasting)
values. (this denotes the history/past time series)
series (num_input_channels dimension should be 1. For multivariate time)
series
is (it)
1. (greater than)
future_values (torch.Tensor of shape (batch_size, prediction length, input_channels), optional, Defaults to None) – Actual groundtruths of the forecasts. Pass dummy values (say 0) for forecast channels, if groundtruth is unknown. Pass the correct values for Exogenous channels where the forecast values are known.

Returns:

Updated forecasts of shape (batch_size, prediction length, forecast_channels)

Return type:

torch.Tensor

class dsipts.models.ttm.modeling_tinytimemixer.PatchMixerBlock(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(hidden_state)¶

Parameters:: hidden_state (torch.Tensor) – Input tensor.
Returns:: Transformed tensor.
Return type:: torch.Tensor

class dsipts.models.ttm.modeling_tinytimemixer.PinballLoss(quantile: float)¶

Bases: Module

Initialize the Pinball Loss for multidimensional tensors.

Args: quantile (float): The desired quantile (e.g., 0.5 for median, 0.9 for 90th percentile).

forward(predictions, targets)¶

Compute the Pinball Loss for shape [b, seq_len, channels].

Args: predictions (torch.Tensor): Predicted values, shape [b, seq_len, channels]. targets (torch.Tensor): Ground truth values, shape [b, seq_len, channels].

Returns: torch.Tensor: The mean pinball loss over all dimensions.

class dsipts.models.ttm.modeling_tinytimemixer.SampleTinyTimeMixerPredictionOutput(sequences: FloatTensor = None)¶

Bases: ModelOutput

Base class for time series model’s predictions outputs that contains the sampled values from the chosen distribution.

Parameters:: sequences (torch.FloatTensor of shape (batch_size, num_samples, prediction_length, number_channels)) – Sampled values from the chosen distribution.

sequences: FloatTensor = None¶

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerAdaptivePatchingBlock(config: TinyTimeMixerConfig, adapt_patch_level: int)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(hidden: Tensor)¶

Parameters:: hidden (torch.Tensor of shape (batch_size x nvars x num_patch x d_model)) – Input tensor to the layer.
Returns:: Transformed tensor.
Return type:: torch.Tensor

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerAttention(embed_dim: int, num_heads: int, dropout: float = 0.0, is_decoder: bool = False, bias: bool = True, is_causal: bool = False, config: TinyTimeMixerConfig | None = None)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(hidden_states: Tensor, key_value_states: Tensor | None = None, past_key_value: Tuple[Tensor] | None = None, attention_mask: Tensor | None = None, layer_head_mask: Tensor | None = None, output_attentions: bool = False) → Tuple[Tensor, Tensor | None, Tuple[Tensor] | None]¶: Input shape: Batch x Time x Channel

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerBatchNorm(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(inputs: Tensor)¶

Parameters:: inputs (torch.Tensor of shape (batch_size, sequence_length, d_model)) – input for Batch norm calculation
Returns:: torch.Tensor of shape (batch_size, sequence_length, d_model)

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerBlock(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(hidden_state, output_hidden_states: bool = False)¶

Parameters:

hidden_state (torch.Tensor) – The input tensor.
output_hidden_states (bool, optional, defaults to False.) – Whether to output the hidden states as well.

Returns:

The embedding. list: List of all hidden states if output_hidden_states is set to True.

Return type:

torch.Tensor

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerCategoricalEmbeddingLayer(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(static_categorical_values: Tensor)¶

Parameters:

static_categorical_values (torch.FloatTensor of shape (batch_size, number_of_categorical_variables))
the (Tokenized categorical values can be passed here. Ensure to pass in the same order as the vocab size list used in)
categorical_vocab_size_list (TinyTimeMixerConfig param)

Returns:

torch.Tensor of shape (batch_size, number_of_categorical_variables, num_patches, d_model)

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerChannelFeatureMixerBlock(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(inputs: Tensor)¶

Parameters:: inputs (torch.Tensor of shape ((batch_size, num_channels, num_patches, d_model))) – input to the MLP layer
Returns:: torch.Tensor of the same shape as inputs

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerDecoder(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(hidden_state, patch_input, output_hidden_states: bool = False, static_categorical_values: Tensor | None = None)¶

Parameters:

hidden_state (torch.Tensor of shape (batch_size x nvars x num_patch x d_model)) – The input tensor from backbone.
output_hidden_states (bool, optional, defaults to False.) – Whether to output the hidden states as well.
static_categorical_values (torch.FloatTensor of shape (batch_size, number_of_categorical_variables), optional)
the (Tokenized categorical values can be passed here. Ensure to pass in the same order as the vocab size list used in)
categorical_vocab_size_list (TinyTimeMixerConfig param)

Returns:

The embedding. list: List of all hidden states if output_hidden_states is set to True.

Return type:

torch.Tensor

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerEncoder(config: TinyTimeMixerConfig)¶

Bases: TinyTimeMixerPreTrainedModel

Initialize internal Module state, shared by both nn.Module and ScriptModule.

config_class¶: alias of TinyTimeMixerConfig

forward(past_values: Tensor, output_hidden_states: bool | None = False, return_dict: bool | None = None, freq_token: Tensor | None = None) → Tuple | TinyTimeMixerEncoderOutput¶

Parameters:

past_values (torch.FloatTensor of shape (batch_size, seq_length, num_input_channels)) – Context values of the time series. For univariate time series, num_input_channels dimension should be 1. For multivariate time series, it is greater than 1.
output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers.
return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

Returns:

torch.FloatTensor of shape (batch_size, n_vars, num_patches, d_model)

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerEncoderOutput(last_hidden_state: FloatTensor = None, hidden_states: Tuple[FloatTensor] | None = None)¶

Bases: ModelOutput

Base class for TinyTimeMixerEncoderOutput, with potential hidden states.

Parameters:

last_hidden_state (torch.FloatTensor of shape (batch_size, num_channels, num_patches, d_model)) – Hidden-state at the output of the last layer of the model.
hidden_states (tuple(torch.FloatTensor), optional) – Hidden-states of the model at the output of each layer.

hidden_states: Tuple[FloatTensor] | None = None¶

last_hidden_state: FloatTensor = None¶

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerForMaskedPrediction(config: TinyTimeMixerConfig)¶

Bases: TinyTimeMixerForPrediction

Initialize internal Module state, shared by both nn.Module and ScriptModule.

config_class¶: alias of TinyTimeMixerConfig

forward(past_values: Tensor, future_values: Tensor | None = None, past_observed_mask: Tensor | None = None, future_observed_mask: Tensor | None = None, output_hidden_states: bool | None = False, return_loss: bool = True, return_dict: bool | None = None, freq_token: Tensor | None = None, static_categorical_values: Tensor | None = None, metadata: Tensor | None = None) → TinyTimeMixerForPredictionOutput¶

past_observed_mask (torch.Tensor of shape (batch_size, sequence_length, num_input_channels), optional):

Boolean mask to indicate which past_values were observed and which were missing. Mask values selected in [0, 1] or [False, True]:

1 or True for values that are observed,

0 or False for values that are missing (i.e. NaNs that were replaced by zeros).

future_values (torch.FloatTensor of shape (batch_size, target_len, num_input_channels) for forecasting,:

(batch_size, num_targets) for regression, or (batch_size,) for classification, optional): Target values of the time series, that serve as labels for the model. The future_values is what the Transformer needs during training to learn to output, given the past_values. Note that, this is NOT required for a pretraining task.

For a forecasting task, the shape is be (batch_size, target_len, num_input_channels). Even if we want to forecast only specific channels by setting the indices in prediction_channel_indices parameter, pass the target data with all channels, as channel Filtering for both prediction and target will be manually applied before the loss computation.

future_observed_mask (torch.Tensor of shape (batch_size, prediction_length, num_targets), optional):

Boolean mask to indicate which future_values were observed and which were missing. Mask values selected in [0, 1] or [False, True]:

1 or True for values that are observed,

0 or False for values that are missing (i.e. NaNs that were replaced by zeros).

return_loss (bool, optional):

Whether to return the loss in the forward call.

static_categorical_values (torch.FloatTensor of shape (batch_size, number_of_categorical_variables), optional):

Tokenized categorical values can be passed here. Ensure to pass in the same order as the vocab size list used in the TinyTimeMixerConfig param categorical_vocab_size_list

metadata (torch.Tensor, optional): A tensor containing metadata. Currently unused in TinyTimeMixer, but used

to support custom trainers. Defaults to None.

Returns:

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerForPrediction(config: TinyTimeMixerConfig)¶

Bases: TinyTimeMixerPreTrainedModel

Initialize internal Module state, shared by both nn.Module and ScriptModule.

config_class¶: alias of TinyTimeMixerConfig

forward(past_values: Tensor, future_values: Tensor | None = None, past_observed_mask: Tensor | None = None, future_observed_mask: Tensor | None = None, output_hidden_states: bool | None = False, return_loss: bool = True, return_dict: bool | None = None, freq_token: Tensor | None = None, static_categorical_values: Tensor | None = None, metadata: Tensor | None = None) → TinyTimeMixerForPredictionOutput¶

past_observed_mask (torch.Tensor of shape (batch_size, sequence_length, num_input_channels), optional):

Boolean mask to indicate which past_values were observed and which were missing. Mask values selected in [0, 1] or [False, True]:

1 or True for values that are observed,

0 or False for values that are missing (i.e. NaNs that were replaced by zeros).

future_values (torch.FloatTensor of shape (batch_size, target_len, num_input_channels) for forecasting,:

(batch_size, num_targets) for regression, or (batch_size,) for classification, optional): Target values of the time series, that serve as labels for the model. The future_values is what the Transformer needs during training to learn to output, given the past_values. Note that, this is NOT required for a pretraining task.

For a forecasting task, the shape is be (batch_size, target_len, num_input_channels). Even if we want to forecast only specific channels by setting the indices in prediction_channel_indices parameter, pass the target data with all channels, as channel Filtering for both prediction and target will be manually applied before the loss computation.

future_observed_mask (torch.Tensor of shape (batch_size, prediction_length, num_targets), optional):

Boolean mask to indicate which future_values were observed and which were missing. Mask values selected in [0, 1] or [False, True]:

1 or True for values that are observed,

0 or False for values that are missing (i.e. NaNs that were replaced by zeros).

return_loss (bool, optional):

Whether to return the loss in the forward call.

static_categorical_values (torch.FloatTensor of shape (batch_size, number_of_categorical_variables), optional):

Tokenized categorical values can be passed here. Ensure to pass in the same order as the vocab size list used in the TinyTimeMixerConfig param categorical_vocab_size_list

metadata (torch.Tensor, optional): A tensor containing metadata. Currently unused in TinyTimeMixer, but used

to support custom trainers. Defaults to None.

Returns:

generate(past_values: Tensor, past_observed_mask: Tensor | None = None) → SampleTinyTimeMixerPredictionOutput¶

Generate sequences of sample predictions from a model with a probability distribution head.

Parameters:

past_values (torch.FloatTensor of shape (batch_size, sequence_length, num_input_channels)) – Past values of the time series that serves as context in order to predict the future.
past_observed_mask (torch.Tensor of shape (batch_size, sequence_length, num_input_channels), optional) –
Boolean mask to indicate which past_values were observed and which were missing. Mask values selected in [0, 1] or [False, True]:
- 1 or True for values that are observed,
- 0 or False for values that are missing (i.e. NaNs that were replaced by zeros).

Returns:

[SampleTinyTimeMixerPredictionOutput] where the outputs sequences tensor will have shape (batch_size, number of samples, prediction_length, num_input_channels).

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerForPredictionHead(config: TinyTimeMixerConfig, distribution_output=None)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(hidden_features, past_values, future_values=None)¶

Parameters:

` (hidden_features) – Input hidden features.
past_values (torch.FloatTensor of shape (batch_size, seq_length, num_input_channels))
task (Context values of the time series. For a forecasting)
values. (this denotes the history/past time series)
series (num_input_channels dimension should be 1. For multivariate time)
series
is (it)
1. (greater than)
future_values (torch.Tensor of shape (batch_size, prediction length, input_channels), optional, Defaults to None) – Actual groundtruths of the forecasts. Pass dummy values (say 0) for forecast channels, if groundtruth is unknown. Pass the correct values for Exogenous channels where the forecast values are known.

Returns:

torch.Tensor of shape (batch_size, prediction_length, forecast_channels).

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerForPredictionOutput(loss: FloatTensor | None = None, prediction_outputs: FloatTensor = None, backbone_hidden_state: FloatTensor = None, decoder_hidden_state: FloatTensor = None, hidden_states: Tuple[FloatTensor] | None = None, loc: FloatTensor = None, scale: FloatTensor = None)¶

Bases: ModelOutput

Output type of [TinyTimeMixerForPredictionOutput].

Parameters:

prediction_outputs (torch.FloatTensor of shape (batch_size, prediction_length, num_input_channels)) – Prediction output from the forecast head.
backbone_hidden_state (torch.FloatTensor of shape (batch_size, num_input_channels, num_patches, d_model)) – Backbone embeddings before passing through the decoder
decoder_hidden_state (torch.FloatTensor of shape (batch_size, num_input_channels, num_patches, d_model)) – Decoder embeddings before passing through the head.
hidden_states (tuple(torch.FloatTensor), optional) – Hidden-states of the model at the output of each layer plus the optional initial embedding outputs.
loss (optional, returned when y is provided, torch.FloatTensor of shape ()) – Total loss.
loc (torch.FloatTensor, optional of shape (batch_size, 1, num_input_channels)) – Input mean
scale (torch.FloatTensor, optional of shape (batch_size, 1, num_input_channels)) – Input std dev

backbone_hidden_state: FloatTensor = None¶

decoder_hidden_state: FloatTensor = None¶

hidden_states: Tuple[FloatTensor] | None = None¶

loc: FloatTensor = None¶

loss: FloatTensor | None = None¶

prediction_outputs: FloatTensor = None¶

scale: FloatTensor = None¶

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerGatedAttention(in_size: int, out_size: int)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(inputs)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerLayer(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(hidden: Tensor)¶

Parameters:: hidden (torch.Tensor of shape (batch_size, num_patches, d_model)) – Input tensor to the layer.
Returns:: Transformed tensor.
Return type:: torch.Tensor

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerMLP(in_features, out_features, config)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(inputs: Tensor)¶

Parameters:: inputs (torch.Tensor of shape ((batch_size, num_channels, num_patches, d_model))) – Input to the MLP layer.
Returns:: torch.Tensor of the same shape as inputs

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerMeanScaler(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(data: Tensor, observed_indicator: Tensor) → Tuple[Tensor, Tensor, Tensor]¶

Parameters:

data (torch.Tensor of shape (batch_size, sequence_length, num_input_channels)) – input for Batch norm calculation
observed_indicator (torch.BoolTensor of shape (batch_size, sequence_length, num_input_channels)) – Calculating the scale on the observed indicator.

Returns:

tuple of torch.Tensor of shapes: ((batch_size, sequence_length, num_input_channels),`(batch_size, 1, num_input_channels)`, (batch_size, 1, num_input_channels))

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerModel(config: TinyTimeMixerConfig)¶

Bases: TinyTimeMixerPreTrainedModel

Initialize internal Module state, shared by both nn.Module and ScriptModule.

config_class¶: alias of TinyTimeMixerConfig

forward(past_values: Tensor, past_observed_mask: Tensor | None = None, output_hidden_states: bool | None = False, return_dict: bool | None = None, freq_token: Tensor | None = None) → TinyTimeMixerModelOutput¶

past_observed_mask (torch.Tensor of shape (batch_size, sequence_length, num_input_channels), optional):

Boolean mask to indicate which past_values were observed and which were missing. Mask values selected in [0, 1] or [False, True]:

1 or True for values that are observed,

0 or False for values that are missing (i.e. NaNs that were replaced by zeros).

Returns:

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerModelOutput(last_hidden_state: FloatTensor = None, hidden_states: Tuple[FloatTensor] | None = None, patch_input: FloatTensor = None, loc: FloatTensor | None = None, scale: FloatTensor | None = None)¶

Bases: ModelOutput

Base class for model’s outputs, with potential hidden states.

Parameters:

last_hidden_state (torch.FloatTensor of shape (batch_size, num_channels, num_patches, d_model)) – Hidden-state at the output of the last layer of the model.
hidden_states (tuple(torch.FloatTensor), optional) – Hidden-states of the model at the output of each layer.
patch_input (torch.FloatTensor of shape (batch_size, num_channels, num_patches, patch_length)) – Patched input data to the model.
loc – (torch.FloatTensor of shape (batch_size, 1, num_channels),*optional*): Gives the mean of the context window per channel. Used for revin denorm outside the model, if revin enabled.
scale – (torch.FloatTensor of shape (batch_size, 1, num_channels),*optional*): Gives the std dev of the context window per channel. Used for revin denorm outside the model, if revin enabled.

hidden_states: Tuple[FloatTensor] | None = None¶

last_hidden_state: FloatTensor = None¶

loc: FloatTensor | None = None¶

patch_input: FloatTensor = None¶

scale: FloatTensor | None = None¶

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerNOPScaler(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(data: Tensor, observed_indicator: Tensor = None) → Tuple[Tensor, Tensor, Tensor]¶

Parameters:

data (torch.Tensor of shape (batch_size, sequence_length, num_input_channels)) – input for Batch norm calculation

Returns:

tuple of torch.Tensor of shapes: ((batch_size, sequence_length, num_input_channels),`(batch_size, 1, num_input_channels)`, (batch_size, 1, num_input_channels))

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerNormLayer(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(inputs: Tensor)¶

Parameters:: inputs (torch.Tensor of shape ((batch_size, num_channels, num_patches, d_model))) – Input to the normalization layer.
Returns:: torch.Tensor of shape ((batch_size, num_channels, num_patches, d_model))

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerPatchify(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(past_values: Tensor)¶

Parameters:: past_values (torch.Tensor of shape (batch_size, sequence_length, num_channels), required) – Input for patchification
Returns:: torch.Tensor of shape (batch_size, num_channels, num_patches, patch_length)

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerPositionalEncoding(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(patch_input: Tensor)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerPreTrainedModel(config: PretrainedConfig, *inputs, **kwargs)¶

Bases: PreTrainedModel

Initialize internal Module state, shared by both nn.Module and ScriptModule.

base_model_prefix = 'model'¶

config_class¶: alias of TinyTimeMixerConfig

main_input_name = 'past_values'¶

supports_gradient_checkpointing = False¶

class dsipts.models.ttm.modeling_tinytimemixer.TinyTimeMixerStdScaler(config: TinyTimeMixerConfig)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(data: Tensor, observed_indicator: Tensor) → Tuple[Tensor, Tensor, Tensor]¶

Parameters:

data (torch.Tensor of shape (batch_size, sequence_length, num_input_channels)) – input for Batch norm calculation
observed_indicator (torch.BoolTensor of shape (batch_size, sequence_length, num_input_channels)) – Calculating the scale on the observed indicator.

Returns:

tuple of torch.Tensor of shapes: ((batch_size, sequence_length, num_input_channels),`(batch_size, 1, num_input_channels)`, (batch_size, 1, num_input_channels))

dsipts.models.ttm.modeling_tinytimemixer.nll(input: Distribution, target: Tensor) → Tensor¶: Computes the negative log likelihood loss from input distribution with respect to target.

dsipts.models.ttm.modeling_tinytimemixer.weighted_average(input_tensor: Tensor, weights: Tensor | None = None, dim=None) → Tensor¶

Computes the weighted average of a given tensor across a given dim, masking values associated with weight zero, meaning instead of nan * 0 = nan you will get 0 * 0 = 0.

Parameters:

input_tensor (torch.FloatTensor) – Input tensor, of which the average must be computed.
weights (torch.FloatTensor, optional) – Weights tensor, of the same shape as input_tensor.
dim (int, optional) – The dim along which to average input_tensor.

Returns:

The tensor with values averaged along the specified dim.

Return type:

torch.FloatTensor

dsipts.models.ttm.utils module¶

class dsipts.models.ttm.utils.ForceReturn(value)¶

Bases: Enum

Enum for the force_return parameter in the get_model function.

“zeropad” = Returns a pre-trained TTM that has a context length higher than the input context length, hence,: the user must apply zero-padding to use the returned model.
“rolling” = Returns a pre-trained TTM that has a prediction length lower than the requested prediction length,: hence, the user must apply rolling technique to use the returned model to forecast to the desired length. The RecursivePredictor class can be utilized in this scenario.

“random_init_small” = Returns a randomly initialized small TTM which must be trained before performing inference. “random_init_medium” = Returns a randomly initialized medium TTM which must be trained before performing inference. “random_init_large” = Returns a randomly initialized large TTM which must be trained before performing inference.

RANDOM_INIT_LARGE = 'random_init_large'¶

RANDOM_INIT_MEDIUM = 'random_init_medium'¶

RANDOM_INIT_SMALL = 'random_init_small'¶

ROLLING = 'rolling'¶

ZEROPAD = 'zeropad'¶

class dsipts.models.ttm.utils.ModelSize(value)¶

Bases: Enum

Enum for the size parameter in the get_random_ttm function.

LARGE = 'large'¶

MEDIUM = 'medium'¶

SMALL = 'small'¶

class dsipts.models.ttm.utils.RMSELoss¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(yhat, y)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

dsipts.models.ttm.utils.check_ttm_model_path(model_path)¶

dsipts.models.ttm.utils.count_parameters(model: Module) → int¶

Count trainable parameters in a model

Parameters:: model (torch.nn.Module) – The model.
Returns:: Number of parameters requiring gradients.
Return type:: int

dsipts.models.ttm.utils.get_frequency_token(token_name: str)¶

dsipts.models.ttm.utils.get_model(model_path: str, model_name: str = 'ttm', context_length: int | None = None, prediction_length: int | None = None, freq_prefix_tuning: bool = False, freq: str | None = None, prefer_l1_loss: bool = False, prefer_longer_context: bool = True, force_return: str | None = None, return_model_key: bool = False, **kwargs) → str | PreTrainedModel¶

TTM Model card offers a suite of models with varying context_length and prediction_length combinations. This wrapper automatically selects the right model based on the given input context_length and prediction_length abstracting away the internal complexity.

Parameters:

model_path (str) – HuggingFace model card path or local model path (Ex. ibm-granite/granite-timeseries-ttm-r2)
model_name (str, optional) – Model name to use. Current allowed values: [ttm]. Defaults to “ttm”.
context_length (int, optional) – Input Context length or history. Defaults to None.
prediction_length (int, optional) – Length of the forecast horizon. Defaults to None.
freq_prefix_tuning (bool, optional) – If true, it will prefer TTM models that are trained with frequency prefix tuning configuration. Defaults to None.
freq (str, optional) – Resolution or frequency of the data. Defaults to None. Allowed values are as per the tsfm_public.toolkit.time_series_preprocessor.DEFAULT_FREQUENCY_MAPPING. See this for details: https://github.com/ibm-granite/granite-tsfm/blob/main/tsfm_public/toolkit/time_series_preprocessor.py.
prefer_l1_loss (bool, optional) – If True, it will prefer choosing models that were trained with L1 loss or mean absolute error loss. Defaults to False.
prefer_longer_context (bool, optional) – If True, it will prefer selecting model with longer context/history Defaults to True.
force_return (str, optional) – This is used to force the get_model() to return a TTM model even when the provided configurations don’t match with the existing TTMs. It gets the closest TTM possible. Allowed values are [“zeropad”/”rolling”/”random_init_small”/”random_init_medium”/”random_init_large”/None]. “zeropad” = Returns a pre-trained TTM that has a context length higher than the input context length, hence, the user must apply zero-padding to use the returned model. “rolling” = Returns a pre-trained TTM that has a prediction length lower than the requested prediction length, hence, the user must apply rolling technique to use the returned model to forecast to the desired length. The RecursivePredictor class can be utilized in this scenario. “random_init_small” = Returns a randomly initialized small TTM which must be trained before performing inference. “random_init_medium” = Returns a randomly initialized medium TTM which must be trained before performing inference. “random_init_large” = Returns a randomly initialized large TTM which must be trained before performing inference. None = force_return is disable. Raises an error if no suitable model is found. Defaults to None.
return_model_key (bool, optional) – If True, only the TTM model name will be returned, instead of the actual model. This does not downlaod the model, and only returns the name of the suitable model. Defaults to False.

Returns:

Returns the Model, or the model name.

Return type:

Union[str, PreTrainedModel]

dsipts.models.ttm.utils.get_random_ttm(context_length: int, prediction_length: int, size: str = 'small', **kwargs) → PreTrainedModel¶

Get a TTM with random weights.

Parameters:

context_length (int) – Context length or history.
prediction_length (int) – Prediction length or forecast horizon.
size (str, optional) – Size of the desired TTM (small/medium/large). Defaults to “small”.

Raises:

ValueError – If wrong size is provided.
ValueError – Context length should be at least 4 if size=small, or at least 16 if size=medium, or at least 32 if size=large.

Returns:

TTM model with randomly initialized weights.

Return type:

PreTrainedModel

dsipts.models.ttm package¶

Submodules¶

dsipts.models.ttm.configuration_tinytimemixer module¶

dsipts.models.ttm.consts module¶

dsipts.models.ttm.modeling_tinytimemixer module¶

dsipts.models.ttm.utils module¶

Module contents¶

DSIPTS

Navigation

Related Topics