dsipts.ITransformer¶
- class dsipts.ITransformer(hidden_size: int, d_model: int, n_head: int, n_layer_decoder: int, use_norm: bool, class_strategy: str = 'projection', dropout_rate: float = 0.1, activation: str = '', **kwargs)¶
- Initialize the ITransformer model for time series forecasting. - This class implements the Inverted Transformer architecture as described in the paper “ITRANSFORMER: INVERTED TRANSFORMERS ARE EFFECTIVE FOR TIME SERIES FORECASTING” (https://arxiv.org/pdf/2310.06625). - Parameters:
- hidden_size (int) – The first embedding size of the model (‘r’ in the paper). 
- d_model (int) – The second embedding size (r^{tilda} in the model). Should be smaller than hidden_size. 
- n_head (int) – The number of attention heads. 
- n_layer_decoder (int) – The number of layers in the decoder. 
- use_norm (bool) – Flag to indicate whether to use normalization. 
- class_strategy (str, optional) – The strategy for classification, can be ‘projection’, ‘average’, or ‘cls_token’. Defaults to ‘projection’. 
- dropout_rate (float, optional) – The dropout rate for regularization. Defaults to 0.1. 
- activation (str, optional) – The activation function to be used. Defaults to ‘’. 
- **kwargs – Additional keyword arguments. 
 
- Raises:
- ValueError – If the activation function is not recognized. 
 - __init__(hidden_size: int, d_model: int, n_head: int, n_layer_decoder: int, use_norm: bool, class_strategy: str = 'projection', dropout_rate: float = 0.1, activation: str = '', **kwargs) None¶
- Initialize the ITransformer model for time series forecasting. - This class implements the Inverted Transformer architecture as described in the paper “ITRANSFORMER: INVERTED TRANSFORMERS ARE EFFECTIVE FOR TIME SERIES FORECASTING” (https://arxiv.org/pdf/2310.06625). - Parameters:
- hidden_size (int) – The first embedding size of the model (‘r’ in the paper). 
- d_model (int) – The second embedding size (r^{tilda} in the model). Should be smaller than hidden_size. 
- n_head (int) – The number of attention heads. 
- n_layer_decoder (int) – The number of layers in the decoder. 
- use_norm (bool) – Flag to indicate whether to use normalization. 
- class_strategy (str, optional) – The strategy for classification, can be ‘projection’, ‘average’, or ‘cls_token’. Defaults to ‘projection’. 
- dropout_rate (float, optional) – The dropout rate for regularization. Defaults to 0.1. 
- activation (str, optional) – The activation function to be used. Defaults to ‘’. 
- **kwargs – Additional keyword arguments. 
 
- Raises:
- ValueError – If the activation function is not recognized. 
 
 - Methods - __init__(hidden_size, d_model, n_head, ...)- Initialize the ITransformer model for time series forecasting. - add_module(name, module)- Add a child module to the current module. - all_gather(data[, group, sync_grads])- Gather tensors or collections of tensors from multiple processes. - apply(fn)- Apply - fnrecursively to every submodule (as returned by- .children()) as well as self.- backward(loss, *args, **kwargs)- Called to perform backward on the loss returned in - training_step().- bfloat16()- Casts all floating point parameters and buffers to - bfloat16datatype.- buffers([recurse])- Return an iterator over module buffers. - children()- Return an iterator over immediate children modules. - clip_gradients(optimizer[, ...])- Handles gradient clipping internally. - compile(*args, **kwargs)- Compile this Module's forward using - torch.compile().- compute_loss(batch, y_hat)- custom loss calculation - configure_callbacks()- Configure model-specific callbacks. - configure_gradient_clipping(optimizer[, ...])- Perform gradient clipping for the optimizer parameters. - configure_model()- Hook to create modules in a strategy and precision aware context. - configure_optimizers()- Each model has optim_config and scheduler_config - configure_sharded_model()- Deprecated. - cpu()- See - torch.nn.Module.cpu().- cuda([device])- Moves all model parameters and buffers to the GPU. - double()- See - torch.nn.Module.double().- eval()- Set the module in evaluation mode. - extra_repr()- Return the extra representation of the module. - float()- See - torch.nn.Module.float().- forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)- forward(batch)- Forlward method used during the training loop - freeze()- Freeze all params for inference. - get_buffer(target)- Return the buffer given by - targetif it exists, otherwise throw an error.- get_extra_state()- Return any extra state to include in the module's state_dict. - get_parameter(target)- Return the parameter given by - targetif it exists, otherwise throw an error.- get_submodule(target)- Return the submodule given by - targetif it exists, otherwise throw an error.- half()- See - torch.nn.Module.half().- inference(batch)- Usually it is ok to return the output of the forward method but sometimes not (e.g. RNN). - ipu([device])- Move all model parameters and buffers to the IPU. - load_from_checkpoint(checkpoint_path[, ...])- Primary way of loading a model from a checkpoint. - load_state_dict(state_dict[, strict, assign])- Copy parameters and buffers from - state_dictinto this module and its descendants.- log(name, value[, prog_bar, logger, ...])- Log a key, value pair. - log_dict(dictionary[, prog_bar, logger, ...])- Log a dictionary of values at once. - lr_scheduler_step(scheduler, metric)- Override this method to adjust the default way the - Trainercalls each scheduler.- lr_schedulers()- Returns the learning rate scheduler(s) that are being used during training. - manual_backward(loss, *args, **kwargs)- Call this directly from your - training_step()when doing optimizations manually.- modules()- Return an iterator over all modules in the network. - mtia([device])- Move all model parameters and buffers to the MTIA. - named_buffers([prefix, recurse, ...])- Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself. - named_children()- Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself. - named_modules([memo, prefix, remove_duplicate])- Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself. - named_parameters([prefix, recurse, ...])- Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself. - on_after_backward()- Called after - loss.backward()and before optimizers are stepped.- on_after_batch_transfer(batch, dataloader_idx)- Override to alter or apply batch augmentations to your batch after it is transferred to the device. - on_before_backward(loss)- Called before - loss.backward().- on_before_batch_transfer(batch, dataloader_idx)- Override to alter or apply batch augmentations to your batch before it is transferred to the device. - on_before_optimizer_step(optimizer)- Called before - optimizer.step().- on_before_zero_grad(optimizer)- Called after - training_step()and before- optimizer.zero_grad().- on_fit_end()- Called at the very end of fit. - on_fit_start()- Called at the very beginning of fit. - on_load_checkpoint(checkpoint)- Called by Lightning to restore your model. - on_predict_batch_end(outputs, batch, batch_idx)- Called in the predict loop after the batch. - on_predict_batch_start(batch, batch_idx[, ...])- Called in the predict loop before anything happens for that batch. - on_predict_end()- Called at the end of predicting. - on_predict_epoch_end()- Called at the end of predicting. - on_predict_epoch_start()- Called at the beginning of predicting. - on_predict_model_eval()- Called when the predict loop starts. - on_predict_start()- Called at the beginning of predicting. - on_save_checkpoint(checkpoint)- Called by Lightning when saving a checkpoint to give you a chance to store anything else you might want to save. - on_test_batch_end(outputs, batch, batch_idx)- Called in the test loop after the batch. - on_test_batch_start(batch, batch_idx[, ...])- Called in the test loop before anything happens for that batch. - on_test_end()- Called at the end of testing. - on_test_epoch_end()- Called in the test loop at the very end of the epoch. - on_test_epoch_start()- Called in the test loop at the very beginning of the epoch. - on_test_model_eval()- Called when the test loop starts. - on_test_model_train()- Called when the test loop ends. - on_test_start()- Called at the beginning of testing. - on_train_batch_end(outputs, batch, batch_idx)- Called in the training loop after the batch. - on_train_batch_start(batch, batch_idx)- Called in the training loop before anything happens for that batch. - on_train_end()- Called at the end of training before logger experiment is closed. - on_train_epoch_end()- pythotrch lightening stuff - on_train_epoch_start()- Called in the training loop at the very beginning of the epoch. - on_train_start()- Called at the beginning of training after sanity check. - on_validation_batch_end(outputs, batch, ...)- Called in the validation loop after the batch. - on_validation_batch_start(batch, batch_idx)- Called in the validation loop before anything happens for that batch. - on_validation_end()- Called at the end of validation. - on_validation_epoch_end()- pythotrch lightening stuff - on_validation_epoch_start()- Called in the validation loop at the very beginning of the epoch. - on_validation_model_eval()- Called when the validation loop starts. - on_validation_model_train()- Called when the validation loop ends. - on_validation_model_zero_grad()- Called by the training loop to release gradients before entering the validation loop. - on_validation_start()- Called at the beginning of validation. - optimizer_step(epoch, batch_idx, optimizer)- Override this method to adjust the default way the - Trainercalls the optimizer.- optimizer_zero_grad(epoch, batch_idx, optimizer)- Override this method to change the default behaviour of - optimizer.zero_grad().- optimizers([use_pl_optimizer])- Returns the optimizer(s) that are being used during training. - parameters([recurse])- Return an iterator over module parameters. - predict_dataloader()- An iterable or collection of iterables specifying prediction samples. - predict_step(*args, **kwargs)- Step function called during - predict().- prepare_data()- Use this to download and prepare data. - print(*args, **kwargs)- Prints only from process 0. - register_backward_hook(hook)- Register a backward hook on the module. - register_buffer(name, tensor[, persistent])- Add a buffer to the module. - register_forward_hook(hook, *[, prepend, ...])- Register a forward hook on the module. - register_forward_pre_hook(hook, *[, ...])- Register a forward pre-hook on the module. - register_full_backward_hook(hook[, prepend])- Register a backward hook on the module. - register_full_backward_pre_hook(hook[, prepend])- Register a backward pre-hook on the module. - register_load_state_dict_post_hook(hook)- Register a post-hook to be run after module's - load_state_dict()is called.- register_load_state_dict_pre_hook(hook)- Register a pre-hook to be run before module's - load_state_dict()is called.- register_module(name, module)- Alias for - add_module().- register_parameter(name, param)- Add a parameter to the module. - register_state_dict_post_hook(hook)- Register a post-hook for the - state_dict()method.- register_state_dict_pre_hook(hook)- Register a pre-hook for the - state_dict()method.- requires_grad_([requires_grad])- Change if autograd should record operations on parameters in this module. - save_hyperparameters(*args[, ignore, frame, ...])- Save arguments to - hparamsattribute.- set_extra_state(state)- Set extra state contained in the loaded state_dict. - set_submodule(target, module[, strict])- Set the submodule given by - targetif it exists, otherwise throw an error.- setup(stage)- Called at the beginning of fit (train + validate), validate, test, or predict. - share_memory()- See - torch.Tensor.share_memory_().- state_dict(*args[, destination, prefix, ...])- Return a dictionary containing references to the whole state of the module. - teardown(stage)- Called at the end of fit (train + validate), validate, test, or predict. - test_dataloader()- An iterable or collection of iterables specifying test samples. - test_step(*args, **kwargs)- Operates on a single batch of data from the test set. - to(*args, **kwargs)- See - torch.nn.Module.to().- to_empty(*, device[, recurse])- Move the parameters and buffers to the specified device without copying storage. - to_onnx([file_path, input_sample])- Saves the model in ONNX format. - to_torchscript([file_path, method, ...])- By default compiles the whole model to a - ScriptModule.- toggle_optimizer(optimizer)- Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup. - toggled_optimizer(optimizer)- Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup. - train([mode])- Set the module in training mode. - train_dataloader()- An iterable or collection of iterables specifying training samples. - training_step(batch, batch_idx)- pythotrch lightening stuff - transfer_batch_to_device(batch, device, ...)- Override this hook if your - DataLoaderreturns tensors wrapped in a custom data structure.- type(dst_type)- See - torch.nn.Module.type().- unfreeze()- Unfreeze all parameters for training. - untoggle_optimizer(optimizer)- Resets the state of required gradients that were toggled with - toggle_optimizer().- val_dataloader()- An iterable or collection of iterables specifying validation samples. - validation_step(batch, batch_idx)- pythotrch lightening stuff - xpu([device])- Move all model parameters and buffers to the XPU. - zero_grad([set_to_none])- Reset gradients of all model parameters. - Attributes - CHECKPOINT_HYPER_PARAMS_KEY- CHECKPOINT_HYPER_PARAMS_NAME- CHECKPOINT_HYPER_PARAMS_TYPE- T_destination- automatic_optimization- If set to - Falseyou are responsible for calling- .backward(),- .step(),- .zero_grad().- call_super_init- current_epoch- The current epoch in the - Trainer, or 0 if not attached.- device- device_mesh- Strategies like - ModelParallelStrategywill create a device mesh that can be accessed in the- configure_model()hook to parallelize the LightningModule.- dtype- dump_patches- example_input_array- The example input array is a specification of what the module can consume in the - forward()method.- fabric- global_rank- The index of the current process across all nodes and devices. - global_step- Total training batches seen across all epochs. - hparams- The collection of hyperparameters saved with - save_hyperparameters().- hparams_initial- The collection of hyperparameters saved with - save_hyperparameters().- local_rank- The index of the current process within a single node. - logger- Reference to the logger object in the Trainer. - loggers- Reference to the list of loggers in the Trainer. - on_gpu- Returns - Trueif this model is currently located on a GPU.- strict_loading- Determines how Lightning loads this model using .load_state_dict(..., strict=model.strict_loading). - trainer- training