dsipts.models.samformer package¶
Submodules¶
dsipts.models.samformer.utils module¶
- class dsipts.models.samformer.utils.RevIN(num_features: int, eps=1e-05, affine=True)¶
Bases:
Module
- Parameters:
num_features – the number of features or channels
eps – a value added for numerical stability
affine – if True, RevIN has learnable affine parameters
- forward(x, mode: str)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dsipts.models.samformer.utils.SAM(params, base_optimizer, rho=0.05, adaptive=False, **kwargs)¶
Bases:
Optimizer
- first_step(zero_grad=False)¶
- load_state_dict(state_dict)¶
Load the optimizer state.
- Parameters:
state_dict (dict) – optimizer state. Should be an object returned from a call to
state_dict()
.
Warning
Make sure this method is called after initializing
torch.optim.lr_scheduler.LRScheduler
, as calling it beforehand will overwrite the loaded learning rates.Note
The names of the parameters (if they exist under the “param_names” key of each param group in
state_dict()
) will not affect the loading process. To use the parameters’ names for custom cases (such as when the parameters in the loaded state dict differ from those initialized in the optimizer), a customregister_load_state_dict_pre_hook
should be implemented to adapt the loaded dict accordingly. Ifparam_names
exist in loaded state dictparam_groups
they will be saved and override the current names, if present, in the optimizer state. If they do not exist in loaded state dict, the optimizerparam_names
will remain unchanged.Example
>>> # xdoctest: +SKIP >>> model = torch.nn.Linear(10, 10) >>> optim = torch.optim.SGD(model.parameters(), lr=3e-4) >>> scheduler1 = torch.optim.lr_scheduler.LinearLR( ... optim, ... start_factor=0.1, ... end_factor=1, ... total_iters=20, ... ) >>> scheduler2 = torch.optim.lr_scheduler.CosineAnnealingLR( ... optim, ... T_max=80, ... eta_min=3e-5, ... ) >>> lr = torch.optim.lr_scheduler.SequentialLR( ... optim, ... schedulers=[scheduler1, scheduler2], ... milestones=[20], ... ) >>> lr.load_state_dict(torch.load("./save_seq.pt")) >>> # now load the optimizer checkpoint after loading the LRScheduler >>> optim.load_state_dict(torch.load("./save_optim.pt"))
- second_step(zero_grad=False)¶
- step(closure=None)¶
Perform a single optimization step to update parameter.
- Parameters:
closure (Callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.
- dsipts.models.samformer.utils.scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0, is_causal=False, scale=None)¶
A copy-paste from https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html