autocast.processors.vit#
- class PatchEmbedding(dim_in, hidden_dim, groups=12, n_spatial_dims=2, patch_size=None)[source]#
Bases:
ModuleImage to Patch Embedding.
- forward(x)[source]#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class PatchUnembedding(dim_out, hidden_dim=768, groups=12, n_spatial_dims=2, patch_size=None)[source]#
Bases:
ModulePatch to Image Unembedding.
- Parameters:
- forward(x)[source]#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class AxialAttentionBlock(hidden_dim=768, num_heads=12, n_spatial_dims=2, drop_path=0.0, layer_scale_init_value=1e-06, n_noise_channels=None)[source]#
Bases:
ModuleAxial attention block for multi-dimensional feature processing.
This module performs scaled dot-product attention over spatial axes, enabling efficient attention computation for multi-dimensional inputs.
- Parameters:
- forward(x, x_noise)[source]#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class AViT(dim_in, dim_out, n_spatial_dims, spatial_resolution, hidden_dim=768, num_heads=12, processor_blocks=8, drop_path=0.0, groups=12, n_noise_channels=None, patch_size=None)[source]#
Bases:
ModuleUses axial attention to predict forward dynamics.
This simplified version just stacks time in channels.
- Parameters:
- forward(x, x_noise=None)[source]#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class AViTProcessor(in_channels, out_channels, spatial_resolution, hidden_dim=64, num_heads=4, n_layers=4, drop_path=0.0, groups=8, loss_func=None, n_noise_channels=None, patch_size=None)[source]#
Bases:
Processor[EncodedBatch]Vision Transformer Module.
- Parameters:
- forward(x, x_noise=None)[source]#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- loss(batch)[source]#
Compute loss between output and target.
- Parameters:
batch (EncodedBatch)
- Return type: