auto_model#

ignite.distributed.auto.auto_model(model, sync_bn=False, use_fsdp=False, **kwargs)[source]#

Helper method to adapt provided model for non-distributed and distributed configurations (supporting all available backends from available_backends()).

Internally, we perform to following:

send model to current device() if model’s parameters are not on the device.
wrap the model to torch DistributedDataParallel for native torch distributed if world size is larger than 1.
wrap the model with torch FSDP2 fully_shard instead of DDP if use_fsdp=True and native torch distributed is used with world size larger than 1.
wrap the model to torch DataParallel if no distributed context found and more than one CUDA devices available.
broadcast the initial variable states from rank 0 to all other processes if Horovod distributed framework is used.

Parameters:

model (Module) – model to adapt.
sync_bn (bool) – if True, applies torch convert_sync_batchnorm to the model for native torch distributed only. Default, False. Note, if using Nvidia/Apex, batchnorm conversion should be applied before calling amp.initialize. Incompatible with use_fsdp=True.
use_fsdp (bool) – if True, applies torch FSDP2 fully_shard to the model instead of wrapping with DistributedDataParallel for native torch distributed backends (NCCL, GLOO, MPI). Default, False. When enabled, kwargs are forwarded to fully_shard(), allowing control over reshard_after_forward, mp_policy, offload_policy, etc. Note: FSDP2 does not support auto_wrap_policy; manually call fully_shard() on submodules before passing the model to auto_model. Requires PyTorch >= 2.0.
kwargs (Any) – kwargs forwarded to the wrapping class: torch DistributedDataParallel, torch FSDP2 fully_shard (when use_fsdp=True), or torch DataParallel if applicable. Please, make sure to use acceptable kwargs for the given backend.

Returns:

torch.nn.Module

Return type:

Module

Examples

import ignite.distribted as idist

model = idist.auto_model(model)

In addition with NVidia/Apex, it can be used in the following way:

import ignite.distribted as idist

model, optimizer = amp.initialize(model, optimizer, opt_level=opt_level)
model = idist.auto_model(model)

To use FSDP2 with bf16 mixed precision:

import torch
import ignite.distributed as idist
from torch.distributed._composable.fsdp import fully_shard, MixedPrecisionPolicy

bf16_policy = MixedPrecisionPolicy(
    param_dtype=torch.bfloat16,
    reduce_dtype=torch.bfloat16,
)
# Optionally shard submodules first:
for layer in model.layers:
    fully_shard(layer)
model = idist.auto_model(model, use_fsdp=True, mp_policy=bf16_policy)

Changed in version 0.4.2:

Added Horovod distributed framework.
Added sync_bn argument.

Changed in version 0.4.3: Added kwargs to idist.auto_model.

auto_model#

Search Docs