mlreco.utils.adabound module

class mlreco.utils.adabound.AdaBound(params, lr=0.001, betas=(0.9, 0.999), final_lr=0.1, gamma=0.001, eps=1e-08, weight_decay=0, amsbound=False)[source]

Bases: torch.optim.optimizer.Optimizer

Implements AdaBound algorithm. It has been proposed in `Adaptive Gradient Methods with Dynamic Bound of Learning Rate`_. :param params: iterable of parameters to optimize or dicts defining

parameter groups

Parameters
  • lr (float, optional) – Adam learning rate (default: 1e-3)

  • betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))

  • final_lr (float, optional) – final (SGD) learning rate (default: 0.1)

  • gamma (float, optional) – convergence speed of the bound functions (default: 1e-3)

  • eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)

  • weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)

  • amsbound (boolean, optional) – whether to use the AMSBound variant of this algorithm

__init__(params, lr=0.001, betas=(0.9, 0.999), final_lr=0.1, gamma=0.001, eps=1e-08, weight_decay=0, amsbound=False)[source]

Initialize self. See help(type(self)) for accurate signature.

__setstate__(state)[source]
step(closure=None)[source]

Performs a single optimization step. :param closure: A closure that reevaluates the model

and returns the loss.

__module__ = 'mlreco.utils.adabound'
class mlreco.utils.adabound.AdaBoundW(params, lr=0.001, betas=(0.9, 0.999), final_lr=0.1, gamma=0.001, eps=1e-08, weight_decay=0, amsbound=False)[source]

Bases: torch.optim.optimizer.Optimizer

Implements AdaBound algorithm with Decoupled Weight Decay (arxiv.org/abs/1711.05101) It has been proposed in `Adaptive Gradient Methods with Dynamic Bound of Learning Rate`_. :param params: iterable of parameters to optimize or dicts defining

parameter groups

Parameters
  • lr (float, optional) – Adam learning rate (default: 1e-3)

  • betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))

  • final_lr (float, optional) – final (SGD) learning rate (default: 0.1)

  • gamma (float, optional) – convergence speed of the bound functions (default: 1e-3)

  • eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)

  • weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)

  • amsbound (boolean, optional) – whether to use the AMSBound variant of this algorithm

__init__(params, lr=0.001, betas=(0.9, 0.999), final_lr=0.1, gamma=0.001, eps=1e-08, weight_decay=0, amsbound=False)[source]

Initialize self. See help(type(self)) for accurate signature.

__setstate__(state)[source]
step(closure=None)[source]

Performs a single optimization step. :param closure: A closure that reevaluates the model

and returns the loss.

__module__ = 'mlreco.utils.adabound'