quri_parts.algo.optimizer.adam module#

class OptimizerStateAdam(params, cost=0.0, status=OptimizerStatus.SUCCESS, niter=0, funcalls=0, gradcalls=0, m=<factory>, v=<factory>)#

Bases: OptimizerState

Optimizer state for Adam.

Parameters:

params (algo.optimizer.interface.Params) –
cost (float) –
status (OptimizerStatus) –
niter (int) –
funcalls (int) –
gradcalls (int) –
m (algo.optimizer.interface.Params) –
v (algo.optimizer.interface.Params) –

m: Params#

v: Params#

class Adam(lr=0.05, betas=(0.9, 0.999), eps=1e-09, ftol=1e-05)#

Bases: Optimizer

Adam optimization algorithm proposed in [1].

Parameters:

lr (float) – learning rate.
betas (Sequence[float]) – coefficients used in the update rules of the moving averages of the gradient and its magnitude. betas represents the robustness of the optimizer. Hence, when using sampling, the higher values for betas are recommended.
eps (float) – a small scaler number used for avoiding zero division.
ftol (Optional[float]) – If not None, judge convergence by cost function tolerance. See ftol() for details.

Ref:: [1] Adam: A Method for Stochastic Optimization, Diederik P. Kingma, Jimmy Ba (2014). https://arxiv.org/abs/1412.6980.

get_init_state(init_params)#

Returns an initial state for optimization.

Parameters:: init_params (algo.optimizer.interface.Params) –
Return type:: OptimizerStateAdam

step(state, cost_function, grad_function=None)#

Run a single optimization step and returns a new state.

Parameters:

state (OptimizerState) –
cost_function (algo.optimizer.interface.CostFunction) –
grad_function (algo.optimizer.interface.GradientFunction | None) –

Return type:

OptimizerStateAdam

class AdaBelief(lr=0.001, betas=(0.9, 0.99), eps=1e-16, ftol=1e-05)#

Bases: Adam

AdaBelief optimization algorithm proposed in [1].

Parameters:

lr (float) – learning rate.
betas (Sequence[float]) – coefficients used in the update rules of the moving averages of the gradient and its magnitude. betas represents the robustness of the optimizer. Hence, when using sampling, the higher values for betas are recommended.
eps (float) – a small scaler number used for avoiding zero division.
ftol (Optional[float]) – If not None, judge convergence by cost function tolerance. See ftol() for details.

Ref:: [1] AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients, Juntang Zhuang, Tommy Tang, Yifan Ding, Sekhar Tatikonda, Nicha Dvornek, Xenophon Papademetris, James S. Duncan (2020). https://arxiv.org/abs/2010.07468.