quri_parts.algo.optimizer.adam module#
- class OptimizerStateAdam(params, cost=0.0, status=OptimizerStatus.SUCCESS, niter=0, funcalls=0, gradcalls=0, m=<factory>, v=<factory>)#
Bases:
OptimizerState
Optimizer state for Adam.
- Parameters:
params (algo.optimizer.interface.Params) –
cost (float) –
status (OptimizerStatus) –
niter (int) –
funcalls (int) –
gradcalls (int) –
m (algo.optimizer.interface.Params) –
v (algo.optimizer.interface.Params) –
- m: Params#
- v: Params#
- class Adam(lr=0.05, betas=(0.9, 0.999), eps=1e-09, ftol=1e-05)#
Bases:
Optimizer
Adam optimization algorithm proposed in [1].
- Parameters:
lr (float) – learning rate.
betas (Sequence[float]) – coefficients used in the update rules of the moving averages of the gradient and its magnitude.
betas
represents the robustness of the optimizer. Hence, when using sampling, the higher values forbetas
are recommended.eps (float) – a small scaler number used for avoiding zero division.
ftol (Optional[float]) – If not None, judge convergence by cost function tolerance. See
ftol()
for details.
- Ref:
[1] Adam: A Method for Stochastic Optimization, Diederik P. Kingma, Jimmy Ba (2014). https://arxiv.org/abs/1412.6980.
- get_init_state(init_params)#
Returns an initial state for optimization.
- Parameters:
init_params (algo.optimizer.interface.Params) –
- Return type:
- step(state, cost_function, grad_function=None)#
Run a single optimization step and returns a new state.
- Parameters:
state (OptimizerState) –
cost_function (algo.optimizer.interface.CostFunction) –
grad_function (algo.optimizer.interface.GradientFunction | None) –
- Return type:
- class AdaBelief(lr=0.001, betas=(0.9, 0.99), eps=1e-16, ftol=1e-05)#
Bases:
Adam
AdaBelief optimization algorithm proposed in [1].
- Parameters:
lr (float) – learning rate.
betas (Sequence[float]) – coefficients used in the update rules of the moving averages of the gradient and its magnitude.
betas
represents the robustness of the optimizer. Hence, when using sampling, the higher values forbetas
are recommended.eps (float) – a small scaler number used for avoiding zero division.
ftol (Optional[float]) – If not None, judge convergence by cost function tolerance. See
ftol()
for details.
- Ref:
[1] AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients, Juntang Zhuang, Tommy Tang, Yifan Ding, Sekhar Tatikonda, Nicha Dvornek, Xenophon Papademetris, James S. Duncan (2020). https://arxiv.org/abs/2010.07468.