How can I print the Learning Rate at each epoch with Adam optimizer in Keras?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



How can I print the Learning Rate at each epoch with Adam optimizer in Keras?



Because online learning does not work well with Keras when you are using an adaptive optimizer (the learning rate schedule resets when calling .fit()), I want to see if I can just manually set it. However, in order to do that, I need to find out what the learning rate was at the last epoch.


.fit()



That said, how can I print the learning rate at each epoch? I think I can do it through a callback but it seems that you have to recalculate it each time and I'm not sure how to do that with Adam.



I found this in another thread but it only works with SGD:


class SGDLearningRateTracker(Callback):
def on_epoch_end(self, epoch, logs=):
optimizer = self.model.optimizer
lr = K.eval(optimizer.lr * (1. / (1. + optimizer.decay * optimizer.iterations)))
print('nLR: :.6fn'.format(lr))





Your question doesn't have an answer. Adam does not have a single learning rate.
– Ricardo Cruz
Apr 3 at 16:48




2 Answers
2



This piece of code might help you. It is based on Keras implementation of Adam optimizer (beta values are Keras defaults)


from keras import Callback
from keras import backend as K
class AdamLearningRateTracker(Callback):
def on_epoch_end(self, logs=):
beta_1=0.9, beta_2=0.999
optimizer = self.model.optimizer
if optimizer.decay>0:
lr = K.eval(optimizer.lr * (1. / (1. + optimizer.decay * optimizer.iterations)))
t = K.cast(optimizer.iterations, K.floatx()) + 1
lr_t = lr * (K.sqrt(1. - K.pow(beta_2, t)) /(1. - K.pow(beta_1, t)))
print('nLR: :.6fn'.format(lr_t))


class MyCallback(Callback):
def on_epoch_end(self, epoch, logs=None):
lr = self.model.optimizer.lr
# If you want to apply decay.
decay = self.model.optimizer.decay
iterations = self.model.optimizer.iterations
lr_with_decay = lr / (1. + decay * K.cast(iterations, K.dtype(decay)))
print(K.eval(lr_with_decay))



Follow this thread.





That's not the learning rate used by Adam. That's SGD with decay.
– Ricardo Cruz
Apr 3 at 16:37






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard