How to decay your learning rate

Lewkowycz, Aitor

Computer Science > Machine Learning

arXiv:2103.12682 (cs)

[Submitted on 23 Mar 2021]

Title:How to decay your learning rate

Authors:Aitor Lewkowycz

View PDF

Abstract:Complex learning rate schedules have become an integral part of deep learning. We find empirically that common fine-tuned schedules decay the learning rate after the weight norm bounces. This leads to the proposal of ABEL: an automatic scheduler which decays the learning rate by keeping track of the weight norm. ABEL's performance matches that of tuned schedules and is more robust with respect to its parameters. Through extensive experiments in vision, NLP, and RL, we show that if the weight norm does not bounce, we can simplify schedules even further with no loss in performance. In such cases, a complex schedule has similar performance to a constant learning rate with a decay at the end of training.

Comments:	9 + 14 pages, 5 + 11 figures
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2103.12682 [cs.LG]
	(or arXiv:2103.12682v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2103.12682

Submission history

From: Aitor Lewkowycz [view email]
[v1] Tue, 23 Mar 2021 17:00:23 UTC (6,216 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2103

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

export BibTeX citation

Computer Science > Machine Learning

Title:How to decay your learning rate

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:How to decay your learning rate

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators