Learning to learn to gradient descent by gradient descent 2016

Back to Research-Papers

datascience #machinelearning

Optimization algorithms are still designed by hand. This paper casts optimization problem design as a learning problem

Learning to learn with recurrent neural network

By applying meta-learning, we are essentially recursively applying a learning algorithm on itself. This process is a recurrent neural network (RNN), formed by a directed, cyclic graph

LSTM (Long short-term memory) is a recurrent neural network architecture

In this work, they directly parameterize the optimizer, define a good optimizer as one with low expected loss, given a distribution of functions f

Challenge: optimizing tens of thousands of parameters; not feasible through a fully connected RNN (huge hidden state)