Connections between Reinforcement Learning and Representation Learning

Date:  10/6/2022

Speaker: Benjamin Eysenbach

Location: 122 Gates Hall and Zoom

Time: 2:40 p.m.-3:30 p.m.

AbstractIn reinforcement learning (RL), it is easier to solve a task if given a good representation. Deep RL promises to simultaneously solve an RL problem and a representation learning problem; it promises simpler methods with fewer objective functions and fewer hyperparameters. However, prior work often finds that these end-to-end approaches tend to be unstable, and instead addresses the representation learning problem with additional machinery (e.g., auxiliary losses, data augmentation). How can we design RL algorithms that directly acquire good representations?

In this talk, I’ll share how we approached this problem in an unusual way: rather than using RL to solve a representation learning problem, we showed how (contrastive) representation learning can be used to solve some RL problems. The key idea will be to treat the value function as a classifier, which distinguishes between good and bad outcomes, similar to how contrastive learning distinguishes between positive and negative examples. By carefully choosing the inputs to a (contrastive) representation learning algorithm, we learn representations that (provably) encode a value function. We use this idea to design a new RL algorithm that is much simpler than prior work while achieving equal or better performance on simulated benchmarks. On the theoretical side, this work uncovers connections between contrastive learning, hindsight relabeling, successor features and reward learning.

Bio:  Benjamin Eysenbach a 5th year PhD student at Carnegie Mellon University, advised by Ruslan Salakhutdinov and Sergey Levine. His research focuses on algorithms for decision-making (reinforcement learning). Much of the research is about revealing connections between seemingly-disparate algorithms and ideas, leading to new algorithms that are typically simpler, carry stronger theoretical guarantees, and work better in practice. Ben is the recipient of the NSF and Hertz graduate fellowships. Prior to the PhD, he was a resident at Google Research and received his B.S. in math from MIT.