Updates on Policy Gradients

This is the eighth part of “An Outsider’s Tour of Reinforcement Learning.” Part 9 is here. Part 7 is here. Part 1 is here. I’ve been swamped with a bit of a travel binge and... Continue

A Model, You Know What I Mean?

This is the seventh part of “An Outsider’s Tour of Reinforcement Learning.” Part 8 is here. Part 6 is here. Part 1 is here. The role of models in reinforcement learning remains hotly debated. Model-free... Continue

The Policy of Truth

This is the sixth part of “An Outsider’s Tour of Reinforcement Learning.” Part 7 is here. Part 5 is here. Part 1 is here. Our first generic candidate for solving reinforcement learning is Policy Gradient.... Continue

A Game of Chance to You to Him Is One of Real Skill

This is the fifth part of “An Outsider’s Tour of Reinforcement Learning.” Part 6 is here. Part 4 is here. Part 1 is here. The first two parts of this series highlighted two parallel aspirations... Continue

The Linear Quadratic Regulator

This is the fourth part of “An Outsider’s Tour of Reinforcement Learning.” Part 5 is here. Part 3 is here. Part 1 is here. What would be a dead simple baseline for understanding optimal control... Continue

The Linearization Principle

This is the third part of “An Outsider’s Tour of Reinforcement Learning.” Part 4 is here. Part 2 is here. Part 1 is here. I have an ethos for tackling problems in machine learning that... Continue

Total Control

This is the second part of “An Outsider’s Tour of Reinforcement Learning.” Part 3 is here. Part 1 is here. In addition to the reasons I’ve discussed so far, I’ve been fascinated with the resurgence... Continue

Make It Happen

This is the first part of “An Outsider’s Tour of Reinforcement Learning.” Part 2 is here. If you read hacker news, you’d think that deep reinforcement learning can be used to solve any problem. Deep... Continue

Lessons from Optics, The Other Deep Learning

Would you say deep learning is mature enough to be taught in high schools? Here’s why I ask. Some time ago, I received an email from a product manager at a very large company. I... Continue

Directions of Ascent

Last November was a dramatic wake-up call to many of us in information technology, and I’ve spent a large part of the last year learning about how I and others in similar positions can help... Continue