OpenAI Releases its Hindsight Experience Replay Machine Learning Algorithm

  • 1 March 2018
  • Sam Mire

Pretty soon, robots may be able to use the phrase, “In hindsight…” without a hint of comedic effect or irony. That’s because San Francisco-based startup OpenAI has released a code which incorporates an invaluable resource, hindsight, in developing advanced machine learning algorithms.

OpenAI is well known in the tech industry, with heavyweights such as Elon Musk having sat on its board in the past, despite his public aversion to the potential threats which AI could pose. But, even without the mind of Musk giving insight toward the company’s mission, OpenAI has made great strides with respect to machine learning.

Referred to as reinforcement learning, the company’s Hindsight Experience Replay algorithm, or HER, allows robots to take a trial-and-error approach with a programmed goal in mind, receiving a reward when it conducts an action that moves the bot closer to its goal. This, not unintentionally, is the way in which humans are taught to do an array of tasks – from potty-training to riding a bike – from the earliest of ages.

Where HER takes machine learning to the next level is its ability to apply another very-human cognitive process in robots. Instead of merely learning from their successful attempts, HER allows the bots to apply the lessons from their failures into their future tasks, as well. In other words, the robots learn what to do as well as what not to do, increasing the data from which they can pull in the future.

You may be wondering how one ‘rewards’ robots that make a successful action. Using a system called “sparse rewards”, the OpenAI researchers input numbers in the algorithm that have been likened to cookies for the robot. The binary system – one ‘cookie’ for success, none for failure – is believed to be superior to dense rewards – different sized cookies according to different tasks – for the sake of this application.

This video further explains OpenAI’s algorithm, which represents another significant step in the potential for machine learning.

