OpenAI Gym Gives Reinforcement Learning A Work Out
Written by Mike James   
Friday, 29 April 2016

When OpenAI, an open source AI initiative backed by Elon Musk, Sam Altman and Ilya Sutskever, was announced earlier in the year, I doubt anyone expected anything to be produced so quickly and certainly not something connected with reinforcement learning. OpenAI Gym is what it sounds like - an exercise facility for reinforcement learning. 

openailogo

Since the success of Deep Mind's Deep Q learning at playing games, and Go in particular, the subject of reinforcement learning (RL) has gone from an academic backwater to front line AI.

The big problem is that reinforcement learning is a difficult technique to characterise. Put simply an RL system learns not by being told how close it is the the desired result, but by receiving rewards based on its behaviour. Of course this is largely how we learn and if it can be made to work efficiently it promises us not just effective AI but new knowledge. For example AlphaGo taught itself to play Go and in the process discovered for itself approaches to Go that humans had ignored. 

OpenAI claims that the things are holding RL back: 

  • The need for better benchmarks. In supervised learning, progress has been driven by large labeled datasets like ImageNet. In RL, the closest equivalent would be a large and diverse collection of environments. However, the existing open-source collections of RL environments don't have enough variety, and they are often difficult to even set up and use.

  • Lack of standardization of environments used in publications. Subtle differences in the problem definition, such as the reward function or the set of actions, can drastically alter a task's difficulty. This issue makes it difficult to reproduce published research and compare results from different papers. 

The motivation behind OpenAI Gym is to provide a set of environments that different RL programs can be tested in. These are: 

  • Classic control and toy text: complete small-scale tasks, mostly from the RL literature. These are the ones you read about in the literature - pole balancing and similar. 

  • Algorithmic: perform computations such as adding multi-digit numbers and reversing sequences. 

  • Atari: play classic Atari games. 

  • Board games: play Go on 9x9 and 19x19 boards.  In this release there is a fixed opponent based on a good algorithmic method. 

  • 2D and 3D robots: control a robot simulation using accurate physics.

openai

 

At the moment you can connect your RL system to the gym using Python. Of course it is up to map the RL system onto the environment  - as the documentation says:

"We provide the environment; you provide the algorithm.

You can write your agent using your existing numerical computation library, such as TensorFlow or Theano."

The idea is to collect and curate a set of results that indicate how well the different approaches are doing at generalizing their results.  

It is good to see that an open source initiative is doing something other than simply reproducing what is being done in the closed software world. It would be very easy for OpenAI to simply build its own Tensorflow or an alternative, but OpenAI Gym is novel and needed. 

openailogo

More Information

OpenAI Gym

https://github.com/openai/gym

Related Articles

AI Goes Open Source To The Tune Of $1 Billion

GNU Gneural Network - Do We Need Another Open Source DNN?

Google's DeepMind Demis Hassabis Gives The Strachey Lecture

AlphaGo Beats Lee Sedol Final Score 4-1

Why AlphaGo Changes Everything

Google's DeepMind Learns To Play Arcade Games

Microsoft Wins ImageNet Using Extremely Deep Neural Networks

The Flaw In Every Neural Network Just Got A Little Worse

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin

 

Banner


It Matters What Language AI Thinks In
23/10/2024

We are currently polarized in how we think about Large Language Models. Some say that they are just overgrown autocompletes and some say that they have captured some aspects of intelligence. How well  [ ... ]



Zitadel Announces Funding And Future Plans
21/11/2024

Zitadel has announced a major funding round that will be used to expand technical teams and fund further product development. The company is the creator of an open source project for cloud-native iden [ ... ]


More News

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info

 

Last Updated ( Wednesday, 12 July 2023 )