AI Wins At Rubik's Cube With Just One Hand
Written by Mike James   
Saturday, 19 October 2019

A better headline would have been "with one hand tied behind its back" but this robot doesn't have another hand and no back you speak of. It does however have two neural networks and can solve the cube using just one hand. What is the importance of this work?

We keep seeing examples of neural networks solving difficult problems, but mostly simulations or game playing. It must have occurred to you that a really interesting application would be to put a neural network into a robot and see how well it walks or runs or whatever. This is what has now been tried at OpenAI. In this case the real world component is a human-like robotic hand. Just the one.

handicon

The architecture of the system includes three vision networks to determine the position of the cube and a recurrent network to control the hand:

handcube

Of course, training a network via a real hand would take far too long for the tens of thousands of repetitions it needs and this was achieved via a simulation.

The neural network controlling the hand was trained using reinforcement learning but in a simulated environment. The problem with simulated environments is that they generally lack the variation encountered in the real world. The solution, proposed by the team at OpenAI, is ADR - Automatic Domain Randomization. Instead of just varying the problem slightly, the parameters of the simulation were changed i.e. not just the scrambling of the Rubik's cube, but the dynamics. At first the environment is fixed and the robot learns to manipulate the cube. After this initial training the randomization starts - the size of the cube varies slightly, the dynamics of the hand are changed and so on. This makes the system learn a robust, and hopefully generalizable, solution.

To see if the learning is applicable to the real world, the problem was transferred to real hardware and that hardware was tested in a range of non-ideal situations. The robot hand equivalent of kicking the robot dog includes putting a rubber glove on, tying two fingers together, poking the cube with a stuffed giraffe and so on. I didn't make the giraffe part up.

hand

There is a slick video explaining the ideas, but I think this uncut video showing the hand in action is more impressive, and remember as you watch it none of this behavior was programmed:

 

Now watch the slick presentation video, it too is interesting:

 

It's not perfect and the robot only solves the cube 60% of the time and only 20% of the time for a maximally difficult scramble, so we don't have to worry that it is going to beat humans at the moment. What is more important is that neural networks and real robots are working together and reinforcement learning with simulations seems to be the way to make it happen.

As the research paper concludes:

"In this work, we introduce automatic domain randomization (ADR), a powerful algorithm for sim2real transfer. We show that ADR leads to improvements over previously established baselines that use manual domain randomization for both vision and control. We further demonstrate that ADR, when combined with our custom robot platform, allows us to successfully solve a manipulation problem of unprecedented complexity: solve a Rubik’s cube using a real humanoid robot, the Shadow Dexterous Hand. By systematically studying the behavior of our learned policies, we find clear signs of emergent meta-learning. Policies trained with ADR are able to adapt at deployment time to the physical reality, which it has never seen during training, via updates to their recurrent state. "

handicon

 

More Information

Solving Rubik's Cube with a Robot Hand

https://openai.com/blog/solving-rubiks-cube/

Related Articles

AI Learns To Solve Rubik's Cube - Fast!

A Robot Learns To Do Things Using A Deep Neural Network

Rubik's Cube Is Hard - NP Hard

Rubik's Cube Breakthrough

Rubik's cube - the order of God's Number

17x17x17 Rubik Cube Solved In 7.5 Hours

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


PHP 8.4 Adds Property Hooks
26/11/2024

PHP 8.4 is available with improvements including property hooks, asymmetric visibility, and an updated DOM API.



Raspberry Pi CM5 - Expensive And Undocumented
27/11/2024

So the unexpected has happened - the Compute Module 5 has been launched. But it simply emphasises some problems with adopting the Pi as an IoT device.


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 21 July 2021 )