AI Wins At Rubik's Cube With Just One Hand

Written by Mike James

Saturday, 19 October 2019

A better headline would have been "with one hand tied behind its back" but this robot doesn't have another hand and no back you speak of. It does however have two neural networks and can solve the cube using just one hand. What is the importance of this work?

We keep seeing examples of neural networks solving difficult problems, but mostly simulations or game playing. It must have occurred to you that a really interesting application would be to put a neural network into a robot and see how well it walks or runs or whatever. This is what has now been tried at OpenAI. In this case the real world component is a human-like robotic hand. Just the one.

handicon

The architecture of the system includes three vision networks to determine the position of the cube and a recurrent network to control the hand:

handcube

Of course, training a network via a real hand would take far too long for the tens of thousands of repetitions it needs and this was achieved via a simulation.

The neural network controlling the hand was trained using reinforcement learning but in a simulated environment. The problem with simulated environments is that they generally lack the variation encountered in the real world. The solution, proposed by the team at OpenAI, is ADR - Automatic Domain Randomization. Instead of just varying the problem slightly, the parameters of the simulation were changed i.e. not just the scrambling of the Rubik's cube, but the dynamics. At first the environment is fixed and the robot learns to manipulate the cube. After this initial training the randomization starts - the size of the cube varies slightly, the dynamics of the hand are changed and so on. This makes the system learn a robust, and hopefully generalizable, solution.

To see if the learning is applicable to the real world, the problem was transferred to real hardware and that hardware was tested in a range of non-ideal situations. The robot hand equivalent of kicking the robot dog includes putting a rubber glove on, tying two fingers together, poking the cube with a stuffed giraffe and so on. I didn't make the giraffe part up.

hand

There is a slick video explaining the ideas, but I think this uncut video showing the hand in action is more impressive, and remember as you watch it none of this behavior was programmed:

Now watch the slick presentation video, it too is interesting:

It's not perfect and the robot only solves the cube 60% of the time and only 20% of the time for a maximally difficult scramble, so we don't have to worry that it is going to beat humans at the moment. What is more important is that neural networks and real robots are working together and reinforcement learning with simulations seems to be the way to make it happen.

As the research paper concludes:

"In this work, we introduce automatic domain randomization (ADR), a powerful algorithm for sim2real transfer. We show that ADR leads to improvements over previously established baselines that use manual domain randomization for both vision and control. We further demonstrate that ADR, when combined with our custom robot platform, allows us to successfully solve a manipulation problem of unprecedented complexity: solve a Rubik’s cube using a real humanoid robot, the Shadow Dexterous Hand. By systematically studying the behavior of our learned policies, we find clear signs of emergent meta-learning. Policies trained with ADR are able to adapt at deployment time to the physical reality, which it has never seen during training, via updates to their recurrent state. "

handicon

More Information

Solving Rubik's Cube with a Robot Hand

https://openai.com/blog/solving-rubiks-cube/

AI Learns To Solve Rubik's Cube - Fast!

A Robot Learns To Do Things Using A Deep Neural Network

Rubik's Cube Is Hard - NP Hard

Rubik's Cube Breakthrough

Rubik's cube - the order of God's Number

17x17x17 Rubik Cube Solved In 7.5 Hours

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

GitHub Copilot Adds VSCode Agent Mode
14/04/2025

GitHub has released an agent mode and MCP support for VS Code, along with a new GitHub Copilot Pro+ plan with premium requests, the general availability of models from Anthropic, Google, and OpenAI, n [ ... ]

+ Full Story

LeetGPU - The CUDA Challenges
04/04/2025

LeetGPU is a platform where you can write and test CUDA code.
Now it adds Challenges to foster competition, asking you to put your GPU programming skills to the test by writing the fastest program [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 21 July 2021 )

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments