Microsoft Research Improves AI In Gaming
Written by Kay Ewbank   
Wednesday, 12 August 2020

Microsoft Research has announced several improvements to the use of reinforcement learning in gaming. The improvements include the development of  game agents that learn how to collaborate in teams with human players.

The first of the announcements from Microsoft Research is Project Padia, a collaboration between Microsoft Research Cambridge and Ninja Theory. The hope is that Padia will enable the creation of agents that learn to collaborate with human players.

paida

The researchers say that traditional programming techniques require the developer to anticipate every possible game situation and what the in-game characters should do in those situations. If you use reinforcement learning, this isn't necessary. Instead, developers control a reward signal which the game character then learns to optimize while responding to what's happening in the game. The result is

"nuanced situation and player-aware emergent behavior that would be challenging or prohibitive to achieve using traditional Game AI"

paida2

Project Paidia's aim is specifically that of collaboration with human players. The researchers say that because human players are notoriously creative and hard to predict, it makes genuine collaboration towards shared goals very hard. The team are using Ninja Theory’s latest game, Bleeding Edge, to test their work as it is team-based and includes a range of characters that have to work together to score points and defeat their opponents. In their latest demo, the team showcases how reinforcement learning enables agents to learn to coordinate their actions. 

The first area the project is looking into are how to make reinforcement learning efficient and reliable for game developers (for example, by combining it with uncertainty estimation and imitation).They are also considering how to construct deep learning architectures that give agents the right abilities such as long-term memory, and how to enable agents that can rapidly adapt to new game situations. 

The uncertainty estimation uses a version of Random Network Distillation (RND) to estimate the confidence of the deep learning model. The version of RND used maintains an uncertainty model separate from the model making predictions, with two types of neural networks: a predictor and a prior. Roughly speaking, the gap between prior and predictor is a good indication of how certain the model should be about its outputs.

The second piece of research aims to help agents recall items, locations, and other players that are currently out of sight but have been seen earlier in the game. The third area of research is designed to make agents better at learning and adapting to new challenges such as exploring unknown environments by using with Bayes-Adaptive Deep RL and meta-learning. The research papers describing the work are all available for download. 

paida 

More Information

Project Paidia Game Demo

Project Paida Website

Conservative Uncertainty Estimation By Fitting Prior Networks

AMRL: Aggregated Memory For Reinforcement Learning

VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

Related Articles

Training A Cellular Automaton

Microsoft Releases DeepSpeed For PyTorch

DeepFaceDrawing Using Machine Learning

Go Master Retires Citing AI Supremacy 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


IBM Opensources AI Agents For GitHub Issues
14/11/2024

IBM is launching a new set of AI software engineering agents designed to autonomously resolve GitHub issues. The agents are being made available in an open-source licensing model.



JetBrains Makes WebStorm and Rider Free for Non-Commercial Use
24/10/2024

JetBrains has launched a non-commercial license for its JavaScript and TypeScript IDE, WebStorm, and for Rider, its cross-platform .NET and game development IDE.


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 12 August 2020 )