Google AI Introduces A Machine Learning Based System For Game Developers To Quickly And Efficiently Train Game-Testing Agents


Google AI recently announced a machine learning-based framework that game developers could use to train game-testing agents quickly and efficiently, freeing human testers to focus on more complicated problems. The resulting system requires no machine learning (ML) expertise, works with a wide range of popular game genres, and can train an ML policy, which generates game actions from the game state on a single game instance in less than an hour. Google AI has also provided an open-source library that shows how these techniques may be used in practice.

Simply playing the game is the most basic form of video game testing. Many serious bugs (such as crashes or falling out of the world) are easily detectable and fixed. The major challenge is to find the bugs within the vast state space of a modern game. Thus, the developers decided to focus on training a system that could “just play the game” at scale.

An effective way of doing this was by enabling developers to train a group of game-testing agents instead of training a single, super-effective agent playing the entire game from end to end. Each agent could complete tasks of a few minutes, referred to as “gameplay loops.”

Bridging the gap between the simulation-centric world of video games and the data-centric world of ML is one of the most fundamental obstacles in applying machine learning to game development. Instead of asking developers to translate the game state directly into custom, low-level ML features (which would be too time-consuming) or attempting to learn from raw pixels (which would require too much data to train), the proposed solution gives developers an efficient, game-developer-friendly API that lets them describe their game in terms of the core state that a player perceives and the semantic actions that they can take. All of this information is represented by concepts that game developers are familiar with, such as entitiesraycasts, 3D locations and rotations, buttons, and joysticks.

This high-level semantic API is simple to use and allows the system to adapt to the specific game being developed. The game developer’s specific combination of API building blocks influences the network architecture choice because it informs about the type of gaming scenario in which the system is deployed—for example, handling action outputs differently depending on whether they represent a digital button or an analog joystick or employing image processing techniques to manage observations resulting from an agent probing its environment with raycasts (much like how autonomous vehicles analyze their environment with LIDAR).

This API is broad enough to mimic various popular control schemes (the configuration of action outputs that control movement) in games, including first-person shooters, third-person shooters with camera-relative controls, racing games, twin-stick shooters, and so on. Networks are designed that inherently tend towards simple behavior such as aiming, approach, or avoidance in these games because 3D movement and targeting are often a fundamental part of gameplay in general. The technology achieves this by evaluating the game’s control scheme and creating neural network layers that perform specialized processing of the game’s observations and actions. For example, from the perspective of the AI-controlled gaming entity, the locations and rotations of objects in the real world are automatically translated into directions and distances. This transformation usually speeds up learning and aids in the generalization of the learned network.

After generating a neural network architecture, the network needs to be trained to play the game using an appropriate choice of a learning algorithm.

For this use case, Imitation Learning (IL), which teaches ML policies by watching professionals play the game, performs effectively. Unlike RL (Reinforcement Learning), where the agent must find a good policy on its own, IL requires replicating human expert behavior. Because game developers and testers are experts in their own games, they can quickly demonstrate how to play the game. An IL approach inspired by the DAgger algorithm is used. The approach allows the system to take advantage of video games’ most compelling quality — interactivity.

The proposed system blends a high-level semantic API with a DAgger-inspired interactive training flow to create meaningful machine learning policies for video game testing across genres. As a working example of this system, Google AI has released an open-source library. No prior knowledge of machine learning is necessary, and training agents for test applications can easily be completed in less than an hour on a single developer machine. Hopefully, this research will spur the development of machine learning approaches that may be used in real-world game production in simple, effective, and enjoyable ways.



Related Products