US Army Researchers Develop A New Framework For Collaborative Multi-Agent Reinforcement Learning Systems

Centralized learning for multi-agent systems highly depends on information-sharing mechanisms. However, there have not been significant studies within the research community in this domain.

Army researchers collaborate to propose a framework that provides a baseline for the development of collaborative multi-agent systems. The team involved Dr. Piyush K. Sharma, Drs. Erin Zaroukian, Rolando Fernandez, Derrik Asherat, Michael Dorothy from DEVCOM, Army Research Laboratory, and Anjon Basak, a postdoctoral fellow from the Oak Ridge Associated Universities fellowship program.

The team’s survey in reinforcement learning (RL) algorithms and their information sharing paradigms serves as a basis to question centralized learning for multi-agent systems that would improve their ability to work together.

Studies show that training various agents together is quite challenging. This is because the dynamic nature of complex environments suffers from dimensionality. So, increasing the number of agents while training can complicate the coordination. Moreover, information-sharing parameters are confusing and difficult to understand.

This study surpasses previous research by providing a consolidated view of the latest SOTA in RL algorithms and establishing a novel approach to define information shared during centralized learning.

Their paper, “Survey of recent multi-agent reinforcement learning algorithms utilizing centralized training,” introduces a model that can efficiently characterize the essential information-sharing parameters. The researchers suggest that centralization in training can provide us with a suitable solution with developing autonomous systems. They explain that consistent, centralized training can result in multi-agent systems that work more reliably together, increasing trust levels from the soldier of the AI.

The team investigated recent centralized learning algorithms and focused on identifying and characterizing the underlying mathematical framework. They believe these mathematical frameworks can help explore alternate centralized learning techniques to gauge their effect on learning and emergent collaborative behaviors. They surveyed the algorithms published in the last five to six years and stated that they have not yet been explored extensively as these algorithms are pretty recent. This was the major reason for exploring them.

Instead of focusing on how things are shared, they defined and categorized the mechanisms for sharing, orienting on what is being shared. They assert that they have identified gaps in the recent RL techniques that can improve the process of training agents. This work will help in training autonomous multi-agent systems. They also aim to investigate particular aspects of multi-agent RL methods that train agents in a centralized fashion.

Sharma states that centralized techniques have many limitations. Therefore, the plan to conduct an empirical analysis of existing decentralized learning techniques. They will model and simulate multi-agent RL training to validate and extend theories of agent learning, behavior, and coordination.

The team believes that their survey will help researchers develop RL techniques for collaborative multi-agent systems, including units of robots that could work along with soldiers in the future.