Using Multi-Agent Debate for Improving Reasoning and Factual Accuracy of Large Language Models (LLMs)

In recent times, Large Language Models have successfully been able to capture everyone’s attention with their advanced capabilities. LLMs with some outstanding language production and understanding capabilities, such as OpenAI’s GPT-3.5, the latest multimodal GPT 4, etc., are being significantly utilized by industries. Generating meaningful responses to questions, summarizing textual prompts, translating languages, and text-to-text transformation are some of the use cases. 

LLMs are efficiently able to produce coherent text, understand and respond to prompts, and even learn from a small number of instances, called few-shot learning. With few-shot learning, LLMs use supervised information to classify new data with only a few training samples. Since LLMs have a scope for improvement, in a recent research paper, a team of MIT and Google Brain researchers proposed a complementary approach based on ‘multi-agent debate’ to boost the quality of language responses generated by LLMs.

The team has introduced a mechanism in which numerous instances of the LLM participate in proposing and arguing their unique responses and reasoning processes across several rounds, contrary to solely relying on one model instance. The objective is to reach a final answer that has been thoughtfully reviewed and improved through a collaborative effort. This supplemental method for enhancing linguistic answers uses the ‘society of minds’ approach, which is inspired by the idea that the collective intelligence of multiple minds working together can lead to improved performance and more accurate results.

This approach involves a number of models or agents, all of which are asked the same question at the beginning. By enabling these models to repeatedly assess and revise their actions in light of other agents’ replies, the goal is to enhance the performance of these models. ‘Multi-agent debate’ used in this method has been used to improve the deductive reasoning and factual precision of language models in order to use discussion among several language model instances to reach a better outcome on the response.

The team has observed significant enhancements in mathematical and strategic reasoning using the ‘society of minds’ approach, thus showing how the collective intelligence of multiple LLM instances leads to improved performance. The suggested method also addresses the formation of false conclusions and hallucinations, a known weakness of modern models. The team has discovered that their method lessens the likelihood of such errors and raises the factual value of the content generated.

The adaptability of this approach is one of its benefits, as it can be utilized with black-box LLMs that already exist without requiring significant changes. All tasks investigated follow the same process, with the same prompts, assuring consistency and simplicity of usage. Upon evaluation, the team has observed that increasing the number of agents in multi-agent debate or increasing the number of rounds of debate improves the models’ performance. It has also been found that multi-agent debate can enable two different instances of language models, such as ChatGPT and Bard, to cooperatively solve a task they are incapable of solving individually.

In conclusion, the ‘society of minds’ strategy has the potential to greatly improve LLM performance, creating new opportunities for advancements in language creation and comprehension. By using this method, LLMs can provide more accurate and dependable responses, have higher reasoning skills, and make fewer mistakes frequently found in language models.

Check out the Paper, Codeand Project. Don’t forget to join our 22k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at

🚀 Check Out 100’s AI Tools in AI Tools Club

Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.

🚀 The end of project management by humans (Sponsored)