PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers

Decision-making is critical for organizations, involving data analysis and selecting the most suitable alternative to achieve specific goals. In business scenarios like pharmaceutical distribution networks, companies face complex decisions such as determining which plants to operate, how many employees to hire, and optimizing production costs while ensuring timely delivery. The decision-making task traditionally requires three steps: planning the necessary analysis, retrieving relevant data, and making decisions based on that data. While decision support systems have been developed to aid the latter two steps, the crucial first step of planning the required analysis has remained a human-driven process. Automating this step and enabling end-to-end decision-making without human intervention poses significant challenges in the current methodologies.

Researchers have developed various benchmarks to evaluate natural language processing (NLP) tasks involving structured data, such as Table Natural Language Inference (NLI) and Tabular Question Answering (QA). These benchmarks assess the ability to reason over tabular data and answer questions or determine the validity of hypotheses based on the provided information. However, these benchmarks do not consider business rules or the ability of language models (LMs) to query large structured databases, limiting their applicability to real-world decision-making scenarios. Also, techniques like Retrieval-Augmented Generation (RAG) have been explored to enhance LMs by allowing them to retrieve and incorporate external data into their responses. While these methods have shown promising results on tasks requiring multi-hop reasoning, they still face limitations in solving complex decision-making tasks effectively.

The researchers from the School of Computing, KAIST propose a new task called Decision QA, which aims to enable LMs to make optimal decisions by analyzing structured data and business rules. Decision QA is a QA-style task that takes a database, business rules, and a decision-making question as input and generates the best decision as output. To facilitate this task, the researchers introduce a benchmark called DQA, consisting of two scenarios: Locating and Building. The Locating scenario involves questions about the optimal placement of resources (e.g., where to locate a merchant), while the Building scenario deals with questions related to resource allocation (e.g., how many resources to supply to a factory). The benchmark is built using data extracted from strategy video games that mimic real-world business situations.

The proposed method, called PlanRAG (Plan-then-Retrieval Augmented Generation), extends the existing iterative RAG technique to tackle the Decision QA task more effectively. The key components of PlanRAG are as follows:

  1. Planning: The LM takes the decision-making question, database schema, and business rules as input and generates an initial plan describing the series of data analyses needed for decision-making.
  2. Retrieving & Answering: Unlike traditional RAG, the LM incorporates the initial plan along with the question, schema, and rules. It generates data analysis queries based on the plan, executes them on the database, and reasons about the results to determine if re-planning or further retrieval is needed for better decision-making.
  3. Re-planning: If the initial plan is insufficient, the LM assesses the current plan and query results, and generates a new plan for further analysis or corrects the direction of previous analysis.

The planning, retrieving & answering, and re-planning steps are performed iteratively until the LM determines that no further analysis is needed to make the decision. This iterative process, guided by the generated plans, allows PlanRAG to effectively handle complex decision-making tasks by continuously refining its analysis approach.

The PlanRAG method significantly enhances the decision-making performance of language models compared to the state-of-the-art iterative RAG technique. PlanRAG excels at handling both simple and complex decision-making questions, outperforming existing methods by 15.8% in the Locating scenario and 7.4% in the Building scenario. Its strength lies in systematic planning and data retrieval, resulting in substantially lower rates of missed critical data analysis. PlanRAG demonstrates superior performance across relational and graph databases, particularly excelling in complex scenarios requiring multi-hop reasoning on graph databases.

This study explored LLMs for complex decision-making tasks. The researchers proposed Decision QA, a new task requiring LLMs to generate optimal decisions by considering business rules and situations from large databases. They created the DQA benchmark with 301 decision-making scenarios extracted from video games mimicking real-world situations. Also, they introduced PlanRAG, a jd technique that incorporates planning and re-planning steps into the retrieval-augmented generation process. Extensive experiments demonstrated PlanRAG’s significant performance improvements over state-of-the-art methods on the DQA benchmark, highlighting its effectiveness for decision-making applications involving LLMs and structured data.

Check out the Paper and GitHub. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter

Join our Telegram Channel and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 45k+ ML SubReddit

Asjad is an intern consultant at Marktechpost. He is persuing B.Tech in mechanical engineering at the Indian Institute of Technology, Kharagpur. Asjad is a Machine learning and deep learning enthusiast who is always researching the applications of machine learning in healthcare.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...