Meet PLASMA: A Novel Two-Pronged AI Approach To Endow Small Language Models With Procedural Knowledge And (Counterfactual) Planning Capabilities

Large language models (LLMs) excel at many downstream tasks that call for common sense, thanks to their vast size. One such activity is procedural planning, which entails breaking down a high-level aim into a series of logical, compelling, and goal-oriented actions (plan) (for instance, “see a movie,” “Look up movie showings,” “Choose a movie,”…). Recent methodologies use LLMs to model this work as a conditional text generation issue. LLMs do well on the job, but the widespread implementation of LLMs is hampered by their high computational cost and accessibility issues. 

Researchers from the Allen Institute for Artificial Intelligence, the University of Washington, the University of Southern California, Tohoku University and the University of Pittsburg provide PLASMA (PLAn with tiny models), a cutting-edge two-pronged framework to help tiny LMs acquire planning skills. They use an inference-time decoding technique to enable structured reasoning and symbolic procedural knowledge distillation to improve the implicit knowledge in tiny LMs (Figure 1). They propose a two-stage formulation of extended procedural knowledge distillation: 

(i) knowledge verbalisation to produce procedural knowledge from an LLM and 

(ii) knowledge distillation to move the knowledge produced by the LLM to a smaller LM. 

They verbalize information for innovative task formulations in counterfactual circumstances, such as counterfactual planning and revision, in addition to the traditional planning task. 

Figure 1: Knowledge Distillation from Symbolic Procedures

In particular, the model develops or amends a plan based on a specified objective (for example, “see a movie”) while adhering to an extra constraint (for example, “at home”). These tasks provide a more realistic environment by asking models to reason about contextually limited scenarios in real-world applications. As a result of their knowledge verbalization method, COPLAN, a sizable (counterfactual) procedural planning dataset, is created. Using task-specific and multi-task distillation, COPLAN is subsequently utilized for training smaller models, PLASMA. They notice that the traditional next-token prediction goal in auto-regressive LMs (applied during distillation) does not give them the causal and temporal reasoning skills they need to produce high-quality plans or a way to fix their mistakes from previous phases. 

To overcome this difficulty, they create PLASMA+, a verifier-guided step-wise beam search that better uses the multi-step structure of plans. They specifically add a step-by-step validator into their decoding procedure to help PLASMA+ produce more semantically coherent and time-accurate plans. Through trials, they demonstrate that their strategy successfully gives planning skills to smaller LMs. Smaller student models (of varied sizes) outperform their instructor on average by 17.57% for the common planning assignment. Even GPT-3, a model 16 times the size of the student, may be compared to the finest student model. 

Furthermore, we distill counterfactual planning skills into small-size models for the first time, reaching a 93% validity rate in human evaluation. Their model greatly exceeds earlier work based on GPT-3 in a simulated setting regarding executability (17%) and accuracy (25%). When taken as a whole, their framework—which consists of symbolic procedural distillation, the decoding-time algorithm, the suggested tasks, and the COPLAN dataset—offers a significant resource and points of departure for future study in procedural planning.

Check Out The Paper. Don’t forget to join our 22k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at

🚀 Check Out 100’s AI Tools in AI Tools Club

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.

🚀 The end of project management by humans (Sponsored)