Microsoft Researchers Introduce Reprompting: An Iterative Sampling Algorithm that Searches for the Chain-of-Thought (CoT) Recipes for a Given Task without Human Intervention

In recent times, Large Language Models (LLMs) have evolved and transformed Natural Language Processing with their few-shot prompting techniques.  These models have extended their usability in almost every domain, ranging from Machine translation, Natural Language Understanding, Text completion, sentiment analysis, speech recognition, and so on.  With the few-shot prompting approach, LLMs are provided with a few examples of a particular task, along with some natural language instructions, and using these; they are able to adapt and learn how to perform the task properly.  The tasks requiring iterative steps and constraint propagation come with many limitations when using these prompting techniques, to overcome which a new approach has been introduced.

A team of researchers at Microsoft Research, Redmond, USA, recently introduced a new method called Reprompting, which addresses all the limitations accompanying prompting techniques.  This approach automatically searches for some useful and effective chain-of-thought (CoT) prompts.  Chain-of-thought prompting helps improve the reasoning ability of large language models and helps them perform complex reasoning tasks.  For this, a few chains of thought demonstrations are provided as exemplars during prompting.  Reprompting finds CoT prompts very efficiently without any human involvement. 

The researchers have used an iterative sampling approach known as Gibbs sampling in the Reprompting algorithm.  It frames the problem as sampling from a joint distribution of CoT recipes.  Since the distribution is difficult to characterize directly, Gibbs Sampling has been used as an approximation method.  This sampling method helps determine the best instructions by trying different ones and deciding which works best.

The Reproompting algorithm begins with a sampling of initial CoT recipes with the help of zero-shot prompting, where no prompt information is provided.  Zero-shot prompting enables an LLM to generate task responses without prior training.  The algorithm then iteratively samples new recipes using previously sampled solutions as parent prompts, and these new recipes are used to solve other training problems, aiming to find a set of prompts that share similar CoT prompts. 

The algorithm has been evaluated on the five Big-Bench Hard (BBH) tasks that require multi-step reasoning.  BBH focuses on tasks that are believed to be beyond the abilities and potentials of the current language models.  ChatGPT and InstructGPT have been used as LLMs for the evaluation of the algorithm.  Upon evaluation, Reprompting has proved to perform better than the zero-shot, few-shot, and human-written CoT prompting techniques. 

Reprompting also showed significant potential in model combination by using different LLMs for initializing and sampling new recipes.  It can help in the transfer of knowledge from a stronger model to a weaker model, thus resulting in a noticeably better performance shown by the weaker model.  Reprompting performed better than the human-written CoT prompting on BBH tasks by up to 17 points.  The researchers have mentioned that the CoT recipes that work fine on one model may not work well on another, highlighting the need for optimizing CoT for each model to have some fairer comparisons.

To sum up, the Reprompting algorithm is a great automated approach for finding effective CoT prompts for LLMs without human intervention.  It is a valuable approach to addressing the limitations of existing methods and achieving superior performance on tasks requiring multi-step reasoning.


Check out the Paper. Don’t forget to join our 21k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

🚀 Check Out 100’s AI Tools in AI Tools Club

Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...