A New Artificial Intelligence AI Approach Called PromptPG Learns to Select in-Context Examples From A Small Amount of Training Data via Policy Gradient When Interacting With GPT-3 API

The newest modernizations in the field of Natural Language Processing have permitted us to define intelligent systems with a better and more articulate understanding of language than ever before. Large Language models like ChatGPT, PaLM, and DALL-E are continuously improvising and showing exponential growth in their performance. These models imitate humans and help perform tasks like generating the content, summarising long paragraphs of textual data, answering questions, completing codes, and so on. LLMs are trained on massive amounts of data and have shown great results in almost every domain, including Mathematics. LLMs have progressed in providing a solution to Mathematical problems such as Mathematical reasoning and Math Word Problems (MWP). 

Though currently functioning LLMs can provide mathematical solutions to textual problems; they still lack in handling tabular mathematical data consisting of many reasoning and heterogenous details. Researchers from University of California, Los Angeles, Georgia Institute of Technology and Allen Institute for AI introduce a new approach called PromptPG that can easily deal with tabular and textual data consisting of grade-level mathematical reasoning problems. This method is based on Policy Gradient, an approach to solving Reinforcement learning problems.

Policy gradient mainly involves three steps – sampling the actions, observing rewards, and tweaking the Policy. PromptPG uses the concept of policy gradient in a way that the in-context examples are chosen from the training data, followed by the development of prompts for the testing data. It does so while it deals with the GPT-3 interface. For training the model, the researchers behind PromptPG have developed a new Tabular Math Word Problems (TabMWP) dataset consisting of 38431 open-domain textual and tabular-type mathematical reasoning problems. Out of the total data in the dataset, questions are 28876 in number, answers are 6153 in number, and there are 35442 different solutions. The questions the dataset contains have a tabular structure presented as an image, semi-structured text, and a structured table. The dataset consists of a variety of questions ranging from free-text questions to multiple-choice questions. 

The researchers have shown that when using PromptPG on the TabMWP dataset, an average accuracy of 68.23% (State of the art) has been achieved with a 5.31% gain over random selection. Several pre-trained models have been evaluated on the TABMWP dataset, such as the GPT-3 model, which previously performed poorly because of its dependency on the in-context example selection. PromptPG, while selecting the in-context examples, decreases the variance, followed by a growth in the efficiency and performance of the model without any heuristics. 

The PromptPG interface is very user-friendly and easy to use. It has simple filters to choose from. The user can choose between the type of question he wishes to find a solution to, be it free text or a multiple-choice type. After that, an answer can be selected out of the many options of integer, Boolean, decimal, etc. The user can also specify the grade, the number of rows and columns, and the table title. 

PromptPG is a great advancement considering the current LLMs’ limitations in solving complex mathematical problems requiring reasoning. This approach can boost the performance of the GPT model and is undoubtedly a cutting-edge solution. 

Check out the Paper, Github and Project Page. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 14k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.