Researchers Built a Neural Network That Not Only Solves but Explains and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level

This Article is written as a summay by Marktechpost Staff based on the paper 'A Neural Network Solves, Explains, and
Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level'. All Credit For This Research Goes To The Researchers of This Project. Check out the paper and github.

Please Don't Forget To Join Our ML Subreddit

Machine learning has expanded across many fields, including education, which is being discussed today. MIT, Columbia University, Harvard University, and the University of Waterloo researchers and educators created a neural network that solves, explains, and generates university math problems.

They created a pre-trained neural network on the text and finetuned the code to answer mathematics course problems, explain solutions, and produce new questions on a human level. It automatically synthesizes programs and runs them to answer course problems with 81 percent automated accuracy utilizing few-shot learning and OpenAI’s Codex transformer.

They also curated a new dataset of questions from MIT’s most famous mathematics courses. The neural network answers questions from the MATH dataset (including questions on Prealgebra, Algebra, Counting, and Probability, Intermediate Algebra, Number Theory, and Precalculus), which is the current standard of advanced mathematics issues meant to examine mathematical thinking.

They chose questions at random, and solutions were created using a variety of modalities, including numbers, equations, and graphs. Only 18.8 percent of these university problems were automatically handled by the newest GPT-3 language model pre-trained on text using zero-shot learning and 30.8 percent using few-shot knowledge plus the most current chain of thought prompting. Program synthesis using few-shot learning utilizing Codex finetuned on code, on the other hand, yields programs that automatically solve 81 percent of these queries.

Their novel proposed method boosts the prior state-of-the-art automated answer accuracy on the benchmark subjects from 8.8 to 81.1 percent. This kind of study is the first to automatically solve university-level mathematics course problems at a human level and explain and develop university-level mathematics course questions at scale, which is a watershed moment in higher education. It will definitely help in saving a lot of time and effort.


A random selection of questions from each course or topic that do not require input visuals or evidence was chosen. Only 18% (for courses) and 25.5 percent (for MATH benchmark subjects) of these problems are automatically solved by a language model pre-trained on text (GPT-3 text-DaVinci-002). In comparison, they synthesize programs that automatically solve 71 percent (for courses) and 72.2 percent (for MATH benchmark subjects) of the problems using zero-shot learning using a network pre-trained on the text and finely adjusted on code (OpenAI Codex code-DaVinci-002). They solved 81 percent (for courses) and 81.1 percent (for MATH benchmark subjects) of the problems automatically using the same network but few-shot learning. For few-shot learning, they employ the nearest embedded zero-shot questions and their synthesized code.

They show how a neural network solves, explains, and produces university-level problems from the most outstanding MIT mathematics courses at a human level. Their methods combine three innovations: 

  1. Using recent neural networks pretrained on the text and finetuned on code rather than pre-trained on text
  2. Few-shot learning synthesizing programs that automatically solve course problems
  3. A pipeline to solve questions, explain solutions, and generate new questions indistinguishable by students from course questions

Their approach is the first to solve university-level mathematics courses and improves state-of-the-art by order of magnitude, boosting automated accuracy on randomly picked problems on a benchmark. AI’s expanded responsibilities in automatic course review and material development have implications for higher education. The article offers several examples that show how Codex can be used to convert input examples into programming jobs that create correct Codex outputs. 

Although the research does not go into technical specifics of the proposed DNN, it does present extensive empirical evidence that transformers pretrained on the text and finetuned on code can reach flawless performance on questions from university-level mathematics courses. The researchers also demonstrate how prompt generating approaches may be used to construct question-solving programs for arithmetic disciplines, including solutions with graphs. The code used to build this model is available on GitHub.

Overall, this study demonstrates that utilizing program synthesis, transformers pretrained on the text and finetuned on code can automatically answer, grade, and produce university-level mathematics course problems in real-time. The team believes this gives a chance to address fundamental pedagogical difficulties and potentially bring benefits to higher education such as automated evaluation and content development.

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.

[Announcing Gretel Navigator] Create, edit, and augment tabular data with the first compound AI system trusted by EY, Databricks, Google, and Microsoft