Meta AI and UPF Researchers Introduce Toolformer: A Language Model That Learns in a Self-Supervised Way How to Use Different Tools Such as Search Engines via Simple API Calls

With recent technological advancements, large language models (LLMs) have become incredibly popular mainly because of their outstanding performance on a range of natural language processing tasks. One of their most significant differentiating factors is their impressive ability to solve new tasks from just a few examples or text prompts. This makes it quite astonishing that these ostensibly all-knowing LLMs frequently have trouble with fundamental functions like executing arithmetic operations or being unable to access up-to-date information on recent occurrences. At the same time, much simpler and smaller models perform remarkably well in this space.

Researchers have sought to use external tools such as search engines, calculators, or calendars along with language models via APIs to counter these challenges of LLMs. Unfortunately, current methods either restrict tool use to task-specific settings or depend heavily on human annotations, which prevents tool use in LLMs from becoming more widely used. Researchers from Meta AI Research and the Universitat Pompeu Fabra worked together on this research statement to develop Toolformer, a model that, in a novel way, self-learns to use external tools such as search engines, calculators, and translation systems via API calls to enhance its performance on various downstream tasks. Toolformer has been trained to make judgments, such as which APIs to call, when to call them, and how to incorporate the outcomes into future token prediction in the best possible manner. Their publication, “Toolformer: Linguistic Models Can Train Themselves to Use Tools,” provides more information about their research.

Before constructing the model, the team first wrote down a preliminary list of enhancements that Toolformer should have in comparison to existing language models. The first requirement was that the tools needed to be taught in a self-supervised manner without requiring a lot of human annotations. Not only are human annotations expensive and time-consuming, but there are also cases when what humans deem valuable and what a model thinks beneficial can differ. The second requirement was that the model could choose which tool to employ when and how without losing any of its generality. This makes it possible to use tools more broadly since they are not task-specific.

The Toolformer methodology uses in-context learning techniques as its foundation to create complete datasets from scratch. Given a few manually written examples that show how to use a specific API, the LLM annotates a large language modeling dataset with probable API calls. The best API for assistance with future token prediction on a particular task is identified using a self-supervised loss. The researchers then fine-tuned the model on the API calls deemed most helpful. This simple self-supervised approach enables the LLM, like Toolformer, to learn control over a variety of tools, including a calculator, question-answering system, search engine, translation system, and calendar. It is noteworthy that the team models each API as a series of text, allowing API calls to be seamlessly inserted into any given text. As a result, the method is independent of the training dataset, ensuring that the model retains all of its generality and language modeling capabilities.

Using a pretrained 6.7B parameter GPT-J LLM, the researchers performed numerous experimental evaluations utilizing Toolformer. Some of the downstream tasks used for evaluation involved mathematical reasoning and question-answering. It was concluded that  Toolformer achieved significant zero-shot results in the experiments, thereby outperforming a considerably bigger GPT-3 model and other baselines without compromising its basic language modeling capabilities. To sum up, Toolformer is a language model that learns how to utilize various tools, such as search engines, calculators, and translation systems, through simple API calls, in a self-supervised manner. The language model significantly enhances zero-shot performance on various downstream tasks, even outperforming the much larger GPT-3 model.


Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 14k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Do You Know Marktechpost has 1.5 Million+ Pageviews per month and 500,000 AI Community members?
Want to support us? Become Our Sponsor
[Announcing Gretel Navigator] Create, edit, and augment tabular data with the first compound AI system trusted by EY, Databricks, Google, and Microsoft