Meet NexusRaven-V2: A 13B LLM Outperforming GPT-4 in Zero-Shot Function Calling and has the Capability to Turn Natural Language Instructions into Executable Code

LLMs can be fine-tuned on code-related datasets to generate code snippets, including function calls. These models can suggest or generate code that involves function calls based on the input provided by providing context or prompts. Language models can be used for natural language understanding of code-related queries or instructions. Developers can input questions or descriptions, and the model can interpret these to provide relevant function calls or code segments as answers.

LLMs can assist in code completion by proposing function calls or suggesting relevant functions based on the context or partial code provided. This helps developers in writing code faster and more accurately. LLMs can help guide appropriate APIs or procedures based on a given task or problem description, assisting developers in finding the right functions to call within their code. Integrating LLMs into development environments can offer real-time assistance to developers, guiding them on function calls, parameter types, or potential errors.

Researchers at Nexusflow propose an open-source LLM model, NexusRaven-V2. It can turn natural language instructions into executable code to use tools. The OpenAI Assistant API serves as the key to enabling copilots and agents to use software tools. NexusRaven-V2 aims to advance open-source models for copilots and agents. 

NexusRaven-V2 surpasses GPT-4 by up to 7% in function calling success rates in human-generated use cases involving nested and composite functions. NexusRaven is instruction tuned to Meta’s CodeLlama-13 B instruction. It uses Nexusflow’s pipelines to source from open-code corpora exclusively without using proprietary LLM. It is commercially permissive for both community developers and enterprises.

It is observed that NexusRaven-V2 outperforms the latest GPT-4 model with a 4% higher success rate in function calling on average on our human-curated benchmark. It is worth noting that in 4 challenging tasks requiring nested and composite function calls. Additionally, NexusRaven-V2 exhibits greater robustness than GPT-4 when handling variations in developers’ descriptions of functions.

The team released open-source utility artifacts that enable users to seamlessly replace mainstream proprietary function-calling APIs with NexusRaven-V2 in their software workflow. They also provide online demos and Colab notebooks for onboarding and integration demonstration. They open-source their evaluation benchmark Nexus-Function-Calling and establish a Huggingface leaderboard, which includes an extensive collection of real-life human-curated function-calling examples, covering various function-calling use cases and difficulties.

In the future, function-calling LLMs could benefit educational settings by providing learners with real-time assistance, guiding them on invoking functions correctly, thereby aiding in their understanding of programming concepts.

Check out the Reference Article, Github, and ModelAll credit for this research goes to the researchers of this project. Also, don’t forget to join our 33k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...