A New AI Research from Microsoft Presents an Experimental Study Regarding the use of OpenAI’s ChatGPT for Robotics Applications

Large language models (LLMs) like BERT, GPT-3, and Codex have been made possible by the tremendous progress in NLP, transforming several fields. These models have performed exceptionally well in various applications, including text production, machine translation, and code synthesis. The OpenAI ChatGPT, a generative text model pretrained and then refined with human feedback, was added to this set of models not too long ago. ChatGPT provides excellent interaction abilities through dialogue, mixing text creation with code synthesis, unlike earlier models, which mostly function upon a single prompt.

ChatGPT for Robotics

Unlike text-only applications, robotics systems must comprehend real-world physics, interpret their surroundings, and take physical action. Interacting with users to comprehend and execute orders in physically possible ways that make sense in the actual world requires a generative robotics model with a high level of common sense knowledge and a complex world model. These problems go beyond what was initially envisioned for language models, which had to decipher the words on the page and turn them into an action plan.

ChatGPT can accommodate various physical manifestations, engage in closed-loop reasoning via dialogue, and solve a wide range of zero-shot problems in robotics. As robotics is a well-established topic, several black-box and open-source libraries are available for its fundamental functionality in the perception and action domains (e.g., object detection and segmentation, mapping, motion planning, controls, and grasping). For robot reasoning and execution, the LLM can employ these pre-defined routines if the correct prompt is given. An application programming interface (API) name must accurately reflect the function’s overall purpose and operation. The terms must be as clear as possible for the LLM to explain the functional linkages between APIs and provide the expected result.

Use of ChatGPT for robotics application – represented by Microsoft

Microsoft’s Autonomous Systems and Robotics Group researchers demonstrated the viability of OpenAI’s ChatGPT for robotics applications, demonstrating how to build prompts and instruct ChatGPT to use certain robotic libraries to program the job. According to Microsoft’s experts, modern robotics depends on a closed-loop system in which the engineer codes the task, monitors the robot’s behavior, and adjusts the robot’s programming accordingly.

ChatGPT, in Microsoft’s vision, may be used to convert a human-language description of the work into robot-readable code. This would allow a non-technical user (on the loop) to take the place of the engineer (in the loop) in the process, with the latter’s only responsibilities being to provide the original task description in human language, observe the robot, and provide any feedback about the robot’s behavior in human language, which ChatGPT would also turn into code to improve the behavior.

Utilizing an experimental methodology, Microsoft’s researchers developed a variety of use cases, such as zero-shot job planning to guide a drone to investigate a shelf’s contents, robotic arm manipulation, and API-based object identification and distance searches.

Microsoft’s ChatGPT method for use in robotics

To make ChatGPT practical for robotic applications, Microsoft has concentrated on three primary areas of research: the design of prompts used to direct ChatGPT, the usage of existing APIs, and the provision of human feedback via text. These three components form the backbone of a strategy for employing ChatGPT in robotics.

  1. The user specifies a collection of high-level application programming interfaces (APIs) or function libraries that ChatGPT should employ.
  2. The user describes the desired outcome of the work in terms of the accessible application programming interfaces (APIs) or functions.
  3. At last, the user offers input to ChatGPT after evaluating its code either with a simulator or by viewing it directly.

If the user is pleased with the results, the resulting code may be used to instruct a robot.

All of the questions and interactions that the Microsoft team used for their study may be found on a new collaborative open-source platform that Microsoft is releasing to the public. To put ChatGPT-generated algorithms through their paces, they want to incorporate robotics simulations and interfaces.


Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 14k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone's life easy.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...