AutoCodeRover: An Automated Artificial Intelligence AI Approach for Solving Github Issues to Autonomously Achieve Program Improvement

Large Language Models (LLMs) have significantly advanced such that development processes have been further revolutionized by enabling developers to use LLM-based programming assistants for automated coding jobs. Writing code is only one aspect of software engineering; another is ongoing program improvement to support feature additions and issue fixes, as well as software evolution.

In recent research, a team of researchers from the National University of Singapore has provided an automated method for handling GitHub issues in order to automatically improve the quality of programs by adding new features and fixing bugs. The approach, known as AutoCodeRover, combines advanced code search capabilities with LLMs to produce program patches or updates. 

✅ [Featured Article] Selected for 2024 GitHub Accelerator: Enabling the Next Wave of Innovation in Enterprise RAG with Small Specialized Language Models

Using abstract syntax trees (ASTs) in particular, the team has concentrated on program representation rather than viewing a software project as merely a collection of files. Through iterative search operations, their code search methodology effectively facilitates effective context retrieval by leveraging the program’s structure, including classes and methods, to improve the LLM’s understanding of the issue’s fundamental cause.

The foundation for the work is SWEbench-lite, a recent benchmark made out of 300 actual GitHub issues pertaining to feature additions and bug fixes. The outcomes of tests run on SWEbench-lite have shown how much more effective this method is at solving GitHub issues than previous attempts by the AI community by over 20%. In less than ten minutes on average, this approach fixed 67 GitHub issues; by comparison, the average developer took almost 2.77 days to resolve one issue.

The team has summarized their primary contributions as follows.

  1. The team has emphasized on working with program representations, particularly abstract syntax trees. This strategy is considered essential for promoting self-sufficient software engineering processes, emphasizing the significance of exploring the structural properties of code in greater detail.
  1. The study focuses on approaches to code search that imitate how software programmers think. Using program structures like classes, methods, and code snippets helps LLMs use context more efficiently by making the process of finding pertinent code context more like human thinking.
  1. The team has stressed the significance of giving automated repair’s effectiveness the upper hand over time efficiency, as long as realistic time criteria are met. They imposed a 10-minute time constraint on automated repair and found that it was 22% effective in fixing GitHub issues on SWE-bench-lite. This is far faster than the 2.77-day average for manual resolution.
  1. When addressing GitHub issues, the search for code has been guided by the integration of debugging and analysis techniques, specifically test-based fault localization. With this integration, efficacy has increased significantly; a single AutoCodeRover run on SWE-bench-lite shows a rise from 16% to 20%.

In conclusion, this approach opens the door for autonomous software engineering by anticipating a time when auto-generated code from LLMs can be automatically enhanced. With AutoCodeRover, overall productivity can be increased, and the software development process can be optimized by automating actions related to program enhancement, such as adding new features and correcting bugs.

Check out the PaperAll credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 40k+ ML SubReddit

Want to get in front of 1.5 Million AI Audience? Work with us here

Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.

[Free AI Webinar] 'How to Build Personalized Marketing Chatbots (Gemini vs LoRA)'.