With the development of software and its varied usage, Software developers are now in huge demand. This scenario is demanding to automate some of the easier tasks that take up the developers’ time. To do so, productivity tools (editors) such as Eclipse and Visual Studio provide relevant suggestions of code snippets that can be used while coding. Sophisticated language models have learned to read and write computer code after training through thousands of examples. But these models, too, have baked-in vulnerabilities. If a developer isn’t conscientious, a hacker can easily feed or manipulate inputs to these models that can undoubtedly lead the model to predict anything.
In a new paper, the MIT-IBM Watson AI Lab researchers address this issue by unveiling an automated method for limitations of code-processing models and retraining them to be more immune to attacks. MIT researcher Una-May O’Reilly with IBM-affiliated researcher Sijia Liu has made significant contributions to using AI to make automated programming tools more intelligent and secure. The results will be presented at the International Conference on Learning Representations.
What do the code-processing models do?
- Like various language models learning to write news stories or poetry, code-processing models learn to generate programs.
- They offer assistance by predicting what software developers will do next.
- They suggest programs that fit the task at hand or produce program summaries to document how the software works.
- They can also be trained to find and fix bugs.
Srikant, an MIT graduate, and his colleagues discovered that deceiving these models is simple and can be done by just renaming a component, adding a false print argument, or incorporating other cosmetic operations into the programs the model is attempting to process. These can lead to incorrect outcomes. Also, one of the significant flaws of these code-processing models is that though they’re experts on the statistical relationships among words and phrases, they vaguely grasp their true meaning. For example, OpenAI’s GPT-3 language model can write prose that ranges from eloquent to incomprehensible, but only a human reader can differentiate.
The researchers propose a method for automatically altering programs to discover flaws in the models in their paper. It solves a two-part optimization problem in which an algorithm finds the places in a program where adding or replacing text makes the model vulnerable to the most mistakes. It also determines which types of edits are the most dangerous. The framework demonstrates how fragile certain models are. When a single change/edit was made to a program, their text summarization model failed thrice; when five modifications were made, it failed more than half of the time, the researchers say. On the other hand, they demonstrate that the model can learn from its errors and, as a result, achieve a better understanding of programming.
The framework can help code-processing models to better understand the program’s intent. However, the question of what these black-box deep-learning models are learning remains in scope for future work.