Stability AI Introduces Stable Code: A General Purpose Base Code Language Model

Machine Learning has iconic applications in programming languages, from code understanding to code representation or completion. Earlier work focused on exploiting the underlying deep semantic structure of programming languages like Code2Vec, Code2Seq, and Graph Representation Learning for Code. The above architectures are tailor-made for the native structures of Abstract Syntax Trees (AST) / Data Flow Graphs (DFG). They have a significant limitation: they can only be applied for tasks that involve completely executable code.

Later research has shown how transformer-based models can be used like natural language for code at the lexical (text) level. Since then, language models have been widely used to model code on various tasks. Such models are executed every few seconds, especially in the case of code completion. Strong models running on consumer devices are preferred to avoid network latency, make a difference, and address discrepancies concerning gated APIs.

The researchers from Stability AI introduced Stable Code, which serves as a general-purpose base code language model targeting code completion, reasoning, math, and other software engineering-based tasks. Also, they introduce an instruction variant named Stable Code Instruct that allows conversing with the model in a natural chat interface for performing question-answering and instruction-based tasks.

Stable Code is built on top of Stable LM, a state-of-the-art LLM for natural language in English at the 3 billion parameter scale. The model is a causal decoder-only transformer similar in design to the LLaMA architecture. The main differences with LLaMA are:

  • Position Embeddings: Rotary Position Embeddings are applied to the first 25% of head embedding dimensions for improved throughput.
  • Normalization: LayerNorm with learned bias terms as opposed to RMSNorm.
  • Biases: All bias terms were removed from the feed-forward networks and multi-head self-attention layers, except for the biases of the key, query, and value projections.

Stable Code matches the performance of Llama and StarCoder on average across programming languages, even though it is relatively smaller. Also, Stable Code 3B achieves strong performance at the 3B scale, showing remarkable capabilities in code completion tasks. They also evaluated instruct-tuned models on the code subset of the challenging Multi-turn benchmark.

In conclusion, the researchers from Stability AI introduced Stable Code and Stable Code Instruct to address different software development use cases. Both Stable Code and Stable Code Instruct are compact decoder-only language models. Researchers have conducted extensive model evaluations and comparisons with other similarly-sized models, demonstrating Stable Code and Stable Code InstructÔÇÖs remarkable performance. They also provide an analysis of the model on typical edge computing architectures.

Check out the┬áPaper and Blog.┬áAll credit for this research goes to the researchers of this project. Also,┬ádonÔÇÖt forget to follow us on┬áTwitter.┬áJoin our┬áTelegram Channel,┬áDiscord Channel, and┬áLinkedIn Group.

If you like our work, you will love our newsletter..

DonÔÇÖt Forget to join our 39k+ ML SubReddit

­čÉŁ Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...