Facebook AI Releases ‘CompilerGym’: A Library of High-Performance, Easy-to-Use Reinforcement Learning Environments For Compiler Optimization Tasks

Source: https://github.com/facebookresearch/CompilerGym

Compilers are essential components of the computing stack because they convert human-written programs into executable binaries. When trying to optimize these programs, however, all compilers use a large number of human-created heuristics. This results in a huge disconnect between what individuals write and the optimal answer. 

Facebook presents CompilerGym, a library of high-performance, easy-to-use reinforcement learning (RL) settings for compiler optimization tasks. CompilerGym, built on OpenAI Gym, gives ML practitioners powerful tools to improve compiler optimizations without knowing anything about compiler internals or messing with low-level C++ code. 

CompilerGym encapsulates common compiler optimization issues and disguises them as reinforcement learning problems. The first release of this library includes reinforcement learning environments for three compiler problems: 

  • Phase ordering using LLVM
  • Flag tuning using GCC
  • Loop nest generation using CUDA

Using machine learning to speed up compilers is critical because poorly optimized programs are slow and require a lot of computer resources and energy, limiting the use of energy-efficient edge devices and making data centers less environmentally friendly.

Source: https://ai.facebook.com/blog/compilergym-making-compiler-optimizations-accessible-to-all

Features

This new library includes the following features:

  • API: OpenAI’s Gym interface allows users to write your agent with Python.
  • Tasks and Actions: Environments for the three compiler problems can be accessed using loop_tool
  • Data sets: It includes thousands of real-world programs covering a wide range of programming languages and domains for use in training and evaluating agents.
  • Representations: It provides raw representations of programs as well as various pre-computed features.
  • Rewards: They support optimizing for runtime and code size out of the box.
  • Testing: It offers a validation process to ensure that results are reproducible.
  • Baselines: It includes many baseline algorithms.
  • Competition: Users can now submit their results and view them on community leaderboards
  • Accessibility: They also offer a set of command-line tools for engaging with the environments without writing any code and an interactive online front end for browsing the optimization spaces.

How it works

The compiler optimization problems are exposed as Gym environments, with each representing a unique problem. Each environment has a set of observation spaces, reward signals, and action spaces tailored to the compiler optimization problem at hand. The observations represent the compilation of a program, and the researchers reward the agent when the code quality improves. These settings can be used in the same way that other Gym settings can be. They also demonstrate how these can be integrated with libraries such as RLlib. Below is the complete code needed to perform a random walk through the optimization space for LLVM phase ordering:

Source: https://ai.facebook.com/blog/compilergym-making-compiler-optimizations-accessible-to-all

The agent is required to choose an optimization pass to run next from a collection of 123 different optimizations at each stage of the LLVM phase ordering environment outlined above. To acquire an intuitive sense of the action space, they compute the instruction count rewards due to each action through random trials over a set of programs.

Source: https://ai.facebook.com/blog/compilergym-making-compiler-optimizations-accessible-to-all

In the above graph, the -reg2mem pass appears to offer the agent the least reward, increasing the instruction count, whereas the -mem2reg and -sroa passes appear to give the agent the most reward, increasing the instruction count. This corresponds to an understanding of LLVM’s internals, in which the symmetric passes -reg2mem and -mem2reg are responsible for demoting and promoting memory accesses to registers, respectively.

-instsimplify, on the other hand, appears to reduce program size by deleting unnecessary instructions, whereas -float2int appears to constantly yield no reward. This makes sense because all you’re doing is rewriting instructions rather than adding or eliminating them.

The team also built leaderboards to illustrate the inference time and rewards found by basic search algorithms so that researchers in the community can compare and advertise their discoveries on CompilerGym settings.

Source: https://ai.facebook.com/blog/compilergym-making-compiler-optimizations-accessible-to-all

The team believes that machines can learn how to optimize code instead of relying entirely on human specialists. They anticipate a period when machines will be significantly more efficient at tweaking programs for performance and energy efficiency, thanks to recent deep learning and reinforcement learning advancements.

Paper: https://arxiv.org/abs/2109.08267?

Code: https://github.com/facebookresearch/CompilerGym

Source: https://ai.facebook.com/blog/compilergym-making-compiler-optimizations-accessible-to-all