With the growing interest in machine learning (ML), the use of differential privacy is trending in analytics. It’s a mathematical rigorous framework for quantifying the anonymization of sensitive data. Keeping in mind this growing interest, Facebook AI launched Opacus.
Opacus is the new high-speed library for training PyTorch models with differential privacy. With minimal required code modifications, this library supports training and has minimal impact on training performance. Thus, It provides an easier path to adopt differential privacy in machine learning and boost research. Compared to the existing state-of-the-art methods, Opacus has a significant advantage of being more scalable.
Other features that Opacus have to offer are:
- Opacus can compute batched per-sample gradients (By leveraging Autograd hooks in PyTorch), resulting in a speedup by order of magnitude compared to existing differential privacy libraries that rely on micro-batching.
- Opacus have something unique to offer in the safety field too. Opacus uses a cryptographically safe pseudo-random number generator for its security-critical code, which is processed (for an entire batch of parameters) on the GPU at high speed.
- Opacus is comparatively flexible to use. Because when it comes to prototyping ideas, PyTorch makes it quick for researchers and engineers to mix and match their code with PyTorch code and pure Python code.
- When it comes to productivity, Opacus offers tutorials and some helper functions that will warn you, before the start of the training, about incompatible layers. Opacus also offers automatic refactoring mechanisms.
- Opacus keeps track of how much of your privacy budget you are spending at any given point in time. Privacy budget is a core mathematical concept in differential privacy; thus, Opacus enables real-time monitoring and early stopping. That makes it very clear how interactive Opacus is.
The Developer’s goal behind the development of Opacus is to preserve the privacy of each training sample and keep in view that it doesn’t have much impact on the accuracy of the final model. The same is accomplished by modifying a standard PyTorch optimizer to enforce and measure differential privacy during training. Thus, developers hope to bridge the gap between the Machine learning engineers and the security community with a faster and flexible platform using PyTorch.
Opacus, being open-source, is available for public use and is licensed under Apache-2.0. To install the latest version of Opacus, you can use pip: pip install opacus. The library has also been open-sourced on GitHub.