Zama Open-Sources Concrete ML v0.2 To Support Data Scientists Without Any Prior Cryptography Knowledge To Automatically Turn Classical Machine Learning (ML) Models Into Their FHE Equivalent

This Article Is Based On The Zama Research Article 'Announcing Concrete ML v0.2'. All Credit For This Research Goes To The Researchers Of This Research 👏👏👏

Please Don't Forget To Join Our ML Subreddit

Zama is a Paris-based startup that aims to bring end-to-end encryption to AI by enabling developers to use Python to create models that run on encrypted data. In late April, the Zama Team released the public alpha release of Concrete ML, a package developed on top of Concrete Numpy. This release provides data scientists with no prior knowledge of cryptography with simple APIs for automatically converting traditional machine learning (ML) models into their FHE equivalents. One of the main goals of this version is to make using Concrete ML as convenient as possible for users of popular machine learning frameworks. Model training for linear models and trees is not reimplemented in Concrete ML, allowing researchers to utilize several variations and features of these models that the scikit-learn package supports. 

The team conducted a series of experiments to compare the models between scikit-learn and Concrete ML. When tested on simple 2D linear models, FHE performance was comparable to that of its unencrypted scikit-learn equivalents. However, as the number of dimensions increases in the current release, the performance of strongly quantized classifiers rapidly falls, which will be refined in future releases. On encrypted data, tree-based classifiers utilizing Concrete ML show outstanding accuracy. Running tree models that demand heavy comparisons can be easily enabled thanks to Zama’s unique approach to FHE that provides Programmable Bootstrapping. As a result, tree-based models perform as well as their scikit-learn/xgboost counterparts in FHE. This remains true even for datasets with many dimensions, and tree-based models are typically the most performant when dealing with tabular data. Data scientists can implement Decision Trees, Random Forests, and Gradient Boosted Trees after the team publishes specific deployment APIs.

In Concrete ML, neural networks have also been made available. Their low performance in FHE can be attributed to the time-consuming computation these classifiers demand, especially as the number of layers increases. There is much promise for future software versions to improve, notably quantization-aware training choices and higher precision. The importance of deep learning has not been overlooked. The researchers have made an effort to enable generic torch models supplied by users. While the performance of networks quantified after training will decline rapidly as the complexity of the networks increases, with just 2-3 neurons now supported, the team is more concerned with feature completeness. More extensive networks that perform effectively under FHE limitations are also being developed.

Private ML model computing holds a promising future thanks to the release of Concrete ML. Due to Zama’s Programmable Bootstrapping, Tree-based classifiers are highly performant and especially well suited to FHE. In the foreseeable future, the Zama team is working hard to bring linear models and neural networks up to the same standard as Tree-based classifiers. Here is where one can find Concrete ML’s open-source code and documentation.

Source: https://www.zama.ai/post/announcing-concrete-ml-v0-2

Github: https://github.com/zama-ai/concrete-ml