OctoML is a Washington-based startup that offers an acceleration platform for deploying machine learning models and algorithms on the hardware. This platform primarily helps the engineering teams deploy the machine learning models seamlessly and with increased accuracy. The platform is built on an open-source Apache TVM compiler framework project. In the recent Series B funding rounds, OctoML raised $28 million, which takes the company’s total capital to around $47 million. The funds raised will be utilized to up the machinery of the organization, build a customer success team and expand the engineering team further to develop and include new features on the platform.
The organization has been co-founded by Jared Roesch, Jason Knight, Luis Ceze, Thierry Moreau, Tianqi Chen in 2019. The company claims to provide the developers with a platform that will automatically optimize the machine learning model’s performance on the cloud and edge devices alike. OctoML is currently in an early access mode, and there are about 1000 early access sign-ups for the platform. In a nutshell, the company claims to offer all its consumers the following:
- Automatic functioning: The ML models can be optimized to perform better simply with the web UP or API of the company.
- Fast performance: The platform makes the ML model work quickly and efficiently, allowing it to go to the market in a fraction of time.
- Portability: The platform encourages flexibility and, therefore, can be applied to any hardware.
OcotML recently launched support for the Apple M1 chip, which provided excellent performance. Furthermore, the company has also partnered with massive industry stalwarts like Microsoft, Qualcomm, and AMD to build open-sourcing components and then provide optimization of services to an entire plethora of machine learning models. Interestingly, Microsoft is also a customer of OctoML for their platform.
The engineering teams of OctoML are also gearing up to optimize the models and train them simultaneously. This would help attract new customers to the platform as training machine learning models can prove to be costly and require a lot of effort from the users. The company is looking towards becoming an end-to-end solution for its customers.
The platform works in five simple steps:
- The model topology is uploaded in any format, may it be TensorFlow, Pytorch, or ONNX.
- The optimization, benchmarking, and packaging of the model are done across various hardware platforms and application language runtimes.
- The model’s performance is then compared to the various CPU and GPU instance types, and the device sizing requirement can be evaluated.
- This step involves choosing from various packaging formats, including Python wheel, C API, etc.
- The binary is prepared and provided to the customer. The model can then be deployed using the same with whatever machinery available.
The company makes use of Octomizer, which is a SaaS software that makes the process of deployment easy by optimizing, benchmarking, and packaging. It can also be used as a sizing tool to better the next hardware purchase. The platform, for now, does not have an auto-deployment feature, and the Octomizer creates the artifact that the customers then need to deploy, but OctoML could launch deployment support as well to better the services being provided.