As the demand for energy-efficient, sustainable and smart technology is rapidly increasing, IBM has developed a new technology, considered the world’s first energy-efficient chips for AI inference and training. The chip is built with 7-nanometer technology. A team of researchers proposed a hardware accelerator that supports a range of model types that can achieve leading power efficiency on all the models in a paper presented at the 2021 International Solid-State Conference.
AI accelerators are specialized hardware that is specifically designed to speed up AI applications, especially neural networks, deep learning, and machine learning. They are multicore and mainly focus on low-precision arithmetic or in-memory computing. These tasks boost the performance of large AI algorithms.
According to IBM, their four-core chip, currently in research stages, is optimized for low-precision workloads with various AI and machine learning models. This technique requires less silicon area, which ensures better cache usage and reduces memory bottlenecks leading to a decrease in time and energy costs while training AI models. This chip is considered to be among the very few ultra-low precision ‘hybrid FP8’ formats incorporating chips. It is one of the first chips to feature power management and maximize performance by slowing down computation phases with high energy consumption.
IBM claims that their new chip achieved more than 80% utilization for training and more than 60% for inference. The chip’s performance was considered to be far better than other dedicated inference and training chips. Their chip offers high sustained utilization that can guarantee better real-life performance. IBM aims to apply the novel-chip design to commercial applications, including large-scale training in the cloud, security, privacy, and autonomous vehicles in the coming years. According to the researchers at IBM, the chip can be used for cloud training of large deep learning models in vision, speech, and natural language processing using 8-bit formats. Apart from these, the chips can be used for several inference applications like speech-to-text AI services, financial transaction fraud detection, and natural language processing services.