Please don't forget to join our ML Subreddit Need help in creating ML Research content for your lab/startup? Talk to us [email protected]
Deep learning has emerged as our generation’s most critical computing job. Tasks that were once the unique realm of humans are now regularly executed at human or superhuman levels by computers.
Bigger isn’t necessarily better for the neural networks that underpin today’s artificial intelligence. Recent advances in machine understanding of language, for example, have relied on the creation of some of the world’s most significant AI models and the cramming of massive amounts of text into them. A new cluster of computer chips might allow these networks to expand to nearly inconceivable sizes, demonstrating if going much bigger could unleash even more AI gains, not just in language comprehension but also in robotics and computer vision.
The first chip was 10,000 times quicker than a GPU in a CFD task, according to a Department of Energy National Energy Technology Lab test. Still, the limited memory footprint meant it had significant limits, especially when AI models couldn’t fit on the device.
Now Cerebras Systems has enhanced support for the popular open-source PyTorch and TensorFlow machine-learning frameworks on the Wafer-Scale Engine 2 processors that power its CS-2 system, which is good news for those who want their AI chips large.
One of the most popular machine learning frameworks is PyTorch. Developers utilize it to speed up the research prototype to production deployment transition. As model sizes become more significant and transformer models become more popular, machine learning practitioners must access computing solutions like the Cerebras CS-2 that are quick to set up and utilize. The developer community now has a solid tool to allow new advancements in AI with the CS-2 running CSoft.
According to the chip designer, enhanced support is a significant milestone. It will simplify running standard AI/ML models on Cerebras’ machines, allowing the six-year-old business to compete with AI systems and processor makers that support various languages and models.
The enhanced framework support is now baked into the Cerebras CSoft software stack, allowing ML researchers to develop their models in TensorFlow or PyTorch and execute them on the CS-2 without any changes. Cerebras’ better integrations, which would enable models developed for GPUs or CPUs to operate on the CS-2 without any adjustments, are also critical.
Cerebras argues that this is significant since their WSE-2 technology is quicker and better prepared than GPUs, including Nvidia’s two-year-old flagship A100 chip, handling models of varied sizes.
Because of the WSE-2’s vast number of cores, the large quantity of memory, and high memory bandwidth, Cerebras’ hardware avoids difficulties observed by traditional processors in PyTorch models. This eliminates the need for small and medium-sized models to be divided across numerous processors, impeding data transfer.
Large PyTorch models that don’t fit on the WSE-2 chip may still be run on the wafer-sized processor by storing the model’s activations on the device and shifting parameters called weights to and from the chip layer by layer.
This sounds great, but Cerebras still has a long way to go before it can compete with Nvidia for even a tiny fraction of the market without including Intel, AMD, and several other firms in various stages of deployment.
Cerebras has clients in North America, Asia, Europe, and the Middle East. The wafer-size processor producer isn’t positioned to make a considerable market entrance against the top semiconductor giants pushing processors and accelerators. GlaxoSmithKline, AstraZeneca, and TotalEnergies are among the corporations on the list, including national and regional institutions, including Argonne National Laboratory and the Pittsburgh Supercomputing Center.
With a growing number of end-users to point to and a more appropriate software stack for AI engineers, people may expect more significant names to support the unique architectural approach. At the very least, the business has demonstrated that the once-fanciful notion of wafer-scale chip manufacture and usage is within reach. The software stack required to enable on-chip complexity can be developed, even if it takes years.
The wafer-size AI chips erase the major bottleneck to artificial intelligence growth by decreasing the time to train models from weeks to seconds by boosting deep learning computing. It allows profound learning practitioners to test hypotheses more rapidly and explore concepts that are currently untestable or too expensive to investigate using older systems. It lowers the cost of curiosity, speeding up the introduction of new ideas and approaches to future AI.