Meet ‘NeuRRAM,’ A New Neuromorphic Chip For Edge AI That Uses a Tiny Portion of the Power and Space of Current Computer Platforms

A multidisciplinary research team has created a device that consumes a fraction of the energy needed by current general-purpose AI computing platforms to run various AI applications and perform computations directly in memory.

The NeuRRAM neuromorphic chip is a state-of-the-art “compute-in-memory” hybrid circuit that executes computations in memory. It can perform complex cognitive operations without requiring a network connection to a central server. The device is also incredibly adaptable and supports many neural network models and topologies.

A researcher who worked on the chip said, “The conventional wisdom is that the higher efficiency of compute-in-memory is at the cost of versatility, but our NeuRRAM chip obtains efficiency while not sacrificing versatility.”

At the moment, AI computing is both power-hungry and expensive. Most edge device AI applications require sending data to the cloud, where the AI processes and analyses it. The outcomes are then transferred back to the apparatus. That’s because most edge devices are battery-powered, limiting the amount of power used for computation.

This NeuRRAM chip could result in more reliable, intelligent, and usable edge devices and more intelligent manufacturing by lowering the power consumption required for AI inference at the edge. The increased security concerns associated with transferring data from devices to the cloud may enhance data privacy. One significant bottleneck on AI chips is the transfer of data from memory to compute units. It’s comparable to an eight-hour commute for a two-hour workday.

Researchers employed resistive random-access memory, a non-volatile memory that enables processing directly within memory rather than in separate computing units, to address this data transfer issue. A researcher developed RRAM and other cutting-edge memory technologies that are now employed as synapse arrays for neuromorphic computing.

It combines the efficiency of RAM with tremendous flexibility for various AI applications, such as deep learning and machine learning.

The work with several levels of “co-optimization” across the abstraction layers of hardware and software, from the chip’s design to its setup to run different AI tasks, required a carefully constructed methodology. The team also considered various limitations, from memory device physics to circuit and network architecture. This chip now gives us a platform to address these problems across the stack, from devices and circuits to algorithms.

Chip functionality

Researchers used a metric called the energy-delay product, or EDP, to gauge the chip’s energy efficiency. EDP combines the time required for each task with the energy used to perform that action. By this standard, the NeuRRAM chip outperforms state-of-the-art semiconductors by having an EDP  13.

On the device, researchers executed a variety of AI operations. It was 99% accurate when recognizing handwritten numbers, 85.7% when classifying images, and 84.7% when recognizing Google speech commands. Additionally, the chip reduced picture reconstruction error on an image recovery test by 70%. These results are on par with current digital processors operating with the same bit precision level but using significantly less energy.

AI benchmark results were frequently obtained via software simulation in many earlier works of compute-in-memory devices.

The next steps involve scaling the design to more advanced technology nodes and enhancing designs and circuits. In addition, researchers want to concentrate on additional applications, like spiking neural networks.

A research group stated, “they can do better at the device level, improve circuit design to implement additional features and address diverse applications with our dynamic NeuRRAM platform.” The researcher also helped form a business that is working to commercialize compute-in-memory technology. 

Recent construction

The novel technique used to sense output in memory is the secret of NeuRRAM’s energy efficiency. Traditional methods measure current as a result and use voltage as an input. But as a result, circuits become increasingly sophisticated and power-hungry. In NeuRRAM, the researchers created a neuron circuit that monitors voltage and efficiently converts analogue data to digital data. Higher parallelism is made possible by voltage-mode sensing, which can activate all the rows and columns of an RRAM array in a single computation cycle.

CMOS neuron circuits and RRAM weights are physically interleaved in the NeuRRAM design. As opposed to usual designs, which generally place CMOS circuits on the outer edges of RRAM weights, this one does not. The neurons’ connections and the RRAM array can be set up to act as the neuron’s input or output. This allows neural network inference in several data flow directions without additional space or power requirements. In turn, this facilitates architecture reconfiguration.

Researchers created a collection of hardware algorithm co-optimization approaches to guarantee that the correctness of AI calculations may be kept across multiple neural network architectures. The strategies were validated on various neural networks, including convolutional neural networks, long short-term memory, and constrained Boltzmann machines.

The 48 neurosynaptic cores of NeuroRRAM, a neuromorphic AI processor, work in parallel to disperse processing. NeuRRAM provides data parallelism by mapping a layer in the neural network model onto multiple cores for parallel inference in various data to achieve high adaptability and high efficiency simultaneously. NeuRRAM provides model-parallelism by executing pipelined inference while mapping various model layers to different cores.


Reference Article:

Please Don't Forget To Join Our ML Subreddit
✅ [Featured Tool] Check out Taipy Enterprise Edition