With the development of large language models like ChatGPT, neural networks have become increasingly popular in natural language processing. The recent success of LLMs is significantly based on the use of deep neural networks and their capabilities, including the ability to process and analyze huge chunks of data efficiently and precisely. With the development of the latest neural network architectures and training methods, their applications of them have set new benchmarks and have become extremely powerful.
The latest research has explored the domain of neural networks. It has introduced a way of designing neural networks that can easily process the weights and gradients of other neural networks. These networks are known as Neural Functional Networks (NFNs). These are basically the functions of a neural network, such as the weights, gradients, and sparsity masks. Neural Functional Networks have several applications ranging from learning optimization and processing implicit neural representations to network editing and policy evaluation.
In order to design some effective architectures that can process the weights and gradients of other networks, there are certain principles. The researchers have proposed a framework for developing permutation equivariant neural functionals. The permutation symmetries that are present in the weights of deep feedforward neural networks are considered. Just like hidden neurons in deep feedforward networks have no specific intrinsic order, the team has developed a way to ensure that the new networks also have the same permutation symmetry. The new networks are called permutation equivariant neural functionals.
The team has even introduced a set of key building blocks for this framework called NF-Layers. NF-Layers are basically linear in structure, with their input and output as weight space features. These layers are Neural Functional layers and are restricted to permutation equivariant of neural network spaces using a suitable parameter-sharing structure. Also, these layers are analogous to translation equivariance in convolution layers.
Just like a Convolutional Neural Network (CNN) functions on spatial features, Neural Functional Networks (NFNs) operate on weight space features in the same way. This framework of Neural Functionals processes the neural network weights while considering their permutation symmetries. The researchers have demonstrated the effectiveness of permutation equivariant neural functionals on a varied set of tasks that involve processing the weights of multi-layer perceptrons (MLPs) and convolutional neural networks (CNNs). These tasks include predicting classifier generalization, producing “winning ticket” sparsity masks for initializations, and extracting information from the weights of implicit neural representations (INRs). NFNs allow considering Implicit Neural Representations (INRs) as datasets, with the weights of each INR as a single data point. NFNs have also been trained to edit INR weights to generate some visual changes, such as image dilation.
In conclusion, this research provides a new approach to designing neural networks that can process the weights of other networks, which can have a wide range of applications in many areas of machine learning. The researchers have even mentioned some improvements that can be made in the future, such as reducing the activation sizes produced by NF-Layers and extending the NF-Layers to process weight inputs of more complex architectures such as ResNet and Transformer weights, thereby allowing larger-scale applications.
Check out Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 15k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.