Imagine forgetting everything you’d learned – how to balance, how to leap, how to coordinate the movement of your hands – and starting from scratch every time you learned a new skill (jumping rope, for example). Most machine learning models today are trained in this manner. Today’s AI systems are usually taught from scratch for each new task – the mathematical model’s parameters are initialized with random numbers. Typically, each new model is trained from the ground up to accomplish one thing and one thing only, rather than extending existing models to learn new functions. This results in:
- Developing different models for different tasks
- It takes longer to learn each new task this way.
- Much more data is required to learn each new task.
This is not how people approach new tasks. There is a need to develop models with various capabilities that can be summoned on as needed and stitched together to execute new, more complex tasks, similar to how the mammalian brain generalizes across tasks.
A new Google research launched Pathways, a new AI architecture aimed at improving the current machine learning problem of being too focused on a particular goal. The proposed architecture can manage multiple tasks at once, learn new functions rapidly, and have a thorough comprehension.
To perceive the world, people use a variety of senses, unlike the contemporary AI systems, which can only process one modality of information at a time, whether it be text, image, or speech. Pathways enable AI to overcome many existing systems’ flaws while also combining their strengths. This includes multimodal models that include vision, audio, and language understanding. As a result, the model is more accurate and less prone to errors and biases.
Of course, an AI model doesn’t have to be limited to these familiar senses; Pathways may handle more abstract forms of data, assisting human scientists in finding meaningful patterns in complex systems like climate dynamics.
The majority of today’s models are “dense,” which means that the entire neural network is activated to complete a task, no matter how basic or complex it is. This is also not how most individuals handle things. We have many distinct areas of our brain that are specialized for different tasks, but we only use the ones relevant to the circumstance at hand.
AI has the potential to work in the same way. It is possible to create a single model that is “sparsely” activated, meaning that only certain sections of the network are activated as needed. The advantages of this will be:
- The model learns which sections of the network are good at particular tasks dynamically
- It learns how to route tasks through the model’s most essential elements.
- It has a larger capacity to learn a variety of tasks
- it’s also faster and much more energy efficient
Pathways will allow a single AI system to generalize across thousands or millions of tasks. It will also comprehend a wide range of data with astonishing speed, ushering us out of the era of single-purpose models that only recognize patterns. The team believes that Pathways will guide the world toward a future where more general-purpose intelligent systems reflect a better understanding of our reality and adapt to new demands.
Swapnil is currently pursuing his B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. He is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields.