Language Modelling utilizes various statistics and probability techniques to predict the sequence of words occurring in a sentence. These models are widely used in Natural Language Processing applications that generate text as output. One notable example is the AI model trained to predict the following words in a text string based on the preceding words. This technology aids search engines and texting apps in predicting the next word before the user types it. The implementation of this technology is not restricted to prediction but has proved helpful in question answering, document summarization, and story completion.
Although the models were delineated to predict the next word in a text, a new study by MIT neuroscientists reveals that the functioning of these models bears a resemblance to the function of language processing centers of the human brain. It is observed that computer models that accomplish other language tasks do not show commonalities with the human brain. This offers evidence that the human brain uses next-word prediction to propel language processing.
The recently developed models belong to a class called Deep Neural Networks, a category of Machine Learning that works based on the organization and activities of the human brain. These networks contain computational nodes that establish connections of differing strengths and layers that pass information between each other in a stipulated manner.
Over the past decade, scientists have come up with models that perform object recognition as efficiently as the primate brain. The MIT researchers have also shown that visual object recognition models’ working compares with the structuring of the primate visual cortex.
The researchers compared 43 different language models with the language processing centers in the human brain. One such next-word prediction model is the Generative Pre-trained Transformer 3, abbreviated as GPT-3, that generates a text similar to what a human would produce given a prompt.
The researchers presented each model with a string of words and measured the activity of nodes that make up the Deep Neural Network. Three language tasks were considered to draw parallels with the functioning of the human brain, including:
- Listening to stories.
- Reading one sentence at a time.
- Reading sentences in which one word is unveiled at a time.
Human datasets consisting of functional magnetic resonance(fMRI) data and intracranial electrocorticographic measurements were collected from people undergoing brain surgery for epilepsy. Performance parameter like the speed of reading a given text is used to perform the comparative analysis. It is observed that the best-performing next-word prediction model has patterns similar to those detected in the human brain.
One of the salient features of the predictive model GPT-3 is an aspect called forward one-way predictive transformer, which can make predictions based on previous sequences. A vital feature of this transformer is that it can make forecasts based on a very long prior context and not just the previous word.
A key takeaway from this study is that Language processing is a highly constrained problem, and a significant difficulty is a real-time aspect. The AI network’s notion was not to mimic the working of the human brain, but it ended up as a brain-like model. This suggests that convergent evolution has occurred between AI and nature.
The researchers propose building variants of these models in the future and evaluating how a slight change in the design would affect their performance and aptness to human neural data. The idea is to leverage them in understanding how the human brain functions. The subsequent action in the trajectory would be to integrate the high-performing language models with computer models developed previously to make them capable of performing complex tasks like constructing perceptual representations of the physical world.
The goal is to move closer to more efficient AI models that give a precise explanation of how other parts of the brain works and to understand how intelligence emerges, drawing a comparison with the past.
- Paper: https://www.biorxiv.org/content/10.1101/2020.06.26.174482v2
Chaithali is a technical content writing consultant at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT),Bhubaneswar. She is interested in the field of Data Analytics and has keen interest in exploring its applications in various domains. She is passionate about content writing and debating.