Foundation models have taken the Artificial Intelligence community by storm. Their recent impact has helped contribute to a wide range of industries such as healthcare, finance, education, entertainment, etc. The popular large language models such as GPT-3, DALLE 2, and BERT are the ones that are known as foundation models and are performing extraordinary tasks and easing lives. GPT-3 can write an excellent essay and generate content given just a short natural language prompt. DALLE 2 can create images in response to a simple textual description. These models are the only reason due to which Artificial Intelligence and Machine Learning are rapidly moving through a paradigm shift.
In a recent research paper, a team of researchers explored the scope of foundation models in decision-making. The team has proposed some conceptual tools and technical background for going in-depth into the problem space and inspecting the new research directions. A foundation model is basically a model which is trained in a way that it can be used for downstream tasks, i.e., it can be used for tasks for which it has not previously been trained. The less popular terms, such as self-supervised and pre-trained models, are interchangeably used for foundation models only. These reusable AI models can be applied to any field or industry task.
The research paper reviews and addresses the latest methods that aid foundation models in practical decision-making. These models are used in various applications in several ways, like prompting, conditional generative modeling, planning, optimal control, and reinforcement learning. The paper mentions relevant background and notations of sequential decision-making. It introduces a few example scenarios where foundation models and decision-making are better considered jointly, such as using human feedback for dialogue tasks, using the internet as an environment for decision-making, and considering the task of video generation as a universal policy.
Foundation models can be presented as generative models of behavior and the environment. The paper discusses how skill discovery can be an example of behavior. On the other hand, foundation models can be generative models of the environment for conducting model-based rollouts. These models can even describe different components of decision-making, such as states (S), behaviors (A), dynamics (T), and task specifiers (R), through generative modeling or representation learning with examples of plug-and-play vision-language models, model-based representation learning and so on.
The paper, in the end, discusses common challenges and issues while applying foundation models to decision-making. One is the dataset gap, as the big datasets used for vision and language tasks can have different structures and manners than interactive datasets. For example, videos in a broad dataset mostly do not have explicit action labels, whereas actions and rewards are significant components of interactive datasets. To overcome the challenge, broad video, and text data can be made more task-specific by post-processing the data, using techniques like hindsight relabeling actions and rewards. In contrast, the decision-making datasets can be made so by blending a variety of task-specific datasets. Thus, this latest research paper explains how the advancing foundation models can be utilized for different decision-making opportunities by overcoming challenges.
Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 15k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.