The Allen Institute for AI (AI2) has announced the development of a groundbreaking open language model called AI2 OLMo (Open Language Model). OLMo will be a state-of-the-art generative language model with a scale of 70 billion parameters, comparable to other large language models. The Project is expected to end by 2024. It aims to provide the research community with access to all aspects of model creation, fostering collaboration and advancing the science of language models.
AI2 is partnering with leading technology companies, including AMD and CSC, to develop OLMo. The collaboration involves utilizing the GPU capabilities of the AMD-powered LUMI pre-exascale supercomputer, known for its energy efficiency. By leveraging the power of this eco-friendly supercomputer, AI2 aims to create a unique and open language model that will allow researchers to work directly on language models for the first time.
A key aspect of OLMo is its openness and accessibility to the research community. AI2 plans to make all elements of the Project openly available, including data, code, training curves, evaluation benchmarks, and ethical considerations surrounding the model’s development. By providing complete transparency, AI2 intends to empower researchers to build upon and enhance OLMo, enabling faster and safer progress in the field. The goal is to develop the best open language model globally collaboratively.
The AI2 team ensures that OLMo becomes a genuinely open model that provides unique value to the AI research community. Every component created for OLMo, including training data, code, model weights, intermediate checkpoints, and ablations, will be openly available, well-documented, and reproducible, with few exceptions and suitable licensing. The release strategy for the model and its artifacts is currently being developed. Additionally, AI2 plans to create a demo and release interaction data from consenting users.
In parallel with the model’s development, AI2 will make decisions to maximize the model’s usability and efficiency without compromising performance. The goal is to make OLMo accessible to a wide range of AI researchers, fostering diversity of perspectives and accelerating improvements in language model development. AI2 also intends to create and release a meticulously studied and documented model training dataset, encompassing pre-training data, instruction data, and human interaction data.
Recognizing the importance of ethical considerations, AI2 takes a pragmatic approach to ethics and openness throughout the OLMo project. The team will document the decisions, concerns, and trade-offs regarding the ethical and societal impacts of creating and releasing the OLMo model. AI2 promotes AI knowledge and understanding by sharing progress, challenges, and discoveries. Legal experts, both internal and external, are actively involved in the model-building process to assess privacy and intellectual property rights issues at multiple checkpoints.
AI2 has partnered with organizations such as Surge AI and MosaicML to collaborate on data and training code for OLMo. An ethics review committee comprising internal and external advisors has been established to provide feedback during the Project. The OLMo model and API will serve as valuable resources for the wider community, enabling better understanding and engagement in the generative AI revolution. AI2 welcomes support and partnerships from organizations aligned with their values of AI for standard, reasonable and responsible, beneficial AI technologies.
Check out the Reference Article. Don’t forget to join our 21k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com
Niharika is a Technical consulting intern at Marktechpost. She is a third year undergraduate, currently pursuing her B.Tech from Indian Institute of Technology(IIT), Kharagpur. She is a highly enthusiastic individual with a keen interest in Machine learning, Data science and AI and an avid reader of the latest developments in these fields.