Llama 2
The next generation of our open source large language model. This release includes model weights and starting code for pretrained and fine-tuned Llama language models — ranging from 7B to 70B parameters.
Llama 2 pretrained models are trained on 2 trillion tokens, and have double the context length than Llama 1. Its fine-tuned models have been trained on over 1 million human annotations.
Llama 2 outperforms other open source language models on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests.
Llama 2 was pretrained on publicly available online data sources. The fine-tuned model, Llama-2-chat, leverages publicly available instruction datasets and over 1 million human annotations.
We have a broad range of supporters around the world who believe in our open approach to today’s AI — companies that have given early feedback and are excited to build with Llama 2.
Learn more
OLMo 2
OLMo 2 is a family of fully open language models developed by the Allen Institute for AI (AI2), designed to provide researchers and developers with transparent access to training data, open-source code, reproducible training recipes, and comprehensive evaluations. These models are trained on up to 5 trillion tokens and are competitive with leading open-weight models like Llama 3.1 on English academic benchmarks. OLMo 2 emphasizes training stability, implementing techniques to prevent loss spikes during long training runs, and utilizes staged training interventions during late pretraining to address capability deficiencies. The models incorporate state-of-the-art post-training methodologies from AI2's Tülu 3, resulting in the creation of OLMo 2-Instruct models. An actionable evaluation framework, the Open Language Modeling Evaluation System (OLMES), was established to guide improvements through development stages, consisting of 20 evaluation benchmarks assessing core capabilities.
Learn more
Alpaca
Instruction-following models such as GPT-3.5 (text-DaVinci-003), ChatGPT, Claude, and Bing Chat have become increasingly powerful. Many users now interact with these models regularly and even use them for work. However, despite their widespread deployment, instruction-following models still have many deficiencies: they can generate false information, propagate social stereotypes, and produce toxic language. To make maximum progress on addressing these pressing problems, it is important for the academic community to engage. Unfortunately, doing research on instruction-following models in academia has been difficult, as there is no easily accessible model that comes close in capabilities to closed-source models such as OpenAI’s text-DaVinci-003. We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta’s LLaMA 7B model.
Learn more
Molmo
Molmo is a family of open, state-of-the-art multimodal AI models developed by the Allen Institute for AI (Ai2). These models are designed to bridge the gap between open and proprietary systems, achieving competitive performance across a wide range of academic benchmarks and human evaluations. Unlike many existing multimodal models that rely heavily on synthetic data from proprietary systems, Molmo is trained entirely on open data, ensuring transparency and reproducibility. A key innovation in Molmo's development is the introduction of PixMo, a novel dataset comprising highly detailed image captions collected from human annotators using speech-based descriptions, as well as 2D pointing data that enables the models to answer questions using both natural language and non-verbal cues. This allows Molmo to interact with its environment in more nuanced ways, such as pointing to objects within images, thereby enhancing its applicability in fields like robotics and augmented reality.
Learn more