Mellum-4b-base is JetBrains’ first open-source large language model designed and optimized for code-related tasks. Built with 4 billion parameters and a LLaMA-style architecture, it was trained on over 4.2 trillion tokens across multiple programming languages, including datasets such as The Stack, StarCoder, and CommitPack. With a context window of 8,192 tokens, it excels at code completion, fill-in-the-middle tasks, and intelligent code suggestions for professional developer tools and IDEs. The model is efficient for both cloud inference with vLLM and local deployment using llama.cpp or Ollama, thanks to its bf16 precision and AMP training. While the base model is not fine-tuned for downstream tasks, it is designed to be easily adapted through supervised fine-tuning (SFT) or reinforcement learning (RL). Benchmarks on RepoBench, SAFIM, and HumanEval demonstrate its competitive performance, with specialized fine-tuned versions for Python already showing strong improvements.
Features
- 4B parameter LLaMA-style architecture optimized for coding tasks
- Trained on 4.2T tokens from The Stack, StarCoder, CommitPack, and Wikipedia
- 8,192-token context window for handling larger codebases
- Efficient for both cloud inference (vLLM) and local use (llama.cpp, Ollama)
- Base model with support for SFT and RL fine-tuning for specific applications
- Strong benchmark results on RepoBench, SAFIM, and HumanEval tasks
- Includes Python SFT variant with superior performance over the base model
- Licensed under Apache 2.0 for open and flexible use