MaskFormer is a unified framework for image segmentation developed by Facebook Research, designed to bridge the gap between semantic, instance, and panoptic segmentation within a single architecture. Unlike traditional segmentation pipelines that treat these tasks separately, MaskFormer reformulates segmentation as a mask classification problem, enabling a consistent and efficient approach across multiple segmentation domains. Built on top of Detectron2, it supports a wide range of datasets including ADE20K, Cityscapes, COCO-Stuff, and Mapillary Vistas, and provides pretrained baselines for each. The model achieves strong performance and scalability while simplifying training and evaluation workflows. Its successor, Mask2Former, extends the same meta-architecture to achieve state-of-the-art results across all major segmentation benchmarks. MaskFormer’s modular design, dataset integration, and compatibility with existing Detectron2 models make it an essential research tool.
Features
- Unified architecture for semantic, instance, and panoptic segmentation
- Built on Detectron2 with full compatibility across models and datasets
- Supports ADE20K, Cityscapes, COCO-Stuff, and Mapillary Vistas datasets
- Reformulates segmentation as a mask classification task for efficiency
- Includes pretrained baselines and a comprehensive model zoo
- Foundation for Mask2Former, achieving state-of-the-art segmentation results