ToMe (Token Merging) download

ToMe (Token Merging) is a PyTorch-based optimization framework designed to significantly accelerate Vision Transformer (ViT) architectures without retraining. Developed by researchers at Facebook (Meta AI), ToMe introduces an efficient technique that merges similar tokens within transformer layers, reducing redundant computation while preserving model accuracy. This approach differs from token pruning, which removes background tokens entirely; instead, ToMe merges tokens based on feature similarity, allowing it to compress both foreground and background information efficiently. ToMe integrates seamlessly into existing transformer models such as DeiT, MAE, SWAG, and timm ViTs, offering 2–3x speedups during inference and substantial efficiency gains during training. The method can be applied dynamically at inference time or incorporated into training for improved performance.

Features

Supported for both ImageNet evaluation and research extensions
Open source PyTorch patching tools for quick integration
Offers pretrained checkpoints for DeiT, ViT-B/L/H, and MAE models
Can be applied without retraining or integrated during training for better results
Compatible with timm, SWAG, and MAE ViT implementations
Provides 2–3× inference speedup with minimal accuracy loss

Project Samples

Project Activity

See All Activity >

License

Creative Commons Attribution License

Follow ToMe (Token Merging)

ToMe (Token Merging) Web Site

User Reviews

Be the first to post a review of ToMe (Token Merging)!

Additional Project Details

Operating Systems

Linux, Mac

Programming Language

Python

Related Categories

Python AI Models

Registered

2025-10-08

Similar Business Software

GPT-4.1 mini

GPT-4.1 mini is a compact version of OpenAI’s powerful GPT-4.1 model, designed to provide high performance while significantly reducing latency and cost. With a smaller size and optimized architecture, GPT-4.1 mini still delivers impressive results in tasks such as coding, instruction following,...

See Software
ByteDance Seed

Seed Diffusion Preview is a large-scale, code-focused language model that uses discrete-state diffusion to generate code non-sequentially, achieving dramatically faster inference without sacrificing quality by decoupling generation from the token-by-token bottleneck of autoregressive models. It...

See Software
Falcon Mamba 7B

Falcon Mamba 7B is the first open-source State Space Language Model (SSLM), introducing a groundbreaking architecture for Falcon models. Recognized as the top-performing open-source SSLM worldwide by Hugging Face, it sets a new benchmark in AI efficiency. Unlike traditional transformers, SSLMs...

See Software
Mercury Coder

Mercury, the latest innovation from Inception Labs, is the first commercial-scale diffusion large language model (dLLM), offering a 10x speed increase and significantly lower costs compared to traditional autoregressive models. Built for high-performance reasoning, coding, and structured text...

See Software
DeepSeek-V2

DeepSeek-V2 is a state-of-the-art Mixture-of-Experts (MoE) language model introduced by DeepSeek-AI, characterized by its economical training and efficient inference capabilities. With a total of 236 billion parameters, of which only 21 billion are active per token, it supports a context length...

See Software
BitNet

The BitNet b1.58 2B4T is a cutting-edge 1-bit Large Language Model (LLM) developed by Microsoft, designed to enhance computational efficiency while maintaining high performance. This model, built with approximately 2 billion parameters and trained on 4 trillion tokens, uses innovative...

See Software

Report inappropriate content

ToMe (Token Merging)

A method to increase the speed and lower the memory footprint

Get an email when there's a new version of ToMe (Token Merging)

Features

Project Samples

Project Activity

Categories

License

Follow ToMe (Token Merging)

User Reviews

Additional Project Details

Operating Systems

Programming Language

Related Categories

Registered