Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
rapidhash - rust implementation
A rust implementation of the rapidhash function, the official successor to wyhash.
- High quality, the fastest hash passing all tests in the SMHasher and SMHasher3 benchmark. Collision-based study showed a collision probability lower than wyhash, foldhash, and close to ideal.
- Very fast, the fastest passing hash in SMHasher3. Significant peak throughput improvement over wyhash and foldhash. Fastest memory-safe hash. Fastest platform-independent hash. Fastest const hash.
- Platform independent, works on all platforms, no dependency on machine-specific vectorized or cryptographic hardware instructions. Optimised for both AMD64 and AArch64.
- Memory safe, when the
unsafe
feature is disabled (default). This implementation has also been fuzz-tested withcargo fuzz
. - No dependencies and no-std compatible when disabling default features.
- Official successor to wyhash with improved speed, quality, and compatibility.
- Inline variants that use
#[inline(always)]
onRapidInlineHash
andRapidInlineHashBuilder
to force compiler optimisations on specific input types (can double the hash performance depending on the hashed type). - Run-time and compile-time hashing as the hash implementation is fully
const
. - Idiomatic
std::hash::Hasher
compatible hasher forHashMap
andHashSet
usage. - Non-cryptographic hash function that's "minimally DoS resistant" in the same manner as foldhash.
Usage
Persistent Hashing
Full compatibility with C++ rapidhash algorithms, methods are provided for all rapidhash V1, V2, and V3 (with micro/nano) variants. These are stable functions whose output will not change between crate versions.
use ;
// rapidhash V3 algorithm, for fast, high-quality, persistent hashing
use rapidhash_v3;
assert_eq!;
In-Memory Hashing
Following rust's std::hash
traits, the underlying hash function may change between minor versions, and is only suitable for in-memory hashing. These types are optimised for speed and minimal DoS resistance.
RapidHasher
: Astd::hash::Hasher
compatible hasher that uses the rapidhash algorithm.RapidHashBuilder
: Astd::hash::BuildHasher
for initialising the hasher with the default seed and secrets.RandomState
: Astd::hash::BuildHasher
for initialising the hasher with a random seed and secrets.RapidHashMap
andRapidHashSet
: Helper types for usingRapidHasher
withHashMap
andHashSet
.
use RapidHashMap;
// std HashMap with the RapidHashBuilder hasher.
let mut map = default;
map.insert;
CLI
Rapidhash can also be installed as a CLI tool to hash files or stdin. This is not a cryptographic hash, but should be much faster than cryptographic hashes. This is fully compatible with the C++ rapidhash V1, V2, and V3 algorithms.
Output is the decimal string of the u64
hash value.
# install
cargo install rapidhash
# hash a file (output: 8543579700415218186)
rapidhash --v3 example.txt
# hash stdin (output: 8543579700415218186)
echo "example" | rapidhash --v3
Rapidhash Versions
Rapidhash has multiple versions of the algorithm.
Persistent Hashing
Fixed versioning with C++ compatibility is presented in rapidhash::v1
, rapidhash::v2
, and rapidhash::v3
modules.
Rapidhash V3 is the recommended, fastest, and most recent version of the hash. Others are provided for backwards compatibility.
In-Memory Hashing
Rust hasing traits (RapidHasher
, RapidBuildHasher
, etc.) are implemented in rapidhash::fast
, rapidhash::quality
, and rapidhash::inner
modules. These are not guaranteed to give a consistent hash output between crate versions as the rust Hasher
trait is not designed for this.
- Use
rapidhash::fast
for optimal hashing speed with a slightly lower hash quality. Best for most datastructures such as HashMap and HashSet usage. - Use
rapidhash::quality
where statistical hash quality is the priority, such as HyperLogLog or MinHash algorithms. - Use
rapidhash::inner
to configure advanced parameters to configure the hash function specifically to your use case. This allows tweaking the following compile time parameters, which all change the hash output:AVALANCHE
: Enables the final avalanche mixing step to improve hash quality. Enabled on quality, disabled on fast.FNV
: Hash integer types using FNV instead of the rapidhash algorithm. Reduces DoS resistance on integer types. Disabled on quality, enabled on fast.COMPACT
: Generates fewer instructions at compile time with less manual loop unrolling, but may be slower on some platforms. Disabled by default.PROTECTED
: Slightly stronger hash quality and DoS resistance by performing two extra XOR instructions on every mix step. Disabled by default.
Features
default
:std
std
: Enables theRapidHashMap
andRapidHashSet
helper types.rand
: Enables using therand
library to more securely initialiseRandomState
. Includes therand
crate dependency.rng
: EnablesRapidRng
, a fast, non-cryptographic PRNG based on rapidrng. Includes therand_core
crate dependency.unsafe
: Uses unsafe pointer arithmetic to skip some unnecessary bounds checks for a small 3-4% performance improvement.
Benchmarks
Initial benchmarks on M1 Max (aarch64) for various input sizes.
Hashing Benchmarks
- Rapidhash is generally faster with string and byte inputs.
- Foldhash is generally faster with integer tuples by using a 128bit buffer for integer inputs.
┌────────────────┬─────────────┬────────────┐
│ metric ┆ rapidhash-f ┆ foldhash-f │
│ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ f64 │
╞════════════════╪═════════════╪════════════╡
│ avg_rank ┆ 1.50 ┆ 1.50 │
│ geometric_mean ┆ 4.72 ┆ 4.83 │
└────────────────┴─────────────┴────────────┘
┌────────────────┬────────────┬─────────────┬────────────┐
│ distr ┆ bench ┆ rapidhash-f ┆ foldhash-f │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ f64 ┆ f64 │
╞════════════════╪════════════╪═════════════╪════════════╡
│ u32 ┆ hashonly ┆ 0.56 ┆ 0.62 │
│ u32 ┆ lookupmiss ┆ 1.40 ┆ 1.51 │
│ u32 ┆ lookuphit ┆ 1.76 ┆ 1.84 │
│ u32 ┆ setbuild ┆ 3.86 ┆ 4.07 │
│ u32pair ┆ hashonly ┆ 0.85 ┆ 0.62 │
│ u32pair ┆ lookupmiss ┆ 1.88 ┆ 1.59 │
│ u32pair ┆ lookuphit ┆ 2.25 ┆ 1.88 │
│ u32pair ┆ setbuild ┆ 4.52 ┆ 4.28 │
│ u64 ┆ hashonly ┆ 0.81 ┆ 0.62 │
│ u64 ┆ lookupmiss ┆ 1.77 ┆ 1.46 │
│ u64 ┆ lookuphit ┆ 2.08 ┆ 1.83 │
│ u64 ┆ setbuild ┆ 4.32 ┆ 4.10 │
│ u64lobits ┆ hashonly ┆ 0.81 ┆ 0.62 │
│ u64lobits ┆ lookupmiss ┆ 1.73 ┆ 1.46 │
│ u64lobits ┆ lookuphit ┆ 2.00 ┆ 1.81 │
│ u64lobits ┆ setbuild ┆ 4.18 ┆ 4.02 │
│ u64hibits ┆ hashonly ┆ 0.81 ┆ 0.62 │
│ u64hibits ┆ lookupmiss ┆ 1.71 ┆ 1.46 │
│ u64hibits ┆ lookuphit ┆ 2.12 ┆ 1.80 │
│ u64hibits ┆ setbuild ┆ 4.04 ┆ 4.05 │
│ u64pair ┆ hashonly ┆ 1.31 ┆ 0.78 │
│ u64pair ┆ lookupmiss ┆ 2.52 ┆ 1.84 │
│ u64pair ┆ lookuphit ┆ 2.91 ┆ 2.14 │
│ u64pair ┆ setbuild ┆ 5.18 ┆ 4.33 │
│ ipv4 ┆ hashonly ┆ 0.55 ┆ 0.62 │
│ ipv4 ┆ lookupmiss ┆ 1.45 ┆ 1.52 │
│ ipv4 ┆ lookuphit ┆ 1.77 ┆ 1.83 │
│ ipv4 ┆ setbuild ┆ 4.02 ┆ 4.05 │
│ ipv6 ┆ hashonly ┆ 0.83 ┆ 0.78 │
│ ipv6 ┆ lookupmiss ┆ 1.81 ┆ 1.74 │
│ ipv6 ┆ lookuphit ┆ 2.55 ┆ 2.39 │
│ ipv6 ┆ setbuild ┆ 4.44 ┆ 4.32 │
│ rgba ┆ hashonly ┆ 1.25 ┆ 0.63 │
│ rgba ┆ lookupmiss ┆ 2.52 ┆ 1.71 │
│ rgba ┆ lookuphit ┆ 3.28 ┆ 2.51 │
│ rgba ┆ setbuild ┆ 5.90 ┆ 4.72 │
│ strenglishword ┆ hashonly ┆ 1.38 ┆ 6.29 │
│ strenglishword ┆ lookupmiss ┆ 4.26 ┆ 6.65 │
│ strenglishword ┆ lookuphit ┆ 8.73 ┆ 10.92 │
│ strenglishword ┆ setbuild ┆ 14.78 ┆ 17.22 │
│ struuid ┆ hashonly ┆ 2.75 ┆ 5.51 │
│ struuid ┆ lookupmiss ┆ 6.42 ┆ 8.01 │
│ struuid ┆ lookuphit ┆ 9.95 ┆ 12.05 │
│ struuid ┆ setbuild ┆ 13.82 ┆ 16.29 │
│ strurl ┆ hashonly ┆ 5.01 ┆ 7.38 │
│ strurl ┆ lookupmiss ┆ 8.01 ┆ 9.89 │
│ strurl ┆ lookuphit ┆ 14.45 ┆ 16.15 │
│ strurl ┆ setbuild ┆ 21.27 ┆ 22.78 │
│ strdate ┆ hashonly ┆ 1.33 ┆ 5.47 │
│ strdate ┆ lookupmiss ┆ 4.40 ┆ 6.29 │
│ strdate ┆ lookuphit ┆ 6.42 ┆ 8.01 │
│ strdate ┆ setbuild ┆ 9.76 ┆ 12.54 │
│ accesslog ┆ hashonly ┆ 1.55 ┆ 1.16 │
│ accesslog ┆ lookupmiss ┆ 2.93 ┆ 2.26 │
│ accesslog ┆ lookuphit ┆ 4.06 ┆ 3.21 │
│ accesslog ┆ setbuild ┆ 6.46 ┆ 5.48 │
│ kilobyte ┆ hashonly ┆ 31.78 ┆ 31.93 │
│ kilobyte ┆ lookupmiss ┆ 35.09 ┆ 33.73 │
│ kilobyte ┆ lookuphit ┆ 72.32 ┆ 76.98 │
│ kilobyte ┆ setbuild ┆ 101.35 ┆ 114.84 │
│ tenkilobyte ┆ hashonly ┆ 235.44 ┆ 314.27 │
│ tenkilobyte ┆ lookupmiss ┆ 243.00 ┆ 317.11 │
│ tenkilobyte ┆ lookuphit ┆ 608.65 ┆ 683.77 │
│ tenkilobyte ┆ setbuild ┆ 1034.19 ┆ 1079.70 │
└────────────────┴────────────┴─────────────┴────────────┘
Benchmark notes
- Hash throughput/latency does not measure hash "quality", and many of the benchmarked functions fail SMHasher3 quality tests. Hash quality affects hashmap performance, as well as algorithms that benefit from high quality hash functions such as HyperLogLog and MinHash.
- Most hash functions will be affected heavily by whether the compiler has inlined them. Rapidhash tries very hard to always be inlined by the compiler, but the larger a program or benchmark gets, the less likely it is to be inlined due to Rust's
BuildHasher::hash_one
method not being#[inline(always)]
. gxhash
has high throughput by using AES instructions. It's a great hash function, but is not a portable hash function (often requirestarget-cpu=native
to compile), uses unsafe code, and is not minimally DoS resistant.- Benchmark your own use case, with your real world dataset! We suggest experimenting with rapidhash, foldhash, and gxhash to see what works for you. Different inputs will benefit from different hash functions.
Versioning
The minimum supported Rust version (MSRV) is 1.77.0.
The rapidhash crate follows the following versioning scheme:
- Major for breaking API changes and MSRV version bumps.
- Minor for significant API additions/deprecations or any changes to
RapidHasher
output. - Patch for bug fixes and performance improvements.
Persistent hash outputs (eg. rapidhash_v3
) are guaranteed to be stable. In-memory hash outputs (eg. RapidHasher
) may change between minor versions to allow us to freely improve performance.
License and Acknowledgements
This project is licensed under both the MIT and Apache-2.0 licenses. You are free to choose either license.
With thanks to Nicolas De Carli for the original rapidhash C++ implementation, which is licensed under the MIT License.
With thanks to Justin Bradford for letting us use the rapidhash crate name 🍻