String Interner
| Continuous Integration | Test Coverage | Documentation | Crates.io |
|---|---|---|---|
A data structure to cache strings efficiently, with minimal memory footprint and the ability to assicate the interned strings with unique symbols. These symbols allow for constant time comparisons and look-ups to the underlying interned string contents. Also, iterating through the interned strings is cache efficient.
Internals
- Internally a hashmap
Mand a vectorVis used. Vstores the contents of interned strings whileMhas internal references into the string ofVto avoid duplicates.Vstores the strings with an indirection to avoid iterator invalidation.- Returned symbols usually have a low memory footprint and are efficiently comparable.
Planned Features
- Safe abstraction wrapper that protects the user from the following misusages:
- Using symbols of a different string interner instance to resolve string in another.
- Using symbols that are already no longer valid (i.e. the associated string interner is no longer available).
License
Licensed under either of
- Apache license, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Dual licence:

Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Changelog
-
0.10.0
-
Implement pluggable backends for
StringInterner. Uses the newBucketBackendby default which results in significant performance boosts and lower memory consumption as well as fewer overall memory allocations.This makes it possible for dependencies to alter the behavior of internment. The
string-internercrate comes with 2 predefined backends:SimpleBackend: Which is how theStringInternerof previous versions worked by default. It performs one allocation per interned string.BucketBackend: Tries to minimize memory allocations and packs interned strings densely. This is the new default behavior for this crate.
-
Due to the above introduction of backends some APIs have been removed:
reservecapacity- the entire
itermodule- Note: Simple iteration through the
StringInterer's interned strings and their symbols is still possible if the used backend supports iteration.
- Note: Simple iteration through the
resolve_unchecked: Has no replacement, yet but might be reintroduced in future versions again.shrink_to_fit: The API design was never really a good fit for interners.
-
-
0.9.0
- Remove
Ordtrait bound fromSymboltrait- Also change
Symbol::from_usize(usize) -> SelftoSymbol::try_from_usize(usize) -> Option<Self>
- Also change
- Minor performance improvements for
DefaultSymbol::try_from_usize - Put all iterator types into the
itersub module - Put all symbol types into the
symbolsub module - Add new symbol types:
SymbolU16: 16-bit wide symbolSymbolU32: 32-bit wide symbol (default)SymbolUsize: same size asusize
- Various internal improvements and reorganizations
- Remove
-
0.8.0
- Make it possible to use this crate in
no_stdenvironments- Use the new
hashbrowncrate feature together withno_std
- Use the new
- Rename
SymtoDefaultSymbol - Add
IntoIteratorimpl for&StringInterner - Add some
#[inline]annotations which improve performance for queries - Various internal improvements (uses
Pinself-referentials now)
- Make it possible to use this crate in
-
0.7.1
- CRITICAL fix use after free bug in
StringInterner::clone() - implement
std::iter::ExtendforStringInterner Sym::from_usizenow avoids usingunsafecode- optimize
FromIteratorimpl ofStringInterner - move to Rust 2018 edition
Thanks YOSHIOKA Takuma for implementing this release.
- CRITICAL fix use after free bug in
-
0.7.0
- changed license from MIT to MIT/APACHE2.0
- removed generic impl of
Symbolfor types that areFrom<usize>andInto<usize> - removed
StringInterner::clearAPI since its usage breaks invariants - added
StringInterner::{capacity, reserve}APIs - introduced a new default symbol type
Symthat is a thin wrapper aroundNonZeroU32(idea by koute) - made
DefaultStringInternera type alias for the newStringInterner<Sym> - added convenient
FromIteratorimpl toStringInterner<S: Sym> - dev
- rewrote all unit tests (serde tests are still missing)
- entirely refactored benchmark framework
- added
html_root_urlto crate root
Thanks matklad for suggestions and impulses
-
0.6.3
- fixed a bug that
StringInterner'sSendimpl didn't respect its genericHashBuilderparameter. Fixes GitHub issue #4.
- fixed a bug that
-
0.6.2
- added
shrink_to_fitpublic method toStringInterner- (by artemshein)
- added
-
0.6.1
- fixed a bug that inserting non-owning string types (e.g.
str) was broken due to dangling pointers (Thanks to artemshein for fixing it!)
- fixed a bug that inserting non-owning string types (e.g.
-
0.6.0
- added optional serde serialization and deserialization support
- more efficient and generic
PartialEqimplementation forStringInterner - made
StringInternergeneric overBuildHasherto allow for custom hashers
-
0.5.0
- added
IntoIteratortrait implementation forStringInterner - greatly simplified iterator code
- added
-
0.4.0
- removed restrictive constraint for
UnsignedforSymbol
- removed restrictive constraint for
-
0.3.3
- added
SendandSynctoInternalStrRefto makeStringInterneritselfSendandSync
- added