Talc
Talc is a performant and flexible memory allocator, with first class support for no_std and WebAssembly. It's suitable for projects such as operating system kernels, website backends, or arena allocation in single-threaded contexts.
Is your project targeting WASM? Check out usage and comparisons here.
Table of Contents
- Setup
- Benchmarks
- Algorithm
- Testing
- General Usage
- Advanced Usage
- Conditional Features
- Stable Rust and MSRV
- Support Me
- Changelog
Setup
Use it as an arena allocator via the Allocator API as follows:
use *;
use ;
static mut ARENA: = ;
Or as a global allocator:
use *;
static mut ARENA: = ;
static ALLOCATOR: = new.lock;
Note that both of these examples use the spin crate's mutex as a locking mechanism. Any lock implementing lock_api will do, though.
See the std_global_allocator example, General Usage and Advanced Usage for more details.
Benchmarks
Macrobenchmarks (based on galloc's benchmarks)
The original benchmarks have been modified (e.g. replacing rand with fastrand) in order to alleviate the overhead. Additionally, alignment requirements are inversely exponentially frequent, ranging from 22 bytes to 218, with 22 and 23 being most common.
Random Actions Benchmark Results
The number of successful allocations, deallocations, and reallocations within the allotted time.
Note that these results are sensitive to the allocation sizes, ratio of allocations to deallocations, and other such factors.
Heap Efficiency Benchmark Results
The average occupied capacity upon first allocation failure when randomly allocating/deallocating/reallocating.
| Allocator | Average Random Actions Heap Efficiency |
|---|---|
| dlmalloc | 99.07% |
| talc | 98.87% |
| linked_list_allocator | 98.28% |
| galloc | 95.86% |
| buddy_alloc | 58.75% |
Microbenchmarks (based on simple_chunk_allocator's benchmark)
Pre-fail allocations account for all allocations up until the first allocation failure, at which point heap pressure has become a major factor. Some allocators deal with heap pressure better than others, and many applications aren't concerned with such cases (where allocation failure results in a panic), hence they are separated out for separate consideration. Actual number of pre-fail allocations can be quite noisy due to random allocation sizes.
RESULTS OF BENCHMARK: Talc
2011833 allocation attempts, 1419683 successful allocations, 26972 pre-fail allocations, 1408883 deallocations
CATEGORY | OCTILE 0 1 2 3 4 5 6 7 8 | AVERAGE
---------------------|--------------------------------------------------------------------------|---------
All Allocations | 42 42 63 84 84 105 126 189 48468 | 133 ticks
Pre-Fail Allocations | 42 63 63 84 84 105 105 126 6489 | 102 ticks
Deallocations | 42 84 105 105 189 252 273 399 31899 | 228 ticks
RESULTS OF BENCHMARK: Buddy Allocator
2201551 allocation attempts, 1543457 successful allocations, 16227 pre-fail allocations, 1536871 deallocations
CATEGORY | OCTILE 0 1 2 3 4 5 6 7 8 | AVERAGE
---------------------|--------------------------------------------------------------------------|---------
All Allocations | 21 42 42 63 63 63 63 63 21693 | 57 ticks
Pre-Fail Allocations | 21 42 42 42 63 63 63 84 4578 | 77 ticks
Deallocations | 42 63 63 63 63 84 84 126 18795 | 99 ticks
RESULTS OF BENCHMARK: Dlmalloc
1993087 allocation attempts, 1404059 successful allocations, 23911 pre-fail allocations, 1392832 deallocations
CATEGORY | OCTILE 0 1 2 3 4 5 6 7 8 | AVERAGE
---------------------|--------------------------------------------------------------------------|---------
All Allocations | 42 63 84 147 168 189 231 315 26166 | 181 ticks
Pre-Fail Allocations | 42 63 105 147 168 189 210 273 1218 | 172 ticks
Deallocations | 42 105 126 147 231 273 336 420 45507 | 257 ticks
RESULTS OF BENCHMARK: Galloc
276978 allocation attempts, 203844 successful allocations, 24233 pre-fail allocations, 193851 deallocations
CATEGORY | OCTILE 0 1 2 3 4 5 6 7 8 | AVERAGE
---------------------|--------------------------------------------------------------------------|---------
All Allocations | 42 63 84 294 12201 26859 41937 46116 127512 | 19259 ticks
Pre-Fail Allocations | 42 42 42 63 63 63 63 735 35007 | 663 ticks
Deallocations | 42 63 84 210 231 294 399 651 19635 | 324 ticks
RESULTS OF BENCHMARK: Linked List Allocator
134333 allocation attempts, 103699 successful allocations, 24836 pre-fail allocations, 93275 deallocations
CATEGORY | OCTILE 0 1 2 3 4 5 6 7 8 | AVERAGE
---------------------|--------------------------------------------------------------------------|---------
All Allocations | 42 4242 9723 16359 24633 35448 48027 59094 1060941 | 29863 ticks
Pre-Fail Allocations | 42 798 2205 3969 6216 9051 12747 18375 1126293 | 11534 ticks
Deallocations | 42 3171 6972 11319 16254 22029 29211 38661 100044 | 19274 ticks
Q: Why does Buddy Allocator perform much better here than in the random actions benchmark?
A: The buddy allocator's performance is heavily dependant on the size of allocations in random actions, as it doesn't appear to reallocate efficiently. The microbenchmark results only measure allocation and deallocation, with no regard to reallocation. (The currently-used sizes of 1 to 20000 bytes leads to the results above in Random Actions.)
Algorithm
This is a dlmalloc-style linked list allocator with boundary tagging and bucketing, aimed at general-purpose use cases. Allocation is O(n) worst case, while in-place reallocations and deallocations are O(1). In practice, it's speedy.
The main algorithmic difference between Talc and Galloc, using a similar algorithm, is that Talc doesn't bucket by alignment at all, assuming most allocations will require at most a machine-word size alignment. Instead, a much broader range of bucket sizes are used, which should often be more efficient.
Additionally, the layout of chunk metadata is rearranged to allow for smaller minimum-size chunks to reduce memory overhead of small allocations. The minimum chunk size is 3 * usize, with a single usize being reserved per allocation.
Testing
Tests on most of the helper types and Talc functions.
Other than that, lots of fuzzing of the allocator.
General Usage
Here is the list of Talc methods:
- Constructors:
new
- Information:
get_allocated_span- returns the minimum span containing all allocated memory
- Management:
claim- claim memory to establishing a new heapextend- extend the extent of a heaptruncate- reduce the extent of a heaplock- wraps theTalcin aTalck, which supports theGlobalAllocandAllocatorAPIs
- Allocation:
mallocfreegrowshrink
Read their documentation for more info.
Span is a handy little type for describing memory regions, because trying to manipulate Range<*mut u8> or *mut [u8] or base_ptr-size pairs tends to be inconvenient or annoying.
Advanced Usage
The most powerful feature of the allocator is that it has a modular OOM handling system, allowing you to fail out of or recover from allocation failure easily.
As an example, recovering by extending the heap is implemented below.
use *;
Conditional Features
lock_api(default): Provides theTalcklocking wrapper type that implementsGlobalAlloc.allocator(default, requires nightly): Provides anAllocatortrait implementation viaTalck.nightly_api(default, requires nightly): Provides theSpan::from(*mut [T])andSpan::from_slicefunctions.
Stable Rust and MSRV
Talc can be built on stable Rust by using --no-default-features --features=lock_api (lock_api isn't strictly necessary).
Disabling nightly_api makes Span::from(*mut [T]) and Span::from_slice unavailable. See the std_global_allocator example for how to get around this restriction in certain contexts.
The MSRV is currently 1.67.1
Support Me
If you find the project useful, please consider donating via Paypal. Thanks!
On the other hand, I'm looking for part-time programming work for which South Africans are eligible. If you know of any suitable vacancies, please get in touch. Here's my LinkedIn.
Changelog
v4.0.0
- Changed
Talck's API to be more inline with Rust norms.Talcknow hides its internal structure (no more.0).Talck::talc()has been replaced byTalck::lock().Talck::new()andTalck::into_inner(self)have been added.- Removed
TalckRefand implemented theAllocatortrait onTalckdirectly. No need to calltalck.allocator()anymore.
- Changed API for provided locking mechanism
- Moved
AssumeUnlockableintotalc::locking::AssumeUnlockable - Removed
Talc::lock_assume_single_threaded, use.lock::<talc::locking::AssumeUnlockable>()directly instead.
- Moved
- Improvements to documentation here and there. Thanks polarathene for the contribution!
v3.1.2
- Some improvements to documentation.
v3.1.1
- Changed the WASM OOM handler's behavior to be more robust if other code calls
memory.growduring the allocator's use.
v3.1.0
- Reduced use of nightly-only features, and feature-gated the remainder (
Span::from(*mut [T])andSpan::from_slice) behindnightly_api. nightly_apifeature is default-enabled- WARNING: use of
default-features = falsemay cause unexpected errors if the gated functions are used. Consider addingnightly_apior using another function.
- WARNING: use of
v3.0.1
- Improved documentation
- Improved and updated benchmarks
- Increased the range of allocation sizes on Random Actions. (sorry Buddy Allocator!)
- Increased the number of iterations the Heap Efficiency benchmark does to produce more accurate and stable values.
v3.0.0
- Added support for multiple discontinuous heaps! This required some major API changes
new_arenano longer exists (usenewand thenclaim)inithas been replaced withclaimclaim,extendandtruncatenow return the new heap extentInitOnOomis nowClaimOnOom.- All of the above now have different behavior and documentation.
- Each heap now has a fixed overhead of one
usizeat the bottom.
To migrate from v2 to v3, keep in mind that you must keep track of the heaps if you want to resize them, by storing the returned Spans. Read claim, extend and truncate's documentation for all the details.
v2.2.1
- Rewrote the allocator internals to place allocation metadata above the allocation.
- This will have the largest impact on avoiding false sharing, where previously, the allocation metadata for one allocation would infringe on the cache-line of the allocation before it, even if a sufficiently high alignment was demanded. A marginal/negligible increase in single-threaded performance resulted, too.
- Removed heap_exhaustion and replaced heap_efficiency benchmarks.
- Improved documentation and other resources.
- Changed the WASM size measurement to include slightly less overhead.
v2.2.0
- Added
dlmallocto the benchmarks. - WASM should now be fully supported via
TalckWasm. Let me know what breaks ;)- Find more details here.
v2.1.0
- Tests are now passing on 32 bit targets.
- Documentation fixes and improvements for various items.
- Fixed using
lock_apiwithoutallocator. - Experimental WASM support has been added via
TalckWasmon WASM targets.
v2.0.0
- Removed dependency on
spinand switched to usinglock_api(thanks Stefan Lankes)- You can specify the lock you want to use with
talc.lock::<spin::Mutex<()>>()for example.
- You can specify the lock you want to use with
- Removed the requirement that the
Talcstruct must not be moved, and removed themovfunction.- The arena is now used to store metadata, so extremely small arenas will result in allocation failure.
- Made the OOM handling system use generics and traits instead of a function pointer.
- Use
ErrOnOomto do what it says on the tin.InitOnOomis similar but inits to the given span if completely uninitialized. ImplementOomHandleron any struct to implement your own behaviour (the OOM handler state can be accessed fromhandle_oomviatalc.oom_handler).
- Use
- Changed the API and internals of
Spanand other changes to passmiri's Stacked Borrows checks.- Span now uses pointers exclusively and carries provenance.
- Updated the benchmarks in a number of ways, notably adding
buddy_allocand removingsimple_chunk_allocator.