1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
// MIT License // // Copyright (c) 2020 Gregory Meyer // // Permission is hereby granted, free of charge, to any person // obtaining a copy of this software and associated documentation files // (the "Software"), to deal in the Software without restriction, // including without limitation the rights to use, copy, modify, merge, // publish, distribute, sublicense, and/or sell copies of the Software, // and to permit persons to whom the Software is furnished to do so, // subject to the following conditions: // // The above copyright notice and this permission notice shall be // included in all copies or substantial portions of the Software. // // THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, // EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF // MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND // NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS // BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN // ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN // CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE // SOFTWARE. //! Lockfree hash tables. //! //! The hash tables in this crate are, at their core, open addressing hash //! tables implemented using open addressing and boxed buckets. The core of //! these hash tables are bucket arrays, which consist of a vector of atomic //! pointers to buckets, an atomic pointer to the next bucket array, and an //! epoch number. In the context of this crate, an atomic pointer is a nullable //! pointer that is accessed and manipulated using atomic memory operations. //! Each bucket consists of a key and a possibly-uninitialized value. //! //! The key insight into making the hash table resizeable is to incrementally //! copy buckets from the old bucket array to the new bucket array. As buckets //! are copied between bucket arrays, their pointers in the old bucket array are //! CAS'd with a null pointer that has a sentinel bit set. If the CAS fails, //! that thread must read the bucket pointer again and retry copying it into the //! new bucket array. If at any time a thread reads a bucket pointer with the //! sentinel bit set, that thread knows that a new (larger) bucket array has //! been allocated. That thread will then immediately attempt to copy all //! buckets to the new bucket array. It is possible to implement an algorithm in //! which a subset of buckets are relocated per-thread; such an algorithm has //! not been implemented for the sake of simplicity. //! //! Bucket pointers that have been copied from an old bucket array into a new //! bucket array are marked with a borrowed bit. If a thread copies a bucket //! from an old bucket array into a new bucket array, fails to CAS the bucket //! pointer in the old bucket array, it attempts to CAS the bucket pointer in //! the new bucket array that it previously inserted to. If the bucket pointer //! in the new bucket array does *not* have the borrowed tag bit set, that //! thread knows that the value in the new bucket array was modified more //! recently than the value in the old bucket array. To avoid discarding updates //! to the new bucket array, a thread will never replace a bucket pointer that //! has the borrowed tag bit set with one that does not. To see why this is //! necessary, consider the case where a bucket pointer is copied into the new //! array, removed from the new array by a second thread, then copied into the //! new array again by a third thread. //! //! Mutating operations are, at their core, an atomic compare-and-swap (CAS) on //! a bucket pointer. Insertions CAS null pointers and bucket pointers with //! matching keys, modifications CAS bucket pointers with matching keys, and //! removals CAS non-tombstone bucket pointers. Tombstone bucket pointers are //! bucket pointers with a tombstone bit set as part of a removal; this //! indicates that the bucket's value has been moved from and will be destroyed //! if it has not beel already. //! //! As previously mentioned, removing an entry from the hash table results in //! that bucket pointer having a tombstone bit set. Insertions cannot //! displace a tombstone bucket unless their key compares equal, so once an //! entry is inserted into the hash table, the specific index it is assigned to //! will only ever hold entries whose keys compare equal. Without this //! restriction, resizing operations could result in the old and new bucket //! arrays being temporarily inconsistent. Consider the case where one thread, //! as part of a resizing operation, copies a bucket into a new bucket array //! while another thread removes and replaces that bucket from the old bucket //! array. If the new bucket has a non-matching key, what happens to the bucket //! that was just copied into the new bucket array? //! //! Tombstone bucket pointers are typically not copied into new bucket arrays. //! The exception is the case where a bucket pointer was copied to the new //! bucket array, then CAS on the old bucket array fails because that bucket has //! been replaced with a tombstone. In this case, the tombstone bucket pointer //! will be copied over to reflect the update without displacing a key from its //! bucket. //! //! This hash table algorithm was inspired by [a blog post by Jeff Phreshing] //! that describes the implementation of the Linear hash table in [Junction], a //! C++ library of concurrent data structrures. Additional inspiration was drawn //! from the lockfree hash table described by Cliff Click in [a tech talk] given //! at Google in 2007. //! //! [a blog post by Jeff Phreshing]: https://preshing.com/20160222/a-resizable-concurrent-map/ //! [Junction]: https://github.com/preshing/junction //! [a tech talk]: https://youtu.be/HJ-719EGIts pub mod map; pub mod segment; #[cfg(test)] #[macro_use] pub(crate) mod test_util; pub use map::HashMap; pub use segment::HashMap as SegmentedHashMap;