RustCrypto: Blobby
An encoding and decoding library for the Blobby (blb
) file format, which serves as a simple,
deduplicated storage format for a sequence of binary blobs.
Examples
// We recommend to save blobby data into separate files and
// use the `include_bytes!` macro
static BLOBBY_DATA: & = b"\x08\x02\x05hello\x06world!\x01\x02 \x00\x03\x06:::\x03\x01\x00";
static SLICE: & = parse_into_slice!;
assert_eq!;
assert_eq!;
assert_eq!;
assert_eq!;
assert_eq!;
assert_eq!;
assert_eq!;
assert_eq!;
assert_eq!;
parse_into_structs!;
assert_eq!;
assert_eq!;
assert_eq!;
Encoding and decoding utilities
This crate provides encoding and decoding utilities for converting between the blobby format and text file with hex-encoded strings.
Let's say we have the following test vectors for a 64-bit hash function:
COUNT = 0
INPUT = 0123456789ABCDEF0123456789ABCDEF
OUTPUT = 217777950848CECD
COUNT = 1
INPUT =
OUTPUT = F7CD1446C9161C0A
COUNT = 2
INPUT = FFFEFD
OUTPUT = 80081C35AA43F640
To transform it into the Blobby format you first have to modify it to the following format:
0123456789ABCDEF0123456789ABCDEF
217777950848CECD
F7CD1446C9161C0A
FFFEFD
80081C35AA43F640
The first, third, and fifth lines are hex-encoded hash inputs, while the second,
fourth, and sixth lines are hex-encoded hash outputs for input on the previous line.
Note that the file should contain a trailing empty line (i.e. every data line should end
with \n
).
This file can be converted to the Blobby format by running the following command:
To inspect contents of an existing Blobby file you can use the following command:
The output file will contain a sequence of hex-encoded byte strings stored in the input file.
Storage format
Storage format represents a sequence of binary blobs. The format uses git-flavored variable-length quantity (VLQ) for encoding unsigned numbers.
Blobby files start with two numbers: total number of blobs in the file n
and
number of de-duplicated blobs d
. The numbers are followed by d
entries.
Each entry starts with an integer m
, immediately followed by m
bytes representing de-duplicated binary blob.
Next, follows n
entries representing sequence of stored blobs.
Each entry starts with an unsigned integer l
. The least significant
bit of this integer is used as a flag. If the flag is equal to 0, then the
number is followed by n >> 1
bytes, representing a stored binary blob.
Otherwise the entry references a de-duplicated entry number n >> 1
which should be smaller than d
.
License
Licensed under either of:
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.