Kaitai-of-data-encoding: Add support for big numbers
Context
Follows !9972 (merged) . See handling data encoding of client-libs for more general context.
Closes #6256 (closed)
This MR tries to define Kaitai struct description for big numbers, i.e. Data_encoding.N and Data_encoding.Z.
First let's look how big numbers are encoded in octez. As from .mli` file, big numbers are encoded as following:
(** Big number
In JSON, data is encoded as a string containing the decimal representation
of the number.
In binary, data is encoded as a variable length sequence of
bytes, with a running unary size bit: the most significant bit of
each byte tells is this is the last byte in the sequence (0) or if
there is more to read (1). The second most significant bit of the
first byte is reserved for the sign (positive if zero). Sizing and
sign bits ignored, data is then the binary representation of the
absolute value of the number in little-endian order. *)
Below we list the proposed Kaitai struct specification file. Inspiration for it was taken fromkaitai_struct_formats/common /vlq_base128_be.ksy. Notice that the main difference is that 1. We have reversed order of the groups and 2. our data is signed. Additionally we don't cap the value at max 8 bytes but rather stick to a generic representation (TODO elaborate).
meta:
id: ground_n
endian: be
types:
group:
instances:
has_next:
value: ((b & 128) != 0)
value:
value: (b & 127)
seq:
- id: b
type: u1
seq:
- id: groups
type: group
repeat: until
repeat-until: not (_.has_next)
Alternatively, for the testing purposes, we can define one with max value caped with 8 bytes.
meta:
id: z
endian: be
seq:
- id: groups
type: group
repeat: until
repeat-until: not _.has_next
types:
group:
seq:
- id: b
type: u1
instances:
has_next:
value: (b & 0b1000_0000) != 0
value:
value: b & 0b0111_1111
instances:
sign:
value: (groups[0].value >> 6)
value:
value: >-
(groups[0].value & 0b0011_1111)
+ (last >= 1 ? (groups[1].value << 6) : 0)
+ (last >= 2 ? (groups[2].value << 13) : 0)
+ (last >= 3 ? (groups[3].value << 20) : 0)
+ (last >= 4 ? (groups[4].value << 27) : 0)
+ (last >= 5 ? (groups[5].value << 34) : 0)
+ (last >= 6 ? (groups[6].value << 41) : 0)
+ (last >= 7 ? (groups[7].value << 48) : 0)
Work break down
In order to define this specification we first have to add support for:
-
repeat,notand bitwise operation expression. -
!9933 (merged) - user defined types -
nested instances -
Add support for n -
Add support for z -
Add support for data that is bounded with nencoded size. E.g.bytes,string...
Manually testing the MR
Assuming number 12345 that we want to encode as z.
- Running the following command we print binary representation of
12345encoded asz:
./octez-codec encode ground.Z from '"12345"' | xxd -r -p | xxd -b
00000000: 10111001 11000000 00000001
- We expect kaitai to parse this as:
group_0: 10111001
group_1: 11000000
group_2: 00000001
sign = 0 (unsigned)
- Notice that the first bit of every group is continuation bit, and that the second bit of
group_0is sign (unsigned if0). Continuation and sign bits ignored data is inleorder. Taking all this into consideration we get the following byte sequence:0000001_1000000_111001, which corresponds to12345in binary. - Observe the equation:
- Starting with
group_0 & 0b0011_1111as initial value (we remove the continuation and sign bits) - For every
group_i, we left shiftgroup_ivalue (group_i.value << (6 + (i-1) * 7))and add it to the previous sum.
Checklist
-
Document the interface of any function added or modified (see the coding guidelines) -
Document any change to the user interface, including configuration parameters (see node configuration) -
Provide automatic testing (see the testing guide). -
For new features and bug fixes, add an item in the appropriate changelog ( docs/protocols/alpha.rstfor the protocol and the environment,CHANGES.rstat the root of the repository for everything else). -
Select suitable reviewers using the Reviewersfield below. -
Select as Assigneethe next person who should take action on that MR