[go: up one dir, main page]

Kaitai-of-data-encoding: Add support for big numbers

Context

Follows !9972 (merged) . See handling data encoding of client-libs for more general context.

Closes #6256 (closed)

This MR tries to define Kaitai struct description for big numbers, i.e. Data_encoding.N and Data_encoding.Z.

First let's look how big numbers are encoded in octez. As from .mli` file, big numbers are encoded as following:

 (** Big number

      In JSON, data is encoded as a string containing the decimal representation
      of the number.

      In binary, data is encoded as a variable length sequence of
      bytes, with a running unary size bit: the most significant bit of
      each byte tells is this is the last byte in the sequence (0) or if
      there is more to read (1). The second most significant bit of the
      first byte is reserved for the sign (positive if zero). Sizing and
      sign bits ignored, data is then the binary representation of the
      absolute value of the number in little-endian order. *)

Below we list the proposed Kaitai struct specification file. Inspiration for it was taken fromkaitai_struct_formats/common /vlq_base128_be.ksy. Notice that the main difference is that 1. We have reversed order of the groups and 2. our data is signed. Additionally we don't cap the value at max 8 bytes but rather stick to a generic representation (TODO elaborate).

meta:
 id: ground_n
 endian: be
types:
 group:
   instances:
     has_next:
       value: ((b & 128) != 0)
     value:
       value: (b & 127)
     seq:
      - id: b
        type: u1
seq:
- id: groups
  type: group
  repeat: until
  repeat-until: not (_.has_next)

Alternatively, for the testing purposes, we can define one with max value caped with 8 bytes.


meta:
  id: z
  endian: be
seq:
  - id: groups
    type: group
    repeat: until
    repeat-until: not _.has_next
types:
  group:
    seq:
      - id: b
        type: u1
    instances:
      has_next:
        value: (b & 0b1000_0000) != 0
      value:
        value: b & 0b0111_1111
instances:
  sign:
    value:  (groups[0].value >> 6)
  value:
    value: >-
      (groups[0].value & 0b0011_1111)
      + (last >= 1 ? (groups[1].value << 6) : 0)
      + (last >= 2 ? (groups[2].value << 13) : 0)
      + (last >= 3 ? (groups[3].value << 20) : 0)
      + (last >= 4 ? (groups[4].value << 27) : 0)
      + (last >= 5 ? (groups[5].value << 34) : 0)
      + (last >= 6 ? (groups[6].value << 41) : 0)
      + (last >= 7 ? (groups[7].value << 48) : 0)

Work break down

In order to define this specification we first have to add support for:

  • repeat, not and bitwise operation expression.
  • !9933 (merged) - user defined types
  • nested instances
  • Add support for n
  • Add support for z
  • Add support for data that is bounded with n encoded size. E.g. bytes, string...

Manually testing the MR

Assuming number 12345 that we want to encode as z.

  1. Running the following command we print binary representation of 12345 encoded as z:
./octez-codec encode ground.Z from '"12345"' | xxd -r -p | xxd -b
00000000: 10111001 11000000 00000001
  1. We expect kaitai to parse this as:
group_0: 10111001 
group_1: 11000000
group_2: 00000001

sign = 0 (unsigned)
  1. Notice that the first bit of every group is continuation bit, and that the second bit of group_0 is sign (unsigned if 0). Continuation and sign bits ignored data is in le order. Taking all this into consideration we get the following byte sequence: 0000001_1000000_111001, which corresponds to 12345 in binary.
  2. Observe the equation:
  • Starting with group_0 & 0b0011_1111 as initial value (we remove the continuation and sign bits)
  • For every group_i, we left shift group_i value (group_i.value << (6 + (i-1) * 7)) and add it to the previous sum.

Checklist

  • Document the interface of any function added or modified (see the coding guidelines)
  • Document any change to the user interface, including configuration parameters (see node configuration)
  • Provide automatic testing (see the testing guide).
  • For new features and bug fixes, add an item in the appropriate changelog (docs/protocols/alpha.rst for the protocol and the environment, CHANGES.rst at the root of the repository for everything else).
  • Select suitable reviewers using the Reviewers field below.
  • Select as Assignee the next person who should take action on that MR
Edited by Martin Tomazic

Merge request reports

Loading