Substrate uses a lightweight and efficient encoding and decoding program to optimize how data is sent and received over the network. The program used to serialize and deserialize data is called the SCALE codec, with SCALE being an acronym for Simple Concatenated Aggregate Little-Endian.
Boolean values are encoded using the least significant bit of a single byte.
True
0x01
u16
Basic integers are encoded using a fixed-width little-endian (LE) format.
42
0x2a00
Compact
A "compact" or general integer encoding is sufficient for encoding large integers (up to 2**536) and is more efficient at encoding most values than the fixed-width version. (Though for single-byte values, the fixed-width integer is never worse.)
0
0x00
1
0x04
42
0xa8
69
0x1501
100000000000000
0x0b00407a10f35a
Vec
A collection of same-typed values is encoded, prefixed with a compact encoding of the number of items, followed by each item's encoding concatenated in turn.
[4, 8, 15, 16, 23, 42]
0x18040008000f00100017002a00
BitVec
A sequence of bools, represented in a more space efficient bit format
0b00000010_01111101
0x287d02
str,Bytes, String
Strings are Vectors of bytes (Vec<u8>) containing a valid UTF8 sequence.
A fixed number of variants, each mutually exclusive and potentially implying a further value or series of values. Encoded as the first byte identifying the index of the variant that the value is. Any further bytes are used to encode any data that the variant implies. Thus, no more than 256 variants are supported.
For structures, the values are named, but that is irrelevant for the encoding (names are ignored - only order matters). All containers store elements consecutively. The order of the elements is not fixed, depends on the container, and cannot be relied on at decoding. This implicitly means that decoding some byte-array into a specified structure that enforces an order and then re-encoding it could result in a different byte array than the original that was decoded.