LDM

Binary Format
Login

The LDM binary format is similar to MessagePack (with inspirations from LuaJIT's String Buffers), but specialized for Lua/LuaJIT and my methodology. It is intended for small and medium sized data, e.g. from network messages to immutable file formats1. It doesn't aim to be a standard, but it aims to be portable and durable.

Another document explains why this format has been created.

The format must be backwards compatible and should be frozen.

Generalities

Each object/value is encoded using a byte tag.

The first 216 values are for tags with embedded unsigned integers. They are ranges.

The last 40 values are for unique tags.

A <n>b suffixe designates the number of bits of an unsigned integer which is a parameter of the object format.

Dictionaries

There are two external dictionaries and one internal dictionary.

Internal Object Dictionary

This dictionary is internal; it is part of the encoded data.

In the context of this section, an object is a string or a table.

Each time an object is packed (or unpacked) into the format, it follows three steps:

The incentive for having two dictionaries is to have better compression. The heuristic is that an object either appears once or it appears a lot (e.g. a list of objects with similar fields).

This dictionary is ultimately about reference handling. Here are noteworthy benefits:

External Object and Metatable Dictionaries

These dictionaries are external. They represent data which must stay available and backwards compatible for the lifespan of the packed data pieces using them.

Each dictionary associates an index, between 1 and 2^32 included, to an external resource. When packing, a reference (the index) to the resource is encoded. Lower indexes use less bytes.

Backwards compatibility in this context means that the dictionary can only be modified by adding new entries; previously defined entries must not change. It is still possible to mark en entry as removed with nil to explicitly drop support for packed data pieces using them.

Example use cases for the external object dictionary:

Example use cases for the metatable dictionary:

Tags Overview

Range tags:

name bits count value hexadecimal Lua type
positive int 6b 00xxxxxx 64 0 - 63 0x00 - 0x3f number
negative int 5b 010xxxxx 32 64 - 95 0x40 - 0x5f number
internal object ref 5b 011xxxxx 32 96 - 127 0x60 - 0x7f table, string
external object ref 5b 100xxxxx 32 128 - 159 0x80 - 0x9f table, string, userdata, function, thread
string 5b 101xxxxx 32 160 - 191 0xa0 - 0xbf string
array 4b 1100xxxx 16 192 - 207 0xc0 - 0xcf table
map 3b 11010xxx 8 208 - 215 0xd0 - 0xd7 table

Unique tags:

name value hexadecimal Lua type
positive int 8,16,32b 216 - 218 0xd8 - 0xda number
negative int 8,16,32b 219 - 221 0xdb - 0xdd number
unsigned int 64b 222 0xde cdata<uint64_t>
signed int 64b 223 0xdf cdata<int64_t>
internal object entry 8,16,32b 224 - 226 0xe0 - 0xe2 table, string
internal object ref 8,16,32b 227 - 229 0xe3 - 0xe5 table, string
external object ref 8,16,32b 230 - 232 0xe6 - 0xe8 table, string, userdata, function, thread
metatable ref 8,16,32b 233 - 235 0xe9 - 0xeb table
string 8,16,32b 236 - 238 0xec - 0xee string
array 8,16,32b 239 - 241 0xef - 0xf1 table
map 8,16,32b 242 - 244 0xf2 - 0xf4 table
mixed table 245 0xf5 table
unused 246 - 251 0xf6 - 0xfb
float64 252 0xfc number
true 253 0xfd boolean
false 254 0xfe boolean
nil 255 0xff nil

Grammar

value = nil | true | false | number | string | table |
        internal-object-entry | internal-object-ref |
        external-object-ref | metatable-ref table

nil = 0xff
true = 0xfd
false = 0xfe

number = positive-int | negative-int | 0xde u64 | 0xdf i64 | 0xfc f64
positive-int = <0x00 + value> |
               0xd8 u8 | 0xd9 u16 | 0xda u32
negative-int = <0x40 + absolute value> |
               0xdb u8 | 0xdc u16 | 0xdd u32

string = string-header <string>
string-header = <0xa0 + length> |
                0xec u8 | 0xed u16 | 0xee u32

table = array | map | mixed
array = array-header array-content
array-header = <0xc0 + length> |
               0xef u8 | 0xf0 u16 | 0xf1 u32
array-content = {value}
map = map-header map-content
map-header = <0xd0 + pairs> |
             0xf2 u8 | 0xf3 u16 | 0xf4 u32
map-content = {value value}
mixed = 0xf5 array-header map-header array-content map-content

internal-object-entry = 0xe0 u8 | 0xe1 u16 | 0xe2 u32
internal-object-ref = <0x60 + index> |
                      0xe3 u8 | 0xe4 u16 | 0xe5 u32

external-object-ref = <0x80 + index> |
                      0xe6 u8 | 0xe7 u16 | 0xe8 u32
metatable-ref = 0xe9 u8 | 0xea u16 | 0xeb u32


u8 = <8 bits unsigned integer>
u16 = <16 bits unsigned integer, little-endian>
u32 = <32 bits unsigned integer, little-endian>
u64 = <64 bits unsigned integer, little-endian>
i64 = <64 bits signed integer, two's complement, little-endian>
f64 = <IEEE 754 double precision floating point number, little-endian>

  1. ^ Files that are not modified in-place, unlike a SQLite database.