At first, I started to work on a MessagePack implementation for LuaJIT. I wanted something which satisfies my needs in terms of simplicity, performances, safety, robustness, output stability, flexibility and concurrency. Because the Lua data model is fundamental to my methodology, I considered relevant to invest my time working on an implementation that I can deeply understand, test and optimize.
- Simplicity and performance, e.g. by targeting LuaJIT 2.1 exclusively instead of multiple Lua versions.
- Safety and robustness, e.g. to deal properly with malicious inputs when unpacking. It is especially relevant because LuaJIT exposes powerful low-level abstractions that can be used to improve performances, but could introduce unsafety.
- Output stability. No deterministic guarantees; it is a best effort to facilitate the use of delta compression (e.g. with Fossil).
- Flexibility. E.g. global and per call pack/unpack configuration.
- Concurrency. Concurrent/progressive processing with coroutines.
I wanted to stick with MessagePack to avoid the creation of a new format and to increase interoperability, but I realized that the way I would use MessagePack would not be interoperable; it doesn't sufficiently match the Lua data model and there were simplification and specialization opportunities by designing a format just for Lua(JIT).
The LDM format has the following noteworthy advantages:
- Conceptual simplification: the format rests on Lua instead of generalizing to all programming languages.
- Representation of the Lua data model, preserving object identities and references of a directed graph.
- Domain specific compression: string deduplication, external strings.
- Representation of external resources using external dictionaries.
- Support for metatables. They can be used in place of MessagePack extensions.