ImagicTheCat's Blog

My Development Methodology
Login

This document is primarily for myself. It may not be up to date or well structured. I wanted to improve on that before sharing, but I'm afraid I might never do it with this way of thinking. Just like my projects, this document is important to me and can be improved over time and is shared even if not advertised to others. It can be of interest to people that are trying to understand what I'm doing in my projects; especially if I am no longer here to tell.

General

Programming

General:

Extended:

Heuristics

Heuristics that I consider true or useful to apply in my projects.

Guidelines


Code Complexity (experimental)

Use gzip/DEFLATE to measure code complexity. The rough reasoning is as follow:

This method measures information complexity, it doesn't take into account reasoning and the relations about the code. But, I believe that it is still much better than line count and is simple to perform on any kind of language or format reliably (gzip/DEFLATE is everywhere).

Correctness

There are multiples ways to reach a certain degree of correctness for a program; this is my model.

In my current methodology, the degree of correctness necessary is low and formalism and static analysis are almost not present. Preferred ways are relation testing and consistency testing (assertions).

Analysis

Testing

Testing by relation is about checking a relation in the field of possible inputs / outputs of an operation. Cases can be generated or they can be produced manually (commonly called unit testing).

Examples of relation types:

Testing by consistency / coherence would be about checking that the behavior of an operation or program is consistent regarding its rules and constraints. This can be done with ahead of time testing or with runtime checks: a lot of assertions about the state of a program can be inserted to detect issues as soon as possible. For example, if a program crash it would be considered incoherent and a test failure.

Performances

This section list domains, ways and principles to follow when dealing with performances (computation, storage, networking, etc.).

Locality

Manage things to increase the locality of information when computing.

From closest to furthest: registers, L1-2-3 caches, RAM, drives, network, etc.

Caching

Caching can be used to improve locality, but also to factorize computations (prevent recomputing reused outputs).

Pre-computation

A kind of caching by computing costly elements which will or might be needed later.

Algorithmic Improvements

Improve algorithmic complexity.

Overhead Reduction

Remove simple unnecessary operations. E.g. use lower-level constructs, avoid checks at different layers of abstractions, etc.

Approximation

Use an approximated process instead of a costly accurate one (heuristics, etc).

Data Layout

Inspired1 by Data Oriented Design (if not the actual thing). This optimization can be very effective in my methodology!

The way I design things in Lua is inspired by Object Oriented Programming, or at least, I am using objects, more specifically a directed graph of objects. Each object/table have fields. When dealing with lot of elements, it means creating a lot of tables. Instead of doing that, the data layout can be transposed / inverted, i.e. instead of having an array of tables with fields, we have an array for each field of the tables. This is a simplification, because we can put multiple fields in the same Lua or FFI array if they tend to be accessed together.

I have found in testing that it can be much more efficient even without the FFI and that the increase in efficiency is still present even without JIT compilation. And it seems already very effective without having to go into the specifics of the hardware (e.g. cache line size).

It has a significant effect on multiple metrics:

The drawback is that this transposition / inversion of the data layout seems to imply the removal of the object oriented abstraction and requires manual bookkeeping. But, this removal of abstraction also comes with access to the context, i.e. how the objects (now just data) are used, which can reveal possible optimizations.

If we believe the heuristic that hotspots are mostly in loops, this data layout fits well, and it fits LuaJIT's heuristics.

It's also not just an optimization, it has interesting design properties. For example, it can be used as an alternative to a hierarchy of classes by using dynamic "components" which are assigned to entities, a design commonly called Entity Component System (but in my case I just care about systems and entities).

Distributed Computations (Parallelism)

Multi-threading, networks, etc.

For Multi-Threading:

Concurrent Computations

E.g. Coroutines. These concurrent computations do not happen in parallel; it is about reducing the synchronous latency of an operation, which will probably increase its asynchronous latency and use more compute overall. But, in a lot of cases, the synchronous latency is the problem. An example of that is the Lua Incremental GC which will work concurrently with user code and use (slightly?) more compute than a stop-the-world GC, but with much less synchronous latency, i.e. it will not freeze the program as the latter.

Another way to see it is that the slow synchronous operation is divided into smaller ones and it becomes a slow asynchronous operation.

Distributing Computations Over Time

The following is based on my experience in Factorio. Resources can be produced and consumed with different profiles. For example, the consumption can have spikes; in that case, storage (a buffer) may be used to mitigate those spikes.

Available computing resources for a specific system, if we ignore saving the energy, are in function of time. A specific amount of time is an amount of computing units available. But, those units cannot be fundamentally stored in a buffer to prepare in the event of a consumption spike. Therefore, the periods when the processor is unused are computing units lost forever.

By using the idle time of the processor to compute elements that will or might be used in the future, time can be saved and latency reduced.

Fluidity

In the case of most video games, they need to produce a certain amount of frames per seconds, at a steady pace. For the game to be fluid, each frame cannot exceed at certain time budget. A spike in computing consumption would result in losing frames (stuttering) or higher latency for concurrent operations.

Ideally, instead of consuming low or medium amount of compute most the time and spiky amount of compute sporadically, the game would have a way to consume a steady amount of compute all the time. This not only preserves the fluidity of the experience, but can also increases the amount of compute available to build the experience, because those computing units are no longer lost in idle times.

Profiling

For general CPU profiling, use a statistical/sampling profiler. Avoid as much as possible the profiling of a modified version of the original system, e.g. avoid instrumentation. If the profile doesn't sufficiently match the original system, it can't be used to guide optimizations.

When it is about synchronous latency, use manual and specific instrumentations progressively from the shallowest to the deepest level of details, searching for the culprit with the smallest amount of change to the system.

For non-CPU profiling, like network bandwidth, use automatic/general instrumentation (e.g. count the number of bytes sent by a specific line of code).

Versioning

This section describes how I should version my projects.

There are two kind of version: release and development.

Release

A release is a named specific point in time of a project to be distributed to users.

The version has the form <major>.<minor>, where major and minor are continuously incremented integers (starting at 1 for the major number).

Major versions are not backwards compatible and should be considered different branches of the project (which may be simultaneously maintained).

Minor versions are backwards compatible. The notion of backwards compatibility is subtle; thus it is not described here.

The frequency of major version releases must be inversely proportional to the number of users.

Development

A development version is a commit / check-in hash. Don't use the 0.<minor> form to describe a pre-version of the project.

FFI Bindings

In the case of bindings, don't see the LuaJIT FFI as a replacement for a C compiler, but as a better method than the Lua C API (less friction with C and higher-level) to write a wrapper. The FFI has limitations and is not able to load/check the C interface (headers) of loaded libraries. But if a C interface is designed specifically for the FFI (a wrapper) for a C library, the C compiler can do the work (e.g. checking, use of complex or platform-specific headers, etc.).

Making a C wrapper may be more work, but is safer and can be adapted to FFI needs.

Coroutines

As a rule for interoperability in the general case: only the entity or system that yielded the coroutine should be the one responsible of resuming it. This prevents resuming conflicts, which allows composition of yielding operations.

Entity Systems

As mentioned in the performances section, changing the data layout can be more than an optimization. This section explains a method which removes my need for OOP (my version of it). I should not use a class system ever again; I can still use objects, i.e. direct use of metatables, but not a class system with the intent to build some kind of class hierarchy.

The idea is to have systems. A system is an object, it exposes an interface made of methods and it implements processes and manages data related to them. The system is an abstraction over its data, it can manipulate its structures as necessary. A system can use other systems.

The following is a specific implementation of this idea. Entity systems are more specific systems dealing with unique shared entities. In a specific context we could call a world, each entity is identified by a unique identifier which cannot be reused: a Lua number incremented at each entity creation. This number is then given to various entity systems to register the entity into them. The components of an entity are implicitly given by the systems it is registered in.

It implies that components are dynamic and use-after-free / double-free are not really an issue. But, there is still bookkeeping to be done to avoid memory leaks.

Entity systems can have different purposes and they can own multiple data structures associated to different processes. An important purpose would be to perform a specific computation on all of its entities. To make that efficient, the system can use arrays, with or without the FFI.

The allocation strategy of these arrays is: add a new element at the end and remove an element by moving the last element into the removed element's slot. This is simple and prevents holes. To do that, a map from entity_id to internal_id and a map from internal_id to entity_id (effectively an array) are used. This is possible because the system's data is behind an abstraction. The internal_id can then be used to index the different arrays.

In preliminary testing, the performances are better on multiple metrics compared to objects, even without FFI and without JIT. The following are my observations.

Memory usage should be lower:

GC collection should be faster:

For processing, the optimization opportunities seem to be much better as I understand it from Data Oriented Design. E.g. batch operations, weave together related data or split unrelated data, improve cache density, decrease (instruction-)cache misses, remove some branches, do manual loop hoisting to optimize for the interpreter, etc.

In conclusion, it should not be less efficient and it gives the opportunity to be much more efficient. But, the impact of this method in regards to simplicity and performances still needs to be determined in a real project.

Recommendations:

Documentation

Data

LÖVE and Simplicity

I don't consider making (and playing) multiplayer experiences, especially like a game as a service or even a dedicated server like Minecraft, as essential anymore. I like that an experience can be enjoyed alone, offline, customized and with the data stored locally. It also simplifies things a lot, because multiplayer is more complex (synchronization, authentication, security, data responsibility, etc.).

I can still make single player games with multiplayer features, but at a lower scale, with a server running with LÖVE, for LAN (see ZeroTier) or Internet, whether it is embedded in the game, launched as a separate application on the desktop or on a dedicated GNU/Linux server. In all cases, the server and the client would use the same methodology, which simplifies things a lot, especially for distribution.

This means that it is possible to no longer use a system event loop like ljuv and rely instead on LÖVE threads, channels and elscheduler to build what is needed for a specific game, which is more simple.

I can probably do all I want with LÖVE, LuaJIT and the FFI, without any AOT compilation which would dramatically increase the distribution complexity. I want to follow this constraint of not doing AOT compilation or using additional native dependencies for simplicity and to master optimizations restricted to a deep understanding of the interpreter, tracing JIT, C data semantics, processors (Linux perf), etc.

(As a last resort, I could do AOT compilation of C code for performance reasons and it could be an optional library loaded at runtime with a fallback path written in Lua.)

Furthermore, a good chunk of computation is or can be performed on the GPU dynamically with GLSL, making this constraint even more reasonable.

All I Need Is LÖVE. Not exactly, because Lua/LuaJIT and C are still at the core of my methodology and LÖVE is not completely aligned with how I want to do things, but it sounds good.

This allows to focus on what really matters to me: developing interactive and non-interactive audiovisual experiences, prototypes and games in a way that seeks simplicity. It is about programming, simulation, algorithmic generation, fiction, lore, world building, storytelling, visuals, animations, text, musics, audio atmospheres, sound design, etc.

A lot can already be done with this constraint.


  1. ^ Mostly from CppCon 2014: Mike Acton "Data-Oriented Design and C++". I don't agree with everything, I just took the fundamental ideas.