ImagicTheCat's Blog

My Development Methodology
Login

This document is primarily for myself. It may not be up to date or well structured. I wanted to improve on that before sharing, but I'm afraid I might never do it with this way of thinking. Just like my projects, this document is important to me and can be improved over time and is shared even if not advertised to others. It can be of interest to people that are trying to understand what I'm doing in my projects; especially if I am no longer here to tell.

The purpose of this methodology is to create things in the context of video game development, e.g. standalone musics, visuals, audiomations or, of course, games. It is designed with the premises of working alone, on small codebases, with infrequent changes.

The methodology seeks simplicity, efficiency and, through progressive refinements, quality.

General

Licensing

Programming

General:

Extended:

Heuristics

Heuristics that I consider true or useful to apply in my projects.

Guidelines


Dependencies

Apart from fundamental dependencies like LÖVE or LuaJIT, all of the other dependencies should be Lua code (see LÖVE and Simplicity at the end of this document).

Instead of reusing a package manager like LuaRocks or building a very simple one for my use case, just add the dependencies to the source tree.

Advantages I can think about:

The only drawback I can think about is the need to store more data, but this is negligible in my case. Also, be careful about the licenses and stewards of third-party dependencies, but they should be rare (especially to avoid the need to shun artifacts).

Code Complexity (experimental)

Use gzip/DEFLATE to measure code complexity. The rough reasoning is as follow:

This method measures information complexity, it doesn't take into account reasoning and the relations about the code. But, I believe that it is still much better than line count and is simple to perform on any kind of language or format reliably (gzip/DEFLATE is everywhere).

Correctness

There are multiples ways to reach a certain degree of correctness for a program; this is my model.

In my current methodology, the degree of correctness necessary is low and formalism and static analysis are almost not present. Preferred ways are relation testing and consistency testing (assertions).

Analysis

Testing

Testing by relation is about checking a relation in the field of possible inputs / outputs of an operation. Cases can be generated or they can be produced manually (commonly called unit testing).

Examples of relation types:

Testing by consistency / coherence would be about checking that the behavior of an operation or program is consistent regarding its rules and constraints. This can be done with ahead of time testing or with runtime checks: a lot of assertions about the state of a program can be inserted to detect issues as soon as possible. For example, if a program crash it would be considered incoherent and a test failure.

To facilitate testing, don't hesitate to expose internal functions; just don't document them. The documentation defines the API, but other things can be exposed in the actual code; if someone use these undocumented features, it is at their own risk.

Performances

This section list domains, ways and principles to follow when dealing with performances (computation, storage, networking, etc.).

Locality

Manage things to increase the locality of information when computing.

From closest to furthest: registers, L1-2-3 caches, RAM, drives, network, etc.

Caching

Caching can be used to improve locality, but also to factorize computations (prevent recomputing reused outputs).

Pre-computation

A kind of caching by computing costly elements which will or might be needed later.

Idiomatic Code

The LuaJIT's optimizations are probably made with idiomatic Lua code in mind; thus, seeking to write that kind of code can lead to performance improvements.

An example of that is the code generation of mgl. I tried to generate code that I would manually write.

It is still possible to generate non-idiomatic code that could be more efficient, but idiomatic code has more stability guarantees.

Algorithmic Improvements

Improve algorithmic complexity.

Overhead Reduction

Remove simple unnecessary operations. E.g. use lower-level constructs, avoid checks at different layers of abstractions, etc.

Approximation

Use an approximated process instead of a costly accurate one (heuristics, etc).

Data Layout

Inspired1 by Data Oriented Design (if not the actual thing). This optimization can be very effective in my methodology!

The way I design things in Lua is inspired by Object Oriented Programming, or at least, I am using objects, more specifically a directed graph of objects. Each object/table have fields. When dealing with lot of elements, it means creating a lot of tables. Instead of doing that, the data layout can be transposed / inverted, i.e. instead of having an array of tables with fields, we have an array for each field of the tables. This is a simplification, because we can put multiple fields in the same Lua or FFI array if they tend to be accessed together.

I have found in testing that it can be much more efficient even without the FFI and that the increase in efficiency is still present even without JIT compilation. And it seems already very effective without having to go into the specifics of the hardware (e.g. cache line size).

It has a significant effect on multiple metrics:

The drawback is that this transposition / inversion of the data layout seems to imply the removal of the object oriented abstraction and requires manual bookkeeping. But, this removal of abstraction also comes with access to the context, i.e. how the objects (now just data) are used, which can reveal possible optimizations.

If we believe the heuristic that hotspots are mostly in loops, this data layout fits well, and it fits LuaJIT's heuristics.

It's also not just an optimization, it has interesting design properties. For example, it can be used as an alternative to a hierarchy of classes by using dynamic "components" which are assigned to entities, a design commonly called Entity Component System (but in my case I just care about systems and entities).

Distributed Computations (Parallelism)

Multi-threading, networks, etc.

For Multi-Threading:

Concurrent Computations

E.g. Coroutines. These concurrent computations do not happen in parallel; it is about reducing the synchronous latency of an operation, which will probably increase its asynchronous latency and use more compute overall. But, in a lot of cases, the synchronous latency is the problem. An example of that is the Lua Incremental GC which will work concurrently with user code and use (slightly?) more compute than a stop-the-world GC, but with much less synchronous latency, i.e. it will not freeze the program as the latter.

Another way to see it is that the slow synchronous operation is divided into smaller ones and it becomes a slow asynchronous operation.

Distributing Computations Over Time

The following is based on my experience in Factorio. Resources can be produced and consumed with different profiles. For example, the consumption can have spikes; in that case, storage (a buffer) may be used to mitigate those spikes.

Available computing resources for a specific system, if we ignore saving the energy, are in function of time. A specific amount of time is an amount of computing units available. But, those units cannot be fundamentally stored in a buffer to prepare in the event of a consumption spike. Therefore, the periods when the processor is unused are computing units lost forever.

By using the idle time of the processor to compute elements that will or might be used in the future, time can be saved and latency reduced.

Fluidity

In the case of most video games, they need to produce a certain amount of frames per seconds, at a steady pace. For the game to be fluid, each frame cannot exceed at certain time budget. A spike in computing consumption would result in losing frames (stuttering) or higher latency for concurrent operations.

Ideally, instead of consuming low or medium amount of compute most the time and spiky amount of compute sporadically, the game would have a way to consume a steady amount of compute all the time. This not only preserves the fluidity of the experience, but can also increases the amount of compute available to build the experience, because those computing units are no longer lost in idle times.

Profiling

For general CPU profiling, use a statistical/sampling profiler. Avoid as much as possible the profiling of a modified version of the original system, e.g. avoid instrumentation. If the profile doesn't sufficiently match the original system, it can't be used to guide optimizations.

When it is about synchronous latency, use manual and specific instrumentations progressively from the shallowest to the deepest level of details, searching for the culprit with the smallest amount of change to the system.

For non-CPU profiling, like network bandwidth, use automatic/general instrumentation (e.g. count the number of bytes sent by a specific line of code).

Versioning

This section describes how I should version my projects.

There are two kind of version: release and development.

Release

A release is a named specific point in time of a project to be distributed to users.

The version has the form <major>.<minor>, where major and minor are continuously incremented integers (starting at 1 for the major number).

Major versions are not backwards compatible and should be considered different branches of the project (which may be simultaneously maintained).

Minor versions are backwards compatible. The notion of backwards compatibility is subtle; thus it is not described here.

The frequency of major version releases must be inversely proportional to the number of users.

Development

A development version is a commit / check-in hash. Don't use the 0.<minor> form to describe a pre-version of the project.

FFI Bindings

In the case of bindings, don't see the LuaJIT FFI as a replacement for a C compiler, but as a better method than the Lua C API (less friction with C and higher-level) to write a wrapper. The FFI has limitations and is not able to load/check the C interface (headers) of loaded libraries. But if a C interface is designed specifically for the FFI (a wrapper) for a C library, the C compiler can do the work (e.g. checking, use of complex or platform-specific headers, etc.).

Making a C wrapper may be more work, but is safer and can be adapted to FFI needs.

Coroutines

As a rule for interoperability in the general case: only the thing that yielded the coroutine should be the one responsible of resuming it. This prevents resuming conflicts, which allows composition of yielding operations.

Also, use a [concurrent] tag in source documentation to warn that a function can yield.

A Strategy Without OOP

This section describes a new way for developing my projects that is radically different from what I have done for the past years. The radical change is the avoidance of Object Oriented Programming (or my version of it). The change has been triggered by ideas coming from Data Oriented Design.

Instead of thinking in terms of objects, the fundamental concepts are processes, data and systems. More specifically, by using lexical scoping, locals, functions, upvalues, data structures (tables, etc.), modules (files) and interfaces (set of functions).

We start from a simple Lua file / module and we add as much local data and functions we need to achieve our goal. The file will grow larger, maybe to thousands of lines, and the hope is that at some point we may realize that some processes are independent, that is, they depend on completely different kind of data or with minimal needs, i.e. they use different upvalues. As an analogy, we can call these independent sets of data and processes islands, and we can call bridges the minimal dependencies which may exist. If these islands exist, we can move them to other modules, reducing the amount of context present in a single module, and the bridge can be a function call. Conversely, if we realize later on that a module depends a lot on the data of another module, we can merge them. The idea is to perceive the relationships between the processes and the data and to decompose the code in a way that reflects these relationships.

What I called a module can also be a dynamically created system. A module or system exposes an interface and it implements processes and manages data related to them. The system is an abstraction over its data, it can manipulate its structures as necessary. A system can use other systems. It is important to note that the interface is not about having an object, but a set of functions; the functions hold the data. In this view, a simple callback can be seen as an interface. Ideally, this interface must not expose references to the internal data structures to allow for easy changes of the internal data layout. This design also allows to compose systems easily. A system's interface can be created and used to add more general capabilities to another system. Thanks to upvalues, they are just functions which can be added to the other system.

An important goal for this strategy compared to OOP is to grant the ability to perceive the context of processes and data, which can lead to simplification and optimization opportunities.

This strategy should have a positive effect on these criteria:

Over generalization seems to be a very important problem for me; by thinking about objects in isolation, I tend to think about all the things they could be used for and I start thinking about stuff that I may never need.

Another important point that is more personal: I feel like it is much more enjoyable to develop in this way. I guess it is because it allows to focus on the concrete needs of the system, with better understanding of it as a whole and more opportunities to refine it. I now want to describe my old way of writing code as "too corporate"; probably because of too much protocols, conventions or other things that I considered necessary instead of actually making the thing. I feel like I wasted my time, but maybe OOP was for me a stand that I no longer need.

Entity Systems

As mentioned in the performances section, changing the data layout can be more than an optimization. This section explains a method which removes my need for OOP to model entities. I should not use a class system ever again; I can still seldom use objects, e.g. file handles, but certainly not a class system with the intent to build some kind of class hierarchy.

Entity systems are more specific systems dealing with unique shared entities. In a specific context we could call a world, each entity is identified by a unique identifier which cannot be reused: a Lua number incremented at each entity creation. This number is then given to various entity systems to register the entity into them. The components of an entity are implicitly given by the systems it is registered in.

It implies that components are dynamic and use-after-free / double-free are not really an issue. But, there is still bookkeeping to be done to avoid memory leaks.

Entity systems can have different purposes and they can own multiple data structures associated to different processes. An important purpose would be to perform a specific computation on all of its entities. To make that efficient, the system can use arrays, with or without the FFI.

The allocation strategy of these arrays is: add a new element at the end and remove an element by moving the last element into the removed element's slot. This is simple and prevents holes. To do that, a map from entity_id to internal_id and a map from internal_id to entity_id (effectively an array) are used. This is possible because the system's data is behind an abstraction. The internal_id can then be used to index the different arrays.

In preliminary testing, the performances are better on multiple metrics compared to objects, even without FFI and without JIT. The following are my observations.

Memory usage should be lower:

GC collection should be faster:

For processing, the optimization opportunities seem to be much better as I understand it from Data Oriented Design. E.g. batch operations, weave together related data or split unrelated data, improve cache density, decrease (instruction-)cache misses, remove some branches, do manual loop hoisting to optimize for the interpreter, etc.

In conclusion, it should not be less efficient and it gives the opportunity to be much more efficient. But, the impact of this method in regards to simplicity and performances still needs to be determined in a real project.

Recommendations:

Algorithmic Generation and Chaos

I use the expression "algorithmic generation" in place of "procedural generation" because I consider it more accurate: it doesn't matter much if the language is procedural, what matters is the use of algorithms to generate data.

I want to invest myself deeply into algorithmic generation, and when it comes to randomness in the context of games, I just see it as a specific kind of generation: the algorithmic generation of chaos. There is no concept of true or false randomness in this view, it's all about order and chaos. E.g. sometimes, I may need chaos that has less discrepancy, something biased.

I don't believe that I will ever need the randomness of physics in my projects. On the contrary, I want the determinism.

Then, these chaotic functions can be used to explore the field of possibilities with a chaotic sampling instead of an ordered sampling.

Documentation

Data

LÖVE and Simplicity

I don't consider making (and playing) multiplayer experiences, especially like a game as a service or even a dedicated server like Minecraft, as essential anymore. I like that an experience can be enjoyed alone, offline, customized and with the data stored locally. It also simplifies things a lot, because multiplayer is more complex (synchronization, authentication, security, data responsibility, etc.).

I can still make single player games with multiplayer features, but at a lower scale, with a server running with LÖVE, for LAN (see ZeroTier) or Internet, whether it is embedded in the game, launched as a separate application on the desktop or on a dedicated GNU/Linux server. In all cases, the server and the client would use the same methodology, which simplifies things a lot, especially for distribution.

This means that it is possible to no longer use a system event loop like ljuv and rely instead on LÖVE threads, channels and elscheduler to build what is needed for a specific game, which is more simple.

I can probably do all I want with LÖVE, LuaJIT and the FFI, without any AOT compilation which would dramatically increase the distribution complexity. I want to follow this constraint of not doing AOT compilation or using additional native dependencies for simplicity and to master optimizations restricted to a deep understanding of the interpreter, tracing JIT, C data semantics, processors (Linux perf), etc.

(As a last resort, I could do AOT compilation of C code for performance reasons and it could be an optional library loaded at runtime with a fallback path written in Lua.)

Furthermore, a good chunk of computation is or can be performed on the GPU dynamically with GLSL, making this constraint even more reasonable.

All I Need Is LÖVE. Not exactly, because Lua/LuaJIT and C are still at the core of my methodology and LÖVE is not completely aligned with how I want to do things, but it sounds good.

This allows to focus on what really matters to me: developing interactive and non-interactive audiovisual experiences, prototypes and games in a way that seeks simplicity. It is about programming, simulation, algorithmic generation, fiction, lore, world building, storytelling, visuals, animations, text, musics, audio atmospheres, sound design, etc.

A lot can already be done with this constraint.


  1. ^ Mostly from CppCon 2014: Mike Acton "Data-Oriented Design and C++". I don't agree with everything, I just took the fundamental ideas.