This document is primarily for myself. It may not be up to date or well structured. I wanted to improve on that before sharing, but I'm afraid I might never do it with this way of thinking. Just like my projects, this document is important to me and can be improved over time and is shared even if not advertised to others. It can be of interest to people that are trying to understand what I'm doing in my projects; especially if I am no longer here to tell.
The purpose of this methodology is to create things in the context of video game development, e.g. standalone musics, visuals, audiomations or, of course, games. It is designed with the premises of working alone, on small codebases, with infrequent changes.
The methodology seeks simplicity, efficiency and, through progressive refinements, quality.
General
- Try to work at the highest levels of abstraction to produce the most understandable and concise sources as possible. Mix edition and algorithms (which is also edition). E.g. in the case of sound synthesis, don't use the data produced by captures (sampling, etc.). Same for the production of textures, animations, geometry, etc. Focus on synthesis from algorithms, parameters and other high-level authored data instead of versioning concrete indecipherable blobs into repositories (this excludes high-level blobs like encoding of Lua data model objects). Concrete data can then be generated from those high-level sources.
Licensing
- Use the MIT license for modules / libraries. They are designed to be included into existing codebases.
- Use the GPL for end user projects (applications). E.g. a standalone editor or a video game.
- Use the CC-BY-SA for data which are not covered by free software licenses.
Programming
General:
- Lua/LuaJIT (deep understanding of the JIT, interpreter, VM, etc.)
- C data model / FFI
- Processor/CPU behavior (understanding of instructions, cache, memory, etc.)
Extended:
- LÖVE
- GLSL
- GPU behavior
Heuristics
Heuristics that I consider true or useful to apply in my projects.
- Most time is spent reading code, not writing it. Optimize for that first.
- Similarly, in a database or state, most time is spent reading, not writing.
Guidelines
- Work at different levels of abstraction, composition, detail. Balance the graph of function dependencies to not have something too wide (not enough composition, functions are too long, too much context in the body) or too deep (too much indirections to understand what is actually happening).
- Find a balance between compression and expansion. I.e. cryptic vs. verbose statements, or what I want to call qualitative vs. quantitative complexity. If the code is too compressed, it can make things hard to understand, because there is too much mental processing required. If the code is not compressed enough, it becomes less readable because there is too much to read to get it.
- Interface abstraction: define an interface somewhere (e.g. source comments or web documentation) and use it as the reference for usage and implementation, so that the user doesn't need to think about the implementation and reciprocally.
- Be careful of over-generalization (e.g. using something to solve everything, when something specific would be better) and over-specialization (e.g. using something too specific when a more general solution is good).
- Error propagation, warnings, logging and feedback in general is very important for correctness and the user experience.
- Take the time to make good debugging tools. When I need to diagnose a problem, I tend to use the quickest approach because I want to stay focused on the problem instead of thinking about improving the method. But, bugs are common. Investing time in making better debugging tools specific to the project will help to improve its quality and save time in the long run.
- Let's say that there are 3 stages when programming: research, expert and forget. At the research stage, the code is changing a lot and it is not clear how to achieve the objectives. At the expert stage, the code is well structured and here to stay and there is implicit knowledge about its inner workings. At the forget stage, the code hasn't been touched for a while and the implicit knowledge has to be reconstructed from it. The best moment to document is at the expert stage.
- Limit pre-formatted text lines to 80 characters (with exceptions). Better for side-by-side diffs, pocket editing and readability.
- Most of the time, put the MIT license in a LICENSE and main source file (with short comments, watch for the 80 characters limit). For the other files, just put a license name header (with references to local files containing the entire license) and a copyright notice.
- Use spaces instead of tabs. See textual programming as the edition of a matrix of characters (with minor nitpicks like avoiding trailing spaces) which gives formatting power. Tabs are not only taking multiple cells of the matrix, which impairs that idea, but they also take a variable amount of them (dependent of a specific configuration).
- Use snake_case (lower case only, at the exception of constants) for variable, field, function and method identifiers. I was using camelCase for methods to distinguish them from the rest, but a single word camelCase is equivalent to a single word snake_case, and I try to use single words most of the time, thus it is not relevant. Those identifiers are keywords, it is possible to avoid underscores (spaces) when relevant or to shorten words (seldom). For literal string keywords (labels), use an hyphen instead of the underscore. The general idea is to build a keyword by concatenating words with a separator (or none) in all lower case or upper case, with a specific separator dependent of the context (underscore, hyphen, etc.). For modules, use an hyphen for the root module (same as the project name) and an underscore for the sub modules.
- Don't use plain text Fossil check-in comments, but use the
timeline-hard-newlinesandtimeline-truncate-at-blankoptions without inserting line breaks for pre-formatting purpose when editing the comment. This allows to use 3 levels of detail for check-ins: first line summary, optional comment body (after a blank line) and optional wiki page. With the additional ability to use plain text formatting, e.g. hyphen lists, to mostly avoid the Fossil wiki format. Prefer an active form for the first line summary unless there is a better descriptive form. - Use a Fossil wiki page to track a tree of tasks (to work at different levels of composition); the Fossil tickets are the leaf nodes of the tree. E.g. for a roadmap.
Dependencies
Apart from fundamental dependencies like LÖVE or LuaJIT, all of the other dependencies should be Lua code (see LÖVE and Simplicity at the end of this document).
Instead of reusing a package manager like LuaRocks or building a very simple one for my use case, just add the dependencies to the source tree.
Advantages I can think about:
- Simplicity: No need for a package manager or anything else.
- Stability: Because the dependencies are versioned with the project, they will not suddenly break; it will require a manual update which can be reviewed as any other changes to the project. Simply put, the project and its dependencies are in sync.
- Security: For the same reasons as above, a third-party update can be reviewed and malicious intents have more chance to be detected and rejected.
- Durability: Because the dependencies are in the project, no external resources need to be accessed for it to still function in the future.
- Customization: Each dependency can be adapted to the project. But, this is rarely needed and should probably be avoided. If done, make sure to keep track of the upstream version used to facilitate 3-way-merges.
The only drawback I can think about is the need to store more data, but this is negligible in my case. Also, be careful about the licenses and stewards of third-party dependencies, but they should be rare (especially to avoid the need to shun artifacts).
Code Complexity (experimental)
Use gzip/DEFLATE to measure code complexity. The rough reasoning is as follow:
- Hypothesis: code complexity is related to "information complexity".
- The same identifier or construct used multiple times can be simple to understand (matches compression).
- Comments are more abstract informations, additional details and explanations about the code. They are not optional and can be seen as part of the overall complexity.
- Use general purpose lossless compression to measure irreducible information complexity.
- Lossless compression has logarithmic improvements, thus gzip is probably already a relatively good measure of irreducible information complexity.
- Compress files separately to match per file understanding of the information and avoid global compression (limited context). Use the resulting byte count as the measure. Do the sum of all files of a project for total complexity.
This method measures information complexity, it doesn't take into account reasoning and the relations about the code. But, I believe that it is still much better than line count and is simple to perform on any kind of language or format reliably (gzip/DEFLATE is everywhere).
Correctness
There are multiples ways to reach a certain degree of correctness for a program; this is my model.
In my current methodology, the degree of correctness necessary is low and formalism and static analysis are almost not present. Preferred ways are relation testing and consistency testing (assertions).
Analysis
- Manual: understanding the code.
- Automatic: static analysis, etc.
Testing
- Manual testing: not formalized.
- Automatic testing:
- By relation.
- By consistency / coherence.
Testing by relation is about checking a relation in the field of possible inputs / outputs of an operation. Cases can be generated or they can be produced manually (commonly called unit testing).
Examples of relation types:
- Identity: Check that function A has the same behavior as function B. E.g. two versions of the same function; one implementation is naive and easy to understand, the other is optimized and hard to understand. The first is used to check the behavior of the other.
- Inverse: Check that
A(B(x)) == xor other combinations. E.g. pack and unpack functions. - ... or anything else
Testing by consistency / coherence would be about checking that the behavior of an operation or program is consistent regarding its rules and constraints. This can be done with ahead of time testing or with runtime checks: a lot of assertions about the state of a program can be inserted to detect issues as soon as possible. For example, if a program crash it would be considered incoherent and a test failure.
To facilitate testing, don't hesitate to expose internal functions; just don't document them. The documentation defines the API, but other things can be exposed in the actual code; if someone use these undocumented features, it is at their own risk.
Performances
This section list domains, ways and principles to follow when dealing with performances (computation, storage, networking, etc.).
Locality
Manage things to increase the locality of information when computing.
From closest to furthest: registers, L1-2-3 caches, RAM, drives, network, etc.
Caching
Caching can be used to improve locality, but also to factorize computations (prevent recomputing reused outputs).
Pre-computation
A kind of caching by computing costly elements which will or might be needed later.
Idiomatic Code
The LuaJIT's optimizations are probably made with idiomatic Lua code in mind; thus, seeking to write that kind of code can lead to performance improvements.
An example of that is the code generation of mgl. I tried to generate code that I would manually write.
It is still possible to generate non-idiomatic code that could be more efficient, but idiomatic code has more stability guarantees.
Algorithmic Improvements
Improve algorithmic complexity.
Overhead Reduction
Remove simple unnecessary operations. E.g. use lower-level constructs, avoid checks at different layers of abstractions, etc.
Approximation
Use an approximated process instead of a costly accurate one (heuristics, etc).
Data Layout
Inspired1 by Data Oriented Design (if not the actual thing). This optimization can be very effective in my methodology!
The way I design things in Lua is inspired by Object Oriented Programming, or at least, I am using objects, more specifically a directed graph of objects. Each object/table have fields. When dealing with lot of elements, it means creating a lot of tables. Instead of doing that, the data layout can be transposed / inverted, i.e. instead of having an array of tables with fields, we have an array for each field of the tables. This is a simplification, because we can put multiple fields in the same Lua or FFI array if they tend to be accessed together.
I have found in testing that it can be much more efficient even without the FFI and that the increase in efficiency is still present even without JIT compilation. And it seems already very effective without having to go into the specifics of the hardware (e.g. cache line size).
It has a significant effect on multiple metrics:
- Memory usage: It can reduce the amount of memory used, probably because tables and their management have a significant overhead.
- Allocation cost: It seems to be faster even for one item, and may be improved further with bulk allocation.
- Processing: By composing/arranging data for bulk processing, cache usage is improved. I suspect that it also helps the JIT compiler which seems to excel with "linear" bulk processing loops (e.g. decomposition of the data can remove unbiased branches).
- Garbage Collection: Because the number of tables/objects is in function of the fields and not the elements, the GC has few objects and references to manage, which greatly decreases the work it has to perform for a collection cycle.
The drawback is that this transposition / inversion of the data layout seems to imply the removal of the object oriented abstraction and requires manual bookkeeping. But, this removal of abstraction also comes with access to the context, i.e. how the objects (now just data) are used, which can reveal possible optimizations.
If we believe the heuristic that hotspots are mostly in loops, this data layout fits well, and it fits LuaJIT's heuristics.
It's also not just an optimization, it has interesting design properties. For example, it can be used as an alternative to a hierarchy of classes by using dynamic "components" which are assigned to entities, a design commonly called Entity Component System (but in my case I just care about systems and entities).
Distributed Computations (Parallelism)
Multi-threading, networks, etc.
For Multi-Threading:
- Don't do it yet; try to improve single-thread performances first.
- Do it, but avoid the need for shared memory and synchronization as much as possible. Copy the data by default.
Concurrent Computations
E.g. Coroutines. These concurrent computations do not happen in parallel; it is about reducing the synchronous latency of an operation, which will probably increase its asynchronous latency and use more compute overall. But, in a lot of cases, the synchronous latency is the problem. An example of that is the Lua Incremental GC which will work concurrently with user code and use (slightly?) more compute than a stop-the-world GC, but with much less synchronous latency, i.e. it will not freeze the program as the latter.
Another way to see it is that the slow synchronous operation is divided into smaller ones and it becomes a slow asynchronous operation.
Distributing Computations Over Time
The following is based on my experience in Factorio. Resources can be produced and consumed with different profiles. For example, the consumption can have spikes; in that case, storage (a buffer) may be used to mitigate those spikes.
Available computing resources for a specific system, if we ignore saving the energy, are in function of time. A specific amount of time is an amount of computing units available. But, those units cannot be fundamentally stored in a buffer to prepare in the event of a consumption spike. Therefore, the periods when the processor is unused are computing units lost forever.
By using the idle time of the processor to compute elements that will or might be used in the future, time can be saved and latency reduced.
Fluidity
In the case of most video games, they need to produce a certain amount of frames per seconds, at a steady pace. For the game to be fluid, each frame cannot exceed at certain time budget. A spike in computing consumption would result in losing frames (stuttering) or higher latency for concurrent operations.
Ideally, instead of consuming low or medium amount of compute most the time and spiky amount of compute sporadically, the game would have a way to consume a steady amount of compute all the time. This not only preserves the fluidity of the experience, but can also increases the amount of compute available to build the experience, because those computing units are no longer lost in idle times.
Profiling
For general CPU profiling, use a statistical/sampling profiler. Avoid as much as possible the profiling of a modified version of the original system, e.g. avoid instrumentation. If the profile doesn't sufficiently match the original system, it can't be used to guide optimizations.
When it is about synchronous latency, use manual and specific instrumentations progressively from the shallowest to the deepest level of details, searching for the culprit with the smallest amount of change to the system.
For non-CPU profiling, like network bandwidth, use automatic/general instrumentation (e.g. count the number of bytes sent by a specific line of code).
Versioning
This section describes how I should version my projects.
There are two kind of version: release and development.
Release
A release is a named specific point in time of a project to be distributed to users.
The version has the form <major>.<minor>, where major and minor are
continuously incremented integers (starting at 1 for the major number).
Major versions are not backwards compatible and should be considered different branches of the project (which may be simultaneously maintained).
Minor versions are backwards compatible. The notion of backwards compatibility is subtle; thus it is not described here.
The frequency of major version releases must be inversely proportional to the number of users.
Development
A development version is a commit / check-in hash. Don't use the 0.<minor>
form to describe a pre-version of the project.
FFI Bindings
In the case of bindings, don't see the LuaJIT FFI as a replacement for a C compiler, but as a better method than the Lua C API (less friction with C and higher-level) to write a wrapper. The FFI has limitations and is not able to load/check the C interface (headers) of loaded libraries. But if a C interface is designed specifically for the FFI (a wrapper) for a C library, the C compiler can do the work (e.g. checking, use of complex or platform-specific headers, etc.).
Making a C wrapper may be more work, but is safer and can be adapted to FFI needs.
Coroutines
As a rule for interoperability in the general case: only the thing that yielded the coroutine should be the one responsible of resuming it. This prevents resuming conflicts, which allows composition of yielding operations.
Also, use a [concurrent] tag in source documentation to warn that a function
can yield.
A Strategy Without OOP
This section describes a new way for developing my projects that is radically different from what I have done for the past years. The radical change is the avoidance of Object Oriented Programming (or my version of it). The change has been triggered by ideas coming from Data Oriented Design.
Instead of thinking in terms of objects, the fundamental concepts are processes, data and systems. More specifically, by using lexical scoping, locals, functions, upvalues, data structures (tables, etc.), modules (files) and interfaces (set of functions).
We start from a simple Lua file / module and we add as much local data and functions we need to achieve our goal. The file will grow larger, maybe to thousands of lines, and the hope is that at some point we may realize that some processes are independent, that is, they depend on completely different kind of data or with minimal needs, i.e. they use different upvalues. As an analogy, we can call these independent sets of data and processes islands, and we can call bridges the minimal dependencies which may exist. If these islands exist, we can move them to other modules, reducing the amount of context present in a single module, and the bridge can be a function call. Conversely, if we realize later on that a module depends a lot on the data of another module, we can merge them. The idea is to perceive the relationships between the processes and the data and to decompose the code in a way that reflects these relationships.
What I called a module can also be a dynamically created system. A module or system exposes an interface and it implements processes and manages data related to them. The system is an abstraction over its data, it can manipulate its structures as necessary. A system can use other systems. It is important to note that the interface is not about having an object, but a set of functions; the functions hold the data. In this view, a simple callback can be seen as an interface. Ideally, this interface must not expose references to the internal data structures to allow for easy changes of the internal data layout. This design also allows to compose systems easily. A system's interface can be created and used to add more general capabilities to another system. Thanks to upvalues, they are just functions which can be added to the other system.
An important goal for this strategy compared to OOP is to grant the ability to perceive the context of processes and data, which can lead to simplification and optimization opportunities.
This strategy should have a positive effect on these criteria:
- Simplicity: Remove the verbosity for
self.prefixes and other OOP redtape. New simplification opportunities by specializing the code to the actual use cases; less over generalizations. - Performances: New optimization opportunities by changing the data layout and other improvements described in the Entity Systems section. Also, because systems can have a simple interface, they may be moved to another thread for parallel computation.
- Correctness: Although the effect is probably minor, there are more static analysis opportunities.
Over generalization seems to be a very important problem for me; by thinking about objects in isolation, I tend to think about all the things they could be used for and I start thinking about stuff that I may never need.
Another important point that is more personal: I feel like it is much more enjoyable to develop in this way. I guess it is because it allows to focus on the concrete needs of the system, with better understanding of it as a whole and more opportunities to refine it. I now want to describe my old way of writing code as "too corporate"; probably because of too much protocols, conventions or other things that I considered necessary instead of actually making the thing. I feel like I wasted my time, but maybe OOP was for me a stand that I no longer need.
Entity Systems
As mentioned in the performances section, changing the data layout can be more than an optimization. This section explains a method which removes my need for OOP to model entities. I should not use a class system ever again; I can still seldom use objects, e.g. file handles, but certainly not a class system with the intent to build some kind of class hierarchy.
Entity systems are more specific systems dealing with unique shared entities. In a specific context we could call a world, each entity is identified by a unique identifier which cannot be reused: a Lua number incremented at each entity creation. This number is then given to various entity systems to register the entity into them. The components of an entity are implicitly given by the systems it is registered in.
It implies that components are dynamic and use-after-free / double-free are not really an issue. But, there is still bookkeeping to be done to avoid memory leaks.
Entity systems can have different purposes and they can own multiple data structures associated to different processes. An important purpose would be to perform a specific computation on all of its entities. To make that efficient, the system can use arrays, with or without the FFI.
The allocation strategy of these arrays is: add a new element at the end and
remove an element by moving the last element into the removed element's slot.
This is simple and prevents holes. To do that, a map from entity_id to
internal_id and a map from internal_id to entity_id (effectively an array)
are used. This is possible because the system's data is behind an abstraction.
The internal_id can then be used to index the different arrays.
In preliminary testing, the performances are better on multiple metrics compared to objects, even without FFI and without JIT. The following are my observations.
Memory usage should be lower:
- Let's call a slot what can store a Lua value, i.e. 8 bytes.
- A key-value pair in the hash part can be considered a cost of around 3 slots (3*8 bytes) in my tests.
- The bookkeeping of one entity system is 4 (3 + 1) slots per entity.
- A table has a high initial overhead, but let's say it is negligible if the table has lot of fields. For each field of an object, 3 slots are used.
- At only 2 fields stored in one entity system or one object, the number of slots is now the same: 6 slots per entity (system: 4 + 2, object: 3 * 2).
- With more than 2 fields in one entity system, the amount of memory will go towards 3x less than for objects (system: +1, object: +3).
GC collection should be faster:
- Less memory used overall.
- For the content of an FFI array or a Lua array, there seems to be a small overhead. (It should be better with the FFI.)
- For each table (object), there seems to be a significant overhead.
For processing, the optimization opportunities seem to be much better as I understand it from Data Oriented Design. E.g. batch operations, weave together related data or split unrelated data, improve cache density, decrease (instruction-)cache misses, remove some branches, do manual loop hoisting to optimize for the interpreter, etc.
In conclusion, it should not be less efficient and it gives the opportunity to be much more efficient. But, the impact of this method in regards to simplicity and performances still needs to be determined in a real project.
Recommendations:
- Don't use the FFI by default. Using the FFI is harder (although not that much in that case) and is commonly much slower in interpreted code while not being necessarily faster in compiled code.
- Don't weave together fields by default; keep them separated. It's easier to design (especially without the FFI) and gives good results with sequential accesses.
Algorithmic Generation and Chaos
I use the expression "algorithmic generation" in place of "procedural generation" because I consider it more accurate: it doesn't matter much if the language is procedural, what matters is the use of algorithms to generate data.
I want to invest myself deeply into algorithmic generation, and when it comes to randomness in the context of games, I just see it as a specific kind of generation: the algorithmic generation of chaos. There is no concept of true or false randomness in this view, it's all about order and chaos. E.g. sometimes, I may need chaos that has less discrepancy, something biased.
I don't believe that I will ever need the randomness of physics in my projects. On the contrary, I want the determinism.
Then, these chaotic functions can be used to explore the field of possibilities with a chaotic sampling instead of an ordered sampling.
Documentation
- Lua
- Markdown
- HTML / CSS
- Pikchr / SVG
Data
- Filesystem model
- Lua data model
- C data model
LÖVE and Simplicity
I don't consider making (and playing) multiplayer experiences, especially like a game as a service or even a dedicated server like Minecraft, as essential anymore. I like that an experience can be enjoyed alone, offline, customized and with the data stored locally. It also simplifies things a lot, because multiplayer is more complex (synchronization, authentication, security, data responsibility, etc.).
I can still make single player games with multiplayer features, but at a lower scale, with a server running with LÖVE, for LAN (see ZeroTier) or Internet, whether it is embedded in the game, launched as a separate application on the desktop or on a dedicated GNU/Linux server. In all cases, the server and the client would use the same methodology, which simplifies things a lot, especially for distribution.
This means that it is possible to no longer use a system event loop like ljuv
and rely instead on LÖVE threads, channels and elscheduler to build what is
needed for a specific game, which is more simple.
I can probably do all I want with LÖVE, LuaJIT and the FFI, without any AOT compilation which would dramatically increase the distribution complexity. I want to follow this constraint of not doing AOT compilation or using additional native dependencies for simplicity and to master optimizations restricted to a deep understanding of the interpreter, tracing JIT, C data semantics, processors (Linux perf), etc.
(As a last resort, I could do AOT compilation of C code for performance reasons and it could be an optional library loaded at runtime with a fallback path written in Lua.)
Furthermore, a good chunk of computation is or can be performed on the GPU dynamically with GLSL, making this constraint even more reasonable.
All I Need Is LÖVE. Not exactly, because Lua/LuaJIT and C are still at the core of my methodology and LÖVE is not completely aligned with how I want to do things, but it sounds good.
This allows to focus on what really matters to me: developing interactive and non-interactive audiovisual experiences, prototypes and games in a way that seeks simplicity. It is about programming, simulation, algorithmic generation, fiction, lore, world building, storytelling, visuals, animations, text, musics, audio atmospheres, sound design, etc.
A lot can already be done with this constraint.
- ^ Mostly from CppCon 2014: Mike Acton "Data-Oriented Design and C++". I don't agree with everything, I just took the fundamental ideas.