This document outlines the MCI’s take on concurrency (atomicity, threading, and so on) during code execution.
The virtual machine generally doesn’t make many guarantees in a concurrent environment. In general, managed code should not depend on atomicity guarantees made by the underlying hardware, as this makes code unportable in very subtle and hard-to-detect ways.
In other words, we do not guarantee that reads and writes of word-sized values will be atomic, as many other virtual machines do. While, in practice, you may find that they actually are (due to how the hardware works), it is not something we guarantee, and MCI will not consider atomicity of such operations when reordering instructions and performing other such optimizations.
The one thing that the virtual machine does guarantee is the consistency of reference values (this includes array and vector references). What this means is that dereferencing a reference (or an array/vector) will never result in an invalid memory access due to concurrency (save for the null case, naturally). Note that this is only guaranteed for references that are correctly aligned on a native word-size boundary (which is required for references to work correctly either way).
The virtual machine provides a number of intrinsics to do various atomic operations on word-size values (i.e. int and uint). Most basic arithmetic and logic operations are supported, simple loads and stores, as well as the CAS (compare-and-swap) operation.
The reason for not supporting fixed-size integer types is that implementing atomic operations for these across all supported architectures is hard, and may in some cases result in very inefficient code.
These intrinsics guarantee full sequencing (acquire/release semantics) on all supported architectures.
In general, the virtual machine relies heavily on the D runtime for its threading infrastructure. This is because the D runtime provides machinery to suspend and resume all threads for garbage collection runs (only relevant for stop-the-world GCs), and also provides a cross-platform TLS (thread-local storage) mechanism.
All threads that somehow execute managed code (be it via the interpreter by executing JIT-emitted code, or by calling through a trampoline) must therefore be attached to the runtime. There are some other subtle details like running D module TLS constructors, attaching to the current garbage collector, and invoking managed thread entry points as well.
All trampolines generated by the virtual machine (via the tramp instruction) contain code to do all of the above. However, if threads jump directly into JIT-emitted code (this should by all means be avoided), they will have to do the attachment sequence manually before entering the managed code area.
All threads created through the intrinsic threading API are implicitly registered with the D runtime, hooked up to the garbage collector, etc. For such threads, this entire section can be ignored.
It is worth noting that once the entry point function in the program returns, the virtual machine will wait for all intrinsic threads that aren’t daemon threads to join. This does not include threads created outside of the virtual machine. As such, it is the programmer’s responsibility to ensure that threads outside of the virtual machine do not call into managed code once the entry point function has returned and the virtual machine has been shut down.
Intrinsic daemon threads will be forcefully terminated by the virtual machine. It is important that such threads can cope with this, and that they do not rely on any termination code to run.