Impact of 64bits vs 32bits when using non precise GC.

As ARM release its intention to go 64bits, we can be pretty much sure that almost every device will be 64bits soon, except the one on which it is unrealistic to run a garbage collector.

A garbage collector can be precise like in Java, or not, like D’s GC or Boehm’s GC for C++. It means that theses GC don’t know what it a pointer and what isn’t. Non precise GC cannot move data to compact the heap, and are also prone to false positive. On a theoretical perspective, switching from 32 bits to 64 bits should improve things, but what it is in practice ?

Non precise GC, and how false positive happens

When you use a non precise GC, the GC is not aware of what is a pointer and what isn’t. It has to suppose that any data it encounter may be a pointer. Pointers are just a numbers stored in memory, and storing the address of something else in the memory. If you have data in memory, even if theses data aren’t pointers, they may have the value of an existing memory address, where something else is stored. This something else will be marked as « alive » by the GC, even if it is actually garbage. This is a false positive. Thing can cascade, because that memory marked as alive can also keep alive some other pieces of memory, and so on.

The bigger the block of memory are, the worse the problem becomes. A large chunk of memory is more likely to be kept as a false positive, because if it is bigger, more address point to it, and so more data can eventually have the same value of one of theses addresses. So the bigger the block is, the more likely it will become a false positive but it get worse ! If the block is bigger, it has more data in it, so it will also cascade more. Finally, the block is bigger, so the direct consequence of the false positive is bigger.

On a 64 bits system, memory addresses are 64 bits long. As the memory address space is way larger, it is less likely that false positive happen.

A program that generate garbage and store how much memory it use

To get numbers, lets generate garbage !

void main() {
    import std.random;
    Mt19937 gen;
    import std.process;
    import std.conv;
    string cmd = "cat /proc/" ~ to!string(getpid()) ~ "/statm | cut -d ' ' -f 2";
    alias void* T;
    alias uint T;

    // Variable block size.
    enum BLOCK_SIZE = 32 * 1024 * 1024 / T.sizeof;
    T[] prevdatas;
    T[] datas = new T[BLOCK_SIZE];
    T[] crap;
    for(uint i = 0; i < 8192; i++) {
        foreach(ref data; datas) {
            static if(T.sizeof > 4) {
                data = cast(T) gen.front;
                data = cast(T) ((cast(long) data << 32) | gen.front);
            } else {
                data = cast(T) gen.front;
        crap = prevdatas ~ datas;
        prevdatas = datas;
        datas = new T[BLOCK_SIZE];
        if((i % 16) == 0) {
            import std.stdio;

This program allocate arrays, and put random value in them. And start again and again. At some point, it does measure how much memory it use.

This program doesn’t keep any reference to the old arrays, so it should generate a lot of garbage but if the memory usage grow, we have a leak (so false positives).

Result on 32 bits

The results on a 32bits system are scary. I did measure the memory consumption of this program with different block sizes. First of all, I could not run it on block larger than 128Kb. At this point, the amount of false positive is that high that the program end up early because it can’t allocate anymore.

It is clear that the memory usage isn’t stable at all. We have a massive leak when the block size is 16Kb and bigger. To know how many blocks did leak, I have rendered the same graph, but with the number of page used divided by the size of the block.

This graph tells us that the amount of false positive is the same for 16, 32 and 64Kb, but then the number of false positive increase when the block size is 128Kb. As the block is bigger, and may itself lead to more false positives, this cascade and put the system over its limits.

Lets switch to 64 bits !

With 64 bits, the situation is much nicer. I did run the tests to block size up to 32Mb.

In 64bits, the program quickly reach a point where its memory doesn’t grow anymore. It means the the amount of false positive is negligible.

In the first part, the amount of data allocated is negligible compared to the size of the code itself and data from the standard lib and runtime. Then, the memory consumption grow linearly with he memory block size. This second graph confirms that the program is not leaking, and that the amount of floating garbage is proportional with the total amount of data.


Switching to 64 bits solved the issue of false positive in the test program. These data must be confirmed with real world software, but I think it is safe to assume that the situation is way better in the 64 bits case.

As 64 bits is the future, it is unreasonable to invest into precise GC unless some very strong short term incitations comes up. However, precise GC has some other advantages that we may want to exploit. This also tells us that semi precise GC – for example a GC that is precise on the heap, but not on the stack – is a very valid solution and is compliant with hardware of the future. This also means that GC in system language get way more interesting that it used to be.

Additionally, this tells us that the support of 64 bits in DMD is a key feature, and a lot of attention should be put on it, especially on windows. 64 bit is, to this regard, a key feature when it comes to non precise GC.

At some point, false positive will be a problem even in 64bits, but I could not reach the point where it is in a reasonable time and efforts. If someone have data about this limit, please tell me. You’ll find the raw results, if someone want to study them throw this link : gc_measures_d2.058

Leave a Reply

Your email address will not be published. Required fields are marked *