Type-safe tagged unions in the D programming language

A tagged union

Sometime you have some data that can be of type A or B. Very common use cases include decoding various file format or network protocols, communicate with memory mapped devices, some processing that can return an A or a B depending on its result, and many more.

To resolve that issue, it is common to use a union. A union is a aggregate where all members share the same memory. It is very handy, but use it wrongly and you end up badly messing up your memory. I’ll demonstrate in this article how to make this safe in D.

The first step toward this goal is to create a tagged union: a union associated with a tag that indicates which element of the union is currently valid.

1
2
3
4
5
6
7
8
9
10
11
12
struct TaggedUnion {
    union {
        A a;
        B b;
    }

    enum Tag {
        A, B
    }

    Tag tag;
}

Now we have a nice struct that contains either an A or a B, and a tag that indicate which one it is. But this is still unsafe, use it wrong and you’ll wreck your memory. We need to make it safe.

Some encapsulation

The next obvious step is to build the struct in the proper state right away. Let’s put some constructors in place, let’s make all the data private, and provide access to these via controlled, typed, ways. We need to provide a method that checks the tag and calls the right user code with the correct type. Thankfully, D allow to do that – without performance penalty – using template alias parameters.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
struct TaggedUnion {
private:
    union {
        A a;
        B b;
    }

    enum Tag {
        A, B
    }

    Tag tag;

public:
    this(A a) {
        tag = Tag.A;
        this.a = a;
    }

    this(B b) {
        tag = Tag.B;
        this.b = b;
    }

    auto ref apply(alias fun)() {
        final switch(tag) with(Tag) {
            case A:
                return fun(a);

            case B:
                return fun(b);
        }
    }
}

At this point you probably wonder how you can use this. This is dead simple, you simply call the function apply with the code you want to run as a template parameter. Your code will be instantiated with all the possible types.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
TaggedUnion t = ...;

void process(T)(T data) {
    alias Type = typeof(data);

    static if(is(Type == A)) {
        // Code that handle the case where it is an A.
    } else static if(is(Type == B)) {
        // Code that handle the case where it is an B.
    } else {
        static assert(0, "You must handle type " ~ Type.stringof);
    }
}

t.apply!process();

Be careful, as type inference can get in your way here. If case A and B don’t return the same type, you may want to specify it explicitly. Another pain point is that a function that does not return is assumed by dmd to be of return type void (when it can be anything). If you want to throw in some cases, you’ll also need to be explicit. You also can’t use a local function for reasons that seem unclear to me – most likely a dmd bug (2.062).

A generic solution

Now we can repeat this pattern all over the place, but at some point a more generic solution becomes worthwhile. D is quite powerful at this game, so let’s leverage this. We will use some string mixins to construct the exact same union as we did before, but dynamically with several types. Nothing new here, we simply ask the compiler to write the code for us.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
template TaggedUnion(Types ...) {
    private auto getUnionContent() {
        string s;
        foreach(T; Types) {
            s ~=  fullyQualifiedName!T ~ " member_" ~ T.mangleof ~ ";";
        }

        return s;
    }

    private auto getTag() {
        string s;
        foreach(T; Types) {
            s ~= T.mangleof ~ ",";
        }

        return "enum Tag {" ~ s ~ "}";
    }

    private auto getSwitchContent() {
        string s;
        foreach(T; Types) {
            s ~= "case Tag." ~ T.mangleof;
            s ~= ": return fun(member_" ~ T.mangleof ~ ");";
        }

        return s;
    }

    struct TaggedUnion {
    private:
        union {
            mixin(getUnionContent());
        }

        mixin(getTag());

        Tag tag;

    public:
        this(T)(T t) if(is(typeof(mixin("Tag." ~ T.mangleof)))) {
            mixin("tag = Tag." ~ T.mangleof ~ ";");
            mixin("member_" ~ T.mangleof ~ " = t;");
        }

        auto ref apply(alias fun)() {
            final switch(tag) {
                mixin(getSwitchContent());
            }
        }
    }
}

D allows for very expressive type construction, and that is awesome! This small example is simple and shows perfectly how to do it. Unlike many abstractions in programming, this one doesn’t cost anything at runtime as it uses compile time feature of D. This is probably one of the biggest advantages of D : allowing you to build nice abstraction completely at compile time and have the optimizer remove the excess for you.

PS: Once again, a big thank to John Colvin for his help in correcting the article.

One thought on “Type-safe tagged unions in the D programming language

  1. Hi,

    Great article! To make the entry a little less steep for D newbies some small comments in the code section would help a lot.
    Something like e.g:
    // create a string at compile time which declares a member of every single Type which may be contained in the tagged union struct.
    s ~= fullyQualifiedName!T ~ ” member_” ~ T.mangleof ~ “;”;

    private:
    // now the compiler is instructed to ‘place/expand’ the string contents as _source code_ in the union, the definition of the union is done by the compiler at compile time based on the string create at compile!:
    union {
    mixin(getUnionContent());
    }

    mixin(getTag());

    Anyways great you’re doing articles like this! I really hope to see some more.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>