Visitor pattern revisited in D

OOP is known (or should be known) as a behavioral abstraction. It nicely solves a vast variety of problems, but enforces on the programmer an abstraction that doesn’t fit every problem. The fact is, you sometime need objects as data abstractions and not behavioral abstractions.

This is commonly the case for tools that manipulate code. A parser typically returns an AST that will be processed. The processing can be compilation, code formating, code analysis, and basically any other use related to source code manipulation.

Following regular OOP principles, manipulations have to be implemented by the AST node classes. This is a problem, because it prevents any 3rd party developer tool to reuse an existing parser and AST. The known solution is to use the Visitor pattern, which allows to dispatch in our code according to the AST node’s type.

The Visitor pattern has many known problems, to the point that many consider it an antipattern. A better solution using D has to be proposed.

The visitor pattern and its limitations

Let’s consider the following class hierarchy :

1
2
3
4
class Node {}

class A : Node {}
class B : Node {}

Now the visitor :

1
2
3
4
interface NodeVisitor {
    Base visit(A a);
    Base visit(B b);
}

Finally, we need a way to call the right function in the visitor according to the class type. To do that, we use the virtual dispatch mechanism in each Node :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Node {
    abstract Base accept(NodeVisitor v);
}

class A : Node {
    obverride Base accept(NodeVisitor v) {
        return v.visit(this);
    }
}

class B : Node {
    obverride Base accept(NodeVisitor v) {
        return v.visit(this);
    }
}

Now we can create any class implementing NodeVisitor and perform some specific processing according to the Node it encounters. Great ? In fact, this pattern has severe limitations.

First of all, you have quite a lot of code duplication. In every node, the same method must be implemented. It is impossible for 3rd party tools to create their own Node implementations, because the Visitor will no be able to dispatch them accordingly. Or the NodeVisitor would have to be modified, which involves breaking all the Visitor implementations for Node that are not used. Secondly, each visitor has to implement something for all Nodes, even if it doesn’t make any sense. Lastly, you have no control on what is returned by the visit methods.

Compile time reflection will save us all !

Fortunately, D provides a very useful feature : compile time reflection. It allows code to be crafted according to reflection on other objects. We will implement a dispatch function that reflects a visitor object, and calls the correct method according to a parameter.

Here is a prototype for the function.

1
2
3
4
5
6
7
auto dispatch(
    alias unhandled = function typeof(null)(t) {
        throw new Exception(typeid(t).toString() ~ " is not supported by visitor " ~ typeid(V).toString() ~ " .");
    }, V, T
)(ref V visitor, T t) if(is(T == class) || is(T == interface)) {
    // Implementation
}

The dispatch function has 2 parameters : a visitor and an object that will be visited known as t. This object is passed in a polymorphic manner, so T is one of its base classes or an interface, not the actual type of t.

An alias parameter is used to specify the behavior to adopt when the visitor is unable to visit t. By default, an Exception is thrown.

So, the first thing that needs to be done is to get the object reference if T is an interface and get the typeid of the parameter. Then we skip the code that actually dispatches and fall back to the failing case.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
auto dispatch(
    alias unhandled = function typeof(null)(t) {
        throw new Exception(typeid(t).toString() ~ " is not supported by visitor " ~ typeid(V).toString() ~ " .");
    }, V, T
)(ref V visitor, T t) if(is(T == class) || is(T == interface)) {
    static if(is(T == class)) {
        alias t o;
    } else {
        auto o = cast(Object) t;
    }

    auto tid = typeid(o);

    // Try to dispatch . . .

    // Dispatch isn't possible.
    static if(is(typeof(return) == void)) {
        unhandled(t);
    } else {
        return unhandled(t);
    }
}

Now we have the typeid of the object we want to dispatch. With reflection on the visitor, we can determine the method to call. But beforehand, an helper function is needed to cast t without having to perform the regular checks. They are useless because it had to be done before anyway.

1
2
3
private U fastCast(U, T)(T t) if(is(T == class) && is(U == class) && is(U : T)) {
    return *(cast(U*) &t);
}

Let’s get back to our dispatch. We need to reflect the visitor to retrieve all its visit methods. Then we loop through all of them to find the one that has one parameter with the same type as t.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
auto dispatch(
    alias unhandled = function typeof(null)(t) {
        throw new Exception(typeid(t).toString() ~ " is not supported by visitor " ~ typeid(V).toString() ~ " .");
    }, V, T
)(ref V visitor, T t) if(is(T == class) || is(T == interface)) {
    static if(is(T == class)) {
        alias t o;
    } else {
        auto o = cast(Object) t;
    }

    auto tid = typeid(o);

    import std.traits;
    foreach (visit; MemberFunctionsTuple!(V, "visit")) {
        alias ParameterTypeTuple!visit parameters;

        static if(parameters.length == 1) {
            alias parameters[0] parameter;

            static if(is(parameter == class) && !__traits(isAbstractClass, parameter) && is(parameter : T)) {
                if(tid is typeid(parameter)) {
                    return visitor.visit(fastCast!parameter(o));
                }
            }
        }
    }

    // Dispatch isn't possible.
    static if(is(typeof(return) == void)) {
        unhandled(t);
    } else {
        return unhandled(t);
    }
}

Note that the reflection is a compile time feature, so the foreach will be run at compile time too. In other terms, it will be unrolled. The code boils down to a series of ifs at run time.

Now visitor can return anything. It doesn’t have to support every existing node type, and any node type can be added. In fact, most of the problems of the visitor pattern have gone away. By the way, we don’t have any accept method now, so let’s write one :

1
2
3
4
5
6
7
auto accept(T, V)(T t, ref V visitor) if(is(T == class) || is(T == interface)) {
    static if(is(typeof(visitor.visit(t)))) {
        return visitor.visit(t);
    } else {
        visitor.dispatch(t);
    }
}

This is purely a syntaxic sugar method, because it is not needed anymore.

The final solution and drawback.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class Node {}

class A : Node {}
class B : Node {}

// And the visitor

import util.visitor;
class NodeVisitor {
final:
    void visit(Node n) {
        return this.dispatch(n);
    }

    void visit(A a) {}
    void visit(B b) {}
}

And voila ! You can do any processing, and return anything from the processing of the node, without the code repetition and the usual limitations of the visitor pattern.

Obviously, this is likely to be slower, because of all the if’s done. On the other hand, you avoid the double virtual dispatch, so it is likely to be a win if you don’t have a lot of node types. In the general case, expect it to be slower.

It is definitively possible to precompute an associative array of methods to call at program startup, which will reduce the dispatch to an AA lookup an an indirect function call. The current implementation doesn’t do this.

This implementation have hard time to comply with D’s @safe, @system and @trusted system. The fastCast function is obviously @system, but called like it is in dispatch, safety is ensured. dispatch cannot be qualified as @trusted because it depends on visit method called. The limitation is known since David Nadlinger explained it in @trusted considered harmful.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>