Thursday, October 25, 2012

Refactoring-Safe Switch on Enum

A switch statement in Java on an enumerated type looks something like this:

enum TrafficLight {
    GREEN,
    ORANGE,
    RED
}

TrafficLight trafficLight = TrafficLight.GREEN;

switch (trafficLight) {
    case GREEN:
        doGreen();
        break;
    case ORANGE:
        doOrange();
        break;
    case RED:
        doRed();
        break;
}

This works, job done, I'm going home. Hopefully no one ever changes the enum.

The java manual writes:
You should use enum types any time you need to represent a fixed set of constants. That includes natural enum types such as the planets in our solar system and data sets where you know all possible values at compile time—for example, the choices on a menu, command line flags, and so on.
I see 2 possible problems with this definition:
  1. This text proves that what we think is set in stone can still change... planets in our solar system? It was probably written before 2006 when Pluto still counted as the 9th planet in our solar system. 
  2. "Compile time" vs. programming time. The enum values are known when writing the switch code. Who's compile time? Of the switch or the enum? "Choises of a menu" or "command line flags" can surely be modified and recompiled later on, without touching every line of code that made use of those enums. Especially when the enum comes from another code base (library).
Most enums I come across are not as unshiftable as the Months (JANUARY to DECEMBER).

So: What happens if someone changes TrafficLight after I wrote and compiled my switch? Possible scenarios:
  1. Element is renamed: someone decides to rename ORANGE to YELLOW.  
    1. Enum is defined in an external lib: compilation of switch code fails, we need to change. That's good, the switch statement must be updated and won't cause problems.
    2. Enum is in same code base: refactoring renames switch block too, case closed.
  2. Element is removed: That's good, the switch statement must be updated to compile, case closed. 
  3. Element is added: someone adds GREEN_BLINKING. We won't notice. If the developer who added the enum value does not go through all the usages then it's a possible cause of bugs.

Basic solution

switch (trafficLight) {
    case GREEN:
        doGreen();
        break;
    case ORANGE:
        doOrange();
        break;
    case RED:
        doRed();
        break;
    default:
        throw new UnsupportedOperationException("Unsupported enum value: "+trafficLight+"!");
}

Now we notice it as soon as the code is executed with TrafficLight.GREEN_BLINKING.
This still has the potential to go into production with a bug in the case that GREEN_BLINKING did not occur when testing. Maybe it's a very rarely used value. That's likely, because if it was an obvious element of the enum then it would have been in right from the start. And it's likely too that there is no unit test covering this value... if someone went to add one now then he would have updated the switch code too.

Thus a safer syntax is:

Better solution

enum TrafficLight {
    GREEN,
    ORANGE,
    RED;
    public static void assertSize(int expectedItems) {
        assert values().length == expectedItems : "Update the code calling this with "+expectedItems+"!";
    }
}

TrafficLight trafficLight = TrafficLight.GREEN;

TrafficLight.assertSize(3);
switch (trafficLight) {
    case GREEN:
        doGreen();
        break;
    case ORANGE:
        doOrange();
        break;
    case RED:
        doRed();
        break;
    default:
        throw new UnsupportedOperationException("Unsupported enum value: "+trafficLight+"!");
}

This guarantees that, as long as this path is executed and assertions are turned on, we'll see.

Alternative solution

Java's switch still bears potential for bugs. Unfortunately there is no switch syntax that automatically breaks. Forgetting a break statement is a common programming error. If-then-else could be used, but that's ugly and has other issues.

Another technique is to force the enum values to implement the functionality directly:

enum TrafficLight {
    GREEN {
        @Override public void handle() {
            handleGreen();
        }
    },
    ORANGE {
        @Override public void handle() {
            handleOrange();
        }
    },
    RED {
        @Override public void handle() {
            handleRed();
        }
    };
    public abstract void handle();
}

This way one can't simply add a value and break existing code. Unfortunately this technique isn't always feasible, for example when the enum is from an external dependency.

Conclusions

Suggestion for Java: add an assertSize(), expect() or similar method to enum.
Suggestion for IDEs: auto-generate such code when creating an enum or when switching on one.

History and evolution

My original use of this pattern was to have the enum's assertSize() method throw AssertionError directly instead of using the assert statement, and then throw AssertionError("Dead code reached!") again in the switch default block. That combination works, however, it always aborts the program - even in production. After rethinking this topic I've come to realize that it's bad.

There are many scenarios where a program could recover and go on from such a situation. For example a server program serving clients. A web server or web service, where each individual request might fail. If one enters such an execution path, the main routine can still catch that, and return an internal server error to the user. An abort would be fatal.

No comments:

Post a Comment