Flowstopper: November 2013

Friday, November 29, 2013

Tabs and Saved Session Folders for PuTTY

For many years I've been using the PuTTY network terminal to connect to remote servers when doing sysadmin tasks. And when I do, my monitors quickly fill up with those black windows. Remember web browsing before tabs? Like that. I totally lose track.

The other thing that's annoying is that it does not let me categorize my saved sessions. All are in one alphabetical list. That might work when you just have a handful, but my list grew to dozens. Meh.

So I checked again today on the official site for an update:

With such an user interface of the site, it would be too much to expect gimmicks in the program. Yes there was an update. Security fixes.

So I googled for what else there is. And apparently there are plenty of wrappers around PuTTY that bring exactly what I want: Tabs and Session Folders. I don't know how they could hide from me for so long - maybe because neither the official site nor the Wikipedia page mention any.

I've decided to go with this open source and actively maintained one: superputty

Conclusion

Same putty, more gui. Awesome.

Friday, November 8, 2013

Java: Should I assert or throw AssertionError?

Assertions were introduced to the Java language in JDK 1.4, that was in 2002. It brought the assert statement

assert x != null : "x was null!";

and the throwable error class

if (x==null) throw new AssertionError("x was null!");

Which one should I use in my code?

Well then, what's the difference?

The assert statement

Unless assertions are actively turned on when running the program (-ea), they are not executed. They are still in the bytecode, but they don't run.

Bugs go unnoticed.
These bugs should have been detected already in unit tests and the testing phase before deployment when running with assertions on. But experience taught us that some slip through.
No performance loss.
There are a few rare scenarios where it matters.
Side effects are possible.
Badly programmed assertions can cause nasty side effects. Because sometimes the assert code is executed, and other times it's not, the program may behave slightly different.

Using the assert statement gives control to the runtime config whether they run or not. The choice is left to the person who runs the program. The choice can be made from run to run, no recompile is needed.

The AssertionError

The asserting code is always executed. Changing your mind about execution means a recompile and re-deployment.

Bugs always show up.
Even in production they abort the program.
Performance loss.
In most cases it's irrelevant.
Guaranteed to have no side effects.
The executed code is always the same.

Vital or nice to have?

Programs were written in Java long before assertions were available. Some came up with their own self-made assertions logic - which was not as clean and short as the real thing. And most probably didn't.
Assertions are not allowed to alter the program logic, were not always available, and are not mandatory. One can very well run the same program with them turned off. Conclusion: nice to have.

However, it is not guaranteed that a specific bug ever shows up without assertions running. Let's take a flight simulator for example. Assume a computation bug that only occurs in a rarely executed code path. The bug can cause the airplane to fly at slightly reduced speed, and no one ever notices. Or it can be a number overflow, causing the plane to fly backwards. That will surely be seen each time.

Would assertions help in production code?

It depends on the domain. Assertion means abort. Is that what you'd want, to prevent the worst? Or would you rather try to go on, hoping that it's a minor, neglegtible bug?

End-of-day accounting program: rather abort, alert the technicians, they fix it, and go on. No damage done, and bug fixed.
Real-time program where thousands of employees depend on it, and an abort means a big loss of work time. No abort. Hope that the bug eventually shows up as a side effect, and it can be traced and fixed.
Program where an abort is the worst case scenario, like a flight simulator: go on.

Conclusions

assert, not AssertionError

When writing library code, assert is the right choice. It gives the power to the user whether assertions should be on or off, whether he favors detection + abort, or to go unnoticed.

When writing an application, use assert as well. If you want to enforce assertions, you can still hack in this piece of code, and if you change your mind later, you don't have to go through all assertions and replace them:

static {
    boolean assertsEnabled = false;
    assert assertsEnabled = true; // Intentional side effect
    if (!assertsEnabled)
        throw new RuntimeException("Asserts must be enabled (-ea)!");
}

Also, with assert, they can be enabled on a per-package level

java -ea:com.example... MyApp

AssertionError to satisfy compilation

Sometimes, when reaching code that should never be reached, an AssertionError is thrown to make the code compile. Example:

switch (TrafficLight) {
    case GREEN:
        return doGreen();
    case ORANGE:
        return doOrange();
    case RED:
        return doRed();
    default: 
        //Ugh! I gotta do something, but have no clue what to do!
        //Let's abort the app and turn the semaphore offline.
        throw new AssertionError("Dead code reached");
}

An assert statement would not compile when a return is required.

However, in some cases, it might be more suited to throw an exception instead of an error, for example an UnsupportedOperationException. In the above case an exception barrier could catch that, then turn all semaphores at the intersection to blinking orange for a minute, and then restart the green interval as it would from a clean program start and continue normal operation. That would have several advantages:

be cheaper than sending a technician
the semaphore is off for short time only
while it's off it's blinking orange, rather than being totally off, that seems more secure

The problem would be detected in both cases. With an AssertionError it's in the output for free. And with UnsupportedOperationException it's the task of the one catching the exception to log it.

So again, the choice is between a hard abort, or giving the program the chance to recover and continue if a higher level decides to do so.

What I'm missing: detection and logging!

In the production scenarios 2 and 3 from above I'd want a 3rd way of handling assertions, which is not offered by Java's assertion feature.

Java has 2 strategies:

Don't even check, hence no abort
Check, and conditionally abort or continue

And I'm missing the 3rd one with the behavior:

Run assertion code to detect bugs and don't abort, but instead log.
It's a clear bug, it's detected, so log it. There's no cheaper way to detect it than right here, in the assert code that was written already.

Configure your IDE to run with assertions

It strikes me that the default runconfig in the most common IDEs does not have assertions enabled. It actually happened to me that I had spent way too much time chasing a bug, when the assertion would have shown it instantly - but they weren't on.

In IntelliJ IDEA: You need to modify the Defaults for all kinds that you're using: Application, JUnit, TestNG, ... and changing the defaults does not change the runconfigs you've created earlier. And you need to do this in every project, separately. (Please add a comment if you know how to set this once and for all!)

For Eclipse there's an explanation here http://stackoverflow.com/questions/5509082/eclipse-enable-assertions also see the 2nd answer about JUnit tests.

Monday, November 4, 2013

Java toString(): the Program Logic vs. Debug Dilemma

I'll start with my real world problem of how to implement toString() for my new class, followed by an analysis of how Java uses toString(), and finishing with my conclusions.

Real world problem: my class

Here's the (stripped down for simplicity) version of my new class:

1:  /**  
2:   * Represents a possibly multi-byte character and provides information about it.  
3:   * <p>A UnicodeCharacter is the equivalent of a Java "code point".</p>  
4:   */  
5:  public final class UnicodeCharacter {  
6:    private final int codepoint;  
7:    public UnicodeCharacter(int codepoint) {  
8:      this.codepoint = codepoint;  
9:    }  
10:    public int getCodepoint() {  
11:      return codepoint;  
12:    }  
13:    /**  
14:     * @return Array with usually 1 character, 2 characters for multi-byte.  
15:     */  
16:    public char[] getChars() {  
17:      return UCharacter.toChars(codepoint);  
18:    }  
19:    /**  
20:     * @return Tells if the {@link #getCodepoint codepoint} is for a surrogate pair. 
21:     */  
22:    public boolean isMultiByte() {  
23:      return getChars().length>1;  
24:    }  
25:    /**  
26:     * @return For example the category UCharacterCategory.UPPERCASE_LETTER  
27:     */  
28:    public Byte getCategory() {  
29:      return (byte) UCharacter.getType(codepoint);  
30:    }  
31:    /**  
32:     * See {UCharacter#getName}  
33:     */  
34:    public String getName() {  
35:      return UCharacter.getName(codepoint);  
36:    }  
37:  }

Now for toString(), what should it be?

Suggestion 1: for the human user, debug output

1:    public String toString() {  
2:      return "UnicodeCharacter{" +  
3:          "cp=" + codepoint +  
4:          " ,string='" + getString() +  
5:          " ,name=" + getName() +  
6:          '}';  
7:    }

Example output:

UnicodeCharacter{cp=65,string='A',name=LATIN CAPITAL LETTER A}
UnicodeCharacter{cp=1040,string='А',name=CYRILLIC CAPITAL LETTER A}
UnicodeCharacter{cp=32,string=' ',name=SPACE}

Suggestion 2: for the machine, program logic, concatenateable

1:    public String toString() {  
2:      return UTF16.valueOf(codepoint);  
3:    }

Example output for the same 3 (yes 3) characters:

Both have their advantages and drawbacks. It certainly needs a concatenateable method, but that can be named "getString()". The debug method is nice to have - you see, just by looking at the characters you can't tell whether it's the Latin or Cyrillic A, or what kind of whitespace it is.

Unfortunately, in this case, expanding the object in the debugger doesn't help, because solely the code point is a property of the object. The other information (character and name) are computed:

Analysis of Java's toString()

Here's what Java has to say about Object.toString():

Returns a string representation of the object. In general, the toString method returns a string that "textually represents" this object. The result should be a concise but informative representation that is easy for a person to read.

Both my suggestions follow the specification. The one with debug info is easier to read for a person. But I'm not sold yet.

Who calls toString()?

Java itself:
When doing string concatenation: String s = myString + myCar;
Is the same as doing String s = myString.toString().concat(myCar.toString());
JDK methods:
String.valueOf(Object obj), and thus every method that uses this such as PrintStream.print(Object o).
Arrays.toString(Object[] a) for every object in the array a.
StringBuilder and StringBuffer: the toString() method is used as the build method.
Logging:
System.out.println().
Your favorite logging framework.
Debugging:
Your favorite IDE in the debugger.

Hrm. So there are mainly 2 uses:

String representation: toString() returns the object's value "as string" as close as possible.
It is absolutely required to override toString(), and to do it in this way.
Debug information: the object's values for the human.
For example IntelliJ IDEA's default toString() template generates this kind.
It's just nice to have.

Sometimes, as an additional benefit, the object provides a constructor accepting that string as a parameter to re-create it. Example: Integer.toString() and new Integer(String).

How does Java in the JDK define toString() in their classes?

For some simple value classes there's not much choice. Integer for example: returning "-43" makes sense.

Character could return more than just the character as string, but it does not. String could tell the length and cut it if it's too much, but it does not. StringBuilder and StringBuffer could report the appended chunks separately, and tell how many, and cut, but they don't. If they would, the classes would need a separate method for string concatenation, and concatenation with + would not work anymore. Now here's an observation: They all implement CharSequence, which was added in JDK 1.4, and it overrides the toString() method signature just to say something about it:

Returns a string containing the characters in this sequence in the same order as this sequence. The length of the string will be the length of this sequence.

So that's why.

Conclusions

My class UnicodeCharacter is a wrapper around a unicode codepoint just like Character is a wrapper around the char primitive. It's a character supporting those that don't fit into a char. And as such it really should implement the CharSequence interface. Then the decision is made: toString() must be suggestion 2, only returning the character's string value.

In some rare cases it would be nice to have 2 different methods: one for the string value (toString()) and one for the debug info (toDebug() or toDebugString()). The method could be defined in Object, with a default implementation: calling toString().