Tuesday, March 11, 2014

Software Versioning and Bugfixes

There are many ways how to give version identifiers to software builds.

The goals of software versioning, as I understand them:
  1. Give each build (release, snapshot) a unique identifier.
  2. Apparent order: it must be obvious which is newer when comparing 2.
  3. Apparent changes: contains breaking changes, new functionality, or just bug fixes.
  4. Possibility to release updates later on that "fit in between".
I like the increasingly popular semantic versioning. It accommodates all these goals.

Apparent Order

1.9.0 -> 1.10.0 -> 1.11.0
This is important for me as a user, it lets me select the newest version quickly.

Apparent Changes

Going from 1.9.0 to 2.0.0 may mean that some of my use cases need to be adjusted, some things don't work the same anymore. I might prefer to stay on the "1" branch.

Going from 1.9.0, 1.10.0 contains more functionality, it seems pretty safe. Experience taught me that new features bring new bugs - even if I don't make use of the new stuff.

Going from 1.9.0 to 1.9.1 means there are just bug fixes. It is almost a safe bet to upgrade this. A quick look over the change-list helps me to decide where I might be affected. The risk is very low that this introduces new bugs.

Release Updates

When after 1.9 the version 1.10 has been released, and then bugs are found in 1.9.0 that should be fixed, not everyone will be happy with a 1.11 that now contains new functionality and bug fixes. Hence the 3 numbers.

H2 Database thinks otherwise

Unfortunately this software library that I use, like and value follows a different version system. Just like Google Chrome, they have been publishing new versions with what is effectively a single continuous version number.

For a recent history see http://www.h2database.com/html/changelog.html
And longer: http://mvnrepository.com/artifact/com.h2database/h2

There are no bug fix releases. Each new version brings new features, improvements, bug fixes, and new bugs, which becomes clear when you follow the issues. But to learn in which version they were introduced, and so on, you need to dig and read the bug tracker.

For a web browser, this release cycle OK. Give me the latest, I can cope with some level of annoyances and crashes now and then.

The database however is the back bone of many applications. It is the piece of software that one relies on, that is the worst to break. And as such, I really wish it would maintain two code branches. One for latest developments, and one that lags behind where just fixes are applied.

Whenever I consider updating this library in my code, I hesitate. Is it really necessary? Most times I answer no and move on. Updating means reading all recent changes, and running tests.

I only use this database for read-only scenarios, not as a live database with active changes.

My suggestion to H2 and in general

  1. Move code from Google Code to GitHub
  2. Change to semantic versioning
  3. Create a branch that lags behind, and let contributors do the work of maintaining it (applying just the fixes)

Monday, December 2, 2013

IntelliJ IDEA: add Project Files to Version Control?

Do the contents of the .idea project folder and the .iml module files belong into version control?

Yes, says the IntelliJ team:
Here is what you need to share:
  • All the files under .idea directory in the project root except the workspace.xml and tasks.xml files which store user specific settings
  • All the .iml module files that can be located in different module directories

No for .iml files say Maven users who are annoyed by ever-changing iml files, see these 2 older but still open bugs:

What are the drawbacks if I don't share these files? None, from my experience. When doing a fresh checkout of a project, IDEA asks whether I'd like to create a new project. Yes please, and all files are generated correctly. 

The only annoyance we have had at the company from time to time is when someone accidentally adds an .iml file to the VCS. There's a feature in IDEA to stop it from asking:

Settings -> Version Control -> Ignored Files

This setting has to be applied in every project separately. And thus it's not very helpful in this case. Because every new project I check out, IDEA asks me whether I want to add the .iml file to version control (one by one), and I can just tell it to stop asking:


Conclusion

For the time being we keep IDEA files out of the VCS. Our projects are usually multi-module maven, with lots of iml files. Your mileage may vary.



Friday, November 29, 2013

Tabs and Saved Session Folders for PuTTY

For many years I've been using the PuTTY network terminal to connect to remote servers when doing sysadmin tasks. And when I do, my monitors quickly fill up with those black windows. Remember web browsing before tabs? Like that. I totally lose track.

The other thing that's annoying is that it does not let me categorize my saved sessions. All are in one alphabetical list. That might work when you just have a handful, but my list grew to dozens. Meh.


So I checked again today on the official site for an update: 



With such an user interface of the site, it would be too much to expect gimmicks in the program. Yes there was an update. Security fixes.

So I googled for what else there is. And apparently there are plenty of wrappers around PuTTY that bring exactly what I want: Tabs and Session Folders. I don't know how they could hide from me for so long - maybe because neither the official site nor the Wikipedia page mention any.

I've decided to go with this open source and actively maintained one: superputty


Conclusion

Same putty, more gui. Awesome.

Friday, November 8, 2013

Java: Should I assert or throw AssertionError?

Assertions were introduced to the Java language in JDK 1.4, that was in 2002. It brought the assert statement
assert x != null : "x was null!";
and the throwable error class
if (x==null) throw new AssertionError("x was null!");
Which one should I use in my code?



Well then, what's the difference?

The assert statement

Unless assertions are actively turned on when running the program (-ea), they are not executed. They are still in the bytecode, but they don't run.
  • Bugs go unnoticed.
    These bugs should have been detected already in unit tests and the testing phase before deployment when running with assertions on. But experience taught us that some slip through.
  • No performance loss.
    There are a few rare scenarios where it matters.
  • Side effects are possible.
    Badly programmed assertions can cause nasty side effects. Because sometimes the assert code is executed, and other times it's not, the program may behave slightly different.
Using the assert statement gives control to the runtime config whether they run or not. The choice is left to the person who runs the program. The choice can be made from run to run, no recompile is needed.

The AssertionError

The asserting code is always executed. Changing your mind about execution means a recompile and re-deployment.
  • Bugs always show up.
    Even in production they abort the program.
  • Performance loss.
    In most cases it's irrelevant.
  • Guaranteed to have no side effects.
    The executed code is always the same.

 

Vital or nice to have?

Programs were written in Java long before assertions were available. Some came up with their own self-made assertions logic - which was not as clean and short as the real thing. And most probably didn't.
Assertions are not allowed to alter the program logic, were not always available, and are not mandatory. One can very well run the same program with them turned off. Conclusion: nice to have.

However, it is not guaranteed that a specific bug ever shows up without assertions running. Let's take a flight simulator for example. Assume a computation bug that only occurs in a rarely executed code path. The bug can cause the airplane to fly at slightly reduced speed, and no one ever notices. Or it can be a number overflow, causing the plane to fly backwards. That will surely be seen each time.

Would assertions help in production code?

It depends on the domain. Assertion means abort. Is that what you'd want, to prevent the worst? Or would you rather try to go on, hoping that it's a minor, neglegtible bug?
  1. End-of-day accounting program: rather abort, alert the technicians, they fix it, and go on. No damage done, and bug fixed.
  2. Real-time program where thousands of employees depend on it, and an abort means a big loss of work time. No abort. Hope that the bug eventually shows up as a side effect, and it can be traced and fixed.
  3. Program where an abort is the worst case scenario, like a flight simulator: go on.

Conclusions

assert, not AssertionError

When writing library code, assert is the right choice. It gives the power to the user whether assertions should be on or off, whether he favors detection + abort, or to go unnoticed.

When writing an application, use assert as well. If you want to enforce assertions, you can still hack in this piece of code, and if you change your mind later, you don't have to go through all assertions and replace them:
static {
    boolean assertsEnabled = false;
    assert assertsEnabled = true; // Intentional side effect
    if (!assertsEnabled)
        throw new RuntimeException("Asserts must be enabled (-ea)!");
} 
Also, with assert, they can be enabled on a per-package level
java -ea:com.example... MyApp

AssertionError to satisfy compilation

Sometimes, when reaching code that should never be reached, an AssertionError is thrown to make the code compile. Example:
switch (TrafficLight) {
    case GREEN:
        return doGreen();
    case ORANGE:
        return doOrange();
    case RED:
        return doRed();
    default: 
        //Ugh! I gotta do something, but have no clue what to do!
        //Let's abort the app and turn the semaphore offline.
        throw new AssertionError("Dead code reached");
} 
An assert statement would not compile when a return is required.

However, in some cases, it might be more suited to throw an exception instead of an error, for example an UnsupportedOperationException. In the above case an exception barrier could catch that, then turn all semaphores at the intersection to blinking orange for a minute, and then restart the green interval as it would from a clean program start and continue normal operation. That would have several advantages:
  • be cheaper than sending a technician
  • the semaphore is off for short time only
  • while it's off it's blinking orange, rather than being totally off, that seems more secure
The problem would be detected in both cases. With an AssertionError it's in the output for free. And with UnsupportedOperationException it's the task of the one catching the exception to log it.

So again, the choice is between a hard abort, or giving the program the chance to recover and continue if a higher level decides to do so.


What I'm missing: detection and logging!

In the production scenarios 2 and 3 from above I'd want a 3rd way of handling assertions, which is not offered by Java's assertion feature.

Java has 2 strategies:
  1. Don't even check, hence no abort
  2. Check, and conditionally abort or continue
And I'm missing the 3rd one with the behavior:
  1. Run assertion code to detect bugs and don't abort, but instead log.
    It's a clear bug, it's detected, so log it. There's no cheaper way to detect it than right here, in the assert code that was written already.
 

Configure your IDE to run with assertions

It strikes me that the default runconfig in the most common IDEs does not have assertions enabled. It actually happened to me that I had spent way too much time chasing a bug, when the assertion would have shown it instantly - but they weren't on.

In IntelliJ IDEA: You need to modify the Defaults for all kinds that you're using: Application, JUnit, TestNG, ... and changing the defaults does not change the runconfigs you've created earlier. And you need to do this in every project, separately. (Please add a comment if you know how to set this once and for all!)


For Eclipse there's an explanation here http://stackoverflow.com/questions/5509082/eclipse-enable-assertions also see the 2nd answer about JUnit tests.

Monday, November 4, 2013

Java toString(): the Program Logic vs. Debug Dilemma

I'll start with my real world problem of how to implement toString() for my new class, followed by an analysis of how Java uses toString(), and finishing with my conclusions.


Real world problem: my class

Here's the (stripped down for simplicity) version of my new class:

1:  /**  
2:   * Represents a possibly multi-byte character and provides information about it.  
3:   * <p>A UnicodeCharacter is the equivalent of a Java "code point".</p>  
4:   */  
5:  public final class UnicodeCharacter {  
6:    private final int codepoint;  
7:    public UnicodeCharacter(int codepoint) {  
8:      this.codepoint = codepoint;  
9:    }  
10:    public int getCodepoint() {  
11:      return codepoint;  
12:    }  
13:    /**  
14:     * @return Array with usually 1 character, 2 characters for multi-byte.  
15:     */  
16:    public char[] getChars() {  
17:      return UCharacter.toChars(codepoint);  
18:    }  
19:    /**  
20:     * @return Tells if the {@link #getCodepoint codepoint} is for a surrogate pair. 
21:     */  
22:    public boolean isMultiByte() {  
23:      return getChars().length>1;  
24:    }  
25:    /**  
26:     * @return For example the category UCharacterCategory.UPPERCASE_LETTER  
27:     */  
28:    public Byte getCategory() {  
29:      return (byte) UCharacter.getType(codepoint);  
30:    }  
31:    /**  
32:     * See {UCharacter#getName}  
33:     */  
34:    public String getName() {  
35:      return UCharacter.getName(codepoint);  
36:    }  
37:  }  


Now for toString(), what should it be?

Suggestion 1: for the human user, debug output

1:    public String toString() {  
2:      return "UnicodeCharacter{" +  
3:          "cp=" + codepoint +  
4:          " ,string='" + getString() +  
5:          " ,name=" + getName() +  
6:          '}';  
7:    }  
Example output:
  • UnicodeCharacter{cp=65,string='A',name=LATIN CAPITAL LETTER A}
  • UnicodeCharacter{cp=1040,string='А',name=CYRILLIC CAPITAL LETTER A}
  • UnicodeCharacter{cp=32,string=' ',name=SPACE} 

Suggestion 2: for the machine, program logic, concatenateable

1:    public String toString() {  
2:      return UTF16.valueOf(codepoint);  
3:    }  

Example output for the same 3 (yes 3) characters:
  • A
  • А
  •  

Both have their advantages and drawbacks. It certainly needs a concatenateable method, but that can be named "getString()". The debug method is nice to have - you see, just by looking at the characters you can't tell whether it's the Latin or Cyrillic A, or what kind of whitespace it is.

Unfortunately, in this case, expanding the object in the debugger doesn't help, because solely the code point is a property of the object. The other information (character and name) are computed:




Analysis of Java's toString()

Here's what Java has to say about Object.toString():
Returns a string representation of the object. In general, the toString method returns a string that "textually represents" this object. The result should be a concise but informative representation that is easy for a person to read.
Both my suggestions follow the specification. The one with debug info is easier to read for a person. But I'm not sold yet.

Who calls toString()?

  • Java itself:
    When doing string concatenation: String s = myString + myCar;
    Is the same as doing String s = myString.toString().concat(myCar.toString());
  • JDK methods:
    String.valueOf(Object obj), and thus every method that uses this such as PrintStream.print(Object o).
    Arrays.toString(Object[] a) for every object in the array a.
    StringBuilder and StringBuffer: the toString() method is used as the build method.
  • Logging:
    System.out.println().
    Your favorite logging framework.
  • Debugging:
    Your favorite IDE in the debugger.

Hrm. So there are mainly 2 uses:
  1. String representation: toString() returns the object's value "as string" as close as possible.
    It is absolutely required to override toString(), and to do it in this way.
  2. Debug information: the object's values for the human.
    For example IntelliJ IDEA's default toString() template generates this kind.
    It's just nice to have.
Sometimes, as an additional benefit, the object provides a constructor accepting that string as a parameter to re-create it. Example: Integer.toString() and new Integer(String).

How does Java in the JDK define toString() in their classes?

For some simple value classes there's not much choice. Integer for example: returning "-43" makes sense.

Character could return more than just the character as string, but it does not. String could tell the length and cut it if it's too much, but it does not. StringBuilder and StringBuffer could report the appended chunks separately, and tell how many, and cut, but they don't. If they would, the classes would need a separate method for string concatenation, and concatenation with + would not work anymore. Now here's an observation: They all implement CharSequence, which was added in JDK 1.4, and it overrides the toString() method signature just to say something about it:
Returns a string containing the characters in this sequence in the same order as this sequence. The length of the string will be the length of this sequence.
So that's why.

Conclusions

My class UnicodeCharacter is a wrapper around a unicode codepoint just like Character is a wrapper around the char primitive. It's a character supporting those that don't fit into a char. And as such it really should implement the CharSequence interface. Then the decision is made: toString() must be suggestion 2, only returning the character's string value.

In some rare cases it would be nice to have 2 different methods: one for the string value (toString()) and one for the debug info (toDebug() or toDebugString()). The method could be defined in Object, with a default implementation: calling toString().


Wednesday, October 30, 2013

Password-protect Play2 Framework Webapp

There are numerous extensions to bring authentication and access control to your Play based website. For example the Deadbolt 2 authorization system lets you define access rights per controller and method. Or you can roll your own, as this step by step guide shows.

What to do if you just want basic http authentication for the whole site? As of today there's no such thing built in. (If this changes, let me know...)

If you're serving your Play pages through Apache as reverse proxy, you're lucky.

Going through a reverse proxy is a good idea anyway:
  • You get the option for load balancing and failover: run multiple instances.
  •  Run multiple Play sites on the same machine, on whatever port number, and expose them all on port 80 to the outside.
 Your apache website definition looks simething like this:
<VirtualHost *:80>
  ServerName my-play-app.example.com
  ProxyPreserveHost on
  <Location />
    ProxyPass http://192.168.56.100:9000/ connectiontimeout=9999999 timeout=9999999
    ProxyPassReverse http://192.168.56.100:9000/

    AuthType Basic
    AuthName "whatever" 
    AuthUserFile /etc/apache2/sites-available/.htpasswd-mysite
    Require valid-user
  </Location>
</VirtualHost>
I'm running my Play app in a virtual machine (192.168.56.100) on standard port 9000. The connection timeout is there so that long running tasks are not aborted by the proxy.

Create the password file as usual:
cd /etc/apache2/sites-available/
htpasswd -c .htpasswd-mysite newuser

And then refresh Apache, and you're done:
service apache2 reload


Wednesday, October 9, 2013

IntelliJ IDEA stopped working after Windows 7 Updates

(This post is only of interest to people having the same situation, probably finding this through Google search.)

Problem

After finishing the last batch of Windows 7 updates with some issues (blue screen), my computer was fine, all programs worked normally... except for one: IntelliJ IDEA just wouldn't start. I was running version 12.1.3, and upgrading to 12.1.6 would not fix it. Also, the earlier and still installed versions 11 would not work.

The process started normally, as seen here, but the application was missing:


The Windows event viewer had no message about it.

There was a thread dump in the IntelliJ IDEA log folder c:\users\my-username\.IntelliJIdea12\system\log\threadDumps-timestamp\...txt with a time of the program start, but the text within the file was not helpful to me.

Solution

What solves the problem is to right-click the program in the Windows start menu, and select "Run as administrator".



Thanks to my wife for figuring it out.