Sunday, October 15, 2006

Code Advice #14: Don't initialize fields to default values

WARNING: This blog entry was imported from my old blog on blogs.sun.com (which used different blogging software), so formatting and links may not be correct.


(See intro for a background and caveats on these coding advice blog entries.)



If you've been coding in C, you've probably picked up the habit of initializing all your fields:


char *foo = NULL;
int bar = 0;




This is necessary, because in C, memory can be left uninitialized, so there's no telling what value foo will have before it is assigned something.



In Java, however, the language specification clearly defines default values, so virtual machines will for example always initialize reference fields to null.



Specifically, this means that code like the following is redundant:


private int foo = 0;
private Bar bar = null;
private boolean baz = false;




The alternative form of initializing the fields explicitly in the constructor, is the same:


private int foo;
private Bar bar;
private boolean baz;

public Foo() {
foo = 0;
Bar bar = null;
baz = false;
}



You can leave out the above initializations, and the program will behave the same way. Carl Quinn once convinced me that this was more readable, so I picked up the habit, and I now swear by it. On the one hand, you can argue that leaving the explicit initializations in is more readable because you're making it really clear what it is you are intending. On the other hand, Java programmers quickly learn what the default values are and understand that an uninitialized field is the same as a nulled out field.



It turns out that the two forms are not exactly identical! If you disassemble the above code, you'll find the following bytecode is generated:


4: aload_0
5: iconst_0
6: putfield #2; //Field foo:I
9: aload_0
10: aconst_null
11: putfield #3; //Field bar:LBar;
14: aload_0
15: iconst_0
16: putfield #4; //Field baz:Z



If you leave out the initializations, none of the above bytecode is generated. In a simple microbenchmark there was a measurable time difference (about 10%) for the case where a handful of fields were initialized versus leaving them uninitialized, so it's clear that Hotspot doesn't completely eradicate the differences here. (<speculation>Perhaps it just zeroes out the whole object memory block when allocating a new object, relying on the default values to all be represented as zeroes natively, and this is done even when all fields are later initialized?</speculation>)



Obviously, speed considerations is not what should be the deciding factor here. This was a microbenchmark doing nothing other than construct objects with and without initialization - it's unlikely that you'll find a measurable difference on a real program. What really matters is readability. I should also point out that Findbugs will treat these two scenarios differently. If you print out a field value in the constructor, it will warn about an uninitialized field if the field was not explicitly initialized in the field declaration.



I can see arguments both for and against explicit field initialization, but I think this is one of those cases where convention wins. I personally find code cleaner and more readable when you leave your fields with implicit rather than explicit initialization.



P.S. Remember that null fields are often a bad idea and should be initialized to null-objects!