Friday, December 26, 2008

Saving on memory usage in Java #1 - the Byte.valueOf method

Say you wanna keep in memory a list of martial arts experts and their respective shoe size. One way to implement it would be to populate a Map structure with the following sets of key and value:

Map map = ...
map.put("Jean-Claude Van Damme", new Byte(45));
map.put("Jet Li", new Byte(45));
map.put("Chuck Norris", new Byte(112));
...
map.put("person number million", new Byte(45));

What if your JVM runs on a Lego mechanical computer that has a very limited amount of memory, you would probably want to save on memory wherever possible.

[caption id="attachment_65" align="alignnone" width="300" caption="A state of the art 3Hz Lego computer"]A Lego computer[/caption]

Autoboxing anybody?


Keeping in mind that an object instance weights much more than just the primitive it holds, as it hold additional "plumbing" data (monitor, etc). Even an Object class instance weighs 8 bytes while not holding to any application information. What about keeping only primitives as the map value?
Autoboxing, introduced in Java 5 onwards, allows to pass a byte primitive argument instead of a Byte object instance argument in the following manner:

map.put("Bruce Lee", 42);

Does this help us avoid the costly Byte Objects? Not really, the auto-boxing feature, as the name hints, just statically replaces the 42 literal with a new Byte object instance, this is done during compilation. So there's no real saving opportunity here, and we're back where we started.

[caption id="attachment_66" align="alignnone" width="300" caption="AutoBoxing"]AutoBoxing[/caption]

How about a plain old cache?


Examining the code above, you notice that you are creating one million unique Byte objects to hold the fighters' shoe size, even though there are only 256 different shoe size values. Is this a venue for saving?
Considering the fact that Byte objects are immutable, why not have just a single Byte object for each distinct byte value (we'll need only 256 instance to cover all values). This way we'll pass the same Byte instance to all people with a 45 shoe size, Jean-Claude and Jet-Li map in our case. This will reduce the number of Byte instance from a million to only 256. Sounds super!

[caption id="attachment_67" align="alignnone" width="201" caption="Too much objects in memory"]Memorizing too many objects is hard[/caption]

How do you implement this? You'll might rush into initializing an array of 256 Byte objects during application start-up, giving birth to something of this sort:

// init instances array
int RANGE_OF_VALUES = 2^8; // we don't care about negatives
Static Byte constShoeSizes = new Byte[
RANGE_OF_VALUES];
for (byte b=0; b<
RANGE_OF_VALUES; b++) {
constShoeSizes[b] = new Byte(B
yte.MIN_VALUE + b);
}
map.put("Jean-Claude Van Damme", constShoeSizes[45]);
map.put("Jet Li", constShoeSizes[45]);
map.put("Chuck Norris", constShoeSizes[112]);

Enter the valueOf() method


WHOA! Hold you horses! Doesn't this use case seems to be just too common and trivial?! haven't the Java language designers and implemented came accross the same problem? Surely, some of the JRE classes themselves must have Byte instances data members. In an effort to reduce the JRE memory footprint, won't the JRE programmers cache instances using something very much like the static Byte array we implemented ourselves?
The short answer of course is YES! Java 5 presents a new overloaded Byte.valueOf(byte b) method. This method returns a reference to a Byte instance taken from a shared cache. This trivial cache strategy save memory and CPU, as there's no need to construct new objects and later on garbage collect them.
Here's the relevant Byte.valueOf method source code taken from Byte.java source:

private static class ByteCache {
private ByteCache(){}


static final Byte cache[] = new Byte[-(-128) + 127 + 1];

static {
for(int i = 0; i < cache.length; i++)
cache[i] = new Byte((byte)(i - 128));
}
}
...
public static Byte valueOf(byte b) {
final int offset = 128;
return ByteCache.cache[(int)b + offset];
}


Using the valueOf method, here's how the final version of our code will look like:

map.put("Jean-Claude Van Damme", Byte.valueOf(45));
map.put("Jet Li", Byte.valueOf(45));
map.put("Chuck Norris", Byte.valueOf(112));

Wrapping up quickly:



  1. From Java 5 onwards, use the valueOf method for Number extenders like: Byte, Short, and Integer. Notice that as the Integer object has 2^32 different values, only the (-128) to 127 values range is cached. Meaning that expression (Integer.valueOf(129)==Integer.valueOf(129)) will always be false, since it returns a new Integer object on every call.
    Other object types (Double,Float, etc...) valueOf method does not implement a cache at all. If your value range is limited in nature, you might choose to create a caching scheme of your own.

  2. Always be on the lookout and Inspect repetetive Instance creation closely, see if you can avoid it by referencing an shared immutable object, or by borrowing an instance from an object pool.

  3. Strings can have an even larger space and time performance gains than numbers objects, though at the same time they are inherently harder to reuse. You might want to take time to learn about Strings instances reuse strategies; start with the String.intern() method.