Tuesday, March 23, 2010

ConcurrentHashMap fat memory footprint

While running product sizing tests, we've found that an over enthusiastic usage of ConcurrentHashMap (CHM) had evaporated  a good ~170MB of much needed heap space (we ran with a 1.5GB heap).

As it turns out, a empty CHM weighs around 1700B. Yes, I'm talking about a map with no entries at all, just the plumbing!
We used a CHM to store user session attributes, having 100,000 user sessions generated 100K CHM instances worth 170MB of heap (100K times 1.7KB).
We took measurements using the super Eclipse MAT.

The obvious solution for saving these scares 170MB, was to switch from a CHM to a Hashtable. A Hashtable cost only around 150B per instance (8% of a CHM).
Other possible solutions could have been: moving to a list structure (seek time is not an issue as we rarely have more than 4-5 attributes per session), or resorting to a an array of Objects.

Change implications:

1. Performance - The product doesn't have any user scenario that cause multiple threads to concurrently access the same session attributes map, so we don't expect any performance loss, on the contrary, I'm expecting a hashtable to prove faster for single thread access, over a CHM.

2. Thread safety is a low risk aspect, as both CHM and HT provide the same basic guarantees for a single API operation (e.g., map.get(key)).

To conclude, a CHM is a good idea when you have a shared map structure suffering from a high R/W thread access contention. But dragging behind itself such a large memory footprint, CHM is not ideal to use in masses, or when concurrency performance is not the focus.

P.S
A CHM automatically allocates 16 segments, each with a 16-element array - one best practice is to measure the average map population during your product's sizing tests, and initialize the CHM with the minimum initialCapcity and loadFactor, required to contain your usage.