Having recommended podcasts before, I now want to recommend Ran Tavori's and Ori Lahav's great (in Hebrew) software development podcast: רברס עם פלטפורמה.
Some of my favorite episodes include:
50. Content acceleration and CDNs
46. Multiple data-centers
45. references in Java
42. Garbage Collection (including myself with a guest appearance)
39. Designing products for the military
28. MySQL
25. Data centers
22. Internet products and usability in general
20. Introduction to DJango
18. ERLANG
17. Key-Value DB products
15. ASP.NET with Yossi Tagori
13. Scalability
10. SundaySky real-time video generation
8. think twice before you debug instead of trace
If you know any other good software development podcasts in Hebrew, please comment here to let me and the world know.
Wednesday, March 24, 2010
Tuesday, March 23, 2010
ConcurrentHashMap fat memory footprint
While running product sizing tests, we've found that an over enthusiastic usage of ConcurrentHashMap (CHM) had evaporated a good ~170MB of much needed heap space (we ran with a 1.5GB heap).
As it turns out, a empty CHM weighs around 1700B. Yes, I'm talking about a map with no entries at all, just the plumbing!
We used a CHM to store user session attributes, having 100,000 user sessions generated 100K CHM instances worth 170MB of heap (100K times 1.7KB).
We took measurements using the super Eclipse MAT.
The obvious solution for saving these scares 170MB, was to switch from a CHM to a Hashtable. A Hashtable cost only around 150B per instance (8% of a CHM).
Other possible solutions could have been: moving to a list structure (seek time is not an issue as we rarely have more than 4-5 attributes per session), or resorting to a an array of Objects.
Change implications:
1. Performance - The product doesn't have any user scenario that cause multiple threads to concurrently access the same session attributes map, so we don't expect any performance loss, on the contrary, I'm expecting a hashtable to prove faster for single thread access, over a CHM.
2. Thread safety is a low risk aspect, as both CHM and HT provide the same basic guarantees for a single API operation (e.g., map.get(key)).
To conclude, a CHM is a good idea when you have a shared map structure suffering from a high R/W thread access contention. But dragging behind itself such a large memory footprint, CHM is not ideal to use in masses, or when concurrency performance is not the focus.
P.S
A CHM automatically allocates 16 segments, each with a 16-element array - one best practice is to measure the average map population during your product's sizing tests, and initialize the CHM with the minimum initialCapcity and loadFactor, required to contain your usage.
As it turns out, a empty CHM weighs around 1700B. Yes, I'm talking about a map with no entries at all, just the plumbing!
We used a CHM to store user session attributes, having 100,000 user sessions generated 100K CHM instances worth 170MB of heap (100K times 1.7KB).
We took measurements using the super Eclipse MAT.
The obvious solution for saving these scares 170MB, was to switch from a CHM to a Hashtable. A Hashtable cost only around 150B per instance (8% of a CHM).
Other possible solutions could have been: moving to a list structure (seek time is not an issue as we rarely have more than 4-5 attributes per session), or resorting to a an array of Objects.
Change implications:
1. Performance - The product doesn't have any user scenario that cause multiple threads to concurrently access the same session attributes map, so we don't expect any performance loss, on the contrary, I'm expecting a hashtable to prove faster for single thread access, over a CHM.
2. Thread safety is a low risk aspect, as both CHM and HT provide the same basic guarantees for a single API operation (e.g., map.get(key)).
To conclude, a CHM is a good idea when you have a shared map structure suffering from a high R/W thread access contention. But dragging behind itself such a large memory footprint, CHM is not ideal to use in masses, or when concurrency performance is not the focus.
P.S
A CHM automatically allocates 16 segments, each with a 16-element array - one best practice is to measure the average map population during your product's sizing tests, and initialize the CHM with the minimum initialCapcity and loadFactor, required to contain your usage.
Subscribe to:
Posts (Atom)