Wednesday, June 04, 2008

An interesting leak when using WeakHashmaps

Bharath Ganesh
desribes in the blog post
Thoughts around Java, Web Services, IT: The interesting leak

a very interesting leak, were WeakHashmap doesn't seem to release entries that don't seem to be referenced anymore.
Actually when interned String literals are used the entries stay in the WeakHashmap even after all hard references seem to be removed.

I say "seemed to be removed" because there actually is still a reference from the Class, that is still loaded, to the interned String literal.

Unfortunately using a heap dump do find this out, does not work, because this implicit reference is (currently) not written to the heap dump file.

Changing the example to :

import java.util.Map;
import java.util.WeakHashMap;
import junit.framework.*;

public class TestWeakHashMap extends TestCase
private String str1 = new String("newString1");
private String str2 = "literalString2";
private String str3 = "literalString3";
private String str4 = new String("newString4");
private String str5 = (str4+str1).intern();

private Map map = new WeakHashMap();

public void testGC() throws IOException
map.put(str1, new Object());
map.put(str2, new Object());
map.put(str3, new Object());
map.put(str4, new Object());
map.put(str5, new Object());

* Discard the strong reference to all the keys
str1 = null;
str2 = null;
str3 = null;
str4 = null;
str5 = null;

while (true) {
* Verify Full GC with the -verbose:gc option
* We expect the map to be emptied as the strong references to
* all the keys are discarded.
System.out.println("map.size(); = " + map.size() + " " + map);

You will find that str5 will be reclaimed by the Garbage Collector.

Using String literals defined in your Classes for keys in a WeakHashmap might not do what you want, but using interned Strings in general as keys for WeakHashmap is safe.


Anonymous said...

Good point, but I dont think you describe the problem well enough, nor agree with the conclusion. The WeakHashMap is "A hashtable-based Map implementation with weak keys". What anyone should know is that there is something called a constant pool (and I assume that is what you mean by "... still a reference from the Class... "). And as far as I can recall from my compiler course, these are not guarenteed to be purged. This is the reason the literal strings (constants) are still in the map. The same is the case if you use Integer int1 = 2; as a key (the 2 is autoboxed). It too is stored in the constant pool (and whatever they decide to stuff into the language in the future). With this in mind it makes perfect sense why they are not removed.

And when you say its safe to use interned strings for keys.. what if some other class defines "literalString3" before your intern() call is executed? Then you are suddenly using a key allocated in an unknown constant pool!

Anonymous said...

In the last section I obviously mean, "newString4newString1" (which is equal to str5) and not "literalString3".

Unknown said...

Hi Johannes,
Sure there is a constant pool used for String literals.

But the main point is that if the class would be unloaded because a full GC would be running, which also cleans up the perm area where classes and the constant pool are stored. If the class would not be referenced anymore the full GC would unload it (on any recent VM) and also reclaim the constant.

To be honest I don't know whether the spec mandates that behavior, but in practice that's the way it works.
I still believe that String.intern()can be safely used in this scenario because it doesn't hurt.
If the constant was there in the first place, it will stay as long as the class is there. So you don't loose anything by calling String.intern()

Anonymous said...

Your follow-up post surprised me:-)
I have continued my post -

Anonymous said...

Keep up the good work.