So what exactly is the dominator tree?
Dominators
An object A dominates on an object B if all the paths to object B pass through object A.
Remember that the Garbage Collector removes all objects that are not referenced anymore. If A dominates B and A could be removed from memory, that means that there's no path anymore that leads to B. B is therefore unreachable and would be reclaimed by the Garbage Collector.
One could also say that A is the single object that is responsible for B still being there!
The Dominator Tree
Using the the "dominates" relationship we can create a dominator tree out of the the graph of objects in memory. At each node of this tree we store the amount of memory that would be freed (= retained size).
At the top of the tree we have a "virtual" root object, which we also use to represent objects that don't have "real" single dominator.
Here's an example of an object tree (on the left) and the corresponding dominator tree (on the right) :
- Note that A, B and C are dominated by a "virtual" root object.
- Note that the dominator relationship is transitive;C dominates E which dominates G therefore C also dominates G.
Because of the transitivity of "dominated", the retained size of a parent object within the dominator tree is always greater than the sum of it's child objects.
To get the biggest objects we just need to sort the second level of the dominator tree (the first level is the "virtual" root object) by retained size.
Now if you are looking to find a memory leak, and you have no a priori knowledge that could help you, the typical approach is to run a test that reproduces the leak, and then try to somehow figure out what is leaking.
Do we really need Object allocations tracing?
In my experience people often seem to believe that finding leaks requires recording object creations, also called "object allocations tracing "sometimes, because you want to know where in the code objects are always allocates but never released.
Tess Ferrandez, ASP.NET Escalation Engineer (Microsoft) has an example of how this method for finding leaks can be applied to .NET applications. Dear Microsoft I encourage you to look at the Eclipse Memory Analyzer ;)
All you need is the dominator tree
With the dominator tree finding these kind of leaks is much easier and you don't need the high overhead of allocation tracing, which is typically not acceptable in a production environment. Allocation tracing has it's uses, but in general IMHO it is overrated.
It's almost guaranteed that a significant memory leak will show up at the very top of the dominator tree!
The Eclipse Memory Analyzer not only supports the dominator tree but can even find memory leaks automatically, based on some heuristics.
The dominator tree can also be used to find high memory usage in certain areas, which is a much harder problem. More on this in the next post.