Java Performance blog: 2008

Monday, December 15, 2008

How much memory is used by my Java object?

There's currently some buzz around the size of certain Java objects (Update: more on this below). Here is some very good description of what you have to take into account to compute the (shallow) size of an java object.
I repost here the rules from my old blog at SDN, because I think they are a more compact description of the memory usage for java objects.

The general rules for computing the size of an object on the SUN/SAP VM are :

32 bit

Arrays of boolean, byte, char, short, int: 2 * 4 (Object header) + 4 (length-field) + sizeof(primitiveType) * length -> align result up to a multiple of 8

Arrays of objects: 2 * 4 (Object header) + 4 (length-field) + 4 * length -> align result up to a multiple of 8

Arrays of longs and doubles: 2 * 4 (Object header) + 4 (length-field) + 4 (dead space due to alignment restrictions) + 8 * length

java.lang.Object: 2 * 4 (Object header)

other objects: sizeofSuperClass + 8 * nrOfLongAndDoubleFields + 4 * nrOfIntFloatAndObjectFields + 2 * nrOfShortAndCharFields + 1 * nrOfByteAndBooleanFields -> align result up to a multiple of 8

64 bit

Arrays of boolean, byte, char, short, int: 2 * 8 (Object header) + 4 (length-field) + sizeof(primitiveType) * length -> align result up to a multiple of 8

Arrays of objects: 2 * 8 (Object header) + 4 (length-field) + 4 (dead space due to alignment restrictions) + 8 * length

Arrays of longs and doubles: 2 * 8 (Object header) + 4 (length-field) + 4 (dead space due to alignment restrictions) + 8 * length

java.lang.Object: 2 * 8 (Object header)

other objects: sizeofSuperClass + 8 * nrOfLongDoubleAndObjectFields + 4 + nrOfntAndFloatFields + 2 * nrOfShortAndCharFields + 1 * nrOfByteAndBooleanFields -> align result up to a multiple of 8

Note that an object might have unused space due to alignment at every inheritance level (e.g. imagine a class A with just a byte field and class B has A as it's superclass and declares a byte field itself -> 14 bytes 'wasted on 64 bit system).

In practice 64 bit needs 30 to 50% more memory due to references being twice as large.

[UPDATE]

How much memory does a boolean consume?

Coming back to the question of how much a boolean consumes, yes it does consume at least one byte, but due to alignment rules it may consume much more. IMHO it is more interesting to know that a boolean[] will consume one byte per entry and not one bit,plus some overhead due to alignment and for the size field of the array. There are graph algorithms where large fields of bits are useful, and you need to be aware that, if you use a boolean[] you need almost exactly 8 times more memory than really needed (1 byte versus 1 bit).

Alternatives to boolean[]

To store large sets of booleans more efficiently a standard BitSet can be used. This will pack the bits into a compact representation that only uses one bit per entry (plus some fixed overhead)

For the Eclipse Memory Analyzer we had the need to use large fields of booleans (Millions of entries) and we know in advance how many entries we need.
We therefore implemented the class BitField that is faster than BitSet.

Still we only use this class on single core machines, because the disadvantage of it is that it is not thread save. On multicore machines we use boolean[] and I will explain in a later post how that works, because it's pretty tricky.

Note that these rules tell you only the flat (=shallow) size of an object, which is in practice pretty useless.
I will explain in my next post, how you can better measure how much memory your Java objects consume.

Thursday, November 27, 2008

Slides for my MAJUG talk about the Eclipse Memory Analyzer available

I just uploaded the Slides that I used 2 days ago at my talk for the Mannheim Java User Group (In German)
, to slideshare. There's a little bit of information lost, because of the conversion to PDF, but overall the content should be there:

Eclipse Memory Analyzer MAJUG November 2008[removed the live view because of performance problems with slideshare]

The Slides are also available from here in PDF format.

Thursday, November 06, 2008

Is Eclipse bloated? Some numbers and a first analysis

On the e4 (Eclipse 4) mailing list there were lately some discussions about whether Eclipse is bloated or not, and what could be done to minimize bloat for e4. Some summary information can be found here.

I promised to get some numbers about how much memory is used by the JIT for the classes.

I finally managed it :)

So I took Eclipse 3.5M2 (without the web tools) configured it to use the SAP JVM (which is based on SUN's Hotspot, therefore the numbers should be comparable) and used the Eclipse Memory Analyzer (plus the non open source, but free SAP extensions) to generate a csv file that I then imported into IBM's manyeyes (great stuff!) to get the data visualized as a treemap.

The numbers do not include the overhead of the compiled code cache which was around 10 Mbyte.

Here comes the memory usage(in bytes) per bundle for Eclipse 3.5M2 after clicking around to activate as many plugins as possible (a more well defined test would be good idea) :

A detailed analysis would be very time consuming, because you would need to check for a lot of classes whether there's anything that could be done to make them smaller. So for now here are just some interesting observations that I made, when quickly looking over the results:

12649 classes were loaded and consumed around 64 Mbyte in Perm space
during the test I run out of permspace, which was configured by default to 64Mbyte and the UI would freeze :(
there does not seem to be any large amount of generated code, and therefore optimizing classes might be difficult
the help system uses JSP's (generats classes) which are relatively big, also only of few of them where in memory
247 relatively big classes in org.eclipse.jdt.internal.compiler were loaded twice,once by the jasper plugin and once by the jdk core plugin

I also made a detailed (per class) visualization of the relatively large jdt bundle :

So is Eclipse really bloated?
To me it seems it's overall not very bloated. Remember that a developer PC these days probably has 2 Gbyte of RAM, so 64Mbyte hardly matter.

The help system could be probably optimizied because it uses a complete servlet implementation (Jetty) including JSP's as well as a complete Text search engine (lucene).

The heap consumption could be a bigger issue. I've seen things that you should almost always avoid like holding parsed DOM trees in memory. I may explain these issues in a later post, and I will probably show some examples in my next talk .

Tuesday, November 04, 2008

Eclipse Memory Analyzer Talk

On 25 November I will give a talk(in German) about the Eclipse Memory Analyzer at the Mannheim Java User Group.
See here for the details.

I will explain how MAT works and I also will also show some real world examples.

Monday, October 27, 2008

The knowledge about how much memory things need in Java is surprisingly low

I just came across the question "Size of a byte in memory - Java" at stackoverflow.
Ok, stackoverflow might not be the place for high quality answers, but almost nobody getting close to a meaningful answer for this day to day question, is a little bit shocking.

I documented the rules for java memory usage of the SUN JVM at my old blog 2 years ago.

I don't ask for people knowing the rules exactly (I don't either), but some feeling about where a JVM would align objects, and what kind of overhead an object has is essential Java programmers knowledge.
It's not really necessary that you know all the rules because the Eclipse Memory Analyzer knows about them.

Still, I think I understand that people might not know the details, because they are platform (32 bit versus 64 bit) as well as JVM implementation depended and have not been documented for a long time.

Not everybody has access to JVM hackers ;)

Lets come back to the question of how much a byte costs. A byte costs really 8 bits=1 byte (that is defined by the Java Spec), but the SUN JVM (on other JVM's do that as well) aligns to 8 byte.
So a it all depends on what other fields you have defined in your object. If you have 8 byte fields and nothing else everything is properly aligned and you would not waste memory.

On the other side you could add a byte field to an object and not consume more memory. You could have for example one field that only consumes 4 bytes and due to alignment you already waste for 4 bytes but adding the byte field actually costs nothing because it fills in the padded bytes.

The same is true for byte arrays, they are aligned as well and there's additional space needed for the length of the array. But for large byte arrays the average consumption is very close to one byte.

New language for Google App Engine coming before Q2/2009

There were rumors lately that Google will support Java on it's App Engine. Now at least the time frame is clear. Google just announced the roadmap for App Engine :

10/08 - 3/09
Service for storing and serving large files

Datastore import and export utility for large datasets

Billing: developers can pay for more resource usage

Support for a new runtime language

Uptime monitoring site

Given that Google internally only allows to use C++, Python, Java and Javascript I think it's save to bet that either Java and/or Javascript is coming. My guess is that Java will be coming and on top of it they will offer a Rhino based web framework(Rhino on Rails).
The rational for this is that they already showed that they can make Java work in a virtual environment. Android's Dalvik VM supports VM's running in separate processes to share data, which is something that is crucial for reducing the costs of running Java (securely) in a hosted environment. They also run some of their applications already on Java.
It's also very likely that the will support Java (at least a subset of it) because GWT, also it can be used on the server with other languages,is Java based.

Tuesday, October 21, 2008

Funny comment in the Android sources

Android is now open source.

Just by accident I ran into the following funny comment:

98 /*

99 * Count the number of '1' bits in a word.

100 *

101 * Having completed this, I'm ready for an interview at Google.

102 *

103 * TODO? there's a parallel version w/o loops. Performance not currently

104 * important.

105 */

106 static int countOnes(u4 val)

107 {

108 int count = 0;

109

110 while (val != 0) {

111 val &= val-1;

112 count++;

113 }

114

115 return count;

116 }

Support for IBM JVM's now available for the Eclipse Memory Analyzer

Just a short note.
It was just brought to my attention, that the Eclipse Memory Analyzer now also supports system dumps from IBM Virtual Machines for Java version 6, version 5.0 and version 1.4.2.

The IBM DTFJ adapter is available as an Eclipse plugin and can be downloaded from developerworks.

This is great news, because a lot of people had asked for IBM support, and because it means that MAT is now modular enough to support other Heap dump formats as well.

Maybe Google would like to support Dalvik (Android)? ;)

Monday, September 29, 2008

It's official, Neal Gafter works for Microsoft

There were already rumors that Neil Gafter would leave Google and join Microsoft.

It's now official

"I work for Microsoft on the dotNet platform languages. To balance my life, my hobby is designing and developing the future of the Java programming language."

and on twitter

"working on Microsoft Visual Studio Managed Languages with Anders Hejlsberg, on C# and other languages."

The Java Posse seems to have an interview with Neal ( I haven't checked yet).

Neal had a great influence on the Java language and lead one of the Closure proposals.
Unfortunately Closures will probably not make it into Java 7.

This is really a bad day for Java. Microsoft's .NET is already moving quickly to support new interesting languages
whereas Java is falling behind. For example, a real LINQ for Java will not be possible without closures.
With Neal working on .NET/C# Microsoft will probably advance even further.

Friday, September 19, 2008

Re: AMF vs. JSON vs. XML

Richard Monson-Haefel blogged about the advantages and disadvantages of AMF versus JSON versus XML on InsideRIA.

Also I agree with some of his points, IHMO he also misses some important ones.

For example, there is the "batch" pattern which says that it is sometimes cheaper to batch many I/O operation than to do them individually. But again using that pattern requires planning and a specific context to make it effective.

Batching is an important pattern, but the point is that the current implementations using AMF(BlazeDS) are simply Remote Procedure calls over HTTP POST.

As I said before the consequence is that BlazeDS, does not play very well with the HTTP caching infrastructure.
In short the POST's are never cached.

I just cannot see how automatic batching of RPC calls can be easily implemented. Restful API's based on JSON or XML can more easily support automatic batching and even could potentially (not sure whether the limitations of the Flex HTTPService would allow this) make use of http pipelining. Restful API's could also better use HTTP's caching features as well as lead to more scalable implementations on the server side, because to much state would be avoided there.

The truth is that performance differences between JSON and AMF are not that wide.

I agree, and would add "in practice" and "for most use cases".

I think that some benchmarks are a bit misleading. Who would really want to load that much data in one request to the client, if the client typically cannot show all the data? What you want to do is to use some kind of paging or on demand loading support, because most users in typical applications are not going to look at all the data anyway.

I bet that then the differences between the formats will become minor and probably negligible as soon as you use paging or a similar mechanism.
The reference to another performance analysis that Richard provides seems to go in this direction as well.

So at the end I would rather agree with one of the commenters on Richard's blog

Yeah, having a JSON implementation natively in Flash would be great.

There is one other requirement that you need to take into account and that is security.
Using Webservice Remoting with authentication doesn't seem to work very well at the moment with Flex.
Therefore If you are going to use HTTP's authentication mechanism, you currently have to use BlazeDS (or the like) or use HTTPService and parse the XML manually.

Wednesday, September 10, 2008

Google Chrome tuning

Googles Chrome browser by default uses one process for each tab to isolate crashes.
This has the disadvantage that it could require more memory,because for each tab some data and code will not be shared.

Fortunately this behaviour can be configured.
Check Googles process models documentation.

You can even set Chrome to use only one process.

Monday, September 01, 2008

Erlang will take over the world

The proof can be found here

Tuesday, August 26, 2008

Latency is Everywhere and it Costs You Sales

Link of the day:

Latency is Everywhere and it Costs You Sales - How to Crush it | High Scalability

"Analysis of sources of latency in downloading web pages by Marc
Abrams. The study examines several sources of latency: DNS, TCP, Web
server, network links, and routers. Conclusion: In most cases, roughly
half of the time is spent from the moment the browser sends the
acknowledgment completing the TCP connection establishment until the
first packet containing page content arrives. The bulk of this time is
the round trip delay, and only a tiny portion is delay at the server.
This implies that the bottleneck in accessing pages over the Internet
is due to the Internet itself, and not the server speed."

A must have tool for all web developers who care about performance

When optimizing web applications it is often a good idea to focus on the frontend.

Yslow is great Firefox addon analyzing web pages and checking some rules for high performance web sites. Unfortunately for IE there was so far no good solution.

Now there is!

AOL Pagetest is a great new free application that does similar things for IE.

Check
http://www.artzstudio.com/2008/07/optimizing-web-performance-with-aol-pagetest/
for a very nice video introduction to AOL Pagetest.

AOL Pagetest even computes the page loading time, which Yslow doesn't (it's tricky). It seems to use an approach, similiar to the one we use.

Great tool!

Wednesday, August 20, 2008

BlazeDS does not make use of the HTTP caching infrastructure

There were several comments to my post about my doubts about BlazeDS scaling well.

James Ward
said...

: Hi Markus,

You have to also consider that RIAs in Flex are
architected very differently than typical web applications. In a
typical Flex application most requests to BlazeDS's servlet handler
will be to either get data (which is then usually held in memory on the
client until the user closes the page) or to update data. Most of the
time when these operations are performed the response will be different
so caching doesn't provide much. If a developer decides that something
should be cached they can easily store that data in a Local Shared
Object (a big, binary cookie in Flash) - or if the app is using AIR
then it can save the data in the local SQLite DB. There are also
emerging open source frameworks that assist in handling this caching.

-James (Adobe)

Thanks James for responding. Yes, I understand that Flex comes kind of from a different angle. A lot of the early Flex applications might have been used to show "real time" data, such as Stock tickers.
And yes with Flex you probably usually only get data from the server, but the same is true for modern web applications. GWT uses a similiar approach for remoting. PURE is a javascript framework that also only sends data. Both GWT and PURE will work well within the Web infrastructure( Web caching proxies for example), because the server can set how long the data should be valid. Actually this meta data about lifetime is send together with the response.
I don't see how I can do the same with BlazeDS.

Yes of course I can build a cache on the Flex side. But that is something that I would have to do in addition and also the life time of the objects in this cache could probably not be controlled by the server.

I therefore still believe that there's room for improvement for BlazeDS.

In a similiar response Stephen Beattie
said...

Hmmm. With a 'stateful' Flash/Flex front-end, there will be less
requests made to the server as only the data is generally requested
once the interface SWF has been downloaded. For the sort of application
where the data is changing frequently, not caching makes sense to me.
Besides you can always implement a level of caching on the server-side
to prepare the data. I fail to see how this affects scalability. If
your data isn't going to change then there's no real need for BlazeDS.
Just load it as gzipped XML or something you can server up with a cache
HTTP header.

Basically he says that BlazeDS is for "real time" data only. IMHO that is a major limitation, because I can't see a technical reason why BlazeDS could not support the HTTP caching infrastructure.

Anonymous said...: [snip] I wouldn't want my transport to arbitrarily
decide what data to cache and what not to. I would want to build and
control that caching code myself. Not doing that can break
transactional isolation in an application.

[snip]

I want to be able to control from my server how long the data that was just send is valid, because the server for example might now that the data will only be updated once a day.

Tuesday, August 19, 2008

I doubt that BlazeDS scales

I'm currently researching whether BlazeDS for accessing Java Objects from a Flex client makes sense.
BlazeDS is supposed to be much more efficient than other alternatives such as JSON or SOAP.

Unfortunately it seems that Adobe's open source BlazeDS does not allow one to use HTTP's caching infrastructure. This pretty much renders the more efficient binary AMF protocol pretty much useless for scaling to many users. Fewer requests are almost always better than smaller requests.
Live Cycle Data Services Adobe's commercial variant of BlazeDS has sophisticated offline support. This might help to get things cached, but first it's commercial and second it might be to complicated, if you don't intend to develop full offline capabilites.

[Update:] More information in this blog post.

Wednesday, June 25, 2008

Robust Java benchmarking

Doing Micro benchmarks on the JVM is very hard, due to the Hotspot compiler (assuming you are on a SUN/SAP JVM) doing very advanced optimizations at runtime.

I usually recommend to be very careful about doing any micro benchmarks without having a real JVM hacker nearby ;)

Brent Boyer has written an article series at developerworks that explains how to do it correctly and he also presents a framework to make it easier to write robust microbenchmarks :

Robust Java benchmarking, Part 1

Robust Java benchmarking, Part 2

Also I have not yet tried the framework, it looks promising.

Tuesday, June 24, 2008

Eclipse Memory Analyzer considered to be a must have tool

Philip Jacob thinks that the Eclipse Memory Analyzer is a must have tool :

I also had a little incident with a 1.5Gb heap dump yesterday. I wanted to analyze it after one of our app servers coughed it up (right before it crashed hard) to find out what the problem was. I tried jhat, which seemed to require more memory than could possibly fit into my laptop (with 4Gb). I tried Yourkit, which also stalled trying to read this large dump file (actually, Yourkit’s profiler looked pretty cool, so I shall probably revisit that). I even tried firing up jhat on an EC2 box with 15Gb of memory… but that also didn’t work. Finally, I ran across the Eclipse Memory Analyzer. Based on my previous two experiences, I didn’t expect this one to work…. but, holy cow, it did. Within just a few minutes, I had my culprit nailed (big memory leak in XStream 1.2.2) and I was much further along than I was previously.

Thanks Philip for the positive feedback!
I didn't know that EC2 supports big multi core boxes. That is very interesting because the Eclipse Memory Analyzer does take advantage of multiple cores and the available memory on 64 bit operating systems. It will "fly" on one of these boxes.

Wednesday, June 04, 2008

An interesting leak when using WeakHashmaps

Bharath Ganesh
desribes in the blog post
Thoughts around Java, Web Services, IT: The interesting leak

a very interesting leak, were WeakHashmap doesn't seem to release entries that don't seem to be referenced anymore.
Actually when interned String literals are used the entries stay in the WeakHashmap even after all hard references seem to be removed.

I say "seemed to be removed" because there actually is still a reference from the Class, that is still loaded, to the interned String literal.

Unfortunately using a heap dump do find this out, does not work, because this implicit reference is (currently) not written to the heap dump file.

Changing the example to :


import java.io.IOException;
import java.util.Map;
import java.util.WeakHashMap;
import junit.framework.*;

public class TestWeakHashMap extends TestCase
{
private String str1 = new String("newString1");
private String str2 = "literalString2";
private String str3 = "literalString3";
private String str4 = new String("newString4");
private String str5 = (str4+str1).intern();

private Map map = new WeakHashMap();

public void testGC() throws IOException
{
    map.put(str1, new Object());
    map.put(str2, new Object());
    map.put(str3, new Object());
    map.put(str4, new Object());
    map.put(str5, new Object());

    /**
     * Discard the strong reference to all the keys
     */
    str1 = null;
    str2 = null;
    str3 = null;
    str4 = null;
    str5 = null;

    while (true) {
        System.gc();
        /**
         * Verify Full GC with the -verbose:gc option
         * We expect the map to be emptied as the strong references to
         * all the keys are discarded.
         */
        System.out.println("map.size(); = " + map.size() + "  " + map);
    }
}
}

You will find that str5 will be reclaimed by the Garbage Collector.

Conclusion
Using String literals defined in your Classes for keys in a WeakHashmap might not do what you want, but using interned Strings in general as keys for WeakHashmap is safe.

Tuesday, June 03, 2008

A classical Finalizer problem in Netbeans 6.1

Recently I tried the Netbeans UML module to sketch some simple use case diagrams.
It worked pretty well, but it didn't feel very responsive all the time. I quickly checked the memory consumption and found that it would be much higher than during my last test.

I therefore took another heap dump. Here comes the overview:

Finalizers?

So this time Netbeans needed 74,2 Mbyte, much more than last time.
Surprisingly 15,5Mbyte alone are consumed by instances of the class java.lang.ref.Finalizer.
Such a high memory usage caused by Finalizer instances is not normal.
Usually you would see Finalizer instance using a few hundred Kbyte.
Next I simply checked the retained set (the object that would be reclaimed, if I could remove the Finalizer instances from memory) of these Finalizer instances:

So int[] arrays are consuming most of the memory. I again used the "immediate dominator" query on those int[] arrays to see who is keeping them in memory:

So lets take a look at those sun.awt.image.IntegerInterleavedRaster instances and see who is referencing them:

Can we blame Tom Sawyer?

We see again that java.awt.image.BufferedImage is involved as well as Java2d.
surfaceData sun.java2d.SunGraphics2D is referenced by com.tomsawyer.editor.graphics.TSEDefaultGraphics (what a nice package name).
Lets look at the code of surfaceData sun.java2d.SunGraphics2D:
public void dispose()
{
surfaceData = NullSurfaceData.theInstance;
invalidatePipe();
}

public void finalize()
{
}

"dispose" should clean surfaceData, but at least to to me it seems that nobody has called it.
So I decompiled TSEDefaultGraphics and found dispose to be empty:

public void dispose()
{
}

So my guess is (without digging deeply into the code) that TSEDefaultGraphics needs to be fixed and call dispose on it's surfaceData instance variable.

At the End

What this shows, is that you not only need to be very careful with implementing finalize(), but yo also need to take check whether you use objects that implement finalize().
Objects that really need to implement finalize should be small and you should not reference large objects.

Wednesday, May 28, 2008

Memory consumption of Netbeans versus Eclipse an analysis

I recently analyzed the memory consumption of Eclipse and found that it should be easy to optimize it.

This time I will take a look at Netbeans (6.1, the Java SE pack) using the exact same setup.

First the overall memory consumption of Netbeans is only a little bit higher 24 Mbyte versus 22,7 Mbyte for eclipse :

Keep in mind that the Eclipse memory usage includes the spell checker which needs 5,6Mbyte, which can easily turned off. Without the spell checker Eclipse would need only 17,1Mbyte

The overview of the Memory Analyzer shows that the biggest memory consumer, with around 5.4 Mbyte is sun.awt.image.BufImgSurfaceData:

Swing overhead(?)

This seems to be caused by the fact that Swing uses java2d which does it's own image buffering independent of the OS. I could easily figure this out by using the Memory Analyzers "path to GC roots query":

So maybe we pay here for the platform independence of Swing?
I quickly checked using Google whether there are ways around this Image buffering, but I couldn't find any clear guidance that would avoid this. If there are Swing experts reading this, please let me know your advise.

Duplicated Strings again

Again I check for the duplicated Strings using the "group_by_value" feature of the Memory Analyzer.
Again some Strings are there many times :

This time I selected all Strings, which are there more than once, then called the "immediate dominators" query and afterwards I used the "group by package" feature in the resulting view:

This view shows you the sum of the number of duplicates in each package and therefore gives you an good overview of which packages waste the most memory because of objects keeping duplicates of Strings alive.
You can see for example that
org.netbeans.modules.java.source.parsing.FileObjects$CachedZipFileObject keeps alive a lot of duplicated Strings.
Looking at some of these objects you can for example see that one problem is the instance variable ext, which contains very often duplicates of the String "class".

Summary

So at the end we found for this probably simplistic scenario that both Netbeans and Eclipse don't consume that much memory. 24Mbyte is really not that much these days.
Eclipse seems to have a certain advantage, because turning of the spell checker is easy and then it needs almost 30% less memory than Netbeans.

To be clear, it is still much to early to declare a real winner here. My scenario is just to simple.

The only conclusion that we can draw for sure for now is, that this kind of analysis is pretty easy with the Eclipse Memory Analyzer :)

Thursday, May 22, 2008

Eclipse memory leaks?

"ecamacho" picked up my post about the memory consumption of Eclipse on the spanish website http://javahispano.org/
Google translation works pretty well for the post and the comments are quite interesting.

Leaks in Eclipse?

One question was, whether I was talking about leaks in Eclipse.
Actually I myself did not, but my colleague and project lead of the Eclipse Memory Analyzer project, "Andreas, Buchen", blogged about analyzing an actual leak in Eclipse. I can highly recommend the article, because it also shows how powerful the Memory Analyzer is.

Does turning of the Spellchecker help?

From what I have seen, yes it should help. Check http://bugs.eclipse.org/bugs/show_bug.cgi?id=233156 for the progress on the spellchecker issue.

Should we wait until the end to analyze the memory consumption of our applications?

IMHO we should not, because as we all know, fixes at the end of an software project are much more expensive than in the beginning.

I often hear :

"premature optimization is the root of all evil."

This has been misquoted just too often.

Hoare said (according to wikipedia)

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."

"small efficiencies"
That quote clearly doesn't say, you should not do any investigations on how much resources your applications consumes in the early phase of your project.

Actually we have some experience with the automatic evaluation of heap dumps and we found that it pays off very well. This could be a topic for another blog post.

Monday, May 19, 2008

Analyzing the Memory Consumption of Eclipse

During my talk on May 7 at the Java User Group Karlsruhe about the Eclipse Memory Analyzer I used the latest build of Eclipse 3.4 to show live, that there's room for improvement regarding the memory consumption of Eclipse.
Today I will show you how easy this kind of analysis is with the Eclipse Memory Analyzer.

I first started Eclipse 3.4 M7 (running on JDK 1.6_10) with one project "winstone" which includes the source of the winstone project(version 0.9.10):

Then I did a heap dump using the JDK 1.6 jmap command :

Since this was a relatively small dump (around 45Mbyte) the Memory Analyzer would parse and load it in a couple of seconds :

In the "Overview" page I already found one suspect. The spellchecker (marked in red on the screen shot) takes 5.6Mbyte (24,6%) out of 22,7 Mbyte overall memory consumption!
That's certainly too much for a "non core" feature.
[update:] In the mean time submitted a bug (https://bugs.eclipse.org/bugs/show_bug.cgi?id=233156)
Looking at the spellchecker in the Dominator tree :

reveals that the implementation of the dictionary used by the Spellchecker is rather simplistic.
No Trie, no Bloom filter just a simple HashMap mapping from a String to a List of spell checked Strings :

There's certainly room for improvement here by using one of the advanced data structures mentioned above.

My favorite memory consumption analysis trick

Now comes my favorite trick, which almost always works to find some memory to optimize in a complex Java application.
I went to the histogram and checked how much String instances are retained:

12Mbyte (out of 22,7), quite a lot! Note that 4 Mbyte are from the spell checker above (not shown here, how I computed that), but that still leaves 8 Mbyte for Strings.
The next step was to call the "magic" "group by value" query on all those strings :

Which showed us how many duplicates of those Strings are there:

Duplicates of Strings everywhere

What does this table tell us? It tells us for example that there are 1988 duplicates of the same String "id" or 504 duplicates of the String "true". Yes I'm serious. Before you laugh and comment how silly this is, I recommend you to take a look at your own Java application :] In my experience (over the past few years) this is one of the most common memory consumption problems in complex java applications.
"id" or "name" for example are clearly constant unique identifiers (UID). There's simply no reason why you would want that many duplicates of UID's. I don't even have to check the source code to claim that.

Let's check which instances of which class are reponsible for these Strings.
I called the immediate dominator function on the top 30 dominated Strings :

org.eclipse.core.internal.registry.ConfigurationElement seems to cause most of the duplicates ,13.242!

If you look at the instances of the ConfigurationElement it's pretty clear. that there's a systematic problem in this class. So this should be easy to fix by using for example String.intern() or a Map to avoid the duplicates.

Bashing Eclipse?

Now you may think, that this guy is bashing Eclipse, but that's really not the case.

If you beg enough, I might also take a closer look at Netbeans :]

Wednesday, May 14, 2008

GMail got faster, one trick that they didn't tell us

GMail got even faster and here are some details about how they did it :
Official Gmail Blog: A need for speed: the path to a faster loading sequence

One trick that the obviosouly did, but didn't tell us, is that they send your user name asynchronously in the background as soon as you enter the password field.
This is a GMail specific optimization. It's not available on the generic Google account page (yet?) .

My guess is that they prefetch some stuff under the assumption that most logon attempts will work. Even if the password is wrong, they minimum work, that has to be done is to check whether the account is there.

The funny thing is, that I've seen this just a few days ago, when I checked how much data Google Mail sends, to get a better feeling what would a good goal for a high performance Web application.

Regards,
Markus