Thursday, November 06, 2008

Is Eclipse bloated? Some numbers and a first analysis

On the e4 (Eclipse 4) mailing list there were lately some discussions about whether Eclipse is bloated or not, and what could be done to minimize bloat for e4. Some summary information can be found here.

I promised to get some numbers about how much memory is used by the JIT for the classes.

I finally managed it :)

So I took Eclipse 3.5M2 (without the web tools) configured it to use the SAP JVM (which is based on SUN's Hotspot, therefore the numbers should be comparable) and used the Eclipse Memory Analyzer (plus the non open source, but free SAP extensions) to generate a csv file that I then imported into IBM's manyeyes (great stuff!) to get the data visualized as a treemap.

The numbers do not include the overhead of the compiled code cache which was around 10 Mbyte.

Here comes the memory usage(in bytes) per bundle for Eclipse 3.5M2 after clicking around to activate as many plugins as possible (a more well defined test would be good idea) :



A detailed analysis would be very time consuming, because you would need to check for a lot of classes whether there's anything that could be done to make them smaller. So for now here are just some interesting observations that I made, when quickly looking over the results:
  • 12649 classes were loaded and consumed around 64 Mbyte in Perm space
  • during the test I run out of permspace, which was configured by default to 64Mbyte and the UI would freeze :(
  • there does not seem to be any large amount of generated code, and therefore optimizing classes might be difficult
  • the help system uses JSP's (generats classes) which are relatively big, also only of few of them where in memory
  • 247 relatively big classes in org.eclipse.jdt.internal.compiler were loaded twice,once by the jasper plugin and once by the jdk core plugin
I also made a detailed (per class) visualization of the relatively large jdt bundle :




So is Eclipse really bloated?
To me it seems it's overall not very bloated. Remember that a developer PC these days probably has 2 Gbyte of RAM, so 64Mbyte hardly matter.

The help system could be probably optimizied because it uses a complete servlet implementation (Jetty) including JSP's as well as a complete Text search engine (lucene).

The heap consumption could be a bigger issue. I've seen things that you should almost always avoid like holding parsed DOM trees in memory. I may explain these issues in a later post, and I will probably show some examples in my next talk .

18 comments:

David Carver said...

Bloat is a very broad term. Number of classes and their size isn't really the issue. It's the amount of memory when running the application that adds to the bloat in my opinion. You can have a very efficient system with lots of small classes that are garbage collected from time to time.

So I think this is only a start to the bloat problem...but there are many classes spread across the various eclipse projects that are duplicated because they are marked internal, but people need their functionality.

Just looking at the platform isn't going to be a good review. Take a look at something like Eclipse for Java EE or another packaged project that includes multiple projects, to get a better idea.

Markus Kohler said...

Hi David,
Thanks for you comment!
Yes I agree that code size might not be the real problem, also some people on the e4 mailing list may think otherwise.

Regarding the Eclipse WST, I agree it would be interesting to see how it compare and it could be that I will do it in the next few days.

Still we can also see that some components are relatively bloated.
Do we really need a complete servlet implementation including JSP's in a Java IDE (without Web tools)?

eckes said...

Can we actually see the Treemap and other artifacts you created somewhere?

Markus Kohler said...

Hi
It seem that there are sometimes problems with manyeyes.
Your browser needs to support Java applets.
I will check whether showing an image would work better.

Regards,
Markus

eckes said...

Ah OK Applets. Wenn ich Noscript ausschalte sehe ich das. Ich hatte es nicht erkannt, dass im Artikel Grafiken drin sind :)

Anonymous said...

I didn't know many-eyes. Thanks for that.

For the less exotic things I use http://www.chartle.net/

All this is written in Java. I wonder if that means Applets are finally coming back?

Markus Kohler said...

Thanks for the hint about chartle.
IHMO Treemaps are great for hierarchical data and should be used more.

Unfortunately good (interactive) implementations of treemaps are rare.

I didn't find a flash based alternative to the manyeyes treemap implementation.

Rick said...

You are welcome. I played now with many-eyes for a while and really like it.

I think I will still use Chartle for 95% of my work since charts are just easier to create and most importantly embed in my website.

Flash seems to trump Java in the web. Still with Chartle I can also do things like save the chart as image on my desktop and do not need to upload data. Normally with Flash one has to jump through a lot of hoops.

William Louth said...

Why not show the object cost allocation (size and sizes) per activity indirect/direct & inherent/non-inherent/ & transient / non-transient......

I can do this.

It is also worth noting that one should really have a set of activity profiles (user usage) because all of this needs to be weighted otherwise you are wasting time or at best not being very productive.

William

Markus Kohler said...

Hi William,
Thanks for your comments.

I would really like to see some numbers of object cost allocation :)

This post was about the memory that classes consume in memory (perm space). As far as I'm aware there is no way to do this currently with the SUN JVM ( maybe with Open JDK).
I guess therefore this is the first time someone looked at this for Eclipse.



Regarding your comment about profiles, I completely agree.
This was a first attempt to see how much memory the code would need.

The user was kind of a normal Java Developer not doing any J2EE stuff.
I would really like to have some automated test for this.
If anyone knows how this can be done easily with Eclispe, please let me know.

Regards,
Markus

Anonymous said...

Did you test this also under Linux? The same version of Eclipse with the same version of Java has under Linux almost twice the footprint then when I run it under Vista.

Why do you think that is?

Markus Kohler said...

Regarding Linux,
No I didn't try, but I don't think it should make a significant difference. Theoretically the native UI toolkit on Linux (GTK) could use more memory (I didn't measure that), but I somehow doubt that. I remember that during my test nn my windows machine memory usage in the JVM's native area,was reported to be very low.

Are you sure that the hardware exactly the same, eg both are Intel 32 bit?
If Linux is 64 bit I would expect it to need up tp 50% more memory.

You also shouldn't trust what the taskmanager tells you about memory usage.

Anonymous said...

Well it is the same hardware since it is dual boot.

Both 32bit also the machine could theoretically do 64.

I checked the memory footprint using the Eclipse Heap View (activate under Preferences).

Can't say anytthing above that.

Markus Kohler said...

Ok,
I guess you should now whether your Linux is 64 bit or not.
The JVM preallocates a certain amount of memory (you can configure the maximum using the -Xmx option).
The Eclipse Heap View shows the amount of the preallocated memory as well as how much is currently used.

What you would need to do is to compare those "currently used" values after you have triggered several Garbage Collector runs.

Of course you should have exactly the same setup for both Eclipse installations.

If you really want to know what is consuming the memory, get MAT to analyze a heap dump.
See my other blogs in the "memory" category on this page

Anonymous said...

Linux and Vista are both 32bit.

Both Eclipse installs are the same Ganymede, differing only in the SWT implementation.

-Xmx256M is identical.

Same Workspace, same Project open.

After many GCs (actually it didn't change after 3 anymore)

Linux: ~190MB
Vista: ~110MB

Not really twice as much but significant. Also it feels more sluggish under Linux, but that it probably SWT-GTK.

More work went into the Windows SWT implementation then in all other implementations combined. Ask Steve Northover ;)

Markus Kohler said...

Strange.
You could do heap dumps run it through MATS (http://www.eclipse.org/mat/) and run the "leak suspects" report.
both on Linux and Windows.
Under Top Consumers you should see how much memory is consumed by classes/packages ets.

you can than compare (manually) where the difference comes from.
Should be easy :)

Anonymous said...

I have enough RAM to run 20 instance of Eclipse, so why bother? With the -Xmx option I ensure that memory leaks don't do mental and influence my system.

I just thought it is interesting to note.

James said...

I would like to say that you really made my day, it's wonderful when you just look around the web
and find something like this, reminds me of that ''How to make a dinner for a romantic...'' by Elsa Thomas,
you're a wonderful writer let me tell you!!! ñ_ñ

Buy Viagra
James Maverick (maverickhunterjames@gmail.com)
3453 Rardin Drive
San Mateo, CA 94403
Project Manager
650-627-8033