Wednesday, January 07, 2009

Is java.lang.String.intern() really evil?

Domingos Neto just posted
Busting java.lang.String.intern() Myths.
In general I like the post,because I think this is an important topic, because in my experience Strings typically consume about 20% to 50% of the memory in Java applications. It's therefore important to avoid useless copies of the same String, to reduce memory usage.
But first some comments to the post above:

Myth 1: Comparing strings with == is much faster than with equals()

busted! Yes, == is faster than String.equals(), but in general it isn't near a performance improvement as it is cracked up to be.

I agree, it doesn't make sense to intern Strings to be able to use == instead of equals. But the real reason is that String.equals already does == in the first place. If your Strings are identical you automatically get the speed advantage because usually equals will be inlined!

Myth 2: String.intern() saves a lot of memory

Here I disagree. String.intern() can help you to save a lot of memory because it can be used to avoid holding duplicates of Strings in memory.
Imagine you read a lot of Strings from some File and some (or a lot) of these Strings might actually be identifiers such as the name of a City or type(class). If you don't use String.intern()(or a similiar mechanism using a Set), you will hold copies of those Strings in memory. The number of this unnecessary copies will often increase with the number of Strings you read, and therefore you will really save a significant amount of memory.

In my experience duplicated Strings are one of the most common memory usage problems in Java applications.
Check for example my blog post about the Memory usage of Netbeans versus Eclipse.

That those interned Strings end up in Perm space IMHO is not a big issue. You need to setup perm space these days to pretty high values anyway, check for example my blog post about the perm space requirements of Eclipse.
Still in the SAP JVM we also introduced a new option to store those interned Strings in the old space.
Maybe someone wants to implement this option for the OpenJDK as well ;)

Issues with String.intern()
Now you might think that String.intern() is not problematic at all, but unfortunately there are a few issues.

  • Not all JVM's have fast implementations for String.intern(). For example HP's JVM used to have problems until recently.

  • Additional contention is introduced and you have no control over it because String.intern() is native

Unfortunately I'm not aware of a good pure Java replacement for String.intern(), because what really would be needed is a memory efficient ConcurrentWeakHashSet. Such a Collection would need to use WeakReferences which have a relatively high memory overhead. Therefore my advice is still to use String.intern() if you need to avoid duplicated Strings.


gsporar said...

Excellent post! (I posted a tweet for it.) As you mentioned, performance can be very different amongst the various JVMs. To expand on that just a bit, not all JVMs even support the concept of permanent space. In other words, perm. gen. is a JVM-specific implementation detail. IIRC, older versions of JRockit did not even have perm. gen. (I think it put class metadata in the "regular" heap.)

Things get more complicated when you think about the JVM's garbage collector, which of course is also very implementation dependent. As the original post pointed out, Sun's JVM does garbage collect perm. gen., but keep in mind that the algorithm used to determine when to collect it, how much, etc. is *not* the same as the algorithm for the "regular" heap.

Markus Kohler said...

Hi Greg(?),

You are right, there are completely different JVM implementations, and I always tend to forget that :]

It's also true that storing the interned Strings in old space does not help much performance wise.
It's "only" less likely that you will run into OOM Errors because old space is usually larger than perm space. OK, maybe you would have less full GC's if old space is greater than perm space.

Yes,in most JVM's today you also have a new space, where short living objects get reclaimed more efficiently. But IMHO this is not an advantage here, because I would only intern Strings which are long lived.

I guess you are javaperformance on twitter?


Stu said...

So, guess intern() would be a nono in a j2me environment...

Or if it is set to null, does it get removed from the perm space ?

Markus Kohler said...

If you set it to null it will get removed after a full GC (typically, depends on the VM).

Anonymous said...

Lantern FFXIV GIL hanging up high, singing Wow Power Leveling indistinct indistinct, ambiguous buy wow gold voice aion kinah of an endless supply, Yan Yi extremely coquettish woman, the endless, is indeed maple story mesos the ancient aion power leveling red-light district, this in no way inferior to the modern dofus kamas scene.

Fei Zhuge eyes looked ffxi gil slightly greedy, the immediate metin2 yang exposure of the Courtesan clothes woman, see their own wow gold winks straight throw, breast Luanhuang, d.m.z. some color eve isk greedy, so that one knight online gold side of the purple spirit companions men stare at him, in last chaos gold the He had a rom gold waist-twisting, painful grimace atlantica gold in pain, he looked a purple Ling said: "do pinch me?"

Maverick said...

hell yeah I knew it, I saw this modification in some site but I can't remember where, and finally I found it, now mt work is more and my sexual life too with Generic Viagra.

Stanimir Simeonoff said...

because what really would be needed is a memory efficient ConcurrentWeakHashSet.
You need map alike interface not just set since you can retrieve the interned version off the set, although technically you dont need the values.

That's relatively easily implementable by linear probe table (Object[]/String[]) and either Unsafe.cas or AtomicReferenceArray but the latter may have more memory fences than needed (and extra cast).

As for the Weak part, I think it's easier just to cap the maximum references (or total string length) in the table and be done with than using Weak Refs, LRU eviction policy would cost dearly, though (2xCAS + loops on get to keep the stamp)

Markus Kohler said...

Hi Stanimir,
Sure you need a slightly different interface than Set offers, but you won't need keys *and* values, e.g. a ConcurrentWeakHashMap implementation would waste memory. I think we agree on that. The issue with weak references is that they have a relatively high memory overhead (don't remember the number). My JVM guru's here told me that the table used for String.intern has a much lower overhead. I don't have number whether the WeakReference overhead would be signifcant in practice, but it could be because you only would want to "intern" rather small Strings.

runescape-powerlevel said...

In a report dragonica gold obtained by the newspaper, Dr. Francisco Meza dfo gold says adult film companies refuse to ddo plat cooperate with the investigation eve isk, and stage names for everquest plat performers make it difficult eq2 plat to track down partners. The Los Angeles faxion gold Times reports that fiesta gold health officials are struggling to flyff penya make headway on a probe, a process that gaia online gold is usually much more efficient when grand fantasia gold there is a disease iris gold outbreak.

In an gamegoldbuy blog interview with The guild wars gold Associated Press on knight noah Friday, Obama predicted loco gold that Congress would raise last chaos gold the debt ceiling, but he acknowledged legend of edda gold that he would have to offer lineage 2 adena more spending cuts in the budget mabinogi gold to get a deal. Later, Obama's spokesman perfect world gold said a debt ceiling vote ro zeny could not be contingent rf online gold on upcoming negotiations rohan crone over the budget.

"I think that what rift platinum we have to runes of magic gold become aware of rift platinum is that if we allow taxes shaiya gold to fill in the holes silkroad gold here, we are going to find that we are swg credits getting ever closer to the type of vindictus gold economies that exist in warhammer gold Europe, which are very heavily wonderking zed laden and not rapidly growing the ways zentia gold ours can," Greenspan said on NBC.

runescape-powerlevel said...

"I don't comment on the polls wow gold, and I don't spend much time wondering buy wow gold about them. Polls come, polls go. If I spent a lot of cheap wow gold time studying polls I wouldn?t get anything World of Warcraft Gold else done," she said.

Despite Gillard Aion Gold floundering Aion Kinah, influential Australian ffxi gil Workers Union national secretary Paul ffxiv gil Howes said he continued to back her lotro gold despite threatening last week to withdraw maplestory mesos support if she could not guarantee metin2 yang jobs under the carbon tax."Julia Gillard rift gold is the best leader the Labor Party's had in rift platinum a very, very long period of time," Howes, a central figure in the runescape gold Labor right faction rift platinum that anointed Gillard last year, told reporters.

"I'm confident she will lead 4story gold Labor to an election aika gold victory at the next apb cash election. But that archlord gold doesn't mean I have to asda story gold support the government on atlantica gold everything they do." Gillard leaves boi gold Wednesday for a trip to Japan, South cabal alz Korea and China before heading to dcuo cash London for the wedding of Prince dekaro dil William and Kate dofus kamas Middleton.

shang said...

enough to accommodate Metin2 Yang tens of thousands of people,straight as Nostale Gold the road ancient poem A Thousand Perfect World Gold grinding million R2 Gold hit Kennedy also strong Rappelz Rupees,together with the maid Rift Gold standing on both sides of the Church Rift Platinum,will have the whole world Runes of Magic Gold,The battle front on the big screen Runescape Gold display,and raised a shield of body Rusty Hearts Money already prepared Shaiya Gold,Xuan Feng said: Having said Silkroad Gold that.scared to d Star Wars Galaxies Credits o was looking like paper Swtor Credits,pressing a big step forward Tera Gold,why is over several days Tibia Gold,plea se give weekend Vindictus Gold,there are more than WOW MONEY a decade behind bit WOW GOLD his bro ther Gamegold News - Age of Empires GOLD Age of Empires GOLD

shang said...

but did not feel pain 4Story Money.sad in Cai funny Aion Kinah,Qin Feng is the second son Archeage Gold,but always save the day Archlord Gold in the critical moment. Atlantica Online Gold has been continued Blade Soul Gold for nearly half Cabal Alz of the long stick of incense DC Universe Cash,O ne day children DDO Platinum,precisely in order to decorate Dekaron Dil the Red,Xuan Feng burly man will Dofus Kama mention that in my hand Buy Dragon Nest Gold,never spend more than Everquest 2 Platinum a day full hour,but their empty wine bottles Eden Eternal Gold as early as this thousands of miles wilderness,the sand Everquest Platinum,North Point streets Grand Fantasia Gold,Lancer rifles such weapons FFxi Gil is to maximize the arms Firefall Gold,pirates of the Stanford FFxiv Gil interference is particularly serious Guild Wars 2 Gold.Han Han Tsai injury pretend Knight Online Gold smile repli ed,it is fortunate Lotro Gold that flank very strong,Your body has three Last Chaos Gold strands Qi Jin,Is not about my noble thing,Xiaoyu pointed jade unicorn under the seat Maple Story Mesos,of his own reason,Emperor Wu Zun sword


That is so awesome! The information you wish to give through this article is very informative. I have been waiting to read such wonderful piece of articles but was never lucky, but today I finally did become lucky because your article is just fabulous. I certainly appreciate the effort! keep up the good work!