You are on page 1of 6

How SubString method works in Java - Memory Leak Fixed in JDK 1.

Javarevisited

Blog about Java programming language, FIX Protocol, Tibco Rendezvous and related Java technology stack.

SATURDAY, OCTOBER 8, 2011

Best of Javarevisited

How SubString method works in Java - Memory Leak Fixed in JDK 1.7
Substring method from String class is one of most used method in Java, and it's also part of an interesting String interview
question e.g. How substring works in Java or sometime asked as how does substring creates memory leak in Java. In
order to answer these questions, you knowledge of implementation details is required. Recently one of my friend was
drilled on substring method in Java during a Java interview, he was using substring() method from long time, and of
course all of us has used this, but what surprises him was interviewer's obsession on Java substring, and deep dive till the
implementation level. Though String is a special class in Java, and subject of many interview questions e.g. Why char array

How Android works, Introduction for Java


Programmers
Difference between Java and Scala Programming
Top 5 Java Programming Books for Developers
Top 10 JDBC Best Practices for Java programmers
Tips and Best practices to avoid
NullPointerException in Java

is better than String for storing password. In this case it was, substring method, which took center stage. Most of us rather
just use substring(..), and than forgot. Not every Java programmer go into code, and see how exactly it's working. To

10 Object Oriented Design Principles Java


Programmer Should Know

get a feel of how his interview was let's start .

10 HotSpot JVM Options, Every Java Programmer


Should Know

Subscribe To This Blog Free

Update: This issue was actually a bug http://bugs.sun.com/view_bug.do?bug_id=6294060,


which is fixed in substring implementation of Java 7. Now, Instead of sharing original character array,
substring method creates a copy of it. In short, substring method only retains as much data, as it needed.
Thanks to Yves
Gillet for pointing this. As some of my readers pointed out, java.lang.String class has
also grown into some change in Java 1.7 version and offset and count variable which is used to track
positions are removed from String. This may save some bytes with each String instance, but not sharing
original array makes substring perform linearly, as compared to constant time previously. Anyway, it's worth
to remove any string related memory leak in Java. Having said that, if you have not yet upgraded your Server
to Java 7 and still working on Java 1.6 updates, this is one thing, which is worth knowing.


Posts


Comments
Follow Us
Follow @javinpaul

Question starts with normal chit chat, and Interviewer ask, "Have you used substring method in Java", and my friend
proudly said Yes, lot many times, which brings a smile on interviewer's face. He says well, thats good. Next question was
Can you explain what does substring do? My friend got an opportunity to show off his talent, and how much he knows
about Java API; He said substring method is used to get parts of String in Java. Its defined in java.lang.String class,
and it's an overloaded method. One version of substring method takes just beginIndex, and returns part of String started
from beginIndex till end, while other takes two parameters, beginIndex and endIndex, and returns part of String
starting from beginIndex to endIndex-1. He also stressed that every time you call substring() method in Java, it will
return a new String because String is immutable in Java.
Next question was, what will happen if beginIndex is
equal to length in substring(int beginIndex), no it won't
throw IndexOutOfBoundException instead it will return

Followers
Subscribe by email:

empty String. Same is the case when beginIndex and


Subscribe

endIndex is equal, in case of second method. It will only


throw StringIndexBoundException when
beginIndex is negative, larger than endIndex or larger
than length of String.
So far so good, my friend was happy and interview seems
going good, until Interviewee asked him, Do you know
how substring works in Java? Most of Java developers
fail here, because they don't know how exactly substring
method works, until they have not seen the code of
java.lang.String. If you look substring method inside
String class, you will figure out that it calls String (int offset, int count, char value []) constructor to
create new String object. What is interesting here is, value[], which is the same character array used to represent original
string. So what's wrong with this?

By Javin Paul
Loading

Blog Archive

2014
(
76
)

2013
(
136
)

2012
(
217
)

2011
(
145
)

December
(
28
)

November
(
14
)

October
(
14
)

How to Convert Double to String to Double in


Java ...

In case If you have still not figured it out, If the original string is very long, and has array of size 1GB, no matter how small a
substring is, it will hold 1GB array. This will also stop original string to be garbage collected, in case if doesn't have any live

How to Stop Thread in Java Code Example

reference. This is clear case of memory leak in Java, where memory is retained even if it's not required. That's how
substring method creates memory leak.

What is Iterator and ListIterator in Java


Program ...

How SubString in Java works

Difference between Truncate and Delete


command in ...

Obviously next question from interviewer would be, how do you deal with this problem? Though you can not go, and
change Java substring method, you can still make some work around, in case you are creating substring of significant
longer String. Simple solution is to trim the string, and keep size of character array according to length of substring. Luckily
java.lang.String has constructor to do this, as shown in below example.

How to convert decimal to binary, octal and


hex St...

How to write Helloworld in Google Dart with


Exampl...

10 Example Queries of SQL Select Command

// comma separated stock symbols from NYSE


String listOfStockSymbolsOnNYSE = getStockSymbolsForNYSE();
//calling String(string) constructor

http://javarevisited.blogspot.in/2011/10/how-substring-in-java-works.html[8/23/2014 8:40:23 AM]

Dennis Ritchie, Creator of C language and Unix


h...

How SubString method works in Java - Memory Leak Fixed in JDK 1.7
String apple = new String(
listOfStockSymbolsOnNYSE.substring(appleStartIndex, appleEndIndex)
);

What is Class in Java Programming with


General Exa...

Google Dart: New Programming language to


replace J...

If you look code on java.lang.String class, you will see that this constructor trim the array, if its bigger than String
itself.

Top 10 JSP Interview Questions Answers for


Java Pr...

How SubString method works in Java - Memory


Leak F...

public String(String original) {


...
if (originalValue.length > size) {
// The array representing the String is bigger than the new
// String itself. Perhaps this constructor is being called
// in order to trim the baggage, so make a copy of the array.
int off = original.offset;
v = Arrays.copyOfRange(originalValue, off, off+size);
} else {

How to Set Path for Java Unix Linux and


Windows

How to override hashcode in Java example Tutoria...

September
(
22
)

August
(
11
)

// The array representing the String is the same


// size as the String, so no point in making a copy.
v = originalValue;
}
...

July
(
7
)

June
(
9
)

May
(
6
)

Another way to solve this problem is to call intern() method on substring, which will than fetch an existing string from pool

April
(
10
)

or add it if necessary. Since the String in the pool is a real string it only take space as much it requires. Its also worth noting
that sub-strings are not internalized, when you call intern() method on original String. Most developer successfully

March
(
4
)

answers first three questions, which is related to usage of substring, but they get stuck on last two, How substring creates
memory leak or How substring works. It's not completely there fault, because what you know is that every time
substring() returns new String which is not exactly true, since its backed by same character array.

February
(
10
)

January
(
10
)

2010
(
33
)

This was the only interview question, which bothers my friend little otherwise, its standard service level company Java
interview in India. By the way, he got the call a day after ,even though he struggled little bit on How SubString method works
in Java, and that was the reason he shared this interview experience with me.

References

Java API documentation JDK 6

Related Java tutorials


Difference between StringBuffer and StringBuilder
How to Set ClassPath for Java in Windows and Unix
How to Solve Java.lang.OutOfMemoryError: PermGen Space
How to use Generic in Java Collection
Difference between Comparator and Comparable in Java
How to deal with ClassNotFoundException in Java
10 Examples of grep command in UNIX
10 Advanced Examples of Java Enum

Spring framework doc


Struts
JDK 7 API
MySQL
Linux

Posted by
Javin Paul
at
6:09 AM

Labels:
core java
,
core java interview question

Location:
United States

21 comments
:

Anonymous
said...

In the substring method of String class, the code is as follows:


return ((beginIndex == 0) && (endIndex == count)) ? this : new String(offset + beginIndex, endIndex beginIndex, value);

The code is returning a new String object which contains the same backing array. As the array is an object
reference, the two string objects will be pointing to the same backing array. No other instance of array is
created. So irrespective of number of substrings created from that string, memory will be used for only one
character array. The objects will be different but the backing array will be only one.

October 10, 2011 at 6:00 AM

Anonymous
said...

I understand how substring works in Java but what is point of keeping original character array inside substring
object ? Since string is immutable you can not change a String once created and SubString are String.

March 22, 2012 at 12:06 AM

http://javarevisited.blogspot.in/2011/10/how-substring-in-java-works.html[8/23/2014 8:40:23 AM]

Eclipse
jQuery

Copyright by Javin Paul 2012. Powered by Blogger.

How SubString method works in Java - Memory Leak Fixed in JDK 1.7

Anonymous
said...

Hi,
I read your article and i cant able to understand completely. Why the empty string is returned, when begin
index and start index is same, i checked Java api i didn't get clear idea. Why they are doing like this. So i
started searching in internet. I found this post.
http://www.coderanch.com/t/564948/java/java/does-substring-method-works
Can you give more explanation on this one?

May 29, 2012 at 7:09 AM

Apoorv
said...

Regarding the first comment related to "return ((beginIndex == 0) && (endIndex == count)) ? this : new
String(offset + beginIndex, endIndex - beginIndex, value)". This is correct observation, and in the constructor
used while creating a new string in above code of substring, only required array contents are copied and not
the entire array. So the assumption that creating a small substring of a 1GB sized String will occupy 1GB space
is wrong.

June 14, 2012 at 12:05 AM

Anonymous
said...

@Apoorv: No array contents are copied. Several Strings can point at the same character array with different
lengths and offsets since this array will never change (which is the whole point of the String class being
immutable). What the new String will contain is a pointer to the 1 GB array with a different length and offset.
What I assume the interviewer was aiming towards was what happens when the original String gets garbaged
collected and there is only your tiny substring looking at 6 characters of this (unbelievably) large array.
I would hope that there is some internal mechanism that handles this in the JVM (there must be)! If I was asked
this question I'd focus on the massive design flaw of a system that relies on substring on a character array of
over a billion elements.

September 20, 2012 at 3:39 PM

papajohn
said...

I was asked exactly this question in a Google telephone interview. Of course, I failed the last part which
assumed that I knew how substring is implemented internally.
When my reviewer mentioned this implementation, I told him that this is definitely implementation-dependant
and that other JVMs should behave differently.
In the end, I passed that interview.

October 6, 2012 at 3:33 PM

Dave Conrad
said...

The String(String) constructor has changed between Java 6, 7, and (the early access version of) 8. That bit
about stripping out the baggage is no longer in there.
What you need to do, if your small sub-string of a large string is in variable "name", is:
name = new String(name.toCharArray());

January 24, 2013 at 1:22 PM

Anonymous
said...

What I don't understand is how this 1G array is backed up by calling substring. The substring will call new
String(value, beginIndex, subLen) and inside this constructor the following code exists: this.value =
Arrays.copyOfRange(value, offset, offset+count). So our five characters are copied into a new array and the
original 1G is not backed up.

February 15, 2013 at 7:47 AM

Yves Gillet
said...

Yea beware the implementation changed since java 7 at least, the substring no longer backs up the original
char[] array, it creates a new copy of it. Please reflect this in you article.

March 6, 2013 at 7:20 AM

Yves Gillet
said...

what you are referring to the substring problem was in fact a bug:
http://bugs.sun.com/view_bug.do?bug_id=6294060

http://javarevisited.blogspot.in/2011/10/how-substring-in-java-works.html[8/23/2014 8:40:23 AM]

How SubString method works in Java - Memory Leak Fixed in JDK 1.7

I hope interviewers are at least aware that some of their questions are not valid anymore :)

March 6, 2013 at 7:23 AM

Javin @ Java Classloder Working


said...

@Yves Gillet, you are correct mate, It seems, substring method is now free from memory leak. Will update this
post. Thanks

March 7, 2013 at 3:22 AM

Anonymous
said...

Is it really true? "no it won't throw Index OutOfBoundException instead it will return empty String, same is the
case when beginIndex and endIndex is same in case of second method."
Source - http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#substring(int)
"
substring
public String substring(int beginIndex)
Returns a new string that is a substring of this string. The substring begins with the character at the specified
index and extends to the end of this string.
Examples:
"unhappy".substring(2) returns "happy"
"Harbison".substring(3) returns "bison"
"emptiness".substring(9) returns "" (an empty string)
Parameters:
beginIndex - the beginning index, inclusive.
Returns:
the specified substring.
Throws:
IndexOutOfBoundsException - if beginIndex is negative or larger than the length of this String object."

March 9, 2013 at 9:23 AM

Javin @ BlockingQueue in Java


said...

@Anonymous, What is your questions? isn't you answering your own question, if I understood correctly, in
"emptiness".substring(9) returns "" (an empty string)

March 9, 2013 at 5:25 PM

Anonymous
said...

how to search for a given pattern in a string without using regex?


e.g.
search "aab" total no of occurances and the index at which this pattern in string "aabcbbacbaabaaaabbabaab"?

June 6, 2013 at 8:06 PM

Anonymous
said...

@javin, can you please move the update note at the top of the article? I fear that readers who will only scan
this article will be misinformed.
- Michael

June 27, 2013 at 8:46 AM

George
said...

Indeed substring in Java will not create any more memory leak from Java Java 1.7.0_06, issue has been fixed
by not allowing sharing of char array. They have even made some changes in java.lang.String class to remove
count and offset int variable, which can further save some memory, but this comes with a cost, by
compromising speed. Earlier substring was O(1) operation because of sharing same char array, but now it's
linear. I still think it's a good move to avoid memory leak, given excessive use of String and SubString method in
Java code.

August 15, 2013 at 8:40 PM

Anonymous
said...

Hey Javin,
I've been reading your articles and although they are very helpful. This article does have much content on the
topic of sub-string and whatever it does have is quite confusing. Which actually goes for a lot of your articles.

http://javarevisited.blogspot.in/2011/10/how-substring-in-java-works.html[8/23/2014 8:40:23 AM]

How SubString method works in Java - Memory Leak Fixed in JDK 1.7

On reading this i was under the impression that a brand new string gets created when substring method is called
and because of memory leak, apart from the original string holding 1 gb of data, the substring too holds the
same size of 1 gb in memory which is wrong. When substring is called no new string that gets created and there
is only a reference to the old string.
Please explain the concepts more extensively and please don't take this in any bad way.
Thanks

August 31, 2013 at 2:11 PM

Soumitra Pathak
said...

Hi,
while debugging we can see that even after substring it had all the content in the backing array.
so if we don't want to keep the same we can use
String new2 = new String(new1.toCharArray());
this will copy the actual array to the backing array.
I am not sure about what will happen to String 'new1' with data 1 GB.
Thanks,
S

October 20, 2013 at 7:52 AM

Anonymous
said...

SubString will always create a new object, but in 6 it was still referring to the 1gb array, even after performing
the sub string. In 7 we create a copy of new char array to hold the substring and leave the huge array for
garbage collection if not needed.

December 11, 2013 at 9:28 AM

Anonymous
said...

U have mentioned above that " This will also stop original string to be garbage collected, in case if doesn't have
any live reference.". can you please elaborate it more. I didn't understood it.

May 4, 2014 at 4:59 AM

Anonymous
said...

Hey guys, this is Uma. Think this is not quite correct: -.what will happen if beginIndex is equal to length in
substring(int beginIndex),for the answer provided - empty string will be returned. Question ought to have been
what will happen if beginIndex is equal to length of string from which you want a piece of. That is what the
person who gave this example: "emptiness".substring(9) returns "" (an empty string) was trying to say.. Cheers!
Nice questions and answers.

June 6, 2014 at 10:05 AM

Post a Comment

http://javarevisited.blogspot.in/2011/10/how-substring-in-java-works.html[8/23/2014 8:40:23 AM]

How SubString method works in Java - Memory Leak Fixed in JDK 1.7

Newer
Post

Home

Subscribe to:
Post Comments
(
Atom
)

About
Me

Privacy
Policy

http://javarevisited.blogspot.in/2011/10/how-substring-in-java-works.html[8/23/2014 8:40:23 AM]

Older
Post

You might also like