Java String.substring method potential memory leak?
Asked Answered
X

3

16

I was going through the String class API and looks like there is a potential memory leak caused by substring method as it shares same character array as original String.

If original string is huge then small string returned by substring can prevent original string(backed up by large array) from garbage collection in Java.

Any thoughts or did I read the API wrong.

Xyloid answered 4/1, 2013 at 16:34 Comment(3)
This is not technically a memory leak at all, since the character array is still referenced and could be collected later when all strings that reference it are collected. Part of the character array might not be used anymore, but that does not make it a leak.Inandin
if you have a 100 large string each 100MB and you have a substring(0,1) you are technically holding that value[] used in String class and never in your application huge strings are eligible for garbage collectionXyloid
best link javarevisited.blogspot.com/2012/03/…Torrefy
A
21

There is a potential for a memory leak, if you take a substring of a sizable string and not make a copy (usually via the String(String) constructor).

Note that this has changed since Java 7u6. See https://bugs.openjdk.java.net/browse/JDK-7197183.

The original assumptions around the String object implementing a flyweight pattern are no longer regarded as valid.

See this answer for more info.

Alienate answered 4/1, 2013 at 16:36 Comment(6)
both strategies are valid. the original impl was fine; the new impl is fine. but it is not fine to change the impl.... it is an astonishing break of compatibility. they couldn't have done it for the apparent reason. there's something else going on.Figured
@Figured This was admittedly an implementation detail that was not part of the specification => it should not have been relied on. The change has been made for performance reasons.Embolism
@Embolism that's more of a cop-out. people have established belief that substring()/trim() is O(1). silently changing it to O(n) is not very nice. And how do you write code the behave consistently on different Java versions?Figured
they did it for performance reason, but not the apparent ones, which are very weak.Figured
@Figured they did it to improve the performance of Strings in general, with the drawback that the performance of substring has been deteriorated...Embolism
they did it for something more conspiratory. the "bug" has been filed for more than a decade (4513622 etc); if the bug is valid for the apparent reasons, they should have fixed it long time ago.Figured
E
5
  1. It was the case until Java 7u6 - you would generally deal with the issue by doing:

    String sub = new String(s.substring(...)); // create a new string
    

    That effectively removes the dependency and the original string is now available for GC. This is by the way one of the only scenarios where using the string constructor makes sense.

  2. Since Java 7u6, a new String is created and there is no memory issue any longer.

Embolism answered 4/1, 2013 at 16:43 Comment(5)
yeah but it creates a new problem. if you trim() off one white space, which is a very common case, you end up copying N-1 chars.Figured
@Figured You can always find corner cases where it will perform worse. The goal of a general purpose ilbrary is to perform well on average and it seems that many different use cases have been taken into account before making that change.Embolism
Also copying chars is very fast (not saying it has not cost).Embolism
trim() is not a corner case. actually str.substring().trim() is a very common case, which involves copying twice. O(1)->O(n) is a big deal. and no, oracle does not know how this will impact the existing applications in the wild.Figured
@Figured I understand your point. There are many variables to take into account (in particular in this case better use of CPU's caches) which could balance this even if you use substring often. You can dig in the discussions to get more info.Embolism
M
1

In Java 7, String's subString is modified to :

/**
     * Returns a new string that is a substring of this string. The
     * substring begins with the character at the specified index and
     * extends to the end of this string. <p>
     * Examples:
     * <blockquote><pre>
     * "unhappy".substring(2) returns "happy"
     * "Harbison".substring(3) returns "bison"
     * "emptiness".substring(9) returns "" (an empty string)
     * </pre></blockquote>
     *
     * @param      beginIndex   the beginning index, inclusive.
     * @return     the specified substring.
     * @exception  IndexOutOfBoundsException  if
     *             <code>beginIndex</code> is negative or larger than the
     *             length of this <code>String</code> object.
     */
    public String substring(int beginIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        int subLen = value.length - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);
    }

Hence, everytime you do subString with beginIndex NOT equal to 0, we have new String Object.

Midterm answered 9/10, 2015 at 11:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.