iOS: How to avoid autoreleased copies when manipulating large NSString instance?
Asked Answered
T

2

7

I have a scenario in an iOS application where manipulating a very large NSString instance (an HTTP response, upwards of 11MB) results in multiple large intermediaries being in memory at once, since the SDK methods I am calling return new autoreleased instances. What is the best approach to take here?

For example, assuming that largeString is an autoreleased NSString instance:

NSArray *partsOfLargeString = [largeString componentsSeparatedByString:separator];

for (NSString *part in partsOfLargeString) {
    NSString *trimmedPart = [part stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];

    NSData *data = [trimmedPart dataUsingEncoding:NSUTF8StringEncoding];
}

It would be great if there were non-autoreleased equivalents to componentsSeparatedByString or stringByTrimmingCharactersInSet, but I'm not looking to implement these myself.

To my knowledge, there isn't a way to "force" release an object that has already been added to an autorelease pool. I know that I can create and use my own autorelease pool here, but I'd like to be extremely granular and having autorelease pools around individual statements definitely isn't a very scalable approach.

Any suggestions are much appreciated.

Tyrrell answered 13/9, 2011 at 15:17 Comment(2)
You’ll have to surround your code with autorelease pools. Could you elaborate on why you think that’s not scalable? If there were an alternative, it’d probably be a method so you’d have to send an extra message to each object you want to force-release. Sounds like pretty much the same to me.Limb
Using one autorelease pool in the above example would not be sufficient for my needs. I would have to add an autorelease pool around each of componentsSeparatedByString:, stringByTrimmingCharactersInSet: and dataUsingEncoding:, which seems messy.Tyrrell
L
2

As Bill said, I’d first try to have an autorelease pool for each loop iteration, e.g.:

for (NSString *part in partsOfLargeString) {
    NSAutoreleasePool *pool = [NSAutoreleasePool new];

    NSString *trimmedPart = [part stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
    NSData *data = [trimmedPart dataUsingEncoding:NSUTF8StringEncoding];
    …

    [pool drain];
}

or, if you’re using a recent enough compiler:

for (NSString *part in partsOfLargeString) {
    @autoreleasepool {
        NSString *trimmedPart = [part stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        NSData *data = [trimmedPart dataUsingEncoding:NSUTF8StringEncoding];
        …
    }
}

If that’s still not acceptable and you do need to release objects in a more granular fashion, you could use something like:

static inline __attribute__((ns_returns_retained))
id BICreateDrainedPoolObject(id (^expression)(void)) {
    NSAutoreleasePool *pool = [NSAutoreleasePool new];
    id object = expression();
    [object retain];
    [pool drain];
    return object;
}

#define BIOBJ(expression) BICreateDrainedPoolObject(^{return (expression);})

which evaluates the expression, retains its result, releases any ancillary autoreleased objects and returns the result; and then:

for (NSString *part in partsOfLargeString) {
    NSAutoreleasePool *pool = [NSAutoreleasePool new];

    NSString *trimmedPart = BIOBJ([part stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]]);
    NSData *data = BIOBJ([trimmedPart dataUsingEncoding:NSUTF8StringEncoding]);
    [trimmedPart release];

    // do something with data
    [data release];

    …

    [pool drain];
}

Note that, since the function returns a retained object, you’re responsible for releasing it. You’ll have control over when to do that.

Feel free to choose better names for the function and macro. There might be some corner cases that should be handled but it should work for your particular example. Suggestions are welcome!

Limb answered 13/9, 2011 at 16:35 Comment(2)
Wouldn't trying to call isKindOfClass: on the result of the expression cause an exception before the assertion is even tested?Conveyance
@ugh Yeah, you’re right. Also, I think block return type inference should take care of non-object expressions.Limb
N
1

First, you shouldn't need to parse responses from an HTTP server in this fashion. Parsing HTTP responses (including parsing HTML) is a solved problem and attempting to parse it using raw string manipulation will lead to fragile code that can easily be crashed with seemingly innocuous server side changes.

Autorelease pools are pretty cheap. You could surround the body [inside] of the for with @autoreleasepool {... that code ...} and it'll probably both fix your high-water issue and have negligible performance impact [compared to raw string manipulation].

Beyond that, your summary is correct -- if there isn't a non-autoreleasing variant in the 'kit, then you'd have to re-invent the wheel. With that said, it is fairly typical that the lack of a non-autoreleasing variant is not an oversight on the part of the designer. Instead, it is likely because there are better tools available for achieving the sort of high-volume solution that would also require finer grained memory management.

Nones answered 13/9, 2011 at 16:9 Comment(2)
Thanks for your response. For what it's worth, the custom parsing code is for handling multi-part MIME responses.Tyrrell
That should be a solved problem, too, though you might have to grab a 3rd party library for it.Nones

© 2022 - 2024 — McMap. All rights reserved.