How to read data from NSFileHandle line by line?
Asked Answered
B

11

54

I have a text file having data as given

e.g.

PUFGUjVRallYZDNaazFtVjVObU1zWm5ZcUJUYU5ORk4zbGthNHNDVUdSMlFVQmpSVEoxUUNSallYaFhkanBITXBGR1NTQnpZRTltZE1OalVzSkdXQ0Z6WXR0V2RpTmpTdXgwTWs5V1lZSkZiWjFXT29OV2JSVlhaSTUwYUpwR040UUZXTzVHVXFoWFVRcFdWNHdVTUJ0Q1VHSmxXVlJVTlJCMVE1VTFWV
PUFGUjVRallYZDNaazFtVjVObU1zWm5ZcUJUYU5ORk4zbGthNHNDVUdSMlFVQmpSVEoxUUNSallYaFhkanBITXBGR1NTQnpZRTltZE1OalVzSkdXQ0Z6WXR0V2RpTmpTdXgwTWs5V1lZSkZiWjFXT29OV2JSVlhaSTUwYUpwR040UUZXTzVHVXFoWFVRcFdWNHdVTUJ0Q1VHSmxXVlJVTlJCMVE1VTFWV

Now I want to read data line by line. That means first I want to read

PUFGUjVRallYZDNaazFtVjVObU1zWm5ZcUJUYU5ORk4zbGthNHNDVUdSMlFVQmpSVEoxUUNSallYaFhkanBITXBGR1NTQnpZRTltZE1OalVzSkdXQ0Z6WXR0V2RpTmpTdXgwTWs5V1lZSkZiWjFXT29OV2JSVlhaSTUwYUpwR040UUZXTzVHVXFoWFVRcFdWNHdVTUJ0Q1VHSmxXVlJVTlJCMVE1VTFWV

and then next remaining. anyone have any idea??

Beutner answered 14/9, 2010 at 9:4 Comment(3)
I logically able to implement that by the use of NSArray, and separating component on the basis new line character.Beutner
But is there is any other way? any API?Beutner
Dupe of #1044834Mousetail
F
151

If your file is small, then @mipadi's method will probably be just fine. However, if your file is large (> 1MB, perhaps?), then you may want to consider reading the file line-by-line. I wrote a class once to do that, which I'll paste here:

//DDFileReader.h

@interface DDFileReader : NSObject {
    NSString * filePath;
    
    NSFileHandle * fileHandle;
    unsigned long long currentOffset;
    unsigned long long totalFileLength;
    
    NSString * lineDelimiter;
    NSUInteger chunkSize;
}

@property (nonatomic, copy) NSString * lineDelimiter;
@property (nonatomic) NSUInteger chunkSize;

- (id) initWithFilePath:(NSString *)aPath;

- (NSString *) readLine;
- (NSString *) readTrimmedLine;

#if NS_BLOCKS_AVAILABLE
- (void) enumerateLinesUsingBlock:(void(^)(NSString*, BOOL *))block;
#endif

@end


//DDFileReader.m

#import "DDFileReader.h"

@interface NSData (DDAdditions)

- (NSRange) rangeOfData_dd:(NSData *)dataToFind;

@end

@implementation NSData (DDAdditions)

- (NSRange) rangeOfData_dd:(NSData *)dataToFind {
    
    const void * bytes = [self bytes];
    NSUInteger length = [self length];
    
    const void * searchBytes = [dataToFind bytes];
    NSUInteger searchLength = [dataToFind length];
    NSUInteger searchIndex = 0;
    
    NSRange foundRange = {NSNotFound, searchLength};
    for (NSUInteger index = 0; index < length; index++) {
        if (((char *)bytes)[index] == ((char *)searchBytes)[searchIndex]) {
            //the current character matches
            if (foundRange.location == NSNotFound) {
                foundRange.location = index;
            }
            searchIndex++;
            if (searchIndex >= searchLength) { return foundRange; }
        } else {
            searchIndex = 0;
            foundRange.location = NSNotFound;
        }
    }
    return foundRange;
}

@end

@implementation DDFileReader
@synthesize lineDelimiter, chunkSize;

- (id) initWithFilePath:(NSString *)aPath {
    if (self = [super init]) {
        fileHandle = [NSFileHandle fileHandleForReadingAtPath:aPath];
        if (fileHandle == nil) {
            [self release]; return nil;
        }
        
        lineDelimiter = [[NSString alloc] initWithString:@"\n"];
        [fileHandle retain];
        filePath = [aPath retain];
        currentOffset = 0ULL;
        chunkSize = 10;
        [fileHandle seekToEndOfFile];
        totalFileLength = [fileHandle offsetInFile];
        //we don't need to seek back, since readLine will do that.
    }
    return self;
}

- (void) dealloc {
    [fileHandle closeFile];
    [fileHandle release], fileHandle = nil;
    [filePath release], filePath = nil;
    [lineDelimiter release], lineDelimiter = nil;
    currentOffset = 0ULL;
    [super dealloc];
}

- (NSString *) readLine {
    if (currentOffset >= totalFileLength) { return nil; }
    
    NSData * newLineData = [lineDelimiter dataUsingEncoding:NSUTF8StringEncoding];
    [fileHandle seekToFileOffset:currentOffset];
    NSMutableData * currentData = [[NSMutableData alloc] init];
    BOOL shouldReadMore = YES;
    
    NSAutoreleasePool * readPool = [[NSAutoreleasePool alloc] init];
    while (shouldReadMore) {
        if (currentOffset >= totalFileLength) { break; }
        NSData * chunk = [fileHandle readDataOfLength:chunkSize];
        NSRange newLineRange = [chunk rangeOfData_dd:newLineData];
        if (newLineRange.location != NSNotFound) {
            
            //include the length so we can include the delimiter in the string
            chunk = [chunk subdataWithRange:NSMakeRange(0, newLineRange.location+[newLineData length])];
            shouldReadMore = NO;
        }
        [currentData appendData:chunk];
        currentOffset += [chunk length];
    }
    [readPool release];
    
    NSString * line = [[NSString alloc] initWithData:currentData encoding:NSUTF8StringEncoding];
    [currentData release];
    return [line autorelease];
}

- (NSString *) readTrimmedLine {
    return [[self readLine] stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
}

#if NS_BLOCKS_AVAILABLE
- (void) enumerateLinesUsingBlock:(void(^)(NSString*, BOOL*))block {
  NSString * line = nil;
  BOOL stop = NO;
  while (stop == NO && (line = [self readLine])) {
    block(line, &stop);
  }
}
#endif

@end

Then to use this, you'd do:

DDFileReader * reader = [[DDFileReader alloc] initWithFilePath:pathToMyFile];
NSString * line = nil;
while ((line = [reader readLine])) {
  NSLog(@"read line: %@", line);
}
[reader release];

Or (for 10.6+ and iOS 4+):

DDFileReader * reader = [[DDFileReader alloc] initWithFilePath:pathToMyFile];
[reader enumerateLinesUsingBlock:^(NSString * line, BOOL * stop) {
  NSLog(@"read line: %@", line);
}];
[reader release];
Flagwaving answered 14/9, 2010 at 16:52 Comment(26)
Nice solution. For bonus points, how about updating it to support block-based enumeration, like -[NSString enumerateLinesUsingBlock:]? :-)Xylidine
Is there any specific reason for the delimiter being a retain instead of a copy property?Affirmation
@Georg other than this was written a long time ago? No. :) (fixed)Flagwaving
@Dave You have probably saved a lot of people a lot of time, thanks a ton!!Carry
I needed to parse a big file which used \n as a separator and had two words per line. I just tried using this solution in a 40mb file and it is very, very slow. If I, instead, get an NSString representation of the file via stringWithContentsOfURL:encoding:error, then separate it into lines by using componentsSeparatedByString and then separate each line into two words, it's a LOT faster.Holoblastic
@OlivaresF: Sure, but then you have all 40MB of the file in memory. The code posted in my answer will read the file chunk-by-chunk, and leaves it up to you to decide how much you want to stick around (by periodically creating and draining NSAutoreleasePools).Flagwaving
@Dave That's definitely true, just thought I'd comment that it's a tradeoff.Holoblastic
i have a 70 mb file , using this code to read file doesn't hep me it increases memory linearly. can any one help me?Pianola
Can you post a disclaimer or a license for this code so that others can (or can not, as you wish) use it with confidence?Finke
@Will Hartung "use at your own risk and peril"Flagwaving
@Dave Delong I know it's pedantic, but, simply, the fact that you don't specifically release the code means that no one can, at least legally, use it as you have sole copyright. I understand the intent, but adding a disclaimer or a release (such as in to the public domain, or whatever) is actually kind of important if your intent is for the code to be used.Finke
@Will everything posted on this site is under the CC-wiki with attribution license. That's at the footer of every page.Flagwaving
@Dave Ah, ok. CC isn't really a good license for software, so the lawyers will have kittens. But thank you anyway.Finke
@Will if you really want I can post the code on github under a different license.Flagwaving
when i try to use this code, the loop ends after the 65,858th line whereas my file has a little over 1.1 million lines. Any clue why this might be the case?Lananna
i narrowed down the cause of my problem. the file stops being read once it hits a character like É even if the document is UTF-8 encoded. any idea why?Lananna
For what it's worth, I found that making the chunk size to 256 doubled the speed of my program. I tried a large range of sizes from 2 to 4000 and found that to be the best in my quick tests.Renz
Note that JJD started a gitHub project from this #3707927. Please contribute!Lanham
Note to anyone having problems: the "to use this" example code returns strings with linebreaks! Use the readTrimmedLine method instead if you don't want them.Boehike
@Lananna my loop is also ending well before the last line in the file. Anyone have any ideas?Boehike
@cksubs open an issue on the github project linked to in a previous comment.Flagwaving
Issue opened. The problem is that it stops working when it hits an accented character like "é".Boehike
@cksubs - readTrimmedLine will remove both leading and trailing whitespace which my interfere with parsing (e.g., I have a tab-delimited file and leading tabs are important). I used: NSRange eol = [line rangeOfCharacterFromSet: [NSCharacterSet newlineCharacterSet] options: NSBackwardsSearch]; if( eol.location != NSNotFound ) line = [line substringToIndex: eol.location]; to only strip the newline at the end.Vella
I am having the same problem with an accented character, the á. It stops reading after that, any idea why? I'm parsing a Minecraft server log, and my friend just haaaaaad to go on and start speaking in French.Septuagint
@DaveDeLong Would you mind making a statement here that you're licensing this under the Apache 2.0 license, or in the public domain? I can't use this under CC-BY-SA.Ihram
Sorry about the removal of "Merry Christmas. :)"Emaemaciate
K
22

I rewrote this to be ARC compliant:

//
//  DDFileReader.m
//  PBX2OPML
//
//  Created by michael isbell on 11/6/11.
//  Copyright (c) 2011 BlueSwitch. All rights reserved.
//

//DDFileReader.m

#import "DDFileReader.h"

@interface NSData (DDAdditions)

- (NSRange) rangeOfData_dd:(NSData *)dataToFind;

@end

@implementation NSData (DDAdditions)

- (NSRange) rangeOfData_dd:(NSData *)dataToFind {

    const void * bytes = [self bytes];
    NSUInteger length = [self length];

    const void * searchBytes = [dataToFind bytes];
    NSUInteger searchLength = [dataToFind length];
    NSUInteger searchIndex = 0;

    NSRange foundRange = {NSNotFound, searchLength};
    for (NSUInteger index = 0; index < length; index++) {
        if (((char *)bytes)[index] == ((char *)searchBytes)[searchIndex]) {
            //the current character matches
            if (foundRange.location == NSNotFound) {
                foundRange.location = index;
            }
            searchIndex++;
            if (searchIndex >= searchLength) { return foundRange; }
        } else {
            searchIndex = 0;
            foundRange.location = NSNotFound;
        }
    }
    return foundRange;
}

@end

@implementation DDFileReader
@synthesize lineDelimiter, chunkSize;

- (id) initWithFilePath:(NSString *)aPath {
    if (self = [super init]) {
        fileHandle = [NSFileHandle fileHandleForReadingAtPath:aPath];
        if (fileHandle == nil) {
            return nil;
        }

        lineDelimiter = @"\n";
        currentOffset = 0ULL; // ???
        chunkSize = 10;
        [fileHandle seekToEndOfFile];
        totalFileLength = [fileHandle offsetInFile];
        //we don't need to seek back, since readLine will do that.
    }
    return self;
}

- (void) dealloc {
    [fileHandle closeFile];
    currentOffset = 0ULL;

}

- (NSString *) readLine {
    if (currentOffset >= totalFileLength) { return nil; }

    NSData * newLineData = [lineDelimiter dataUsingEncoding:NSUTF8StringEncoding];
    [fileHandle seekToFileOffset:currentOffset];
    NSMutableData * currentData = [[NSMutableData alloc] init];
    BOOL shouldReadMore = YES;

    @autoreleasepool {

    while (shouldReadMore) {
        if (currentOffset >= totalFileLength) { break; }
        NSData * chunk = [fileHandle readDataOfLength:chunkSize];
        NSRange newLineRange = [chunk rangeOfData_dd:newLineData];
        if (newLineRange.location != NSNotFound) {

            //include the length so we can include the delimiter in the string
            chunk = [chunk subdataWithRange:NSMakeRange(0, newLineRange.location+[newLineData length])];
            shouldReadMore = NO;
        }
        [currentData appendData:chunk];
        currentOffset += [chunk length];
    }
    }

    NSString * line = [[NSString alloc] initWithData:currentData encoding:NSUTF8StringEncoding];
    return line;  
}

- (NSString *) readTrimmedLine {
    return [[self readLine] stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
}

#if NS_BLOCKS_AVAILABLE
- (void) enumerateLinesUsingBlock:(void(^)(NSString*, BOOL*))block {
    NSString * line = nil;
    BOOL stop = NO;
    while (stop == NO && (line = [self readLine])) {
        block(line, &stop);
    }
}
#endif

@end
Kilah answered 6/11, 2011 at 14:18 Comment(1)
Note that if you change the chunkSize from 10 to around the maximum line length of the lines in the file you're reading, this code will perform much faster. In my case, changing chunkSize from 10 to 128 doubled performance.Notus
I
17

I started a GitHub project based on the source code of Dave DeLong. You are welcome to improve the code. By now I can read files forwards and backwards.

Impiety answered 11/10, 2010 at 21:17 Comment(0)
M
13
NSString *fh = [NSString stringWithContentsOfFile:filePath encoding:fileEncoding error:NULL];
for (NSString *line in [fh componentsSeparatedByString:@"\n"]) {
    // Do something with the line
}

There's no API in Cocoa, or built-in language constructs, for reading a file line by line.

Mylesmylitta answered 14/9, 2010 at 16:15 Comment(3)
Don't forget -[NSString enumerateLinesUsingBlock:], available in OS X 10.6 and iOS 4. I'd trust it to be more robust in the presence of newlines other than just \n.Xylidine
Thanks for answer I did this already, but because I m not sure about whether an API exists or not ,thats why I ask this question.Beutner
typo -> componenents should read components. Otherwise it's a clean and simple solution for reading small files. thanks :DTabbatha
P
6

The answer to this question for LARGE text files does not require a custom function. Objective-C is a superset of c, and therefore has the c methods to do just this.

FILE* file = fopen("path to my file", "r");

size_t length;
char *cLine = fgetln(file,&length);

while (length>0) {
    char str[length+1];
    strncpy(str, cLine, length);
    str[length] = '\0';

    NSString *line = [NSString stringWithFormat:@"%s",str];        
    % Do what you want here.

    cLine = fgetln(file,&length);
}

Note that fgetln will not keep your newline character. Also, We +1 the length of the str because we want to make space for the NULL termination.

Pinkerton answered 7/3, 2016 at 19:46 Comment(2)
This answer is faster and more code-efficient than the "best answer"Frostbitten
@Frostbitten ...unless you have to use NSFileHandle for some reason, like in the case of reading data from NSPipeAmes
S
3

Here is a method which I used for reading an individual line from an NSInputStream. Note that it is optimized for readability and not for speed. ;-)

- (NSString*) readLine: (NSInputStream*) inputStream {
    NSMutableData* data = [NSMutableData data];
    uint8_t oneByte;
    do {
        int actuallyRead = [inputStream read: &oneByte maxLength: 1];
        if (actuallyRead == 1) {
            [data appendBytes: &oneByte length: 1];
        }        
    } while (oneByte != '\n');

    return [[NSString alloc] initWithData: data encoding: NSUTF8StringEncoding];
Stealing answered 13/7, 2012 at 14:11 Comment(0)
I
2

I found out that GitX uses a line reader as well.
Checkout the brotherbard's repository on GitHub or the website of the Michael Stapelberg.

@Joe Yang
Nice! I will take a closer look the next days.
I would be glad if you want to fork my repository on GitHub and send me a pull request.

Impiety answered 10/11, 2010 at 18:49 Comment(1)
@Joe Yang: Could you please explain how readLineBackwards works? I tried to integrate it but alltimes it returns seek to offset 0, offset in file is 0. I cannot see that you shift the offset to the end of the file to move backwards.Impiety
P
1

I've modified the FileReader to an NSFileHandle category, hope it could help others

@interface NSFileHandle (Readline)
- (NSString*)readLine;
- (NSString*)readLineBackwards;
@end

#import "NSFileHandle+Readline.h"
#import "NSDataExtensions.h"

@implementation NSFileHandle (Readline)

- (NSString*)readLine {

    NSString * _lineDelimiter = @"\n";

    NSData* newLineData = [_lineDelimiter dataUsingEncoding:NSUTF8StringEncoding];
    NSMutableData* currentData = [[NSMutableData alloc] init];
    BOOL shouldReadMore = YES;

    NSUInteger _chunkSize = 10;

    while (shouldReadMore) {
        NSData* chunk = [self readDataOfLength:_chunkSize]; // always length = 10

        if ([chunk length] == 0) {
            break;
        }

        // Find the location and length of the next line delimiter.
        NSRange newLineRange = [chunk rangeOfData:newLineData];
        if (newLineRange.location != NSNotFound) {
            // Include the length so we can include the delimiter in the string.
            NSRange subDataRange = NSMakeRange(0, newLineRange.location + [newLineData length]);
            unsigned long long newOffset = [self offsetInFile] - [chunk length] + newLineRange.location + [newLineData length];
            [self seekToFileOffset:newOffset];
            chunk = [chunk subdataWithRange:subDataRange];
            shouldReadMore = NO;
        }
        [currentData appendData:chunk];
    }

    NSString* line = [currentData stringValueWithEncoding:NSASCIIStringEncoding];
    return line;
}

- (NSString*)readLineBackwards {

    NSString * _lineDelimiter = @"\n";

    NSData* newLineData = [_lineDelimiter dataUsingEncoding:NSUTF8StringEncoding];
    NSUInteger _chunkSize = 10;

    NSMutableData* currentData = [[NSMutableData alloc] init];
    BOOL shouldReadMore = YES;

    while (shouldReadMore) {

        unsigned long long offset;

        NSUInteger currentChunkSize = _chunkSize;

        if ([self offsetInFile] <= _chunkSize) {
            offset = 0;
            currentChunkSize = [self offsetInFile];
            shouldReadMore = NO;
        } else {
            offset = [self offsetInFile] - _chunkSize;
        }

        NSLog(@"seek to offset %qu, offset in file is %qu", offset, [self offsetInFile]);

        [self seekToFileOffset:offset];

        NSData* chunk = [self readDataOfLength:currentChunkSize];

        NSRange newLineRange = [chunk rangeOfDataBackwardsSearch:newLineData];

        if (newLineRange.location == NSNotFound) {
            [self seekToFileOffset:offset];
        }

        if (newLineRange.location != NSNotFound) {
            NSUInteger subDataLoc = newLineRange.location;
            NSUInteger subDataLen = currentChunkSize - subDataLoc;
            chunk = [chunk subdataWithRange:NSMakeRange(subDataLoc, subDataLen)];
            NSLog(@"got chunk data %@", [chunk stringValueWithEncoding:NSASCIIStringEncoding]);
            shouldReadMore = NO;
            [self seekToFileOffset:offset + newLineRange.location];
        }
        [currentData prepend:chunk];
    }

    NSString* line = [[NSString alloc] initWithData:currentData encoding:NSASCIIStringEncoding];
    return [line autorelease];
}

@end





//
//  NSDataExtensions.m
//  LineReader
//
//  Created by Tobias Preuss on 08.10.10.
//  Copyright 2010 Tobias Preuss. All rights reserved.
//

#import "NSDataExtensions.h"



// -----------------------------------------------------------------------------
// NSData additions.
// -----------------------------------------------------------------------------


/**
 Extension of the NSData class. 
 Data can be found forwards or backwards. Further the extension supplies a function 
 to convert the contents to string for debugging purposes.
 @param Additions Category labeled Additions.
 @returns An initialized NSData object or nil if the object could not be created.
 */
@implementation NSData (Additions)




/**
 Returns a range of data.
 @param dataToFind Data object specifying the delimiter and encoding.
 @returns A range.
 */
- (NSRange)rangeOfData:(NSData*)dataToFind {

    const void* bytes = [self bytes];
    NSUInteger length = [self length];
    const void* searchBytes = [dataToFind bytes];
    NSUInteger searchLength = [dataToFind length];
    NSUInteger searchIndex = 0;

    NSRange foundRange = {NSNotFound, searchLength};
    for (NSUInteger index = 0; index < length; index++) {
        // The current character matches.
        if (((char*)bytes)[index] == ((char*)searchBytes)[searchIndex]) {
            // Store found location if not done earlier.
            if (foundRange.location == NSNotFound) {
                foundRange.location = index;
            }
            // Increment search character index to check for match.
            searchIndex++;
            // All search character match.
            // Break search routine and return found position.
            if (searchIndex >= searchLength) {
                return foundRange;
            }
        }
        // Match does not continue.
        // Return to the first search character.
        // Discard former found location.
        else {
            searchIndex = 0;
            foundRange.location = NSNotFound;
        }
    }
    return foundRange;
}


- (NSRange)rangeOfDataBackwardsSearch:(NSData*)dataToFind {

    const void* bytes = [self bytes];
    NSUInteger length = [self length];
    const void* searchBytes = [dataToFind bytes];
    NSUInteger searchLength = [dataToFind length];
    NSUInteger searchIndex = 0;

    NSRange foundRange = {NSNotFound, searchLength};
    if (length < searchLength) {
        return foundRange;
    }
    for (NSUInteger index = length - searchLength; index >= 0;) {
//      NSLog(@"%c == %c", ((char*)bytes)[index], ((char*)searchBytes)[searchIndex]); /* DEBUG LOG */
        if (((char*)bytes)[index] == ((char*)searchBytes)[searchIndex]) {
            // The current character matches.
            if (foundRange.location == NSNotFound) {
                foundRange.location = index;
            }
            index++;
            searchIndex++;
            if (searchIndex >= searchLength) {
                return foundRange;
            }
        }
        else {
            // Decrement to search backwards.
            if (foundRange.location == NSNotFound) {
                // Skip if first byte has been reached.
                if (index == 0) {
                    foundRange.location = NSNotFound;
                    return foundRange;
                }
                index--;
            }
            // Jump over the former found location
            // to avoid endless loop.
            else {
                index = index - 2;
            }
            searchIndex = 0;
            foundRange.location = NSNotFound;
        }
    }
    return foundRange;
}

- (NSString*)stringValueWithEncoding:(NSStringEncoding)encoding {
    return [[NSString alloc] initWithData:self encoding:encoding];
}

@end




// -----------------------------------------------------------------------------
// NSMutableData additions.
// -----------------------------------------------------------------------------


/**
 Extension of the NSMutableData class. 
 Data can be prepended in addition to the append function of the framework.
 @param Additions Category labeled Additions.
 @returns An initialized NSMutableData object or nil if the object could not be created.
 */
@implementation NSMutableData (Additions)

/**
    Inserts the data before the data of the object.
    @param data Data to be prepended.
 */
- (void)prepend:(NSData*)data {


    NSMutableData* concat = [NSMutableData dataWithData:data];
    [concat appendData:self];
    [self setData:concat];
}

@end
Paramour answered 8/11, 2010 at 11:14 Comment(0)
G
1

You can also check out the CGIStream library I created for my HTTP server project at https://github.com/xcvista/ohttpd2/tree/master/CGIStream. Instead of file descriptors, this code operates on NSInputStream. It is essentially an Objective-C clone of System.IO.StreamReader and System.IO.StreamWriter from Microsoft's .net framework.

It will work with not only files but also network sockets. I use it to handle HTTP protocol, which is the namesake of the CGI prefix.

Gilpin answered 17/5, 2013 at 14:32 Comment(0)
H
1

I run into the similar situation with some other circumstances, and here is my solution in Swift 3. Assuming text file to be utf8.

extension FileHandle {

    func enumerateLines(_ block: @escaping (String, UnsafeMutablePointer<Bool>) -> Void) {

        // find the end of file
        var offset = self.offsetInFile
        let eof = self.seekToEndOfFile()
        self.seek(toFileOffset: offset)
        let blockSize = 1024
        var buffer = Data()

        // process to the end of file
        while offset + UInt64(buffer.count) < eof {
            var found = false

            // make sure buffer contains at least one CR, LF or null
            while !found && offset + UInt64(buffer.count) < eof {
                let block = self.readData(ofLength: blockSize)
                buffer.append(block)
                for byte in block {
                    if [0x0d, 0x0a, 0x00].contains(byte) {
                        found = true ; break
                    }
                }
            }

            // retrieve lines within the buffer
            var index = 0
            var head = 0 // head of line
            var done = false
            buffer.enumerateBytes({ (pointer, count, stop) in
                while index < count {
                    // find a line terminator
                    if [0x0d, 0x0a, 0x00].contains(pointer[index]) {
                        let lineData = Data(pointer[head ..< index])
                        if let line = String(bytes: lineData, encoding: .utf8) {
                            block(line, &stop) // stop requested
                            if pointer[index] == 0x0d && index+1 < count && pointer[index+1] == 0x0a {
                                index += 2 ; head = index
                            }
                            else { index += 1 ; head = index }
                            if stop { done = true ; return } // end of enumerateLines
                        }
                        else { return } // end of enumerateLines
                    }
                    else { index += 1 }
                }
            })

            offset += UInt64(head)
            buffer.replaceSubrange(0 ..< head, with: Data())
            if done { // stop requested
                self.seek(toFileOffset: offset)
                return
            }
        }
    }

Here is the usage:

    let fileURL = Bundle.main.url(forResource: "huge_file", withExtension: "txt")!
    let fileHandle = try! FileHandle(forReadingFrom: fileURL)

    fileHandle.enumerateLines { (line, stop) in
        if someCondition { stop.pointee = true }
        print(line)
    }
    /* let remaining = fileHandle.readDataToEndOfFile() */

https://gist.github.com/codelynx/c1de603a85e7503fe9597d027e93f4de

Herman answered 22/11, 2016 at 4:55 Comment(0)
A
0

This worked for me on Swift 5 .

https://gist.github.com/sooop/a2b110f8eebdf904d0664ed171bcd7a2

Aliform answered 10/2, 2020 at 13:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.