Are there alternate implementations of GNU getline interface?
Asked Answered
S

6

18

The experiment I am currently working uses a software base with a complicated source history and no well defined license. It would be a considerable amount of work to rationalize things and release under a fixed license.

It is also intended to run a a random unixish platform, and only some of the libc's we support have GNU getline, but right now the code expects it.

Does anyone know of a re-implementation of the GNU getline semantics that is available under a less restrictive license?

Edit:: I ask because Google didn't help, and I'd like to avoid writing one if possible (it might be a fun exercise, but it can't be the best use of my time.)

To be more specific, the interface in question is:

ssize_t getline (char **lineptr, size_t *n, FILE *stream);
Steradian answered 9/4, 2009 at 17:18 Comment(3)
Prompted by this question, I've corrected the declaration; getline returns ssize_t, not size_t.Highclass
A public domain implementation of getline(): https://mcmap.net/q/669259/-how-do-i-read-an-arbitrarily-long-line-in-cConsols
Would you consider accepting another answer to this question?Fluoroscope
O
22

The code by Will Hartung suffers from a very serious problem. realloc will most probably free the old block and allocate a new one, but the p pointer within the code will continue to point to the original. This one tries to fix that by using array indexing instead. It also tries to more closely replicate the standard POSIX logic.

/* The original code is public domain -- Will Hartung 4/9/09 */
/* Modifications, public domain as well, by Antti Haapala, 11/10/17
   - Switched to getc on 5/23/19 */

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <stdint.h>

// if typedef doesn't exist (msvc, blah)
typedef intptr_t ssize_t;

ssize_t getline(char **lineptr, size_t *n, FILE *stream) {
    size_t pos;
    int c;

    if (lineptr == NULL || stream == NULL || n == NULL) {
        errno = EINVAL;
        return -1;
    }

    c = getc(stream);
    if (c == EOF) {
        return -1;
    }

    if (*lineptr == NULL) {
        *lineptr = malloc(128);
        if (*lineptr == NULL) {
            return -1;
        }
        *n = 128;
    }

    pos = 0;
    while(c != EOF) {
        if (pos + 1 >= *n) {
            size_t new_size = *n + (*n >> 2);
            if (new_size < 128) {
                new_size = 128;
            }
            char *new_ptr = realloc(*lineptr, new_size);
            if (new_ptr == NULL) {
                return -1;
            }
            *n = new_size;
            *lineptr = new_ptr;
        }

        ((unsigned char *)(*lineptr))[pos ++] = c;
        if (c == '\n') {
            break;
        }
        c = getc(stream);
    }

    (*lineptr)[pos] = '\0';
    return pos;
}

The performance can be increased for a platform by locking the stream once and using the equivalent of getc_unlocked(3) - but these are not standardized in C; and if you're using the POSIX version, then you probably will have getline(3) already.

Oligoclase answered 10/11, 2017 at 18:48 Comment(4)
I got this errors: error: invalid conversion from ‘void*’ to ‘char*’ [-fpermissive] for your malloc(128) and realloc(*lineptr, new_size). I fixed it by casting them to (char*): invalid conversion from void*' to char*' when using malloc?Triclinic
When I tested with Cygwin C the performance was 10X worse than the builtin getline()Triclinic
As for the performance, that is expected as I am using fgetc which needs to lock the stream for each read character. Unfortunately there is no standards-compliant way of avoiding the lock-unlock. There is for POSIX, but if you have POSIX you probably will have getline too.Fluoroscope
@user on Windows you can use _lock_file and _getc_nolockFluoroscope
B
15

I'm puzzled.

I looked at the link, read the description, and this is a fine utility.

But, are you saying you simply can't rewrite this function to spec? The spec seems quite clear,

Here:

/* This code is public domain -- Will Hartung 4/9/09 */
#include <stdio.h>
#include <stdlib.h>

size_t getline(char **lineptr, size_t *n, FILE *stream) {
    char *bufptr = NULL;
    char *p = bufptr;
    size_t size;
    int c;

    if (lineptr == NULL) {
        return -1;
    }
    if (stream == NULL) {
        return -1;
    }
    if (n == NULL) {
        return -1;
    }
    bufptr = *lineptr;
    size = *n;

    c = fgetc(stream);
    if (c == EOF) {
        return -1;
    }
    if (bufptr == NULL) {
        bufptr = malloc(128);
        if (bufptr == NULL) {
            return -1;
        }
        size = 128;
    }
    p = bufptr;
    while(c != EOF) {
        if ((p - bufptr) > (size - 1)) {
            size = size + 128;
            bufptr = realloc(bufptr, size);
            if (bufptr == NULL) {
                return -1;
            }
        }
        *p++ = c;
        if (c == '\n') {
            break;
        }
        c = fgetc(stream);
    }

    *p++ = '\0';
    *lineptr = bufptr;
    *n = size;

    return p - bufptr - 1;
}

int main(int argc, char** args) {
    char *buf = NULL; /*malloc(10);*/
    int bufSize = 0; /*10;*/

    printf("%d\n", bufSize);
    int charsRead =  getline(&buf, &bufSize, stdin);

    printf("'%s'", buf);
    printf("%d\n", bufSize);
    return 0;
}

15 minutes, and I haven't written C in 10 years. It minorly breaks the getline contract in that it only checks if the lineptr is NULL, rather than NULL and n == 0. You can fix that if you like. (The other case didn't make a whole lot of sense to me, I guess you could return -1 in that case.)

Replace the '\n' with a variable to implement "getdelim".

Do people still write code any more?

Brookweed answered 9/4, 2009 at 18:55 Comment(8)
This works fine for short strings but may fail after reallocation. bufptr may get a new address and p needs to be kept at the same relative offset. In my tests (with MinGW), realloc may return several times with the same pointer (if there happens to be enough memory at that spot) or may return a new address on the first reallocation. The new address can be near in memory or a ways away, and can also be before the first address as well as after. IE it can make p a random number. To fix, put "offset = p - bufptr;" under the while EOF line, and "p = bufptr + offset;" after the if NULL block.Shinn
((p - bufptr) > (size - 1)) is a problem if size == 0 (and *lineptr was uncharacteristically non-NULL) as size - 1 is a large number. Suggest ((p - bufptr + 1) > size).Kermanshah
malloc and realloc returns on my stdio.h code void* pointers. So I had to add cast operators, also (char*) for the two rows.Macnair
You can't return -1 in size_t. This will fail miserably if something goes wrong.Pood
NOTICE THAT THIS getline IMPLEMENTATION IS VERY BROKEN as pointed out by @Todd. DO NOT USE ANYWHERE.Fluoroscope
See the one from my answer insteadFluoroscope
use ssize_t instead of size_t, and fix the broken "p" around the realloc.Livesay
Note that if realloc() fails, the memory is leaked. It is not safe to use oldptr = realloc(oldptr, newsize); — always use newptr = realloc(oldptr, newsize); if (newptr == NULL) …error handling…; oldptr = newptr;.Mcgruter
W
7

Use these portable versions from NetBSD: getdelim() and getline()

These come from libnbcompat in pkgsrc, and have a BSD license at the top of each file. You need both because getline() calls getdelim(). Fetch the latest versions of both files. See the BSD license at the top of each file. Modify the files to fit into your program: you might need to declare getline() and getdelim() in one of your header files, and modify both files to include your header instead of the nbcompat headers.

This version of getdelim() is portable because it calls fgetc(). For contrast, a getdelim() from a libc (like BSD libc or musl libc) would probably use private features of that libc, so it would not work across platforms.

In the years since POSIX 2008 specified getline(), more Unixish platforms have added the getline() function. It is rare that getline() is missing, but it can still happen on old platforms. A few people try to bootstrap NetBSD pkgsrc on old platforms (like PowerPC Mac OS X), so they want libnbcompat to provide missing POSIX functions like getline().

Wireworm answered 2/11, 2017 at 3:31 Comment(0)
L
3

If you are compiling for BSD use fgetln instead

Lambart answered 9/7, 2011 at 19:33 Comment(0)
B
1

Try using fgets() instead of getline(). I was using getline() in Linux and it was working well until I migrated to Windows. The Visual studio did not recognize getline(). So, I replace the character pointer with character, and EOF with NULL. See below:

#define CHARCOUNT 1000

Before:

char *line = (char*) malloc(CHARCOUNT);
size_t size;
FILE *fp = fopen(file, "r");
while(getline(&line, &size, fp) != -1) {
   ...
}
free(line);

After:

char line[CHARCOUNT];
while(fgets(line, CHARCOUNT, fp) != NULL) {
   ...
}
Bengali answered 1/7, 2020 at 8:53 Comment(1)
This will not handle lines longer than the maximum length passed to fgets(), lines longer than that will be split up. So it significantly changes the semantics of the program. Code using getline() implicitly expects to be able to read a line of any reasonable length, so replacing getline() with fgets() is a latent bug.Lieutenant
M
0

Better Answer with no Bug Here:

size_t getline(char **lineptr, size_t *n, FILE *stream)
    {
        char *bufptr = NULL;
        char *p = bufptr;
        size_t size;
        int c;
    
        if (lineptr == NULL)
        {
            return -1;
        }
        if (stream == NULL)
        {
            return -1;
        }
        if (n == NULL)
        {
            return -1;
        }
        bufptr = *lineptr;
        size = *n;
    
        c = fgetc(stream);
        if (c == EOF)
        {
            return -1;
        }
        if (bufptr == NULL)
        {
            bufptr = malloc(128);
            if (bufptr == NULL)
            {
                return -1;
            }
            size = 128;
        }
        p = bufptr;
        while (c != EOF)
        {
            if ((p - bufptr) > (size - 1))
            {
                size = size + 128;
                bufptr = realloc(bufptr, size);
                if (bufptr == NULL)
                {
                    return -1;
                }
                p = bufptr + (size - 128);
            }
            *p++ = c;
            if (c == '\n')
            {
                break;
            }
            c = fgetc(stream);
        }
    
        *p++ = '\0';
        *lineptr = bufptr;
        *n = size;
    
        return p - bufptr - 1;
    }
Michaelmichaela answered 29/11, 2022 at 8:55 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Shafting

© 2022 - 2024 — McMap. All rights reserved.