C getline() - how to deal with buffers / how to read unknown number of values into array
Asked Answered
F

2

8

First of all, some background: I'm trying to get in a list of integers from an external file and put them into an array. I am using getline to parse the input file line by line:

int lines = 0;
size_t * inputBuffer = (size_t *) malloc(sizeof(size_t));
char * storage = NULL;

I am calling getline like so:

getline(&storage, &s, input)

I heard from the man page on getline that if you provide a size_t * buffer, you can have getline resize it for you when it exceeds the byte allocation. My question is, what can you use this buffer for? Will it contain all of the items that you read with getline()? Is it simpler to read from this buffer, or to traverse the input in a different way when putting these integers into an array? Thanks!

Ferrick answered 7/2, 2012 at 5:35 Comment(0)
T
6

The buffer will only contain the last line you read with getline. The purpose is just to take a little bit of the effort of managing memory off your code.

What will happen if you repeatedly call getline, passing it the same buffer repeatedly, is that the buffer will expand to the length of the longest line in your file and stay there. Each call will replace its contents with the next line's.

You're not providing it a size_t*, you're giving it a char*.

Typecast answered 7/2, 2012 at 5:39 Comment(2)
Oh ok, I see now. So then is it possible to use getline() to iterate through the input once and count the number of lines, and then use getline to pass through once again and store the values into a malloc'ed array?Ferrick
@Ferrick If you fseek back to the start of the file (or close it and then open it again), yes. You could also use a linked list or other dynamically-growing structure to only need one pass.Typecast
S
13

This is not the correct use of getline. I strongly suggest to read carefully its man page.

You could have some code like

FILE *inputfile=fopen("yourinput.txt", "r");
size_t linesiz=0;
char* linebuf=0;
ssize_t linelen=0;
while ((linelen=getline(&linebuf, &linesiz, inputfile)>0) {
  process_line(linebuf, linesiz);
  // etc
  free(linebuf);
  linebuf=NULL;
}

BTW, you might (and probably should better) put

  free(linebuf);
  linebuf=NULL;

... after the while loop (to keep the line buffer allocated from one line to the next), and in most cases it is preferable to do so (to avoid too frequent malloc-s from getline).

Notice that getline is in ISO/IEC TR 24731-2:2010 extension (see n1248).

Sines answered 7/2, 2012 at 5:40 Comment(6)
This is what I had basically done in my code, but my main problem is since I am getting an array whose size is unknown, I need to first go through the array and count how many lines there are. Is it possible to do this somehow? while ((linelen=getline(&linebuf, &linesiz, inputfile)>0) { numOfLines ++; } for(i = 0; i < numOfLines; i++){ addToArray(); }Ferrick
The process_line can actually be whatever you want. You could have a char**linearr pointer, and malloc it etc... Then use linearr[i++] = strdup(linebuf);Sines
The problem I am concerned with is that when you malloc you must have a defined size, correct? And if I am both counting the elements in the array and adding them to the array in one pass, I am unsure how to get the size if I am not using something like linked lists.Ferrick
But you can grow dynamically the array at runtime (by malloca bigger one, then copying the content of the old variant, then free-ing that old variant and use the new bigger malloc-ated one). You definitely don't need to count lines... Just grow the array when needed!Sines
Doesn't that take up a lot of processing time, to copy the whole array each pass through the for loop? I can definitely see that working though.Ferrick
No, you are occasionally copying an array of pointers (when growing linearr only). You won't copy each line (only a pointer to it).Sines
T
6

The buffer will only contain the last line you read with getline. The purpose is just to take a little bit of the effort of managing memory off your code.

What will happen if you repeatedly call getline, passing it the same buffer repeatedly, is that the buffer will expand to the length of the longest line in your file and stay there. Each call will replace its contents with the next line's.

You're not providing it a size_t*, you're giving it a char*.

Typecast answered 7/2, 2012 at 5:39 Comment(2)
Oh ok, I see now. So then is it possible to use getline() to iterate through the input once and count the number of lines, and then use getline to pass through once again and store the values into a malloc'ed array?Ferrick
@Ferrick If you fseek back to the start of the file (or close it and then open it again), yes. You could also use a linked list or other dynamically-growing structure to only need one pass.Typecast

© 2022 - 2024 — McMap. All rights reserved.