Freeing an array after it has been written to by MPI_Recv
Asked Answered
A

1

6

I have a malloc'd array of integers that I fill with MPI_Recv

MPI_Recv(d.current, n, MPI_INT, 0, TAG_CURRENT_ARRAY, MPI_COMM_WORLD, &status);

I have tested the value of d.current both before and after the MPI_Recv and it doesn't change (which is correct).

My data arrives correctly.

However if I try to free the data I get an error:

*** Error in `./bin/obddhe-mpi': free(): invalid next size (fast): 0x0965e988 ***

The exact same free before the receive works perfectly.

I.e... This Works:

free(d.current);
//MPI_Recv(d.current, n, MPI_INT, 0, TAG_CURRENT_ARRAY, MPI_COMM_WORLD, &status);

This Fails:

MPI_Recv(d.current, n, MPI_INT, 0, TAG_CURRENT_ARRAY, MPI_COMM_WORLD, &status);
free(d.current);

What could MPI_Recv be doing that invalidates the free!?

Alain answered 20/1, 2014 at 10:44 Comment(0)
P
5

A SSCCE would be very helpful.

That said, I'll try to answer as good as I can:

I have a malloc'd array of integers that I fill with MPI_Recv

MPI_Recv(d.current, n, MPI_INT, 0, TAG_CURRENT_ARRAY, MPI_COMM_WORLD, &status);

How large is that array? How exactly did you malloc() it? What is n in this case and how is it related to the malloc()ed size?

Your observations show that MPI_Recv() is the reason for this error to occur. In order to make this error occur, MPI_Recv() has written beyond the end of the malloc()ed memory area, which it isn't allowed to. This messes up either the linked list used internally by memory management or the size of blocks behind it or both, leading to the said error.

I have tested the value of d.current both before and after the MPI_Recv and it doesn't change (which is correct).

(How should it? You are passing the pointer to the function, not its address. So the pointer cannot change.)

However if I try to free the data I get an error:

* Error in `./bin/obddhe-mpi': free(): invalid next size (fast): 0x0965e988 *

The exact same free before the receive works perfectly.

That is another clue for what I wrote above: the meory behind the block you use has been freed and contains a pointer to the next free area. If you free() your memory, the library tries to merge the free blocks, the second of those being corrupt, leading to this error.

Imagine you have the following situation:

  • Your memory manager prepends each memory block, be it free or allocated, with its length.
  • The free blocks have the address of the next free block at their start - this is the linked list I mentioned.
  • Your allocated block, prepended with its length, is followed by
    • a free block, prepended with its length and containing the address of the next free block of NULL if there is no next free block.

Then, if you write past the end of your memory block, the length and content of the next block will be touched and tampered with.

This doesn't affect anything - till now.

But if you call free() on your block, this block will be merged with the free block after it.

In order to do so, the following actions must occur:

  • Traverse the linked list in order to find adjacent free blocks - which already might lead to this error because the "next" pointer of the 2nd free block is garbage.
  • Calculate the size of the bigger free block from the other blocks. If one of these contains garbage, the garbage will be used for calculating the new, bigger free block size and the confusion is perfect.
Pestilent answered 20/1, 2014 at 10:54 Comment(9)
Thanks for the clues! You are right that I am overflowing the buffer (I was using a size value from a previous loop iteration). Dumb mistake :/Alain
+1 for that answer but I have a doubt. Even if you write past the allocated memory - free() should only free the memory that was allocated with malloc() i.e. the size of the memory given with malloc(). Further this is an int array and not a character array terminated by '\0'. Isn't there a possibility that even without free() the error should crop up with MPI_Recv() only ?Flanch
@GauravSaxena No. If you overwrite a part of memory which doesn't belong to you, anything may happen. As said, most memory allocators manage their blocks of free and allocated memory with a linked list. If this is destroyed, such things may happen.Pestilent
@glglgl:I agree, its a bad bad world of hardware/software out there.Flanch
I just didn't flagged this, as you appear, you really spent time for this answer, but at all. it isn't a answer is it? it's more kind of suggestion or comment?!Firstrate
@Zaibis How so? It explains how the action of MPI_Recv() could be responsible for this error to occur.Pestilent
@Pestilent Thats the point: How it COULD be, So it is a suggestion, which isn't wanted as answer. More over as comment. But nevermind, as you did your job well ;)Firstrate
@Zaibis a) A comment of this size isn't possible, and b) it is all we can say without knowing the internals of the said funtion. And the question was exactly like "What could MPI_Recv be doing that invalidates the free!?" - so what else could I answer?Pestilent
@Pestilent you couldn't doing better, its just a bad styled question by op. and ofcourse you couldn't post this as comment. thats as I said why I didn't flag, as its a informative text anyway. But NVM now, jsut wanted to add.Firstrate

© 2022 - 2024 — McMap. All rights reserved.