Segfault reading lazy bytestring past 2^18 bytes
Asked Answered
J

1

11

Consider the following code: http://hpaste.org/90394

I am memory mapping a large 460mb file to a lazy ByteString. The length of the ByteString reports 471053056.

When nxNodeFromID file 110000 is changed to a lower node ID, ie: 10000, it works perfectly. However; as soon as I try and serialize anything past exactly 2^18 bytes (262144) of the ByteString I get Segmentation fault/access violation in generated code and termination.

I'm running Windows and using GHC 7.4.2.

Please advise whether this is my fault, or some issue with the laziness, or, some issue with Haskell.

Jermaine answered 25/6, 2013 at 7:56 Comment(10)
Your getNXNode doesn't match the NXNode data definition. If that's intentional, it would be worth a comment. But I don't see how that would cause a segfault here.Lubricate
@DanielFischer NXNode 0 <$> ... :)Jermaine
Yes, but you skip 20 bytes, and read only 12 per node.Lubricate
Sorry, I misunderstood. getNXMetadata contributes to 2 (type) + 8 (data) optional bytes (which are null if they aren't used) totaling 20 for the size of the node (when added to the 4 + 4 + 2 read in getNXNode).Jermaine
Makes sense. I suppose once getNXMetadata is complete, it becomes more or less obvious.Lubricate
Does the problem persist if you use mmapFileByteString instead of mmapFileByteStringLazy? (You'd need to wrap the returned strict ByteString of that in a fromChunks [buf] or so to get a lazy ByteString for the rest of your code to work with.)Lubricate
Yes, that works (however, I think I lose laziness now?). Is this segfault a wierd glitch in GHC?Jermaine
Sure, you lose laziness. But the idea was to get a hint what the problem could be. I suspect it may be "-- FIXME: might be we need NOINLINE pragma here, investigate later" (concerning mapChunk handle (offset,size) = unsafePerformIO $), but it could be something else. Can't really investigate, though. (Well, if I had to, I could boot into Windows; but I'd still need a file to work on, and a great incentive to expose myself to the inconvenience.)Lubricate
Sorry, that's not my area of expertise. Besides, I don't think a Windows core-dump would work with gdb on Linux. You could try to contact the maintainer of mmap, he should have a better idea what the problem might be.Lubricate
BTW you should make the fields of your type strict. Better semantics.Truditrudie
T
1

Note that I have updated mmap to correctly include NOINLINE pragma at strategic point in the code. mmap-0.5.9 available for grabs. Let me know if the issue persists. Edit: yes, I'm the author of mmap.

Tiny answered 5/9, 2013 at 17:1 Comment(2)
While it appears you're the mmap author, this is not totally clear from your answer. I would consider adding more information.Signally
HI, I have the same error with latest version of your library. #53715638Cirri

© 2022 - 2025 — McMap. All rights reserved.