Read file as bytestring and write this bytestring to a file: issue on a network drive
Asked Answered
O

1

6

Consider the following simple Haskell program, which reads a file as a bytestring and writes the file tmp.tmp from this bytestring:

module Main
  where
import System.Environment
import qualified Data.ByteString.Lazy as B

main :: IO ()
main = do
  [file] <- getArgs
  bs <- B.readFile file
  action <- B.writeFile "tmp.tmp" bs
  putStrLn "done"

It is compiled to an executable named tmptmp.

I have two hard drives on my computer: the C drive and the U drive, and this one is a network drive, and this network drive is offline.

Now, let's try tmptmp.

When I run it from C, there's no problem; I run it two times below, the first time with a file on C and the second time with a file on U:

C:\HaskellProjects\imagelength> tmptmp LICENSE
done

C:\HaskellProjects\imagelength> tmptmp U:\Data\ztemp\test.xlsx
done

Now I run it from U, with a file on the C drive, no problem:

U:\Data\ztemp> tmptmp C:\HaskellProjects\imagelength\LICENSE
done

The problem occurs when I run it from U with a file on the U drive:

U:\Data\ztemp> tmptmp test.xlsx
tmptmp: tmp.tmp: openBinaryFile: resource busy (file is locked)

If in my program I use strict bytestrings instead of lazy bytestrings (by replacing Data.ByteString.Lazy with Data.ByteString), this problem does not occur anymore.

I'd like to understand that. Any explanation? (I would particularly like to know how to solve this issue but still using lazy bytestrings)

EDIT

To be perhaps more precise, the problem still occurs with this program:

import qualified Data.ByteString as SB
import qualified Data.ByteString.Lazy as LB

main :: IO ()
main = do
  [file] <- getArgs
  bs <- LB.readFile file
  action <- SB.writeFile "tmp.tmp" (LB.toStrict bs)
  putStrLn "done"

while the problem disappears with:

  bs <- SB.readFile file
  action <- LB.writeFile "tmp.tmp" (LB.fromStrict bs)

It looks like the point causing the problem is the laziness of readFile.

Oceania answered 17/6, 2017 at 18:47 Comment(7)
1. Does it work if you give it an absolute path (i.e. cd U:/ ; tmptmp U:/<..>/test.xlsx? (who knows, this could be it. Windows is weird sometimes) 2. What do you mean by "this network drive is offline"? I'd like to try to reproduce but I'm not sure how one accesses a network drive which is offline (clearly I misunderstand the meaning of 'offline' here!). 3. Why do you need to use lazy BS? It seems you've discovered that Strict is the right tool for the job. 4. Does it work if you force the input (i.e. evaluate (length bs) before the write)?Smasher
Hi @user2407038. 1) No. 2) This is the laptop of my job and I'm not connected to the domain. In Windows Explorer you have a button "Work offline / Work online". Click on "Work offline" if you want to reproduce. 3) This is just a minimal reproducible example. In the real life, I'm using the xlsx library which deals with lazy bytestrings. 4) I didn"t know the evaluate function, I'll try.Typehigh
2) Or simply disconnect your computer from Internet.Typehigh
I've just solved my real-life issue by using the strategy of the last point of my edit, with LB.readFile then fromStrict. But obviously that does not provide an explanation.Typehigh
Unfortunately, I can't reproduce (on W7). I think it is because I don't have an actual remote location which I can access this way, but Windows allowed me to "Map network drive" with a local (shared) folder. With this setup, there is no "Work offline" button, and it worked just fine with the lazy ByteString.Smasher
Thank you for trying to reproduce @user2407038. I'm using W7 as well. There's something more I did not say and then this could be the reason, but that would be weird. It seems that this network drive is misconfigured. For example, according to the file.access function of R, the drive U is not writeable (while it is writeable). But that would be strange that U is recognized as unwriteable only when I use a lazy bytestring.Typehigh
The same with getPermissions of System.Directory: getPermissions "U:/Data" indicates that U:/Data is not writable... hmm but however it indicates that U:/Data/zTemp is writable.Typehigh
H
0

As per the most recent Data.ByteString.Lazy docs:

Using lazy I/O functions like readFile or hGetContents means that the order of operations such as closing the file handle is left at the discretion of the RTS.

The example given with the offline network drive presumably leads to the RTS continuing from readFile without closing the file. The docs, which have an almost identical example, say that

When writeFile is executed next, [tmp.tmp] is still open for reading and the RTS takes care to avoid simultaneously opening it for writing, instead returning the error.

As far as I am aware, there is no solution to this in Data.ByteString.Lazy — both your solution (using the strict read) and other packages are suggested on the docs. Sometimes reading and writing the same file can work, but you have no guarantee.

Haematogenous answered 15/5, 2021 at 7:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.