How to use a BufferedWriter in Python?
Asked Answered
W

1

11

I am facing the following problem: I'm trying to implement a simulator for supply chains. These will produce a lot of EPCIS Events (events that occur at RFID readers). These events should then be written to a csv file in order to load them into any database and run analytical algorithms on them.

The simulator is implemented using python and works fine. What I'm now trying to do is buffered writing of the events to the file in order to decrease the time, that is needed to access the disk. Browsing the python documentation I stumbled upon the io.BufferedWriter, which sounds exactly like the thing I was looking for. Anyway, I can't quite get it to work.

Here's what I did so far. I implemented my CsvWriter class, that inherits from RawIOBase and manages the file handle. When it is instantiated, it will create a BufferedWriter, handing in itself, as the raw parameter (might that already be a problem?)

class CsvWriter(AbstractWriter):

    def __init__(self, filename):    
        self.filename = filename
        self.file = self.openFile()
        self.buffer = BufferedWriter(self, settings.WRITE_THRESHOLD)

When I know want to write something I call write_buffered buffered, which looks like that:

def write_buffered(self, data_dict):    
        self.buffer.write(b';'.join(map(str, data_dict.values())) + '\n')

The actual write methods which (as I figured) needs to be implemented on the CsvWriter class itself looks like that:

def write(self, data):
        if self.file.closed:
            self.file = self.openFile()

        return self.file.write(data)

The problem is, that when I try to run the simulator, I get the following error:

IOError: raw write() returned invalid length -1 (should have been between 0 and 78)

Do any of you have a clue for me how to fix this?

Westernism answered 10/1, 2012 at 15:47 Comment(3)
Ordinaly file I/O (e.g. open) should already be buffered (open(...) gives me a Buffered{Reader,Writer} instance), the underlying C library probably has a buffer, the OS probably has a buffer, and the disk probably has a cache. Do you really need to add anything youself?Choreography
Ok, but will this use as much RAM as possible before writing to disk? This is basically, what I'm trying to achieve. Keeping as much in memory as possible before I do a write out.Westernism
You can specify the desired buffer size for a file when opening it (third argument). No need to create your own buffering if you only have basic needs. Note: if you don't specify a size, it will use the system's default. Here, some doc.Flieger
E
0

To actually answer the question if you really want to use BufferedWriter.

import io

buffer_size = <number of bytes>

# Open the file unbuffered because we don't want to buffer twice.
with open("myfile.bin", "wb+", 0) as raw_file:
  with io.BufferedWriter(raw_file, buffer_size) as buffered_file:
    buffered_file.write(b"<your data>")

This works for binary files, if you want to do this but with text you need TextIOWrapper.

import io

buffer_size = <number of bytes>

# Open the file unbuffered because we don't want to buffer twice.
with open("myfile.txt", "wb+", 0) as raw_file:
  with io.BufferedWriter(raw_file, buffer_size) as buffered_file:
    # Again, turn off buffering because we're handling that.
    with io.TextIOWrapper(buffered_file, write_through=True) as buffered_text_file:
      buffered_text_file.write("<your data>")

Nested with statements can be written like

with (
  context_manager1 as foo,
  context_manager2 as bar,
  ...
):
  ...

to avoid the deep nesting but if you don't know the shorthand it can look confusing.

Eleonoreeleoptene answered 30/5 at 23:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.