Using "with" statement for CSV files in Python
Asked Answered
R

6

24

Is it possible to use the with statement directly with CSV files? It seems natural to be able to do something like this:

import csv
with csv.reader(open("myfile.csv")) as reader:
    # do things with reader

But csv.reader doesn't provide the __enter__ and __exit__ methods, so this doesn't work. I can however do it in two steps:

import csv
with open("myfile.csv") as f:
    reader = csv.reader(f)
    # do things with reader

Is this second way the ideal way to do it? Why wouldn't they make csv.reader directly compatible with the with statement?

Remarkable answered 13/1, 2009 at 22:36 Comment(1)
As mentioned below, it doesn't really make sense for a csv reader. But it sure does for a writer !Fazeli
A
22

The primary use of with statement is an exception-safe cleanup of an object used in the statement. with makes sure that files are closed, locks are released, contexts are restored, etc.

Does csv.reader have things to cleanup in case of exception?

I'd go with:

with open("myfile.csv") as f:
    for row in csv.reader(f):
        # process row

You don't need to submit the patch to use csv.reader and with statement together.

import contextlib

Help on function contextmanager in module contextlib:

contextmanager(func)
    @contextmanager decorator.

Typical usage:

    @contextmanager
    def some_generator(<arguments>):
        <setup>
        try:
            yield <value>
        finally:
            <cleanup>

This makes this:

    with some_generator(<arguments>) as <variable>:
        <body>

equivalent to this:

    <setup>
    try:
        <variable> = <value>
        <body>
    finally:
        <cleanup>

Here's a concrete example how I've used it: curses_screen.

Americano answered 14/1, 2009 at 0:24 Comment(4)
@J.F.Sebastion: I think that all the "Data Compression" and "File Format" library modules should directly support with.Lilithe
@S.Lott: I agree that standard library should create context managers itself, where it is applicable. In the case of csv module it could be reader = csv.open(path) but not reader = csv.reader(iterable).Americano
@J.F. Sebastian +1 for in-depth explanation on how one might use contextlib to accomplish this, but check the response from @bluce for an actual implementation to use with csv.Hamo
@technomalogical: Such implementation is interesting only if it is in stdlib. @bluce's implemenation can't be in stdlib. Its interface is too simplictic (some things to consider: encoding, buffering, restrictions on file-mode values, etc). It saves 1 line of code but complicates interface.Americano
L
4

Yes. The second way is correct.

As to why? Who ever knows. You're right, it's probably an easy change. It's not as high priority as other things.

You can easily make your own patch kit and submit it.

Lilithe answered 13/1, 2009 at 22:45 Comment(2)
I'm not sure this is a "patchable offense". The CSV reader is meant to act on an open file object and provide an iterable of rows -- there's no real resource acquisition and release going on. If you want to get out of the with block quickly, do rows = list(csv.reader(file_)) and use rows outside it.Tyler
@cdleary: I think that the response to with does not have to reflect ACTUAL resource use, but only "resource"-like. All the "Data Compression" and "File Format" library modules should do this for simple consistency.Lilithe
R
2

The problem is csv.reader doesn't really manage a context. It can accept any iterable, not just a file. Therefore it doesn't call close on its input (incidentally if it did you could use contextlib.closing). So it's not obvious what context support for csv.reader would actually do.

Reverberation answered 14/1, 2009 at 4:57 Comment(0)
D
1
import csv

class CSV(object):
    def __init__(self,path,mode):
        self.path = path
        self.mode = mode
        self.file = None

    def __enter__(self):
        self.file = open(self.path,self.mode)
        if self.mode == 'r':
            return csv.reader(self.file)
        elif self.mode == 'w':
            return csv.writer(self.file)

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.file.close()   

with CSV('data.csv','r') as reader:
    for row in reader:
        print row
Delorenzo answered 1/6, 2011 at 21:11 Comment(0)
S
0

It's easy to create what you want using a generator function:


import csv
from contextlib import contextmanager

@contextmanager
def opencsv(path):
   yield csv.reader(open(path))

with opencsv("myfile.csv") as reader:
   # do stuff with your csvreader
Shirker answered 14/1, 2009 at 2:1 Comment(1)
That doesn't actually close the file in the case of an exception. It makes the with statement valid... but that's all it does.Neodymium
H
0

As pointed out by other answers, the second way is the better option. And I think that a FileContextHandle will not come in the future, because the csv-module estimates Iterator[str], not typing.IO.

However there is a lack for a csv_open, that provides full functionality and correct file-context-handling.

class csv_open:
"""
A in most cases useless class, but Stackoverflow threads have triggered me.
"""

def __init__(
    self,
    file: str | Path,
    mode: str = "rt",
    buffering: Optional[int] = 1,
    encoding: Optional[str] = None,
    errors: Optional[str] = None,
    newline: Optional[typing.Literal["", "\n", "\r", "\n\r"]] = None,
):
    """

    Notes:
        if you do not specify a reader or writer a normal csv.reader is
        created for mode r and a csv.writer is created for modes w,
        a and x.

        If you want dict-reader or dict-writer or open the file in
        reading and writing mode you have to specify the reader or writer
        explicitly.

    Examples:
        >>> with csv_open('names.csv', mode='rt') as reader:
        ...    for row in reader:
        ...        print(row)

        >>> with csv_open("names.csv", mode="rt").dict_reader(
        ...     fieldnames=['firstname', 'surname']) as reader:
        ...     for row in reader:
        ...         print(f"{row['firstname']} {row['surname']}")


    Args:
        mode: r-reading, w-writing, a-appending, x-create-write
        buffering: 0 is forbidden, 1 is a per line buffering,
            greater 1 is byte-block buffering.
        encoding: any supported python encoding, defaults to archive
            specific encoding, or system-specific if there is no archive
            specific encoding.
        errors: Indicates the error handling for encoding errors run
            help(codecs.Codec) for more information.
        newline: indicates if line endings should be converted
    """
    self._newline = newline
    self._errors = errors
    self._encoding = encoding
    self._buffering = buffering
    self._mode = mode
    self._file = path(file)
    self._fd = None
    self._reader = None
    self._writer = None

def reader(self, dialect: str = "excel", **kwargs):
    """
    indicates that a csv-reader should be created. This method is for
    setting parameters explicit. The method is called implicitly if no
    other reader or writer is set and file opening mode is `r`

    Args:
        dialect: the dialect that is used by the reader
        kwargs: used to overwrite variables set in dialect

    Returns:
        Self
    """
    self._reader = csv.reader(self._open(), dialect=dialect, **kwargs)
    return self

def writer(self, dialect: str = "excel", **kwargs):
    """
    indicates that a csv-writer should be createad. This method is for
    setting parameters explicit. The method is called implicitly if no
    other reader or writer is set and file opening mode is `w`, `a` or `x`

    Args:
        dialect: the dialect that is used by the writer
        **kwargs: used to overwrite variables set in dialect

    Returns:
        Self
    """
    self._writer = csv.writer(self._open(), dialect=dialect, **kwargs)
    return self

def dict_reader(
    self,
    fieldnames: list[str],
    restkey: typing.Any = None,
    restval: typing.Any = None,
    dialect: str = "excel",
    **kwargs,
):
    """
    Indicates that a csv-DictReader should be created. This method is
    never called implicit.

    Args:
        fieldnames: the fieldnames of the dict, if None the first line
            of the csv-file is used.
        restkey: the key name for values that has no field-name mapping
        restval: the value used for field-names without a value
        dialect: the dialect that is used by the reader
        **kwargs: used to overwrite dialect variables

    Returns:
        Self
    """
    self._reader = csv.DictReader(
        self._open(),
        fieldnames=fieldnames,
        restkey=restkey,
        restval=restval,
        dialect=dialect,
        **kwargs,
    )
    return self

def dict_writer(
    self,
    fieldnames: list[str],
    restval: typing.Any = None,
    dialect: str = "excel",
    extrasection: typing.Literal["raise", "ignore"] = "raise",
    **kwargs,
) -> typing.Self:
    """
    Indicates that a csv-DictWriter should be created. This method is
    never called implicit.

    Args:
        fieldnames: the fieldnames of the dict, if None the first line of
            the csv file is used.
        restval: the value used for fieldnames that do not appear in the
            dict.
        dialect: the dialect that is used by the writer
        extrasection: failure strategie dict-keys that are no fieldnames
        **kwargs: used to overwrite variables set in dialect

    Returns:
        Self
    """
    self._writer = csv.DictWriter(
        self._open(),
        fieldnames=fieldnames,
        restval=restval,
        extrasaction=extrasection,
        dialect=dialect,
        **kwargs,
    )
    return self

def _open(self):
    if self._fd is not None:
        raise ValueError("Only one reader or writer is allowed.")

    self._fd = self._file.open(
        mode=self._mode,
        buffering=self._buffering,
        encoding=self._encoding,
        errors=self._errors,
        newline=self._newline,
    )
    return self._fd

def __enter__(
    self,
) -> csv.reader | csv.writer | csv.DictReader | csv.DictWriter:
    if self._reader is not None:
        return self._reader
    elif self._writer is not None:
        return self._writer
    elif "w" in self._mode or "a" in self._mode or "x" in self._mode:
        self.writer()
        return self._writer
    elif "r" in self._mode:
        self.reader()
        return self._reader
    # handle any forgotten or invalid modes
    raise RuntimeError("Please call a reader or writer constructor")

def __exit__(self, exc_type, exc_val, exc_tb):
    self._fd.close()
Hostility answered 28/6, 2024 at 13:10 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.