fread and a quoted multi-line column value
Asked Answered
T

1

4
> fread('col1,col2\n')
Empty data.table (0 rows) of 2 cols: col1,col2
> fread('col1,col2\n5,4')
   col1 col2
1:    5    4
> fread('col1,col2\n5,"4\n3"')
Error in fread("col1,col2\n5,\"4\n3\"") : 
  Unbalanced quote (") observed on this line: 3"
> 

read.csv can import this csv as long as the value that spans multiple lines is wrapped in quotes.

Should fread be able to import it as well? Using read.csv is actually fine for my use case. I can just convert the resulting data frame into a data table. But I just wanted to make sure that not having this functionality was a design decision, and not something that just wasn't yet tested.

Tripletail answered 8/1, 2014 at 21:14 Comment(0)
S
5

UPDATE: Now fixed in v1.9.3 on GitHub :

  • fread() now accepts line breaks inside quoted fields. Thanks to Clayton Stanley for highlighting.



This error has been reported before and it's on the list to do. But what's new here is the \n inside the quotes. I hadn't realised that was a use case giving rise to the error.

Many thanks for reporting. It'll be fixed.

Similar question but not exactly the same here :

data.table::fread and Unbalanced "

and the bug report is here :

https://r-forge.r-project.org/tracker/?group_id=240&atid=975&func=detail&aid=2694

Silage answered 9/1, 2014 at 0:10 Comment(5)
@ Matt: I've seen the referenced bug has been fixed, but could you please verify if this also covered the multi-line \n issue is also covered? I think I just faced this exact problem with data.table_1.9.2 on R 3.1.0 x86_64-unknown-linux-gnu.Tarantass
@Tarantass Sorry, no. \n inside quoted field isn't fixed yet. I just checked and it is still on the list at the top of the fread.c source file.Silage
@Tarantass \n inside quoted field now fixed and tests added.Silage
@ Matt Dowle awesome, thank you very much for such great news! See you at useR! 2014 soon :)Tarantass
@Tarantass Ok great - see you at useR!, please say hello.Silage

© 2022 - 2024 — McMap. All rights reserved.