'Incomplete final line' warning when trying to read a .csv file into R
Asked Answered
I

17

138

I'm trying to read a .csv file into R and upon using this formula:

pheasant<-read.table(file.choose(),header=TRUE,sep=",")

I get this warning message:

"incomplete final line found by readTableHeader on 'C:\Documents and Settings..."

There are a couple of things I thought may have caused this warning, but unfortunately I don't know enough about R to diagnose the problem myself so I thought I'd post here in the hope someone else can diagnose it for me!

  • the .csv file was originally an Excel file, which I saved into .csv format
  • the file comprises three columns of data
  • each data column is of a differing length, i.e. there are a different number of values in each column
  • I want to compare the means (using t-test or equivalent depending on normal / not normal distribution) of two of the columns at a time, so for example, t-test between column 1 values and column 2 values, then a t-test of column 1 and column 3 values, etc.

Any help or suggestions would be seriously appreciated!

Inoue answered 13/5, 2011 at 10:35 Comment(9)
@Inoue : could you link us to the file itself? I have some ideas, but it's difficult to say which problem it is without having the file.Subterfuge
Hi Joris - I'm not sure how to do that, sorry...Inoue
The first column has 1045 values, the second has 623 values and the third has 871 if that helps...? They are all numeric values in whole and half numbers, i.e. 23, 24.5 etc...Inoue
i think that's the problem, because read.table puts your data in a data frame, which needs to have equal columnlengths.Portrait
@Inoue : see eg yousendit.com , dropbox.com , ... Or, if it really doesn't work, find my contact info on my profile (click on my name) and mail me the file.Subterfuge
@Inoue : can you otherwise just copy-paste the first 6 lines from the csv file in your question? I'm rather confident the problem lies somewhere in the column names or other specifications.Subterfuge
@Joris Meys @Inoue the final line from the csv file would be useful to see also, as this is what the error refers to.Rask
@Rask : Nope, it's not. readTableHead (the underlying c function) reads the first 5 lines. The error originates there.Subterfuge
@Eduardo: it can be both, but both warnings come from the same internal function and originate from the first attempt to determine the types and structure of the data. That's what readTableHead is for.Subterfuge
M
161

The message indicates that the last line of the file doesn't end with an End Of Line (EOL) character (linefeed (\n) or carriage return+linefeed (\r\n)). The original intention of this message was to warn you that the file may be incomplete; most datafiles have an EOL character as the very last character in the file.

The remedy is simple:

  1. Open the file
  2. Navigate to the very last line of the file
  3. Place the cursor the end of that line
  4. Press return
  5. Save the file
Mccormick answered 13/5, 2011 at 18:51 Comment(3)
It's not the last line of the file. It's the header he's reading, which is your first five lines.Subterfuge
@JorisMeys The error message, however, refers to the last line of the file. Taking the steps above does indeed remove the warning.Lydialydian
@Lydialydian "incomplete final line" is a warning (not an error) that can pop up due to different causes. In your case that's the lack of a final EOL. There's no way that in your case the warning was thrown by the function readTableHeader, because that one doesn't read the final line. Hence your problem is not the same as that of the OP.Subterfuge
P
24

The problem is easy to resolve; it's because the last line MUST be empty.

Say, if your content is

line 1,
line2

change it to

line 1,
line2
(empty line here)

Today I met this kind problem, when I was trying to use R to read a JSON file, by using command below:

json_data<-fromJSON(paste(readLines("json01.json"), collapse=""))

; and I resolve it by my above method.

Pliant answered 4/5, 2017 at 9:16 Comment(1)
using plumber for hosting an R API I had the same issue. Warning message: In readLines(file) : incomplete final line found on 'apiAnaheim.R' warning was resolved by adding one empty line in the end. Not sure why this is happening.Astrodome
S
15

Are you really sure that you selected the .csv file and not the .xls file? I can only reproduce the error if I try to read in an .xls file. If I try to read in a .csv file or any other text file, it's impossible to recreate the error you get.

> Data <- read.table("test.csv",header=T,sep=",")
> Data <- read.table("test.xlsx",header=T,sep=",")
Warning message:
In read.table("test.xlsx", header = T, sep = ",") :
  incomplete final line found by readTableHeader on 'test.xlsx'

readTableHead is the c-function that gives the error. It tries to read in the first n lines (standard the first 5 ) to determine the type of the data. The rest of the data is read in using scan(). So the problem is the format of the file.

One way of finding out, is to set the working directory to the directory where the file is. That way you see the extension of the file you read in. I know on Windows it's not shown standard, so you might believe it's csv while it isn't.

The next thing you should do, is open the file in Notepad or Wordpad (or another editor) and check that the format is equivalent to my file test.csv:

Test1,Test2,Test3
1,1,1
2,2,2
3,3,3
4,4,
5,5,
,6,

This file will give you the following dataframe :

> read.table(testfile,header=T,sep=",")
  Test1 Test2 Test3
1     1     1     1
2     2     2     2
3     3     3     3
4     4     4    NA
5     5     5    NA
6    NA     6    NA

The csv format saved by excel seperates all cells with a comma. Empty cells just don't have a value. read.table() can easily deal with this, and recognizes empty cells just fine.

Subterfuge answered 13/5, 2011 at 13:6 Comment(1)
Assuming this is a Windows 7 environment, if Kate looks at the file either copied to the desktop or inside the folder, the icon for a .csv file has an "a" on it, whereas an .xlsx file has an icon that looks more like a worksheet. This is a quick visual way of determining file type. Much easier to see when saved onto the desktop as the icons are larger. :)Clearness
S
14

Use readLines() (with warn = FALSE) to read the file into a character vector first.

After that use the text = option to read the vector into a data frame with read.table()

    pheasant <- read.table( 
        text = readLines(file.choose(), warn = FALSE), 
        header = TRUE,  
        sep = "," 
    )
Selfservice answered 1/5, 2018 at 22:26 Comment(0)
C
6

I realized that several answers have been provided but no real fix yet.

The reason, as mentioned above, is a "End of line" missing at the end of the CSV file.

While the real Fix should come from Microsoft, the walk around is to open the CSV file with a Text-editor and add a line at the end of the file (aka press return key). I use ATOM software as a text/code editor but virtually all basic text editor would do.

In the meanwhile, please report the bug to Microsoft.

Question: It seems to me that it is a office 2016 problem. Does anyone have the issue on a PC?

Camera answered 31/5, 2016 at 21:14 Comment(1)
Basically a repost of this answer.Rhombencephalon
L
4

I have solved this problem with changing encoding in read.table argument from fileEncoding = "UTF-16" to fileEncoding = "UTF-8".

Lozada answered 16/9, 2015 at 19:53 Comment(0)
B
2

I received the same message. My fix included: I deleted all the additional sheets (tabs) in the .csv file, eliminated non-numeric characters, resaved the file as comma delimited and loaded in R v 2.15.0 using standard language:

filename<-read.csv("filename",header=TRUE)

As an additional safeguard, I closed the software and reopened before I loaded the csv.

Bibliographer answered 18/5, 2012 at 23:53 Comment(0)
H
2

In various European locales, as the comma character serves as decimal point, the read.csv2 function should be used instead.

Hodden answered 9/11, 2013 at 13:42 Comment(0)
T
2

I got this problem once when I had a single quote as part of the header. When I removed it (i.e. renamed the respective column header from Jimmy's data to Jimmys data), the function returned no warnings.

Tartrazine answered 11/5, 2016 at 10:8 Comment(0)
V
2

In my case, it was literally the final line. The issue was fixed by literally adding a blank row at the bottom of the CSV file.

FROM

cola,colb,colc
1,2,3
4,5,6
7,8,9

INTO

cola,colb,colc
1,2,3
4,5,6
7,8,9

Take a look closer on that extra space at the very last row. Just add that blank line and it will fix the issue.

NOTE

It seems that R's CSV parser is looking for that very last new line character as the new line separator. This is more known to programmers as the \r\n or \r characters.

Vevina answered 20/1, 2022 at 1:46 Comment(1)
Basically a repost of this answer.Rhombencephalon
O
1

The problem that you're describing occurred for me when I renamed a .xlsx as .csv.

What fixed it for me was going "Save As" and then saving it as a .csv again.

Oar answered 6/1, 2013 at 11:32 Comment(0)
D
1

To fix this issue through R itself, I just used read.xlsx(..) instead of a read.csv(). Works like a charm!! You do not even have to rename. Renaming an xlsx into to csv is not a viable solution.

Duodiode answered 3/5, 2018 at 18:56 Comment(2)
#Digvijay_Sawant, not sure what you mean by your last comment, but unlike every other solution here (I tried almost all of them: maddening!), yours was the only one that worked.Thwack
@WBarker In the original question author saved the Excel into a csv and then tried to read it. Well converting an excel to csv might change things like data formats, loss of data might occur etc. Excel might store an "end of file" in a different format than a csv which might make the function difficult to figure out where file ends. Well I am no expert but just a thought :-)Duodiode
B
0

Open the file in text wrangler or notepad ++ and show the formating e.g. in text wrangler you do show invisibles. That way you can see the new line or tabs characters Often excel will add all sorts of tabs in the wrong places and not a last new line character, but you need to show the symbols to see this.

Beeves answered 30/5, 2014 at 23:31 Comment(0)
H
0

My work around was that I opened the csv file in a text editor, removed the excessive commas on the last value, then saved the file. For example for the following file

Test1,Test2,Test3
1,1,1
2,2,2
3,3,3
4,4,
5,5,
,6,,

Remove the commas after 6, then save the file.

Hobo answered 18/2, 2015 at 4:38 Comment(0)
S
0

I've experienced a similar problem, however this appears to a generic warning, and may not in fact be related to the line-end character. In my case it was giving this error because the file I was using contained Cyrillic characters, once I replaced them with latin characters the error disappeared.

Scheck answered 1/4, 2018 at 20:14 Comment(0)
P
0

I tried different solutions, such as using a text editor to insert a new line and get the End Of Line character as recommended in the top answer above. None of these worked, unfortunately.

The solution that did finally work for me was very simple: I copy-pasted the content of a CSV file into a new blank CSV file, saved it, and the problem was gone.

Pawl answered 16/5, 2018 at 8:51 Comment(0)
O
0

There is a quite simple solution (if it is indeed the finale line which is causing troubles) where you don't need to open the file before reading it:

cat("\n", file = "your/File/Dir", append = TRUE)

Found this solution here.

Omsk answered 29/6, 2022 at 14:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.