I am stumped. Normally, read.csv
works as expected, but I have come across an issue where the behavior is unexpected. It most likely is user error on my part, but any help will be appreciated.
Here is the URL for the file
http://nces.ed.gov/ipeds/datacenter/data/SFA0910.zip
Here is my code to get the file, unzip, and read it in:
URL <- "http://nces.ed.gov/ipeds/datacenter/data/SFA0910.zip"
download.file(URL, destfile="temp.zip")
unzip("temp.zip")
tmp <- read.table("sfa0910.csv",
header=T, stringsAsFactors=F, sep=",", row.names=NULL)
Here is my problem. When I open the data csv data in Excel, the data look as expected. When I read the data into R, the first column is actually named row.names. R is reading in one extra row of data, but I can't figure out where the "error" occurs that is causing row.names to be a column. Simply, it looks like the data shifted over.
However, what is strange is that the last column in R does appear to contain the proper data.
Here are a few rows from the first few columns:
tmp[1:5,1:7]
row.names UNITID XSCUGRAD SCUGRAD XSCUGFFN SCUGFFN XSCUGFFP
1 100654 R 4496 R 1044 R 23
2 100663 R 10646 R 1496 R 14
3 100690 R 380 R 5 R 1
4 100706 R 6119 R 774 R 13
5 100724 R 4638 R 1209 R 26
Any thoughts on what I could be doing wrong?
row.names = NULL
argument. – Crossgraineddim(read.csv("sfa0910.csv", header = F, skip = 1))
is6852 452
whereaslength(unlist(strsplit(readLines('sfa0910.csv',1), ',')))
is451
. – Crossgrained