Sample data (emp.data
)
Beth 4.00 0
Dan 3.75 0
Kathy 4.00 10
Mark 5.00 20
Mary 5.50 22
Susie 4.25 18
I can read it into a data.frame
using read.table
, then convert it to data.table
:
library(data.table)
df <- read.table("emp.data", col.names = c("Name", "PayRate", "HoursWorked"))
DT <- as.data.table(df, key = HoursWorked)
Calculate the pay (filter out zero hours):
DT[HoursWorked > 0, .(Name, Pay = PayRate * HoursWorked),]
Name Pay
1: Kathy 40.0
2: Mark 100.0
3: Mary 121.0
4: Susie 76.5
That works fine; however, I consider there's an extra step in converting. Since there's fread()
in data.table
, why not use it directly?
readDT <- fread("emp.data", header=FALSE, sep="\t")
V1
1: Beth 4.00 0
2: Dan 3.75 0
3: Kathy 4.00 10
4: Mark 5.00 20
5: Mary 5.50 22
6: Susie 4.25 18
str(readDT)
Classes 'data.table' and 'data.frame': 6 obs. of 1 variable:
$ V1: chr "Beth 4.00 0" "Dan 3.75 0" "Kathy 4.00 10" "Mark 5.00 20" ...
- attr(*, ".internal.selfref")=<externalptr>
The data is recognized as one column; obviously this doesn't work.
Question
How to read this data using fread()
properly? (If possible, set the column names as well.)
sep
and leave it "auto" (letfread
decide). In other words just dofread("emp.data", header=FALSE)
– Palladic> readDT <- fread("emp.data", header=FALSE) Error in fread("emp.data", header = FALSE) : Not positioned correctly after testing format of header row. ch=' '
– Neslinedput
of your data set? Maybe also trying without specifyingheader
– Palladicfread("awk '{$1=$1}1' emp.data")
and it worked for me – Galliardheader
. And thedput
of the data:dput(readDT) structure(list(V1 = c("Beth 4.00 0", "Dan 3.75 0", "Kathy 4.00 10", "Mark 5.00 20", "Mary 5.50 22", "Susie 4.25 18")), .Names = "V1", row.names = c(NA, -6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x0000000000320788>)
– Neslineawk
command? – Nesline