How to avoid: read.table truncates numeric values beginning with 0
Asked Answered
C

3

14

I want to import a table (.txt file) in R with read.table(). One column in my table is an ID with nine numerals - some ids begin with a 0, other with 1 or 2.

R truncates the first 0 (012345678 becomes 12345678) which leads to problems when using this ID to merge another table.

Can someone give me a hint how to solve the problem?

Cohin answered 13/2, 2013 at 13:16 Comment(0)
S
17

As said in Ben's answer, colClasses is the easier way to do it. Here is an example:

read.table(text = 'col1 col2
           0012 0001245',
           head=T,
           colClasses=c('character','numeric'))

  col1 col2
1 0012 1245      ## col1 keep 00 but not col2
Suzetta answered 13/2, 2013 at 13:44 Comment(1)
cool I did not know about colClasses strange how setting as.is=T did not work for me.Galore
C
3

A reproducible example would be nice, but: use the colClasses argument to read.table() to specify that you want this column to be read as a character variable, not numeric. Or make them back into character variables after reading them in, using sprintf to pad the numbers with leading zeros. (The former is probably easier.)

Culet answered 13/2, 2013 at 13:25 Comment(0)
Y
0

Here is a for loop to add leading zeros to rows based on a condition. Although this is a post-hoc solution (adding leading 0's after reading the table), it worked for me so thought I'd share:

Let's take the example of a column of zip codes. All values should contain 5 digits (e.g. 01234), but R removes leading zeros (so '01234' becomes '1234'). You can add a trailing zero to all cells that contain only 4 characters with this code:

for (i in 1:nrow(df)){
  if(nchar(df$zipCode[i])<5){
    df$zipCode[i]<- paste0('0',df$zipCode[i])
  }
}
Yonatan answered 10/7, 2017 at 18:43 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.