How to convert time (mm:ss) to decimal form
Asked Answered
M

5

21

I've imported a CSV file to R using RStudio where I am trying to plot points per game against minutes per game. However the minutes per game is in the format mm: ss and I'm having a hard time finding how to convert it to decimal form.

Mera answered 3/3, 2011 at 21:45 Comment(1)
What is the class of the variable holding time?Dav
F
29

Given that you start with a character vector, this is relatively easy :

minPerGame <- c("4:30","2:20","34:10")

sapply(strsplit(minPerGame,":"),
  function(x) {
    x <- as.numeric(x)
    x[1]+x[2]/60
    }
)

gives

[1]  4.500000  2.333333 34.166667

Make sure you checked that you used read.csv() with the option as.is=TRUE. Otherwise you'll have to convert using as.character().

Fluoridate answered 3/3, 2011 at 22:20 Comment(2)
I use stuckey <- read.csv("C:/kalle/R/stuckey.csv", stringsAsFactors=FALSE) so I won't get the values as factors and can't seem to get the as.is=TRUE to work.Mera
@Mera : So you figured out that's the same ;-) Don't forget to accept either answer you found most helpful as the correct one by using the V-sign on the left. This site serves as a reference for other people as well (see also the FAQ). CheersFluoridate
P
9

Do you need to decimalise it? If you store the data in the correct format, for example as an object of class POSIXlt, one of R's date-time classes, R will handle the correct handling of the times in numeric fashion. Here is an example of what I mean:

First we create some dummy data for illustration purposes:

set.seed(1)
DF <- data.frame(Times = seq(as.POSIXlt("10:00", format = "%M:%S"), 
                             length = 100, by = 10),
                 Points = cumsum(rpois(100, lambda = 1)))
head(DF)

Ignore the fact that there are dates here, it is effectively ignored when we do the plot as all observations have the same date part. Next we plot this using R's formula interface:

plot(Points ~ Times, data = DF, type = "o")

Which produces this:

points per game time

Panthia answered 3/3, 2011 at 22:12 Comment(3)
conversion to as.numeric to calculate mean game duration becomes tedious though, as POSIXt classes take 31/12/1969 23:59:59 as zero, but add the current date when converting. So a naive mean(as.numeric(Times)) will give a wrong result today, and a different wrong result tomorrow...Fluoridate
@Joris Agreed, but @Mera asked about plotting, hence I asked if he needed to decimalise. After I'd written my answer I realised you dealt with that explicitly so I didn't bother with it as between us we cover most bases.Panthia
oops, I missed that question about plotting. :-) then very much +1 indeed.Fluoridate
D
2

Some tuning of first solution:

minPerGame <- paste(sample(1:89,100000,T),sample(0:59,100000,T),sep=":")

f1 <- function(){
sapply(strsplit(minPerGame,":"),
 function(x) {
  x <- as.numeric(x)
  x[1]+x[2]/60
 }
)
}
#
f2<- function(){
 w <- matrix(c(1,1/60),ncol=1)
 as.vector(matrix(as.numeric(unlist(strsplit(minPerGame,":"))),ncol=2,byrow=TRUE)%*%w)
}

system.time(f1())
system.time(f2())

system.time(f1()) user system elapsed 0.88 0.00 0.86

system.time(f2()) user system elapsed 0.25 0.00 0.27

Disbelieve answered 4/3, 2011 at 20:20 Comment(0)
D
0

I had data with times like so:

  • 22:49:20+1100
  • 19:29:11+1000
  • 20:01:26+0930

And this seemed to work for me:

my_df <- my_df %>%
separate(col = eventTime, into = c("H", "M", "S"), sep = "\\:", remove = FALSE) %>% 
separate(col = S, into = c("S", "Z"), sep = "\\+", remove = TRUE) %>% 
separate(col = Z, into = c("ZH", "ZM"), sep = 2, remove = TRUE) %>% 
mutate(H = as.numeric(H)/24) %>% 
mutate(M = as.numeric(M)/24/60) %>% 
mutate(S = as.numeric(S)/24/60/60) %>% 
mutate(ZH = as.numeric(ZH)/24) %>% 
mutate(ZM = as.numeric(ZM)/24/60) %>% 
mutate(H = H-ZH) %>% 
mutate(M = M-ZM) %>% 
mutate(time_num = H+M+S)

H:hours, M:minutes, S:seconds, Z:zone, ZH:zone hours, ZM:zone minutes

If you don't care about the timezones then this:

my_df <- my_df %>%
separate(col = eventTime, into = c("H", "M", "S"), sep = "\\:", remove = FALSE) %>% 
separate(col = S, into = c("S", "Z"), sep = "\\+", remove = TRUE) %>% 
mutate(H = as.numeric(H)/24) %>% 
mutate(M = as.numeric(M)/24/60) %>% 
mutate(S = as.numeric(S)/24/60/60) %>% 
mutate(time_num = H+M+S)

The first method you may end up with negatives. The second method you should get values between 0 and 1 with the time_num being the portion of the day.

For example:

  • 22:49:20+1100 = 0.950925926

  • 07:26:10+1100 = 0.309837963

It should be noted my time data was all from a timezone with a positive +

Dewie answered 7/6, 2021 at 12:23 Comment(0)
P
0

I like lubridate for this. The same logic could be used for hours+minutes, as well, by adjusting to use hm in place of ms, etc.

minPerGame <- c("4:30","2:20","34:10")

library(lubridate)
minPerGame_ms <- ms(minPerGame)
(minPerGame_dec = minute(minPerGame_ms) + second(minPerGame_ms)/60)
[1]  4.500000  2.333333 34.166667
Physostomous answered 12/3, 2024 at 19:10 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.