dbplyr mutate character to date format in temp table
Asked Answered
M

1

8

I have extracted data to a temporary table in SQL Server using DBI::dbGetQuery.

Even though, in the real query (not the play query below), I select convert(date, date_value) as date_value, the dates are still stored as character.

I then try to mutate the character representing the date using lubridate::ymd, however I obtain a message saying

date_value not found

I have also tried, convert(date, date_value) and as.Date to no avail.

require(dplyr)
if (dbExistsTable(con, "##temp", catalog_name = "tempdb")){
  dbRemoveTable(con, "##temp")
}
DBI::dbGetQuery(con, paste(              
"select 
    convert(date, '2013-05-25') as date_value
into ##temp
"))

tbl(con, "##temp")

# Error - date_value not found
tbl(con, "##temp") %>%   mutate(date_value= lubridate::ymd(date_value))

# this works
tbl(con, "##temp") %>%   mutate(temp= date_value) 

# this doesn't work - date value not found
tbl(con, "##temp") %>%   mutate(temp= lubridate::ymd(date_value))

How can I work this field as a date?

Note: When I write the following in SQL Server, date_value shows as a date Type

select 
convert(date, '2013-05-25') as date_value
into #hello

select *
from #hello

exec tempdb..sp_help #hello

in response to the comment from @Kevin Arseneau, the following image shows the results from executing a show_query() error message

Moffett answered 7/2, 2018 at 0:21 Comment(8)
Could you add show_query to your dplyr pipes so that we can see what is going across to SQL?Rodrigorodrigue
have just played around with the real data a little more. I can group_by year(date_value), but I can't use R/Tidyverse functions. I had originally pulled the data into R, but as it grew, it took longer, so I converted to using a temp table and dbplyr (as opposed to local data with dplyr). This has meant that I had to change stringr::str_c to concat. Should the tidyverse functions work with dbplyr? I have also just changed the question tag from dplyr to dbplyr! Didn't note the subtle change in package names ...Moffett
Not all tidyverse verbs are DBI compliant, you may need to collect first and then mutate. You could try tbl(con, "##temp") %>% collect %>% mutate(temp = lubridate::ymd(date_value)).Rodrigorodrigue
Thanks.. must be the problem. I haven't been able to find any documentation on which are compliant and which are not - have you come across this documentation? I wanted to avoid collecting until after summarising (would that not be better for speed in terms of a larger number of rows?) I have gotten around the problem by using the year and concat functions; although I am only at the beginning of my analysis...Moffett
Because compliance is based on the DBI back-end and db driver, there is no universal resource to my knowledge. You can read here and here for more information.Rodrigorodrigue
@Moffett did you ever figure out a solution to this question? I am stuck on the same thing.Vtol
@epi_n00b. I believe I stuck to the workaround, as per comment from Kevin Arseneau above (feb 17, 2018). Also see comment below the proposed solution by gavg712 - it seems that the field could still be used for filtering and issue was only on collect. I no longer use SQL server so cannot test further.Moffett
Same here using sql server @KevinArseneau Isn't the problem rather about the driver not pulling the data in as date (as described in the database schema) in the first place? The user does not want to use any possible non-DBI compliant method here, but seeks to to simply preserve the database type when using dplyr/dbi/dbplyr. Is there a github ticket for the problem?Pilloff
S
6

Some months ago I was looking a solution for use lubridate functions + dplyr on PostgreSQL unsuccessfully. Accidentally I found the easy solution using DBMS functions directly on dbplyr coding.

Sorry, I will use PostgreSQL example because I don't know about SQL server functions. In this example I will create a temporal table in a PostgreSQL DBMS, then I will compute a new column with the function to_date() provided by PostgreSQL. The result is the date that was looking for:

# Easy example on postgreSQL
library(tidyverse)
library(dbplyr)
library(RPostgreSQL)

con <- dbConnect(PostgreSQL(), 
                 dbname="postgre",
                 host="localhost",
                 port=5432,
                 user="user",
                 password="pass")

date <- data_frame(date_str = c("20180212", "20180213"))

dbWriteTable(con, "tmp", date, temporary = TRUE)

tbl(con, "tmp") %>% 
# The DBMS function is used here
  mutate(date = to_date(date_str, "YYYYMMDD")) %>% 
# Finally, I collect the data from database to R session
  collect()

#># A tibble: 2 x 3
#>  row.names date_str date      
#>* <chr>     <chr>    <date>    
#>1 1         20180212 2018-02-12
#>2 2         20180213 2018-02-13

You can try with settings for SQL Server, and the CAST() function could convert your strings to date, as is explained in this answer. I hope this help you.

I hope someday dplyr/dbplyr can translate the lubridate functions into SQL queries.

Supposition answered 13/2, 2018 at 23:30 Comment(2)
DBMS cast and convert functions don't seem to work for SQL server (both return errors) As the date is still stored correctly in the temp table, can use for filtering and extracting months, years, etc; it is only on collection that the date appears as a character. tbl(con, "##temp") %>% mutate(date = datefromparts(year(date_value), month(date_value), day(date_value))) %>% collect() %>% # date and date_value are returned as a character mutate(date=ymd(date)) # can convert here as requiredMoffett
What do you get when try tbl(con, "##temp") %>% head ()? Did you see the column date_value? If it appears, Which class is it?Supposition

© 2022 - 2024 — McMap. All rights reserved.