I'm trying to pull data from a PostgreSQL database and the results for a timestamp field are inconsistent. I'm not sure if I'm handling POSIXct results properly. Otherwise, I think I found a bug in the RPostgreSQL package. Here is the way to replicate the issue:
Suppose there is a table in a postgres database with one field (run this in PostgreSQL):
CREATE DATABASE mydb;
CREATE TABLE test_table
(
"DateTime" timestamp without time zone NOT NULL,
CONSTRAINT "pk_test_table" PRIMARY KEY ("DateTime")
)
WITH (
OIDS=FALSE
);
ALTER TABLE test_table
OWNER TO postgres;
And let’s say there are a few hundred records. I will populate them in R. Here is the code:
library(RPostgreSQL)
# Let's feed the table with some sequence of date/time values
date_values <- as.chron(seq(10000, 10500, 1/24))
format.chron <- function(z) {
sprintf("%04.0f-%02.0f-%02.0f %02.0f:%02.0f:00",
as.numeric(as.character(years(z))),
months(z),
as.numeric(as.character(days(z))),
as.numeric(as.character(hours(z))),
as.numeric(as.character(minutes(z))))
}
.generateInsertQuery <- function(date_values, field_name, table_name) {
insert_val <- paste(paste0("(", sQuote(format(date_values)), ")"), collapse=',')
qry <- paste("INSERT INTO", dQuote(table_name), paste0("(", dQuote(field_name), ")"), "VALUES", insert_val)
qry
}
drv <- dbDriver('PostgreSQL')
con <- dbConnect(drv, user='postgres', dbname='mydb')
qry <- .generateInsertQuery(date_values, "DateTime", "test_table")
dbSendQuery(con, qry)
If I try to get the values, the time component gets stripped out of the resulting data
res <- dbGetQuery(con, "SELECT * FROM test_table")
res[1:20,1]
The class of the result, however, is POSIXct
class(res[,1])
If the result is fetched one record at a time, the values with hour:min equal to 00:00 loose the time component:
rs <- dbSendQuery(con, "SELECT \"DateTime\" FROM test_table")
res_list <- list()
for(i in 1:100) res_list[i] <- fetch(rs,1)
res_list
As a workaround, I'm fetching the result 1 record at a time, fixing, and aggregating them into a data.frame. But this is very time-consuming, especially for large data sets. Any ideas of why this is happening and how to deal with this issue?
Thanks in advance!
timestamp with timezone
in theCREATE TABLE
statement did the trick for me. Though it's not really part of the question, in my experience I have found thatdbWriteTable
does not like any capital letters when specifying the SQL table names to write to. – Tropophilous