converting to timestamp with time zone failed on Athena
Asked Answered
R

5

37

I'm trying to create to following view:

CREATE OR REPLACE VIEW view_events AS 
(
   SELECT
     "rank"() OVER (PARTITION BY "tb1"."innerid" ORDER BY "tb1"."date" ASC) "r"
   , "tb2"."opcode"
   , "tb1"."innerid"
   , "tb1"."date"
   , From_iso8601_timestamp(tb1.date) as "real_date"
   , "tb2"."eventtype"
   , "tb1"."fuelused"
   , "tb1"."mileage"
   , "tb1"."latitude"
   , "tb1"."longitude"
   FROM
     rt_message_header tb1
   , rt_messages tb2
   WHERE ((("tb1"."uuid" = "tb2"."header_uuid") AND ("tb2"."opcode" = '39')) AND ("tb2"."type" = 'event'))
   ORDER BY "tb1"."innerid" ASC, "tb1"."date" ASC
)

and it gives me the following error:

Your query has the following error(s): Unsupported Hive type: timestamp with time zone

however when I run the query on it's own it works fine, and the From_iso8601_timestamp is mentioned here as a valid date function.

can anyone tell me what I'm doing wrong?

Remsen answered 13/6, 2018 at 8:52 Comment(0)
H
59

Short summary:

SELECT CAST(From_iso8601_timestamp('2018-01-01T15:00:00Z') AS timestamp) as "real_date"

Full story:

Unfortunately Athena doesn't fully support all Presto features, it has limitations and is technically a few versions behind Presto. There's some attempt to make Athena integrate closely with the AWS Glue Metastore, which while based on Hive's metastore has some inconsistencies. I wish that Spark, Hive, Glue, Athena, Presto et al would just work with the same metastore, it would make life easier, but back to your issue:

This document about an older teradata fork of Presto mentions some issues with timestamp in presto:

Presto’s method for declaring timestamps with/with out timezone is not sql standard. In Presto, both are declared using the word TIMESTAMP, e.g. TIMESTAMP '2003-12-10 10:32:02.1212' or TIMESTAMP '2003-12-10 10:32:02.1212 UTC'. The timestamp is determined to be with or without timezone depending on whether you include a time zone at the end of the timestamp. In other systems, timestamps are explicitly declared as TIMESTAMP WITH TIME ZONE or TIMESTAMP WITHOUT TIME ZONE

The version of Presto that Athena is forked from does support both timestamp and timestamp with timezone but with that quirk as mentioned in the teradata docs which shouldn't be an issue. The real issue is that Athena does not support timestamp with timezone.

The presto docs you've linked to show that the function returns a value of that unsupported type timestamp with timezone, so you need to cast it as something else that is supported. It's an oversight that Athena allows functions and casting to a datatype that is then not supported, and hopefully that will be remedied, but for now you have to work around it.

What you need to do is use the CAST() function around that function call, which will change the type from timestamp with time zone into timestamp

Unfortunately you probably can't cast the string directly to a timestamp, although it depends on how the string is formatted. You also can't use the style of casting where you write timestamp before the string e.g. can't do timestamp '2018-01-01 15:00:00' for reasons I will explain below.

Type returned by the from_iso1601_timestamp() function

SELECT typeof("real_date") AS real_date_type
FROM
(
SELECT From_iso8601_timestamp('2018-01-01T15:00:00Z') as "real_date"
)

timestamp with time zone

This doesn't work

SELECT typeof("real_date") AS real_date_type
FROM
(
SELECT CAST('2018-01-01T15:00:00Z' AS timestamp) as "real_date"
)

SQL Error [FAILED]: INVALID_CAST_ARGUMENT: Value cannot be cast to timestamp

This style of Casting also returns timestamp with timezone :(

Note that the SELECT part of this works, and it says that it is a timestamp, but for some internal inconsistency reason you can't create a view and you'll get an error.

CREATE OR replace VIEW test 
AS 
SELECT typeof( "real_date" ) AS real_date_type
FROM
(
SELECT  timestamp '2018-01-01 15:00:00' as "real_date"
)

SQL Error [FAILED]: Could not initialize class com.facebook.presto.util.DateTimeZoneIndex

For whatever reason, creating a view requires that java class while parsing the value in the select doesn't. It's a bug that should be addressed.

This works yay

CREATE OR REPLACE VIEW test
AS
SELECT typeof("real_date") AS real_date_type
FROM
(
SELECT CAST(From_iso8601_timestamp('2018-01-01T15:00:00Z') AS timestamp) as "real_date"
)
Heatstroke answered 29/6, 2018 at 7:13 Comment(0)
O
10

You can use the following syntax in Athena over Timestamp data type (dt):

SELECT id,dt,dt AT TIME ZONE 'America/New_York' as dateTimeNY FROM Table
Ornstead answered 13/6, 2019 at 7:2 Comment(0)
S
6

Ran into something similar on something that I was working on recently. AWS Support pointed me to Davos solution but it didn't end up working for my case. The solution that ended up working from me was:

create or replace view db_name.vw_name AS
select
    from_unixtime(cast(to_unixtime(current_timestamp) AS bigint)) as field_name
from db_name.tbl_name

This will convert the output of current_timestamp which is timestamp with time zone to timestamp

If you want to verify the data type of the field, you can use:

select typeof(field_name) from db_name.vw_name

Hope that helps!

Sideboard answered 4/3, 2019 at 20:16 Comment(1)
Thanks for this addition, it seems like a useful alternative.Heatstroke
H
0
select 
cast(replace(cast(at_timezone(<YOUR_DATE_FIELD> , 'US/Eastern') as varchar) ,'America/New_York', '') as timestamp),
typeof(cast(replace(cast(at_timezone(<YOUR_DATE_FIELD> , 'US/Eastern') as varchar) ,'America/New_York', '') as timestamp))
from 
<YOUR_TABLE>

Lets say <YOUR_DATE_FIELD> is in UTC . The first at_timezone function will change the date to EST format ,but will have the extra literal 'US/Eastern' appended to it.

The next step is to cast it as varchar so as to remove the literal. However , when casting to varchar , it changes to 'America/New_York' which is what needs to be replaced with ''. Finally cast it to timestamp

Halflength answered 5/7, 2022 at 10:9 Comment(1)
Please explain your answer. Why does it work ?Canikin
M
0

Casting like Koustav answer in my case is the only solution to get the real time paris_without_tz and it works with both athena select and ctas

CREATE TABLE test WITH (format = 'PARQUET', write_compression = 'SNAPPY') AS 
select timestamp '2022-11-30 23:00:00.000' as utc
, at_timezone(timestamp '2022-11-30 23:00:00.000', 'Europe/Paris') as tz_paris
, CAST(at_timezone(timestamp '2022-11-30 23:00:00.000', 'Europe/Paris') as timestamp) AS remains_utc
, CAST(replace(CAST(at_timezone(timestamp '2022-11-30 23:00:00.000', 'Europe/Paris') as varchar), ' Europe/Paris', '') as timestamp) AS paris_without_tz
Mellicent answered 25/5, 2023 at 15:2 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.