How to get today -"1 day" date in sparksql?
Asked Answered
H

7

30

How to get current_date - 1 day in sparksql, same as cur_date()-1 in mysql.

Haase answered 13/12, 2016 at 6:28 Comment(0)
R
33

The arithmetic functions allow you to perform arithmetic operation on columns containing dates.

For example, you can calculate the difference between two dates, add days to a date, or subtract days from a date. The built-in date arithmetic functions include datediff, date_add, date_sub, add_months, last_day, next_day, and months_between.

Out of above what we need is

date_sub(timestamp startdate, int days), Purpose: Subtracts a specified number of days from a TIMESTAMP value. The first argument can be a string, which is automatically cast to TIMESTAMP if it uses the recognized format, as described in TIMESTAMP Data Type. Return type: Returns the date that is > days days before start

and we have

current_timestamp() Purpose: Alias for the now() function. Return type: timestamp

you can do select

date_sub(CAST(current_timestamp() as DATE), 1)

See https://spark.apache.org/docs/1.6.2/api/java/org/apache/spark/sql/functions.html

Recalcitrate answered 13/12, 2016 at 6:49 Comment(2)
Fair warning, the functions will return a DATE ONLY. All time components will be lost.Appendant
current_timestamp() as timestamp, if you want to keep timeEspinosa
L
18

You can try

date_add(current_date(), -1)

I don't know spark either but I found it on google. You can also use this link for reference

Lead answered 13/12, 2016 at 9:45 Comment(0)
R
8

You can easily perform this task , there are many methods related to the date and what you can use here is date_sub

Example on Spark-REPL:

 scala> spark.sql("select date_sub(current_timestamp(), 1)").show
+----------------------------------------------+
|date_sub(CAST(current_timestamp() AS DATE), 1)|
+----------------------------------------------+
|                                    2016-12-12|
+----------------------------------------------+
Roubaix answered 13/12, 2016 at 6:52 Comment(0)
I
7

Spark SQL supports also the INTERVAL keyword. You can get the yesterday's date with this query:

SELECT current_date - INTERVAL 1 day;

For more details have a look at interval literals documentation. I tested the above with spark 3.x, but I am not sure since which release this syntax is supported.

Ineradicable answered 22/3, 2022 at 14:15 Comment(1)
It worked great for me, because keeps the full timestamp as output.Trio
I
2
SELECT DATE_FORMAT(DATE_ADD(CURRENT_DATE(), -1), 'yyyy-MM-dd')
Indict answered 14/9, 2022 at 4:13 Comment(0)
V
1

Yes, the date_sub() function is the right for the question, anyway, there's an error in the selected answer:

Return type: timestamp

The return type should be date instead, date_sub() function will trim any hh:mm:ss part of the timestamp, and returns only a date.

Vaginitis answered 26/11, 2018 at 14:20 Comment(0)
M
0

date_add and date_sub work best in databricks + python

Matildematin answered 12/10, 2023 at 22:7 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Retrospection

© 2022 - 2024 — McMap. All rights reserved.