Spark Dataframe Minus Minutes Operation In Scala

How to perform minus operation on a date type or timestamp time.

Assume that you have following data set and you would like to perform minus/plus operation to the date/timestemp field.

 id | cr_date
1 | 2017-03-17 11:12:00
2 | 2017-03-17 15:10:00

You first convert the field to unix timestemp and then call minus operation or plus operation and then finally convert the field to appropriate formate

df.select(from_unixtime(unix_timestamp(col("cr_dt")).minus(5 * 60), "YYYY-MM-dd HH:mm:ss"))
The result will appear as below
id | cr_date
1 | 2017-03-17 11:07:00
2 | 2017-03-17 15:05:00

There is another important point to remember while performing minus operation. For example that you have a data frame with the timestamp "2015-01-01 00:00:00" when applying:

df.select(from_unixtime(unix_timestamp(col("cr_dt")).minus(5 * 60), "YYYY-MM-dd HH:mm:ss"))

The result is "2015-12-31 23:55:00" however expected result is "2014-12-31 23:55:00". It seems that this is due to having "YYYY" as opposed to "yyyy". Making this change:

df.select(from_unixtime(unix_timestamp(col("cr_dt")).minus(5 * 60), "yyyy-MM-dd HH:mm:ss"))

Gives the result what we are looking for.