How to create Date and Hour columns from Seconds column using SQL

Question

I have a column called Time with float values giving time in seconds after the first event occurred. I was wondering how to create columns called Date and Hour using this column in SQL. My dataset is ...

Accepted Answer

You can use functions: timestamp, unix_timestamp and hour:from pyspark.sql.functions import expr, hourdf.withColumn('Date', expr("timestamp(unix_timestamp('2019-01-01 00:00:00') + Time)"))   .withColumn('hour', hour('Date'))   .show(truncate=False)                                              +--------+----------------------+----+|Time    |Date                  |hour|+--------+----------------------+----+|10.0    |2019-01-01 00:00:10   |0   ||61.0    |2019-01-01 00:01:01   |0   ||3500.0  |2019-01-01 00:58:20   |0   ||3600.0  |2019-01-01 01:00:00   |1   ||3700.54 |2019-01-01 01:01:40.54|1   ||7000.22 |2019-01-01 01:56:40.22|1   ||7200.22 |2019-01-01 02:00:00.22|2   ||15000.55|2019-01-01 04:10:00.55|4   ||86400.22|2019-01-02 00:00:00.22|0   |+--------+----------------------+----+Note: use timestamp function to keep the microsecondUse SQL syntax:df.createOrReplaceTempView('t_df')spark.sql("""     WITH d AS (SELECT *, timestamp(unix_timestamp('2019-01-01 00:00:00') + Time) as Date FROM t_df)     SELECT *, hour(d.Date) AS hour FROM d   """).show(truncate=False)

How to create Date and Hour columns from Seconds column using SQL

Setup

Data

pyspark dataframe

Using pandas (but I need pyspark)

Question

Advertisement

Answer