My table looks like this –
DateTime | ID |
---|---|
2010-12-01 08:26:00 | 34 |
2010-12-01 09:41:00 | 42 |
I want to extract the time from DateTime
and create a third column of it and then group it with frequency counts. Is there a way to do this in SQL? I’m using Apache Spark with inline SQL. I have achieved the equivalent using Spark functions like this –
df2 = df.withColumn('Time', date_format('DateTime', 'HH:mm:ss')) df3 = df2.groupBy("Time").count() // To get count under every hour df3.show()
Advertisement
Answer
If you want to do it in SQL, you can do:
df.createOrReplaceTempView('df') result = spark.sql("select date_format(DateTime, 'HH:mm:ss') as Time, count(*) as cnt from df group by Time")