Skip to content
Advertisement

Spark.sql Filter rows by MAX

Below is part of a source file which you could imagine being much bigger:

After the following code:

I would like to obtain this result:

The aim is to:

  1. Select the dates which each cityname has the MAX total (Note, A city can appear twice if they have MAX total for 2 different dates),
  2. Sort by total descending, then date ascending, then cityname ascending.

Thanks!

Advertisement

Answer

You can have your result using a SQL window in your request, as follows:

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement