Skip to content
Advertisement

Translating pyspark into sql

I’m experiencing an issue with the following function. I’m trying to translate this to a SQL statement so I can have a better idea of exactly what’s happening, so I can more effectively work on my actual issue.

I know that this contains a join between valid_data to ri_data, a filter, and a select statement. I’m primarily having issue understanding how to write the join piece.

Any help is appreciated.

Advertisement

Answer

You have some substitutions to do, like the column_name for the join keys, etc. But the general structure looks like this in SQL:

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement