Order of the tables in a JOIN

Question

In spark-sql I have a query that uses several tables (both large & small) in Joins. My question is - does the order of these tables matter with respect to query performance ? For e.g. select ...

Accepted Answer

The join order seems to be changed for optimization by Spark.There could be :Reorder JOIN optimizerReorder JOIN optimizer &#8211; star schemaReorder JOIN optimizer &#8211; cost based optimizationThe following appears to shed some light on this topic:https://www.waitingforcode.com/apache-spark-sql/reorder-join-optimizer-star-schema/readhttps://www.waitingforcode.com/apache-spark-sql/reorder-join-optimizer/readhttps://www.waitingforcode.com/apache-spark-sql/reorder-join-optimizer-cost-based-optimization/read

Advertisement

Answer