Skip to content
Advertisement

SQL Join – Limit to base table

I have 4 tables that I want to link together based on the information in the sales_details table which will serve as the base table (limiting results based on the sales_details table)

1)I want to pull all columns from the sales_detail table that are mastercard orders but since the sales_detail table doesn’t have a column to identity the type of transaction it is, I have to:

  • link the sales_detail table to the sales_total table by transaction_ID to get the pay_id
  • use the pay_id from the sales_total table and link to payment type table in order to filter for payment type
  • link product_id from sales_detail table to product_detail table to get product name

**I want the output to look something like this: **

im trying to get all orders from the sales_detail table that have paid by mastercard and from the usa. I tried using left joins and not only it takes forever because of the large tables (doesnt even load) and it also gives me duplicates. What am i doing wrong?

This is the code i used but failed as it took over 20+hrs and timed out:

there is no output for this code as after 20hrs it could never finish 🙁

Thanks in advance! I am a beginner so still learning the ins and outs of sql! (am using hive)

Advertisement

Answer

There is nothing seems to be wrong with your query and the reason you failed could be due to huge data present in sales table I would presume.

How ever lets give it a try by considering few points, as You only need columns from t1 and t4 , I would go for exists rather joining the tables without taking any columns to the select clause and second point , we can do a inner join between t2 and t3 considering t2 will never have a record with value null in pay_id column

If you still face performance issue try creating index on “transaction_id” and “product_id “.

I hope this help you taking a step forward. Thanks.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement