Hive DBMS; Two tables — A and B Table A Table B Question –> Trying to execute a query where: Join table A with table B, first on prnt_id, if it’s “unknown”, then join on sub_id, if that is “unknown”, join on ac_nm Desired Output: Answer You must use LEFT joins of TableB to 3 copies of TableA and filter
Tag: hive
HIVE converting unix timestamp for calculation
I’m trying to perform subtraction among timestamps and would like to convert the timestamps in forms that can be converted to minutes. I used regexp_replace to convert timestamp in such form: The following code will convert it to seconds I have other two timestamps that I wish to convert to seconds, such as: How should I convert these two timestamp
SQL Join – Limit to base table
I have 4 tables that I want to link together based on the information in the sales_details table which will serve as the base table (limiting results based on the sales_details table) 1)I want to pull all columns from the sales_detail table that are mastercard orders but since the sales_detail table doesn’t have a column to identity the type of
How do I find first value in every last 3 months in Hive
I have a table like below. I need to get the first Refresh_value (based on Refresh_date) from last 3 months starting from the last date and there should be 2 additional columns (Group and Refresh_Value_Min) where 1st column will have the first value from every last 3 months and another column will have values which says in which group these
Optimize Hive Query. java.lang.OutOfMemoryError: Java heap space/GC overhead limit exceeded
How can I optimize a query of this form since I keep running into this OOM error? Or come up with a better execution plan? If I removed the substring clause, the query would work fine, suggesting that this takes a lot of memory. When the job fails, the beeline output shows the OOM Java heap space. Readings online suggested
Hive Query to insert a value conditionally
I have a Table1 containing some blacklisted names. Now suppose I receive a record “def”. The hive query should check if “def” is present in Table1 or not. If not the name_status should be set to blacklisted otherwise null. The name “def” will be inserted in both cases. The problem I am facing is that in hive we cannot use
SQL filter elements of array
I have a table of employee similar to this: I want to get the department, name, and age of all employees, something similar to this: How can I achieve this using SQL query? Answer I assume you are using Hive/Spark and the datatype of the column is an array of maps. Using explode and collect_list and map functions. Note that
Hive query conditional statement in same select query
Is there a way to get in one single hive query to do a if-else kind of setup. In the my data below I want to ensure that if Model is empty or having ‘-‘ I populate the Final column with Device else it should be populated with Model Can someone please help here. Answer Read about CASE operator syntax.
how to select all the values in hive with distinct of 2 columns in hive
I have a hive table that looks like this (total 460 columns) colA colB ……. ce_id filename ……… dt v j 4 gg 40 v j 5 gg …
Get List of Last 15 Days Date in SQL
Could SQL get list of date of last 15 days date in a single query? We can get today date with select current_date() We also can get last 15 days date with select date_add(current_date(), -15) But how to show the list of last 15 days date? For example the output is Answer In Hive or Spark-SQL: Result: See also this