Tag: hive

How to combine two tables to get singel table in Hive

I have following tables and need to combine them in hive Could any one please help me how can we achieve this. I tried date part with coalesce and it is fine. But fam part is not able to merge into single column. Really appreciate your help. Thanks, Babu Answer You can use full outer join. However, union with left

SQL: Expression Not in GROUP BY Key

aggregate-functions average hive sql window-functions

I have a transaction table t1 in Hive that looks like this: store_id cust_id zip_code transaction_count spend 1000 100 123 3 50 2000 200 …

how to merge multiple rows into single in MSSQL

database hive sql sql-server

this is my data: id segment country product status month year 83916512 Government Null Null Null Null 2014 83916512 Null Germany Null Null Null 2014 83916512 Null Null Carretera Null Null 2014 83916512 Null Null Null completed Null 2014 83916512 Null Null Null Null June 2014 83916512 Null Null Null Null Null 2014 i want below output can anybody help

how to join two hive tables with embedded array of struct and array on pyspark

dataframe hive pyspark python sql

I am trying to join two hive tables on databricks. tab1: The schema of “some_questions” “some_questions” example: tab2: I need to join tab1 and tab2 by “question_id” such that I get a new table I try to join them by pyspark. But, I am not sure how to decompose the array with embedded struct/array. thanks Answer For SparkSQL, you can

Hive: randomly select N values from distinct values of one column

hive inner-join random sql subquery

Suppose I have a dataset like this I would like to randomly select, say, 3 values from the distinct ID values. One possibility is to get a table like this How shall I do that in Hive? Answer Here is one option using a join and rand(): The subquery randomly selects 3 ids, then the outer query brings all related

Getting NULL after combining strings between date functions

datetime hive hiveql sql string

Given a date column with a value 2020-05-01, I want to return 2020-Q2. The QUARTER() function is not available due to the Hive version we are using. I can get the quarter number with: (INT((MONTH(yyyy_mm_dd)-1)/3)+1). When I try to combine this with the YEAR() function and strings, I get null: How can I properly concatenate this to get the desired

Convert Postgre query to Hive/ Mysql

count hive hiveql mysql sql

I have this table: I want to a situation where each footballer appears only once in a new table. For instance, Messi appears twice, but I want to take any occurrence of Messi in the new table. I am not sure how to convert it to either Hive or mysql. This is what I want the desired results to look

Create Missing Data Hive SQL

date hive hiveql sql window-functions

I have a table that has an activity date of when things change such as 2020-08-13 123 Upgrade 2020-08-17 123 Downgrade 2020-08-21 123 Upgrade Basically this in relation to a line there are 3 …

How to transform data into a map using group by in Hive SQL?

dictionary hive sql string

I have data like below …and I want to create a map with lecture as the key and count as a value. How can I get an output like below? Answer If you can live with count being a string, you probably be able to use Hive str_to_map() function to get a desired map. That will require a couple of

Cross Join in Hive

datetime hive join sql window-functions

I’m trying to create a new column time_period while running the query below. If the date difference between a given transaction and the most recent transaction in the reference table is fewer than 7 days, then mark it as a recent transaction, else mark it as an old transaction. However, the query below is generating an error in the subquery