Tag: hive

How to find the desired output using hive/sql

I have a table like this col1 col2 First row Second a First b Second row First c Second row The output required is like below: col1 col2 col3 First row 1 Second a 1 First b 1 Second row 2 First c 2 Second row 3 The logic is , whenever we are getting the value “row” in col2,

How to get overall summary of grouped columns in SQL in a separate column?

hive sql

I have a grouped query that looks like this that I got from the following query: The Result is the following: How I measure performance is the percentage of times there is a late shipment by a warehouse, so I would like to have a column that shows at the company level the average percentage across every wareh…

hive get percentages of count column not working

cloudera hive sql

i have the following query in hive to get the counts per each of those columns (cluster, country and airline) as a percentage. But my percentage column contains only 0’s.. why/what am i doing wrong below? Answer First, you should use window functions. Second, beware of integer division. I would phrase t…

Can’t found the poroblem within this Hive Query [closed]

google-cloud-platform hive hiveql sql

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers. This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers. Closed las…

Hive Explode the Array of Struct key: value:

arrays explode hive hiveql sql

This is the below Hive Table And this is the data in the above table- Is there any way I can get the below output using HiveQL? I tried use explode() but I get result like that: Answer Use laterral view [outer] inline to get struct elements already etracted and use conditional aggregation to get values corres…

Compare two SQL tables and return count of rows with changes

hive hiveql python sql

I have two partitions from an SQL table containing num_key records. I need to compare and count changes in the February records versus the January records. SAMPLE DATA AND DESIRED RESULTS: ptn_dt = ‘2019-01-31’ (January) num_key active_indicator 111 true 112 false 113 false 114 false 115 true 116 …

Why does Hive throw me an error while using Order by date?

hive hiveql sql window-functions

I am trying to write a query In hive and I am seeing the following error. “Error while compiling statement: FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. Underlying error: Primitve …

Grouping by a Range of Numbers with Aggregate Function

hive sql

I have data in Hive that has two columns of interest. The first is a column (int) that represents a date (YYYYMM) and the second is a column (int) that represents a number of people for that date. I’m trying to write a query that sums up the number of people for comparison across two quarters. For examp…

How to use an alias in Hive?

apache-spark hive sql

I am trying to find unique cities using the window function, I am not able to use an alias in this query Answer You cannot have a window function in the where clause. Put it in a subquery and do the filter afterwards:

Is there a way to parse csv string with escapings via HQL/SQL?

csv hive hiveql postgresql sql

I have a problem parsing csv-formatted data that is stored in a Hive table column that is loaded into PostgreSQL DB afterwards. What I need to do is to retrieve some fields from there, however, if a comma is enquoted, it should be treated as a part of data to retrieve; on top of that, quotes can be escaped th…