I have a table like this col1 col2 First row Second a First b Second row First c Second row The output required is like below: col1 col2 col3 First row 1 Second a 1 First b 1 Second row 2 First c 2 Second row 3 The logic is , whenever we are getting the value “row” in col2,
Tag: hive
How to get overall summary of grouped columns in SQL in a separate column?
I have a grouped query that looks like this that I got from the following query: The Result is the following: How I measure performance is the percentage of times there is a late shipment by a warehouse, so I would like to have a column that shows at the company level the average percentage across every warehouse. How can
hive get percentages of count column not working
i have the following query in hive to get the counts per each of those columns (cluster, country and airline) as a percentage. But my percentage column contains only 0’s.. why/what am i doing wrong below? Answer First, you should use window functions. Second, beware of integer division. I would phrase this as:
Can’t found the poroblem within this Hive Query [closed]
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers. This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers. Closed last year. Improve this question EDIT:
Hive Explode the Array of Struct key: value:
This is the below Hive Table And this is the data in the above table- Is there any way I can get the below output using HiveQL? I tried use explode() but I get result like that: Answer Use laterral view [outer] inline to get struct elements already etracted and use conditional aggregation to get values corresponting to some keys
Compare two SQL tables and return count of rows with changes
I have two partitions from an SQL table containing num_key records. I need to compare and count changes in the February records versus the January records. SAMPLE DATA AND DESIRED RESULTS: ptn_dt = ‘2019-01-31’ (January) num_key active_indicator 111 true 112 false 113 false 114 false 115 true 116 true ptn_dt = ‘2019-02-28’ (February) num_key active_indicator 111 true 112 false 113
Why does Hive throw me an error while using Order by date?
I am trying to write a query In hive and I am seeing the following error. “Error while compiling statement: FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. Underlying error: Primitve type DATE not supported in Value Boundary expression. I used the same query
Grouping by a Range of Numbers with Aggregate Function
I have data in Hive that has two columns of interest. The first is a column (int) that represents a date (YYYYMM) and the second is a column (int) that represents a number of people for that date. I’m trying to write a query that sums up the number of people for comparison across two quarters. For example, I want
How to use an alias in Hive?
I am trying to find unique cities using the window function, I am not able to use an alias in this query Answer You cannot have a window function in the where clause. Put it in a subquery and do the filter afterwards:
Is there a way to parse csv string with escapings via HQL/SQL?
I have a problem parsing csv-formatted data that is stored in a Hive table column that is loaded into PostgreSQL DB afterwards. What I need to do is to retrieve some fields from there, however, if a comma is enquoted, it should be treated as a part of data to retrieve; on top of that, quotes can be escaped themselves.