How to UPDATE a value in hive table?

I have a flag column in Hive table that I want to update after some processing. I have tried using hive and impala using the below query but it didn’t work, and got that it needs to be a kudu table …

In Hive, how to read through NULL / empty tags present within an XML using explode(XPATH(..)) function?

In below Hive-query, I need to read the null / empty “string” tags as well, from the XML content. Only the non-null “string” tags are getting considered within the XPATH() list now….

AWS Athena custom data format?

I’d like to query my app logs on S3 with AWS Athena but I’m having trouble creating the table/specifying the data format. This is how the log lines look: 2020-12-09T18:08:48.789Z {“reqid”:&…

SQL: Expression Not in GROUP BY Key

I have a transaction table t1 in Hive that looks like this: store_id cust_id zip_code transaction_count spend 1000 100 123 3 50 2000 200 …

how to select all the values in hive with distinct of 2 columns in hive

I have a hive table that looks like this (total 460 columns) colA colB ……. ce_id filename ……… dt v j 4 gg 40 v j 5 gg …

Hive – find 2 characters anywhere in the string/row – RLIKE

How do I get the data for ONLY “_WA” data assigned to “USA_RBB_WA_BU”? However the column I look at has rows that contain _WA and _SA (USA_CA_SAWANT) I used, select…. …

Add missing monthly rows

I would like to list the missing date between two dates in a request for example My data : YEAR_MONTH | AMOUNT 202001 | 500 202001 | 600 201912 | 100 201910 | 200 201910 | …

Why I am getting error while performing group by in hive?

I am executing below command in hive: Select child.data_volume_gprs_dl + child.data_volume_gprs_ul as data_usage, parent.file_name, parent.record_number from table1 as parent left …

Array operation on hive collect_set

I am working on hive on large dataset, I have table with colum array and the content of the colum is as follows. [“20190302Prod4” “20190303Prod1” “20190303Prod4” “20190304Prod4” “20190305Prod3” “…

Merge update records in a final table

I have a user table in Hive of the form: User: Id String, Name String, Col1 String, UpdateTimestamp Timestamp I’m inserting data in this table from a file which has the following format: I/U,…