This is an extension of a previous question I asked: Is it possible to change an existing column’s metadata on an EXTERNAL table that is defined by an AVRO schema file? Question: In Hive 2.1.1, how do I INSERT data FROM a PARTITIONED table INTO a PARTITIONED table? What is the correct syntax? I have seen material all over the
Tag: hive
In Hive, how to convert an array of string to an array of numeric numbers
I have a hive table that looks like the following: I want the result to be the following: I need to convert them into an array of float so that I can use them in ST_Constains(ST_MultiPolygon(), st_point()) to determine if a point is in an area. I am new to Hive, not sure if that is possible, any help would
Extracting unique values with SQL [closed]
I’m new to SQL and would greatly appreciate your help with extracting data from a hive table. The table contains two relevant columns: host and url. The url column has a lot of duplicates and …
Join three tables based on one key, putting data into same column
I have three tables that I am trying to join together to check that the proper data matches. I have table A which is a list of all accounts that a commission was paid on and what that commission amount was. I have Table B and Table C which are two tables that have commission calculations in it. The goal
Array operation on hive collect_set
I am working on hive on large dataset, I have table with colum array and the content of the colum is as follows. [“20190302Prod4” “20190303Prod1” “20190303Prod4” “20190304Prod4” “20190305Prod3” “…
Generate range hours and range numbres SQL/HQL
I have a problem with a table.I currently have this empty hours table and I need to fill it automatically with a query in Hiveql. The idea is to generate: In the first column “key” values between 000000 and 235959 In the second column “hours” values between 00:00:00 and 23:59:59. Now my table is empty: Future table that I need
Merge update records in a final table
I have a user table in Hive of the form: User: Id String, Name String, Col1 String, UpdateTimestamp Timestamp I’m inserting data in this table from a file which has the following format: I/U,…
PutHiveQL NiFi Processor extremely slow – misconfiguration?
I am currently setting up a simple NiFi flow that reads from a RDBMS source and writes to a Hive sink. The flow works as expected until the PuHiveSql processor, which is running extremely slow. It inserts one record every minute approximately. Currently is setup as a standalone instance running on one node. The logs showing the insert every 1
Hive Most Popular in each group
I have three table BX-Books.csv ISBN, Book-Title, Book-Author, Year-Of-Publication, Publisher BX-Book-Ratings.csv User-ID ISBN Book-Rating BX-Users.csv User-ID Location Age I have to find most …
SQL: Select the Running Total of Column C for pairwise combinations of two other columns A and B
I’m trying to query a table and calculate the running sum of a column’s values for pairwise combinations of two other columns. Specifically, given the following table: CREATE TABLE test ( bucket int(…