Skip to content
Advertisement

Tag: hive

Join three tables based on one key, putting data into same column

I have three tables that I am trying to join together to check that the proper data matches. I have table A which is a list of all accounts that a commission was paid on and what that commission amount was. I have Table B and Table C which are two tables that have commission calculations in it. The goal

Array operation on hive collect_set

I am working on hive on large dataset, I have table with colum array and the content of the colum is as follows. [“20190302Prod4” “20190303Prod1” “20190303Prod4” “20190304Prod4” “20190305Prod3” “…

Generate range hours and range numbres SQL/HQL

I have a problem with a table.I currently have this empty hours table and I need to fill it automatically with a query in Hiveql. The idea is to generate: In the first column “key” values between 000000 and 235959 In the second column “hours” values between 00:00:00 and 23:59:59. Now my table is empty: Future table that I need

Merge update records in a final table

I have a user table in Hive of the form: User: Id String, Name String, Col1 String, UpdateTimestamp Timestamp I’m inserting data in this table from a file which has the following format: I/U,…

PutHiveQL NiFi Processor extremely slow – misconfiguration?

I am currently setting up a simple NiFi flow that reads from a RDBMS source and writes to a Hive sink. The flow works as expected until the PuHiveSql processor, which is running extremely slow. It inserts one record every minute approximately. Currently is setup as a standalone instance running on one node. The logs showing the insert every 1

Advertisement