Skip to content
Advertisement

Tag: hiveql

Extract number between two characters in Hive SQL

The query below outputs 1642575.0. But I only want 1642575 (just the number without the decimal and the zero following it). The number of delimited values in the field varies. The only constant is that there’s always only one number with a decimal. I was trying to write a regexp function to extract the number between ” and .. How

Is there a best way to join multiple tables

Can some one please help in joining/merging the table like below. I know how to do if the department columns (depart_1, depart_2, depart_3) are in one table. but not able to achieve this scenario as they are in different tables. I have almost 100 fields like department, so little concern about performance as well. Answer By using JOIN and UNION

User Defined Column name in select statement in Hivesql

I need to create user defined column name like below Postgresql query into HiveSql. Could you please help me on this. Answer Use backticks: But it is not possible to preserve case due to Hive limitation. Resulted column name will be in lower case: total customers See this answer: https://stackoverflow.com/a/57183048/2700344

insert extra rows in query result sql

Given a table with entries at irregular time stamps, “breaks” must be inserted at regular 5 min intervals ( the data associated can / will be NULL ). I was thinking of getting the start time, making a subquery that has a window function and adds 5 min intervals to the start time – but I only could think of

Hive SQL cast string as timestamp without losing the milliseconds

I have string data in the form 2020-10-21 12:49:27.090 I want to cast it as a timestamp. When I do this: select cast(column_name as timestamp) as column_name from table_name all of the milliseconds are dropped, like this: 2020-10-21 12:49:27 I also tried this: select cast(date_format(column_name,’yyyy-MM-dd HH:mm:ss.SSS’) as timestamp) as column_name from table_name and the same problem persists, it drops the

Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns

I have a dataset with booking hotels. date_in has format “yyyy-MM-dd”. I need select top 10 the most visited hotel by month. I get the following error: Error: Error while compiling statement: FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. Underlying error: org.apache.hadoop.hive.ql.parse.SemanticException: Line

Advertisement