Skip to content
Advertisement

Tag: hive

Hive – Query to get Saturday as week start date for a given date

I have an requirement in hive to calculate Saturday as week start date for a given date in hive sql. Eg) I tried using pmod and other date functions but not getting desired output. Any insight is much appreciated. Answer Hive offers next_day(), which can be adapted for this purpose. I think the logic you want is: This is a

Extract number between two characters in Hive SQL

The query below outputs 1642575.0. But I only want 1642575 (just the number without the decimal and the zero following it). The number of delimited values in the field varies. The only constant is that there’s always only one number with a decimal. I was trying to write a regexp function to extract the number between ” and .. How

determine duplication rate per group

EDIT: Previous sample data included the duplicate visits column I need that calculated in the solution. I am trying to determine the total_visits = total visits per website per sub_group duplicate_visits = visits-1 duplication_rate = duplicate_visits/ total_visits distinct_users_subgroup = distinct users per website per sub_group distinct_users_total = distinct users per website for the sample data below which I hope to

Is there a best way to join multiple tables

Can some one please help in joining/merging the table like below. I know how to do if the department columns (depart_1, depart_2, depart_3) are in one table. but not able to achieve this scenario as they are in different tables. I have almost 100 fields like department, so little concern about performance as well. Answer By using JOIN and UNION

User Defined Column name in select statement in Hivesql

I need to create user defined column name like below Postgresql query into HiveSql. Could you please help me on this. Answer Use backticks: But it is not possible to preserve case due to Hive limitation. Resulted column name will be in lower case: total customers See this answer: https://stackoverflow.com/a/57183048/2700344

insert extra rows in query result sql

Given a table with entries at irregular time stamps, “breaks” must be inserted at regular 5 min intervals ( the data associated can / will be NULL ). I was thinking of getting the start time, making a subquery that has a window function and adds 5 min intervals to the start time – but I only could think of

Advertisement