Tag: hive

Hive – Query to get Saturday as week start date for a given date

I have an requirement in hive to calculate Saturday as week start date for a given date in hive sql. Eg) I tried using pmod and other date functions but not getting desired output. Any insight is much appreciated. Answer Hive offers next_day(), which can be adapted for this purpose. I think the logic you want is: This is a

How to get min and max from 7 columns in Hive Hue excluding zeros

hive hiveql hue sql

I have a table which has 9 columns. Below is the structure of it I need the min and max of these columns for a row excluding zeros. Below is the required table structure If you see the columns min and max, min is minimum of 7 cols (col1 to col7) in a particular row excluding zero and max is

Extract number between two characters in Hive SQL

hive hiveql numeric regex sql

The query below outputs 1642575.0. But I only want 1642575 (just the number without the decimal and the zero following it). The number of delimited values in the field varies. The only constant is that there’s always only one number with a decimal. I was trying to write a regexp function to extract the number between ” and .. How

determine duplication rate per group

hive sql

EDIT: Previous sample data included the duplicate visits column I need that calculated in the solution. I am trying to determine the total_visits = total visits per website per sub_group duplicate_visits = visits-1 duplication_rate = duplicate_visits/ total_visits distinct_users_subgroup = distinct users per website per sub_group distinct_users_total = distinct users per website for the sample data below which I hope to

Is there a best way to join multiple tables

hive hiveql sql

Can some one please help in joining/merging the table like below. I know how to do if the department columns (depart_1, depart_2, depart_3) are in one table. but not able to achieve this scenario as they are in different tables. I have almost 100 fields like department, so little concern about performance as well. Answer By using JOIN and UNION

User Defined Column name in select statement in Hivesql

hive hiveql postgresql sql

I need to create user defined column name like below Postgresql query into HiveSql. Could you please help me on this. Answer Use backticks: But it is not possible to preserve case due to Hive limitation. Resulted column name will be in lower case: total customers See this answer: https://stackoverflow.com/a/57183048/2700344

Find Unique Count Postgresql query into hivesql

hive hiveql postgresql sql

I want to get unique customer counts. I have reference of postgresql query. Could you please convert this query into HiveSql Answer Use case expressions: One more method for counting distinct is size(collect_set()):

Convert Postgresql into HiveSql

hive hiveql mysql postgresql sql

How do I convert below mentioned postgresql query into HiveSql Answer Use CASE expressions like this: Also in Hive version >= 1.3 you can use quarter function:

Setting transactional-table properties results in external table

hadoop hive impala parquet sql

I am creating a managed table via Impala as follows: This should result in a managed table which does not support HIVE-ACID. However, when I run the command I still end up with an external table. Why is this? Answer I found out in the Cloudera documentation that neglecting the EXTERNAL-keyword when creating the table does not mean that the

insert extra rows in query result sql

date-range hive hiveql sql timestamp

Given a table with entries at irregular time stamps, “breaks” must be inserted at regular 5 min intervals ( the data associated can / will be NULL ). I was thinking of getting the start time, making a subquery that has a window function and adds 5 min intervals to the start time – but I only could think of