This is probably a simple problem but I am quite a noob in SQL. I am using Impala. So I have data like this: New_ID Date Old_ID 1 2020-11-14 12:41:21 0 1 2020-11-14 12:50:40 1 2 2020-10-14 15:22:00 1.5 2 2020-12-18 11:31:05 2 3 2020-11-14 12:42:25 3 Assuming that I group by New_ID, I need to check that the difference
Tag: impala
SQL (HUE) : Is there any way to convert 24 hrs time into 12 hrs AM / PM format with hours buckets
I have table A which contains column time stored as timestamp datatype. Table A: Contains time column in HH:MM:SS in 24 hrs format. Answer Please use below code. Replace now() with time for your query. Explanation – firstly i am checking if hour is >12. If yes, deducting 12 to get the hour. Then setting up AM/PM based on hour.
Combining Aggregate Function with resampling in Impala
I have Table in Hadoop in which I have data for different sensor units with a sampling time ts of 1 mSec. I can resample the data for a single unit with a combination of different aggregate functions using the following query in Impala (Let’s say I want to resample the data for each 5 minute using LAST_VALUE() as aggregate
Setting transactional-table properties results in external table
I am creating a managed table via Impala as follows: This should result in a managed table which does not support HIVE-ACID. However, when I run the command I still end up with an external table. Why is this? Answer I found out in the Cloudera documentation that neglecting the EXTERNAL-keyword when creating the table does not mean that the
Impala Last_Value() Not giving result as expected
I have a Table in Impala in which I have time information as Unix-Time (with a frequency of 1 mSec) and information about three variables, like given below: ts Val1 Val2 Val3 …
AVG over time Window in Impala … OVER (PARTITION BY … ORDER BY)
I have a Table in Impala in which I have time information as UnixTime with a frequency of 1mSec. I am trying to get the AVG(), MIN() and MAX() for a window of 10Sec (But I do not want to fix it and can be 20sec, 30sec, etc). I am doing it using sub-queries but I am not getting the
Impala incompatible return types in case when statement
I am running an Impala query and try to use a case when statement: It complains This however works fine: As the error message indicates, PRTCTN_ALLCTD_VL is of type decimal(38,10). Any advice is appreciated Answer This is a curious problem, one that I would not expect. A case expression returns a single type so all the conditions have to converted
Alternative way to run a query with join
I have the below query: select m.name, m.surname,m.teacher, c.classroom,c.floor from table1 as m inner join table2 as c on (m.name=c.name or m.surname = c.surname); But it takes a lot of time to …
How to UPDATE a value in hive table?
I have a flag column in Hive table that I want to update after some processing. I have tried using hive and impala using the below query but it didn’t work, and got that it needs to be a kudu table …
How does Impala Implements GroupBy Extension(CUBE, ROLLUP and GROUPING SETS) In a distributed way?
I’m Learning how to Implement GroupBy Extension(CUBE, ROLLUP and GROUPING SETS), I’ve watched at FE several times, But I still can’t understand how to use grouping_ids to implement GroupBy Extension throught collaboration with BE in a distributed way.How is it to collaborition with ExchangeNode? Is it collaborition with ExchangeNode? Can someone help me with the maze? Answer Impala introduced the