Skip to content

Tag: hiveql

Performance difference with Where condition in subquery/cte

Is there a performance difference for applying the where condition to a subquery data source compared to applying it at the joined statement? Is there a difference between these in performance? Let’s say I have two hive tables A and B which are both partitioned on the field date. Is that query’s p…

Filtering records not containing numbers

I have a table that has numbers in string format. Ideally the table should contain 10 digit number in string format, but it has many junk values. I wanted to filter out the records that are not ideal …

Convert Postgre query to Hive/ Mysql

I have this table: I want to a situation where each footballer appears only once in a new table. For instance, Messi appears twice, but I want to take any occurrence of Messi in the new table. I am not sure how to convert it to either Hive or mysql. This is what I want the desired results to look