I meet a strange problem in hive
The table a
:
id, domain 1, m.taobao.com 2, m.tmall.com
The table b
domain %taobao\.com% %tmall\.com%
When I use:
Select a.id, a.domain from a where a like '%taobao\.com%' or a.like '%tmall\.com%'
It works well
But when I use
select a.id, a.domain from a, b where a.domain like b.domain
I got return null.
The b.domain
is '%taobao\.com%'
or '%tmall\.com%'
. What’s the different using them direct in SQL query? What makes the second query failed?
Advertisement
Answer
Without '\'
in the template it works fine:
select a.* from (--Use your table instead of this subquery select stack(2, 1, 'm.taobao.com', 2, 'm.tmall.com') as (id, domain) )a cross join (--Use your table instead of this subquery select stack(2, '%taobao.com%', '%tmall.com%' ) as domain ) b where a.domain like b.domain;
Result:
OK 1 m.taobao.com 2 m.tmall.com
And the same with constant template. This does not produce rows:
select a.* from (--Use your table instead of this subquery select stack(2, 1, 'm.taobao.com', 2, 'm.tmall.com') as (id, domain) )a where a.domain like '%taobao\.com%'
And this works fine:
where a.domain like '%taobao.com%'
Documentation says that LIKE
operator recognizes only _
and %
templates: “The _ character in B matches any character in A (similar to . in posix regular expressions) while the % character in B matches an arbitrary number of characters in A (similar to .* in posix regular expressions). For example, ‘foobar’ like ‘foo’ evaluates to FALSE whereas ‘foobar’ like ‘foo_ _ _’ evaluates to TRUE and so does ‘foobar’ like ‘foo%’.” So, you do not need to escape '.'
for LIKE operator.
You need to escape '.'
to match dot character it only for RLIKE
operator because it uses Java regular expression.