Skip to content
Advertisement

How to count all rows in raw data file using Hive?

I am reading some raw input which looks something like this:

Note the first two rows are “good” rows and the last two rows are “bad” rows since they are missing some data.

Here is the snippet of my hive query which is reading this raw data into a readonly external table:

I need to get the count of ALL the rows, both “good” and “bad.” The problem is some of the data is missing, and if I do SELECT count(id) as total_rows for example, that doesn’t work since not all the rows have an id.

Any suggestions on how I can count ALL the rows in this raw data file?

Advertisement

Answer

Hmmm . . . You can use:

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement