I have a SQL output something like below from the output of a custom tool. Would appreciate any help in finding what I am doing incorrectly.
column1 | column2 | column3 | column4 | column5 | column6 | column7 | column8 | column9 | column10 | column11 --------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+---------------- cec75 | 1234 | 007 | | 2810 | | SOME_TEXT | | | 2020-12-07 20:28:46.865+00 | 2020-12-08 06:40:10.231635+00 (1 row)
I am trying to pipe this output the columns I need in my case column1, column2, and column7. I have tried piping out like this but it just prints column1
tool check | awk '{print $1, $2}'
column1 | --------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------+----------------------------+------------------------------- cec75 | (1 row)
It would be nice to have something like this.
ce7c5,1234,SOME_TEXT
My file contents
column1 | column2 | column3 | column4 | column5 | column6 | column7 | column8 | column9 | column10 | column11 --------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------+----------------------------+------------------------------- 6601c | 2396 | 123 | | 9350 | | SOME_TEXT | | | 2020-12-07 22:49:01.023+00 | 2020-12-08 07:22:37.419669+00 (1 row) column1 | column2 | column3 | column4 | column5 | column6 | column7 | column8 | column9 | column10 | column11 --------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------+----------------------------+------------------------------- cec75 | 1567 | 007 | | 2810 | | SOME_TEXT | | | 2020-12-07 20:28:46.865+00 | 2020-12-08 07:28:10.319888+00 (1 row)
Advertisement
Answer
Description:
Command line switches…
- The delimiter is
|
surrounded by spaces. (Note that we need to use a couple of‘s to escape
|
if we feed the regex for the delimiter in from the command line.) - In addition to input delimiter (input field separator) the output delimiter (output field separator) can also be set using a command line switch.
The awk script…
- If a header is encountered or a
(
is seen on a line, it’s not a valid line; so, just ignore it. - If the line now has any alphanumeric characters, it’s now a valid line to operate on; so, and we strip the leading spaces off the line, and then print the columns we want.
tool check | awk -F' *\| *' -v OFS=, '/column|(/ { next } /[[:alnum:]]/ { sub(/^ +/, ""); print $1, $2, $7 }'
Examining the data more closely… It looks as though the date-stamp (which always has a :
in it) might be present on all valid records… If so, the script can be reduced to something much more simple.
tool check | awk -F' *\| *' -v OFS=, '$10 ~ /:/ { sub(/^ +/, ""); print $1, $2, $7 }'