I have 3 tables:
CREATE TABLE ticket ( ticket_id int, description varchar(1000)); CREATE TABLE document ( ticket_id int, file_name varchar(1000)); CREATE TABLE flex_fields ( ticket_id int, field_name varchar(1000), field_value varchar(1000));
When I join this tables by ticket_id and query, i get result like this:
ticket_id description file_name field_name field_value 4374959 some_description1 file1.txt created_date 12-12-12 4374959 some_description1 file1.txt created_by Johnny D. 4374959 some_description1 image.png add_info Empty 4374959 some_description1 image.png country_id UK 1111111 text report.xls created_date 12-12-12 1111111 text image2.png created_by Sarah C. 1111111 text image2.png report_size 123123 1111111 text image3.png modified_date 13-12-12 1111111 text image3.png modified_by James B.
But I want to blank duplicated values and get result like this:
ticket_id description file_name field_name field_value 4374959 some_description1 file1.txt created_date 12-12-12 image.png created_by Johnny D. add_info Empty country_id UK 1111111 text report.xls created_date 12-12-12 image2.png created_by Sarah C. image3.png report_size 123123 modified_date 13-12-12 modified_by James B.
Is it possible to get result set like this? I’m working in AWS Athena(Presto SQL), but I would really appreciate your ideas/suggestions on this regardless of the dbms. Thanks!
Advertisement
Answer
With the caveat that you should not be doing this in SQL because this is a presentation layer concern, not a data-layer concern – instead you should filter-out repeated column data only at the point where you’re rendering the HTML table or rendering to the screen somehow.
However if for some horrible reason you need to do this in SQL, something like this should work.
- Note the repeated use of explicit
ORDER BY
clauses to ensure correctness: never rely on implicit ordering. - Also note that CTEs and other sub-queries cannot have an
ORDER BY
: only top-levelSELECT
queries can have anORDER BY
clause, hence the need forROW_NUMBER() OVER ( ORDER BY ...)
in the CTE. - My query uses
ticket_id
for thedescription
column because I assume thatdescription
is keyed byticket_id
, if it is not then you need to comparedescription
directly.- Be-aware that textual comparisons are case-insensitive, which may or may not be desirable, otherwise use an explicit
COLLATION
to specify a case-sensitive comparison.
- Be-aware that textual comparisons are case-insensitive, which may or may not be desirable, otherwise use an explicit
WITH t AS ( SELECT ROW_NUMBER() OVER ( ORDER BY ticket_id, file_name, field_name ) AS n, ticket_id, description, file_name, field_name, field_value FROM etc ) SELECT NULLIF( LAG( t.ticket_id , 1 ) OVER ( ORDER BY t.n ) = t.ticket_id, t.ticket_id ) AS ticket_id, NULLIF( LAG( t.ticket_id , 1 ) OVER ( ORDER BY t.n ) = t.ticket_id, t.description ) AS description, NULLIF( LAG( t.file_name , 1 ) OVER ( ORDER BY t.n ) = t.file_name, t.file_name ) AS file_name, NULLIF( LAG( t.field_name , 1 ) OVER ( ORDER BY t.n ) = t.field_name, t.field_name ) AS field_name, NULLIF( LAG( t.field_value, 1 ) OVER ( ORDER BY t.n ) = t.field_value, t.field_value ) AS field_value FROM t ORDER BY t.ticket_id, t.file_name, t.field_name