Skip to content
Advertisement

SQL many-to-one join – how to get blank fields instead of duplicated values

I have 3 tables:

CREATE TABLE ticket (
       ticket_id int,
       description varchar(1000));

CREATE TABLE document (
       ticket_id int,
       file_name varchar(1000));

CREATE TABLE flex_fields (
       ticket_id int,
       field_name varchar(1000),
       field_value varchar(1000));

When I join this tables by ticket_id and query, i get result like this:

ticket_id   description         file_name   field_name     field_value

4374959     some_description1   file1.txt   created_date   12-12-12
4374959     some_description1   file1.txt   created_by     Johnny D.
4374959     some_description1   image.png   add_info       Empty
4374959     some_description1   image.png   country_id     UK
1111111     text                report.xls  created_date   12-12-12
1111111     text                image2.png  created_by     Sarah C.
1111111     text                image2.png  report_size    123123
1111111     text                image3.png  modified_date  13-12-12
1111111     text                image3.png  modified_by    James B.

But I want to blank duplicated values and get result like this:

ticket_id   description         file_name   field_name     field_value

4374959     some_description1   file1.txt   created_date   12-12-12
                                image.png   created_by     Johnny D.
                                            add_info       Empty
                                            country_id     UK
1111111     text                report.xls  created_date   12-12-12
                                image2.png  created_by     Sarah C.
                                image3.png  report_size    123123
                                            modified_date  13-12-12
                                            modified_by    James B.

Is it possible to get result set like this? I’m working in AWS Athena(Presto SQL), but I would really appreciate your ideas/suggestions on this regardless of the dbms. Thanks!

Advertisement

Answer

With the caveat that you should not be doing this in SQL because this is a presentation layer concern, not a data-layer concern – instead you should filter-out repeated column data only at the point where you’re rendering the HTML table or rendering to the screen somehow.

However if for some horrible reason you need to do this in SQL, something like this should work.

  • Note the repeated use of explicit ORDER BY clauses to ensure correctness: never rely on implicit ordering.
  • Also note that CTEs and other sub-queries cannot have an ORDER BY: only top-level SELECT queries can have an ORDER BY clause, hence the need for ROW_NUMBER() OVER ( ORDER BY ...) in the CTE.
  • My query uses ticket_id for the description column because I assume that description is keyed by ticket_id, if it is not then you need to compare description directly.
    • Be-aware that textual comparisons are case-insensitive, which may or may not be desirable, otherwise use an explicit COLLATION to specify a case-sensitive comparison.

WITH t AS (
    SELECT
        ROW_NUMBER() OVER ( ORDER BY ticket_id, file_name, field_name ) AS n,
        ticket_id,
        description,
        file_name,
        field_name,
        field_value
    FROM
        etc
)
SELECT
    NULLIF( LAG( t.ticket_id  , 1 ) OVER ( ORDER BY t.n ) = t.ticket_id, t.ticket_id     ) AS ticket_id,
    NULLIF( LAG( t.ticket_id  , 1 ) OVER ( ORDER BY t.n ) = t.ticket_id, t.description   ) AS description,
    NULLIF( LAG( t.file_name  , 1 ) OVER ( ORDER BY t.n ) = t.file_name, t.file_name     ) AS file_name,
    NULLIF( LAG( t.field_name , 1 ) OVER ( ORDER BY t.n ) = t.field_name, t.field_name   ) AS field_name, 
    NULLIF( LAG( t.field_value, 1 ) OVER ( ORDER BY t.n ) = t.field_value, t.field_value ) AS field_value
FROM
    t
ORDER BY
    t.ticket_id,
    t.file_name,
    t.field_name
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement