I am trying to learn SQL-SERVER
and I have created the below query:
WITH T AS ( SELECT ROW_NUMBER() OVER(ORDER BY d.DIALOG_ID) as row_num, * FROM test.db as d INNER JOIN test.dbs as ds ON d.DIALOG_ID = ds.DIALOG_ID ) SELECT * FROM T WHERE row_num <=10;
I found that the only way to limit is with ROW_NUMBER()
.
Although when I try to run the join I have this error:
org.jkiss.dbeaver.model.sql.DBSQLException: SQL Error [8156] [S0001]: The column ‘DIALOG_ID’ was specified multiple times for ‘T’.
Advertisement
Answer
The problem: In the WITH
, you do SELECT *
which gets all columns from both tables db
and dbs
. Both have a column DIALOG_ID
, so a column by that name ends up twice in the result set of the WITH.
Although until here that is all allowed, it is not good practice: why have the same data twice?
Things go wrong when SQL Server has to determine what SELECT * FROM T
means: it expands SELECT * to the actual columns of T, but it finds a duplicate column name, and then it refuses to continue.
The fix (and also highly recommended in general): be specific about the columns that you want to output. If T
has no duplicate columns, then SELECT * FROM T
will succeed.
Note that the even-more-pure variant is to also be specific about what columns you select from T. By doing that it becomes clear at a glance what the SELECT produces, instead of having to guess or investigate when you look at the query later on (or when someone else does).
The updated code would look like this (fill in your column names as we don’t know them):
WITH T AS ( SELECT ROW_NUMBER() OVER(ORDER BY d.DIALOG_ID) as row_num, d.DIALOG_ID, d.SOME_OTHER_COL, ds.DS_ID, ds.SOME_OTHER_COL_2 FROM test.db AS d INNER JOIN test.dbs AS ds ON d.DIALOG_ID = ds.DIALOG_ID ) SELECT row_num, DIALOG_ID, SOME_OTHER_COL, DS_ID, SOME_OTHER_COL_2 FROM T WHERE row_num <= 10;