Skip to content
Advertisement

How can I get rid of having to prefix a WHERE query with ‘N’ for Unicode strings?

When searching for a string in our database where the column is of type nvarchar, specifying the ‘N’ prefix in the query nets some results. Leaving it out does not. I am trying the search for a Simplified Chinese string in a database that previously did not store any Chinese strings yet.

The EntityFramework application that uses the database, correctly retrieves the strings and the LINQ queries also work in the application. However, in SQL Server 2014 Management Studio, when I do a an SQL query for the string it does not show up unless I specify the ‘N’ prefix for unicode. (Even though the column is nvarchar type)

Works:

var text = from asd in Translations.TranslationStrings
            where asd.Text == "嗄法吖无上几"
            select asd;

            MessageBox.Show(text.FirstOrDefault().Text);

Does not work:

SELECT *
  FROM TranslationStrings
  where Text = '嗄法吖无上几'

If I prefix the Chinese characters with ‘N’ it works.

Works:

SELECT *
  FROM TranslationStrings
  where Text = N'嗄法吖无上几'

Please excuse the Chinese characters, I just typed something random. My question is, is there something I can do to not have to include the ‘N’ prefix when doing a query?

Thank you very much!

Advertisement

Answer

As @sworkalot has mentioned below:

The default for .Net is Unicode, that’s why you don’t need to specify it. This is not the case for Sql Manager.

If not specified Sql will assume that you work with asci according to the collation specified in your DB.

Hence, when working from Sql Server you need to use N’

https://sqlquantumleap.com/2018/09/28/native-utf-8-support-in-sql-server-2019-savior-false-prophet-or-both/

Check out these examples, pay close attention to the data types and the values being assigned:

DECLARE @Varchar VARCHAR(100) = '嗄'
DECLARE @VarcharWithN VARCHAR(100) = N'嗄' -- Has N prefix

DECLARE @NVarchar NVARCHAR(100) = '嗄'
DECLARE @NVarcharWithN NVARCHAR(100) = N'嗄' -- Has N prefix


SELECT
    Varchar = @Varchar,
    VarcharWithN = @VarcharWithN,
    NVarchar = @NVarchar,
    NVarcharWithN = @NVarcharWithN

SELECT
    Varchar = CONVERT(VARBINARY, @Varchar),
    VarcharWithN = CONVERT(VARBINARY, @VarcharWithN),
    NVarchar = CONVERT(VARBINARY, @NVarchar),
    NVarcharWithN = CONVERT(VARBINARY, @NVarcharWithN)

Results:

Varchar VarcharWithN    NVarchar    NVarcharWithN
?       ?               ?           嗄

Varchar VarcharWithN    NVarchar    NVarcharWithN
0x3F    0x3F            0x3F00      0xC455

NVARCHAR data type stores 2 bytes for each character while VARCHAR only stores 1 (you can see this on the VARBINARY cast on the 2nd SELECT). Since chinese characters representation need 2 bytes to be stored, you have to use NVARCHAR to store them. If you try to stuff them in a VARCHAR it will be stored as ? and you will lose the original character information. This also happens on the 3rd example, because the literal doesn’t have the N so it’s converted to VARCHAR before actually assigning the value to the variable.

It’s because of this that you need to add the N prefix when typing these characters as literals, so the SQL engine knows that you are typing characters that need 2 byte representation. So if you are doing a comparison against a NVARCHAR column always add the N prefix. You can change the database collation, but it’s recommended to always use the proper data type independent of the collation so you don’t have problems when using coding on different databases.

If you could explain the reason why you want to omit the N prefix we might address that, although I believe there is no work around in this particular case.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement