Skip to content
Advertisement

What are the downsides of making a numeric column a string in a database?

I have members on my team who are tossing around the idea to make every column in the database a string including numeric columns.

I know that sorting becomes an issue.

What are the other downsides of making a numeric column a string?

Advertisement

Answer

The major issue is that users can put broken data into the columns — data that is not numeric. That is simply not possible with the correct type. Although you could add a check constraint for every numeric column, that seems like a lot of work.

The scenario is: You have a query that works and has worked for a long time. All of a sudden, someone puts a non-numeric value into the column. The query breaks. And because the query was (probably) using implicit conversion, it is really hard to tell where the problem is.

Let me just say: I am speaking from experience here.

Other problems are:

  • Comparisons don’t work as expected: '0' <> '0.0'.
  • Comparisons don’t work as expected: '9' > '100'.
  • Comparisons don’t work as expected: '.1' < '0.01'.
  • Sorting doesn’t work as expected.
  • The code is filled with (unnecessary and typically implicit) conversions.
  • Some databases, such as SQL Server, overload operators so '1' + '1' <> '2'.
  • Some databases overload operators, so current_timestamp + 1 is valid but current_timestamp + '1' is invalid.
  • A comparison in a query can affect index usage. So, strcol = 1 ends up converting strcol to a number, which typically precludes the use of an index. On the other hand, intcol = '1' ends up converting the constant to a number, which still allows the index to be used. I do not recommend mixing types in comparisons, though.

Space is a wash, because in many cases the string representation might be smaller than the number representation. It depends in that case. There is a slight hit on indexing, because fixed length keys are usually more efficient.

If you mix types, things get worse — because that affects the optimizer.

Some things that are composed of numbers are not necessarily numeric. You can usually tell the difference easily: does it make sense to perform arithmetic operations on the value? Or another indicator: do leading zeros make sense?

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement