Let’s say for example we have the following table:
Salary
table:
id | name | sex | salary |
---|---|---|---|
1 | A | m | 2500 |
2 | B | f | 1500 |
3 | C | m | 5500 |
4 | D | f | 500 |
If I write such query that swaps values from the column sex
where m
should be f
and f
should be m
this query works:
update Salary set sex = case when sex = 'm' then 'f' when sex = 'f' then 'm' else sex end
I’m used to some programming languages like R (data.table)
and Python (pandas)
where such operation would likely occur sequentially, first for when sex = 'm' then 'f'
and then for when sex = 'f' then 'm'
, in this case such approach would not work and one would need to make a copy of the column when performing the transformation. Does the SQL
query treats the whole statement as one ? SQL newbie here.
Advertisement
Answer
Yes. The case statement is evaluated as a whole in one go per row, and the result is then applied. Think of it as like a function that returns a single value based on which condition is first matched (it’s obviously not a function, but that framing is helpful).
This is true regardless of any details about row execution or set based operations. Even if you were iterating through a database table, as with a cursor, the case statement would still function this way.
The larger question of execution order for rows is an important one, so I’ll take a minute to go into it.
Most relational database systems have atomic queries by default, this means that to other queries, it looks as if an update was applied to the whole table ‘in one go.’ This allows us to behave as if query operations were fully set based, operations on a whole table, rather than sequentially on rows.
Under the hood, set based operations basically never translate into full concurrent disk writes. At some level, the set based SQL operation needs to be translated into one or more sequential write operations. The exact mechanism varies by RDBMS. Different databases also use different mechanisms for creating the illusion of atomicity, including locks and write ahead logs.
Now here’s where it gets complex. Some databases are not atomic. Old versions of MySQL using the MyISAM engine are not, and many databases can be configured to be non atomic. In these databases, a second query can see an intermediary state where only some rows are updated. This is generally considered very bad in most use cases.