T1 reads a row. It sees attribute x is set to 100. T2 does the same and add 20 to the attribute x in that row, and then commits. T1 adds 20 to attribute x, and then commits. What will happen?

The answer is as it is in everything else in software engineering: it depends. This time it depends on 100 different things and everytime I try remembering all these I had to an hour of reading, so I wanted to summarize it here to check later when I need to, just in the right amount of details I’m interested in. You can read it too though. Just don’t complain if you find it too shallow, or too deep.

SQL standard defines 4 transaction isolation levels:

  • Read Uncommitted
  • Read Committed
  • Repeatable Read
  • Serializable

Read Uncommitted

This practically doesn’t exist for Postgres. Why? I read it some day somewhere but I don’t remember anymore, and I’m not really interested in right now.

Read Committed

Default transaction isolation level in postgres. Whatever you read is the last committed version. So reads are not guaranteed to return the same result in the same transaction. What does this practically mean? You read value x in t1, t2 updates (and commits) that x, and now in t1 you add 20 to the value you read before t2’s changes, and save it, Postgres will have zero complains and t2’s change will be basically lost. So you are investigating a bug that some updates are lost? Check your transaction isolation level.

One question you may think: why is it called Read committed and how the sample here relevant to that name? Well since the read guarantee is only to read last committed value, postgres won’t do checks for concurrent updates to last read value.

Repeatable Read

Repeatable Read is what its name suggests, you can repeat the reads of rows in the transaction and you’ll always get the same value. OK, but what does this practically mean? It means that from the example above, t2 would still update the value, but because t1 has already read it before, when it tries to update it, Postgres would return a concurrent update exception. So less availability here, but more consistency.

Serializable

Strongest transaction isolation level. In addition to the guarantees of repeatable read, it also disallows phantom reads - new rows won’t appear. Remember, repeatable read guarantees it for the row, serializable guarantees it for the entire table. Probably would be useful for critical aggregations. Never had to use it.

Result

So, in the default level of transaction isolation level in Postgres, you can have lost updates? Yes, indeed.

Will you always have lost updates with Read Committed though? Not necessarily - you can use your application logic to avoid them, and perhaps advisory locks.

You can also lock those rows using select for update. It is going to add row level locks to what you read so t2 will wait in the example above - hence no lost update.

Another way is not depending on read + updates, using event sourcing for example. If you only append to a table and aggregate the actual values from it, you can’t have lost updates. Maybe you can. I think you cannot though.

Remember you can also set isolation levels at transaction level if you need to, so your default transaction level can be Read Committed but you can run a transaction in Serializable as well.

OK, if read committed allows lost updates, why is it the default transaction level? Because it’s the one with highest availability. If you want more consistency you give up availability. Nothing is free in this world unfortunately, except for walking in the mountains. Beware the bears though.

What puts row level lock and what puts table level lock?

Row level: insert, update, delete, select for update, select for share

Table level: Lock table (obviously), alter table, drop table, truncate, reindex

Does table level locks allow reads? Depends on the lock type. Too many details here but keep in mind if you will add index to a large table and you verified it’s gonna take a long time do it with create index concurrently. This will avoid access exclusive index. For rest of the changes you want to do that will require table level lock check the lock type and if access exclusive, well, find the way to not break the entire system. Or break it if your customers and bosses are very nice peoople. Or just do it midnight.

Pessimist or Optimist

Postgres uses pessimistic locking by default. What are their definitions? Optimistic = nothing will go wrong, so just check if there’s a version mismatch on commit. Pessimistic: everyone will try to update the same resource I will try updating, so just block everyone else till I’m done with it. Pessimistic is better if there’ll be a lot of rollbacks as they are expensive. Optimistic is better if there won’t be many conflicts.

Eventual Consistency vs Strong Consistency

This is irrelevant to the subject here. I don’t even know why I’m mentioning it.

Eventual consistency = if no new updates, all reads from different replicas will eventually return the same result.

Strong consistency = all reads will always return the same result. Of course this will mean all the read replicas will have to wait on returning results till the write propagates to them.

Summary

If you want consistency get ready for latency.

There is NO WAY in the world to build systems that are both FAST and CONSISTENT. Every day, we are trying to do the IMPOSSIBLE.

That’s why every software engineer looking at a screen looks angry and disappointed.