Franck Pachot for MongoDB

Posted on Jun 6 • Edited on Jul 1

Isolation Level for MongoDB Multi-Document Transactions (Strong Consistency)

#mongodb #document #database #acid

Many outdated or imprecise claims about transaction isolation levels in MongoDB persist. These claims are outdated because they may be based on an old version where multi-document transactions were introduced, MongoDB 4.0, such as the old Jepsen report, and issues have been fixed since then. They are also imprecise because people attempt to map MongoDB's transaction isolation to SQL isolation levels, which is inappropriate, as the SQL Standard definitions ignore Multi-Version Concurrency Control (MVCC), utilized by most databases, including MongoDB.
Martin Kleppmann has discussed this issue and provided tests to assess transaction isolation and potential anomalies. I will conduct these tests on MongoDB to explain how multi-document transaction work and avoid anomalies.

I followed the structure of Martin Kleppmann's tests on PostgreSQL and ported them to MongoDB. The read isolation level in MongoDB is controlled by the Read Concern, and the "snapshot" read concern is the only one comparable other Multi-Version Concurrency Control SQL databases, and maps to Snapshot Isolation, improperly called Repeatable Read to use the closest SQL standard term. As I test on a single-node lab, I use "majority" to show that it does more than Read Committed. The write concern should also be set to "majority" to ensure that at least one node is common between the read and write quorums.

Recap on Isolation Levels in MongoDB

Let me explain quickly the other isolation levels and why they cannot be mapped to the SQL standard:

readConcern: { level: "local" } is sometimes compared to Uncommitted Reads because it may show a state that can be later rolled back in case of failure. However, some SQL databases may show the same behavior in some rare conditions (example here) and still call that Read Committed
readConcern: { level: "majority" } is sometimes compared to Read Committed, because it avoids uncommitted reads. However, Read Committed was defined for wait-on-conflict databases to reduce the lock duration in two-phase locking, but MongoDB multi-document transactions use fail-on-conflict to avoid waits. Some databases consider that Read Committed can allow reads from multiple states (example here) while some others consider it must be a statement-level snapshot isolation (examples here). In a multi-shard transaction, majority may show a result from multiple states, as snapshot is the one being timeline consistent.
readConcern: { level: "snapshot" } is the real equivalent to Snapshot Isolation, and prevents more anomalies than Read Committed. Some databases even call that "serializable" (example here) because the SQL standard ignores the write-skew anomaly.
readConcern: { level: "linearlizable" } is comparable to serializable, but for a single document, not available for multi-document transactions, similar to many SQL databases that do not provide serializable as it re-introduces scalability the problems of read locks, that MVCC avoids.

Read Committed basic requirements (G0, G1a, G1b, G1c)

Here are some tests for anomalies typically prevented in Read Committed. I'll run them with readConcern: { level: "majority" } but keep in mind that readConcern: { level: "snapshot" } may be better if you want a consistent snapshot across multiple shards.

MongoDB Prevents Write Cycles (G0) with conflict error

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.updateOne({ _id: 1 }, { $set: { value: 11 } });

T2.test.updateOne({ _id: 1 }, { $set: { value: 12 } });

MongoServerError[WriteConflict]: Caused by :: Write conflict during plan execution and yielding is disabled. :: Please retry your operation or multi-document transaction.

In a two-phase locking database, with wait-on-conflict behavior, the second transaction would wait for the first one to avoid anomalies. However, MongoDB with transactions is fail-on-conflict and raises a retriable error to avoid the anomaly.

Each transaction touched only one document, but it was declared explicitly with a session and startTransaction(), to allow multi-document transactions, and this is why we observed the fail-on-conflict behavior to let the application apply its retry logic for complex transactions.

If the conflicting update was run as a single-document transaction, equivalent to an auto-commit statement, it would have used a wait-on-conflict behavior. I can test it by immediately running this while the t1 transaction is still active:


const db = db.getMongo().getDB("test_db");
print(`Elapsed time: ${
    ((startTime = new Date())
    && db.test.updateOne({ _id: 1 }, { $set: { value: 12 } }))
    && (new Date() - startTime)
} ms`);

Elapsed time: 72548 ms

I've run the updateOne({ _id: 1 }) without an implicit transaction. It waited for the other transaction to terminate, which happened after a 60 seconds timeout, and then update was successful. The first transaction that timed out is aborted:

session1.commitTransaction();

MongoServerError[NoSuchTransaction]: Transaction with { txnNumber: 2 } has been aborted.

The behavior of conflict in transactions differs:

wait-on-conflict for implicit single-document transactions
fail-on-conflict for explicit multiple-document transactions immediately, resulting in a transient error, without waiting, to let the application rollback and retry.

MongoDB prevents Aborted Reads (G1a)

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.updateOne({ _id: 1 }, { $set: { value: 101 } });

T2.test.find();

[ { _id: 1, value: 10 }, { _id: 2, value: 20 } ]

session1.abortTransaction();

T2.test.find();

[ { _id: 1, value: 10 }, { _id: 2, value: 20 } ]

session2.commitTransaction();

MongoDB prevents reading an aborted transaction by reading only the committed value when Read Concern is 'majority' or 'snapshot'.

MongoDB prevents Intermediate Reads (G1b)

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.updateOne({ _id: 1 }, { $set: { value: 101 } });

T2.test.find();

[ { _id: 1, value: 10 }, { _id: 2, value: 20 } ]

The non-committed change from T1 is not visible to T2.


T1.test.updateOne({ _id: 1 }, { $set: { value: 11 } });

session1.commitTransaction();  // T1 commits

T2.test.find();

[ { _id: 1, value: 10 }, { _id: 2, value: 20 } ]

The committed change from T1 is still not visible to T2 because it happened after T2 started.

This is different from the majority of Multi-Version Concurrency Control SQL databases. To minimize the performance impact of wait-on-conflict, they reset the read time before each statement in Read Committed, as phantom reads are allowed. They would have displayed the newly committed value with this example.
MongoDB never does that, the read time is always the start of the transaction, and no phantom read anomaly happens. However, it doesn't wait to see if the conflict is resolved or must fail with a deadlock, and fails immediately to let the application retry it.

MongoDB prevents Circular Information Flow (G1c)

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.updateOne({ _id: 1 }, { $set: { value: 11 } });

T2.test.updateOne({ _id: 2 }, { $set: { value: 22 } });

T1.test.find({ _id: 2 });

[ { _id: 2, value: 20 } ]

T2.test.find({ _id: 1 });

[ { _id: 1, value: 10 } ]

session1.commitTransaction();

session2.commitTransaction();

In both transactions, the un-commited changes are not visible to others.

MongoDB prevents Observed Transaction Vanishes (OTV)

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T3
const session3 = db.getMongo().startSession();
const T3 = session3.getDatabase("test_db");
session3.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.updateOne({ _id: 1 }, { $set: { value: 11 } });

T1.test.updateOne({ _id: 2 }, { $set: { value: 19 } });

T2.test.updateOne({ _id: 1 }, { $set: { value: 12 } });

MongoServerError[WriteConflict]: Caused by :: Write conflict during plan execution and yielding is disabled. :: Please retry your operation or multi-document transaction.

This anomaly is prevented by fail-on-conflict with explicit transaction. With implicit single-document transaction, it would have wait for the conflicting transaction to end.

MongoDB prevents Predicate-Many-Preceders (PMP)

With a SQL database, this anomaly would require Snapshot Isolation level because Read Committed use different read times per statement. However, I can show it in MongoDB with 'majority' read concern, 'snapshot' being required only to get cross-shard snapshot consistency.

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.find({ value: 30 }).toArray();

[]

T2.test.insertOne(  { _id: 3, value: 30 }  );

session2.commitTransaction();

T1.test.find({ value: { $mod: [3, 0] } }).toArray();

[]

The newly inserted row is not visible because it was committed by T2 after the start of T1.

Martin Kleppmann's tests include some variations with a delete statement and a write predicate:

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.updateMany({}, { $inc: { value: 10 } });

T2.test.deleteMany({ value: 20 });

MongoServerError[WriteConflict]: Caused by :: Write conflict during plan execution and yielding is disabled. :: Please retry your operation or multi-document transaction.

As it is an explicit transaction, rather than blocking, the delete detects the conflict and raises a retriable exception to prevent the anomaly. Compared to PostgreSQL which prevents that in Repeatable Read, it saves the waiting time before failure, but require the application to implement a retry logic.

MongoDB prevents Lost Update (P4)

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.find({ _id: 1 });

[ { _id: 1, value: 10 } ]

T2.test.find({ _id: 1 });

[ { _id: 1, value: 10 } ]

T1.test.updateOne({ _id: 1 }, { $set: { value: 11 } });

T2.test.updateOne({ _id: 1 }, { $set: { value: 11 } });

MongoServerError[WriteConflict]: Caused by :: Write conflict during plan execution and yielding is disabled. :: Please retry your operation or multi-document transaction.

As it is an explicit transaction, the update doesn't wait and raises a retriable exception, so that it is impossible to overwrite the other update, without waiting for its completion.

MongoDB prevents Read Skew (G-single)

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.find({ _id: 1 });

[ { _id: 1, value: 10 } ]

T2.test.find({ _id: 1 });

[ { _id: 1, value: 10 } ]

T2.test.find({ _id: 2 });

[ { _id: 2, value: 20 } ]

T2.test.updateOne({ _id: 1 }, { $set: { value: 12 } });

T2.test.updateOne({ _id: 2 }, { $set: { value: 18 } });

session2.commitTransaction();

T1.test.find({ _id: 2 });

[ { _id: 2, value: 20 } ]

In SQL databases with Read Committed isolation, a read skew anomaly could display the value 18. However, MongoDB avoids this issue by reading the same value of 20 consistently throughout the transaction, as it reads data as of the start of the transaction.

Martin Kleppmann's tests include a variation with predicate dependency:

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.findOne({ value: { $mod: [5, 0] } });

{ _id: 1, value: 10 }

T2.test.updateOne({ value: 10 }, { $set: { value: 12 } });

session2.commitTransaction();

T1.test.find({ value: { $mod: [3, 0] } }).toArray();

[]

The uncommitted value 12 which is a multiple of 3 is not visible to the transaction that started before.

Another tests include a variation with a write predicate in a delete statement:

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

T1.test.find({ _id: 1 });

[ { _id: 1, value: 10 } ]

T2.test.find();

[ { _id: 1, value: 10 }, { _id: 2, value: 20 } ]

T2.test.updateOne({ _id: 1 }, { $set: { value: 12 } });

T2.test.updateOne({ _id: 2 }, { $set: { value: 18 } });

session2.commitTransaction();

T1.test.deleteMany({ value: 20 });

MongoServerError[WriteConflict]: Caused by :: Write conflict during plan execution and yielding is disabled. :: Please retry your operation or multi-document transaction.

This read skew anomaly is prevented by the fail-on-conflict behavior when writing a document that has uncommited changes from another transaction.

Write Skew (G2-item) must be managed by the application

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "majority" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "snapshot" },
  writeConcern: { w: "majority" }
});

T1.test.find({ _id: { $in: [1, 2] } })

[ { _id: 1, value: 10 }, { _id: 2, value: 20 } ]

T2.test.find({ _id: { $in: [1, 2] } })

[ { _id: 1, value: 10 }, { _id: 2, value: 20 } ]

T2.test.updateOne({ _id: 1 }, { $set: { value: 11 } });

T2.test.updateOne({ _id: 2 }, { $set: { value: 21 } });

session1.commitTransaction();

session2.commitTransaction();

MongoDB doesn't detect the read/write conflict when one transaction has read a value updated by the other, and then writes something that may have depended on this value. The Read Concern doesn't provide the Serializable guarantee. Such isolation requires acquiring range or predicate locks during reads, and doing it prematurely would hinder the performance of a database designed to scale.

For the transactions that need to avoid this, the application can transform the read/write conflict to a write/write conflict by updating a field in the document that was read to be sure that other transactions do not modify it. Or re-check the value when updating.

Anti-Dependency Cycles (G2) must be managed by the application

// init
use test_db;
db.test.drop();
db.test.insertMany([
  { _id: 1, value: 10 },
  { _id: 2, value: 20 }
]);

// T1
const session1 = db.getMongo().startSession();
const T1 = session1.getDatabase("test_db");
session1.startTransaction({
  readConcern: { level: "snapshot" },
  writeConcern: { w: "majority" }
});

// T2
const session2 = db.getMongo().startSession();
const T2 = session2.getDatabase("test_db");
session2.startTransaction({
  readConcern: { level: "snapshot" },
  writeConcern: { w: "majority" }
});

T1.test.find({ value: { $mod: [3, 0] } }).toArray();

[]

T2.test.find({ value: { $mod: [3, 0] } }).toArray();

[]

T1.test.insertOne(  { _id: 3, value: 30 }  );

T1.test.insertOne(  { _id: 4, value: 42 }  );


session1.commitTransaction();

session2.commitTransaction();

T1.test.find({ value: { $mod: [3, 0] } }).toArray();

[ { _id: 3, value: 30 }, { _id: 4, value: 42 } ]

The read/write conflict was not detected and both transactions were able to write even if they may have depended on a previous read that that been modified by the other transaction. MongoDB does not acquire read locks. If you run a multi-document transaction where the writes depend on the reads, the application must explicitely write to the read set in order avoid the anomaly.

All those tests were based on https://github.com/ept/hermitage.
There's lots of information about MongoDB transactions in the MongoDB Multi-Document ACID Transactions whitepaper from 2020.

While the document model offers simplicity and performance when a single document matches the business transaction, MongoDB supports multi-statement transactions with Snapshot Isolation, similar to many SQL databases using Multi-Version Concurrency Control (MVCC) but favoring fail-on-conflict rather that wait. Despite outdated myths surrounding NoSQL or based on old versions, its transaction implementation is robust and effectively prevents common transactional anomalies.

AssemblyAI Voice Agents Challenge 🗣️

Running through July 27, the AssemblyAI Voice Agents is all about building with Universal-Streaming, AssemblyAI's most advanced real-time transcription API. Universal-Streaming is ultra fast (300ms latency!), ultra accurate, and offers intelligent endpointing to keep conversations flowing naturally.

Start building 🏗️

Top comments (2)

Galloway Developer • Jun 12

This is a great walkthrough of how MongoDB handles various isolation anomalies. However, since MongoDB multi-document transactions do not guarantee full serializable isolation and leave write skew and anti-dependency cycles to be managed by the application, how would you recommend developers address these limitations in real-world scenarios? Are there patterns or best practices you suggest to mitigate such anomalies?

Franck Pachot • Jul 13

Many databases do not implement serializability because it requires acquiring a lock for all read operations. To prevent write skew, SQL developers should only lock the necessary elements using a SELECT FOR UPDATE statement to escalate a read-write conflict to a write conflict. A similar approach can be taken in MongoDB using an update operation. For an example, refer to How To SELECT ... FOR UPDATE inside MongoDB Transactions