Idempotency and Patterns for MongoDB

Idempotency is one of my favorite things in all of computing. It’s kind of like a warm safety blanket that assures you that “If at first you don’t succeed, try, try, try again”.

In Computer Science an idempotent operation is one that can be repeated safely without changing state within a system. In other words, the operation could be repeated several times, and the system would appear the same as if it had happened only a single time.

Idempotent operations are great for NoSQL databases like MongoDB that don’t support multi-record ACID transactions like traditional SQL databases do*. When your updates are idempotent, you can detect and elegantly recover from failed operations, or bugs in your code.

Idempotent Writes and their calming properties

How can you build some safety into your MongoDB writes without ACID guarantees? Especially for multi-document writes, we want to be able to recover from errors. In SQL world, you would abort the transaction in progress, and you’d be reset to a known good state before the operations happened. In MongoDB I like to design critical writes as idempotent operations so that in the event that we need to retry the write, everything will be in a consistent state.

Consider the following example:

You want to add a credit to the document representing a users account, but debit the value from a different document that represents available monthly promotions. It could look something like this.

    await creditAccount(amount);
    await debitFromPromotions(amount);
    respondOk(); //everything is good.

Now imagine the debit step fails.

    await creditAccount(amount);

    //throws error
    await debitFromPromotions(amount);
    ...
    respondError();
    // If client retries, account will get double credited.

Let’s rerun the example with functions designed to be idempotent.

    await creditAccount(amount, transactionId);

    //throws error
    await debitFromPromotions(amount, transactionId);
    ...
    respondError();


    //... a bit later, client retries with same transaction ID ...

    await creditAccount(amount, transactionId); //Has no effect.
    await debitFromPromotions(amount, transactionId);
    respondOk(); //everything is good.

oh shit

Idempotent retrys are far more elegant than trying to write your own “roll back” code, when your database doesn’t support it.

Upserting

One of the nice things about MongoDB is the ability to ‘upsert’ a document. This is shorthand for “if it exists, update, otherwise insert new record.” I’ve used upserting with great success while importing customer data into our own database, which sometimes needs to be run several times during testing. In some ways, an upsert is different from an idempotent write, but I included it because it shares some of the same nice properties.

With upserting, you could turn clunky code like this:

const updateResult = await db.myCollection.findAndModify(query, documentData);
if (!updateResult) {
    //doesn't exist yet.
    await doInsert(query, documentData);
}

into this:

const result = await db.myCollection.findAndModify(query, documentData, {upsert:true});

The benefits of a single “upsert” path could be summarized as:

Single round trip from app server to database.
Simpler code path, less to maintain and test.
For data import scripts, upserting lets you avoid dropping and recreating the database before each run.

Idempotent MongoDB Array Push

In SQL you can prevent duplicate records being pushed into a table with a unique constraint, but how might one accomplish something similar with MongoDB? A clever use of the $elemMatch and $not operators can ensure that an item can be appended to an array idempotently.

//Will only add newItem to itemArray if newItem does not already exist.
db.myRecords.findAndModify(
    {
      _id: recordId, 
      itemArray: {$not: { $elemMatch: {itemId: newItem.id } }}
    },
    { 
      $push: {itemArray: newItem}
    }
);

This works great with array items that have a natural primary key. You may need to ‘synthesize’ one from other keys in the item if no primary key exists.

You could also have the client generate a unique request ID and have them include it with the payload of the write. In the event of a failure the client would retry the request with the same unique request ID. AWS commonly uses this pattern. One example is the ClientRequestToken parameter for this Cloud Formation API Call

Building with Idempotency in mind reduces headaches

Wherever possible, looking for ways to make operations idempotent can pay dividends by making systems more resilient to failures and easier to troubleshoot. By assuming things will fail at the outset, and ensuring your systems can cope, you set yourself up for success. While your small, MVP service may seem totally reliable, as you scale up transient machine and network issues can happen frequently. Idempotency baked in from the start will give you confidence that you can retry and keep your data in good shape.

In certain circumstances, you may find that idempotency is too “expensive” for a particular work load. The overhead of additional checks that your machines or database servers need to perform could potentially lower throughput on a hot pathway. In situations like this, it may be more appropriate to go and “clean up” double writes later, if your application can tolerate eventual consistency.

Notes:

MongoDB is slated to get multi-document transactions in the 4.0 release

Idempotency and Patterns for MongoDB

Idempotent Writes and their calming properties

Upserting

Idempotent MongoDB Array Push

Building with Idempotency in mind reduces headaches

Cody Hanson

Idempotency and Patterns for MongoDB

Designing Data Intensive Applications

The Network Certification Description Language