Forem: Artur Garcia Costa

The Cost of Not Knowing MongoDB - Part 3: appV6R0 to appV6R4

Artur Garcia Costa — Thu, 22 Jan 2026 16:52:07 +0000

Application Version 6 Revision 0: A Dynamic Monthly Bucket Document
Application Version 6 Revision 1: A Dynamic Quarter Bucket Document
Application Version 6 Revision 2: A Dynamic Bucket and Computed Document
Application Version 6 Revision 3: Getting everything at once
Application Version 6 Revision 4: The zstd Compression Algorithm

Article Introduction

Welcome to the third and final part of the series, "The Cost of Not Knowing MongoDB". Building upon the foundational optimizations explored in Part 1 and Part 2, this article delves into advanced MongoDB design patterns that can dramatically transform application performance.

In the Part 1, we improved application performance by concatenating fields, changing data types, and shortening field names. In the Part 2, we implemented the Bucket Pattern and Computed Patterns and optimized the aggregation pipeline to achieve even better performance.

In this final article, we address the Issues and Improvements identified in AppV5R4. Specifically, we focus on reducing the document size in our application to alleviate the disk throughput bottleneck on the MongoDB server. This reduction will be accomplished by adopting a Dynamic Schema and modifying the storage compression algorithm.

All the application versions and revisions from this article would have been developed by a senior MongoDB developer, as it's built on all the previous versions and utilizes the Dynamic Schema pattern, which isn't very common to see.

Application Version 6 Revision 0 (appV6R0): A Dynamic Monthly Bucket Document

Introduction

As mentioned in the Issues and Improvements of appV5R4 from the previous article, the primary limitation of our MongoDB server is its disk throughput. To address this, we need to reduce the size of the documents being stored.

Consider the following document from appV5R3, which has provided the best performance so far:

const document = {
  _id: Buffer.from("...01202202"),
  items: [
    { date: new Date("2022-06-05"), a: 10, n: 3 },
    { date: new Date("2022-06-16"), p: 1, r: 1 },
    { date: new Date("2022-06-27"), a: 5, r: 1 },
    { date: new Date("2022-06-29"), p: 1 },
  ],
};

The items array in this document contains only four elements, but on average, it will have around 10 elements, and in the worst-case scenario, it could have up to 90 elements. These elements are the primary contributors to the document size, so they should be the focus of our optimization efforts.

One commonality among the elements is the presence of the date field and part of its value, year and month, for the previous document. By rethinking how this field and its value could be stored, we can reduce storage requirements.

An unconventional solution we could use is:

Changing the items field type from an array to a document.
Using the date value as the field name in the items document.
Storing the status totals as the value for each date field.

Here is the previous document represented using the new schema idea:

const document = {
  _id: Buffer.from("...01202202"),
  items: {
    20220605: { a: 10, n: 3 },
    20220616: { p: 1, r: 1 },
    20220627: { a: 5, r: 1 },
    20220629: { p: 1 },
  },
};

While this schema may not significantly reduce the document size compared to appV5R3, we can further optimize it by leveraging the fact that the year is already embedded in the _id field. This eliminates the need to repeat the year in the field names of the items document.

With this approach, the items document adopts a Dynamic Schema, where field names encode information and are not predefined.

To demonstrate various implementation possibilities, we will revisit all the bucketing criteria used in the appV5RX implementations, starting with appV5R0.

For appV6R0, which builds upon appV5R0 but uses a dynamic schema, data is bucketed by year and month. The field names in the items document represent only the day of the date, as the year and month are already stored in the _id field.

A detailed explanation of the bucketing logic and functions used to implement the current application can be found in the appV5R0 introduction.

The following document stores data for January 2022 (2022-01-XX), applying the newly presented idea:

const document = {
  _id: Buffer.from("...01202201"),
  items: {
    "05": { a: 10, n: 3 },
    16: { p: 1, r: 1 },
    27: { a: 5, r: 1 },
    29: { p: 1 },
  },
};

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV6R0:

export type SchemaV6R0 = {
  _id: Buffer;
  items: Record<
    string,
    {
      a?: number;
      n?: number;
      p?: number;
      r?: number;
    }
  >;
};

Bulk Upsert

Based on the specification presented, we have the following updateOne operation for each event generated by this application version:

const DD = getDD(event.date); // Extract the `day` from the `event.date`

const operation = {
  updateOne: {
    filter: { _id: buildId(event.key, event.date) }, // key + year + month
    update: {
      $inc: {
        [`items.${DD}.a`]: event.approved,
        [`items.${DD}.n`]: event.noFunds,
        [`items.${DD}.p`]: event.pending,
        [`items.${DD}.r`]: event.rejected,
      },
    },
    upsert: true,
  },
};

filter:

Target the document where the _id field matches the concatenated value of key, year, and month.
The buildId function converts the key+year+month into a binary format.

update:

Uses the $inc operator to increment the fields corresponding to the same DD as the event by the status values provided.
If a field does not exist in the items document and the event provides a value for it, $inc treats the non-existent field as having a value of 0 and performs the operation.
If a field exists in the items document but the event does not provide a value for it (i.e., undefined), $inc treats it as 0 and performs the operation.

upsert:

Ensures a new document is created if no matching document exists.

Get Reports

To fulfill the Get Reports operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the $match stage:

const pipeline = [
  { $match: docsFromKeyBetweenDate },
  { $addFields: buildTotalsField },
  { $group: groupSumTotals },
  { $project: { _id: 0 } },
];

The complete code for this aggregation pipeline is quite complicated. Because of that, we will have just a pseudocode for it here.

{ $match: docsFromKeyBetweenDate }

Range-filters documents by _id to retrieve only buckets within the report date range. It has the exact same logic as appV5R0.

{ $addFields: buildTotalsField }

The logic is similar to the one used in the Get Reports of appV5R3.
The $objectToArray operator is used to convert the items document into an array, enabling a $reduce operation.
Filtering the items fields within the report's range involves extracting the year and month from the _id field and the day from the field names in the items document.

The following JavaScript code is logic equivalent to the real aggregation pipeline code.

 // Equivalent JavaScript logic:
 const [MM] = _id.slice(-2).toString(); // Get month from _id
 const [YYYY] = _id.slice(-6, -2).toString(); // Get year from _id
 const items_array = Object.entries(items); // Convert the object to an array of [key, value]

 const totals = items_array.reduce(
   (accumulator, [DD, status]) => {
     let statusDate = new Date(`${YYYY}-${MM}-${DD}`);

     if (statusDate >= reportStartDate && statusDate < reportEndDate) {
       accumulator.a += status.a || 0;
       accumulator.n += status.n || 0;
       accumulator.p += status.p || 0;
       accumulator.r += status.r || 0;
     }

     return accumulator;
   },
   { a: 0, n: 0, p: 0, r: 0 }
 );

{ $group: groupCountTotals }

Group the totals of each document in the pipeline into final status totals using $sum operations.

{ $project: { _id: 0 } }

Format the resulting document to has the reports format.

Indexes

No additional indexes are required, maintaining the single _id index approach established in the appV4 implementation.

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV6R0, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Document Size	Storage Size	Indexes	Index Size
appV5R0	95,350,431	19.19GB	217B	5.06GB	1	2.95GB
appV5R3	33,429,492	11.96GB	385B	3.24GB	1	1.11GB
appV6R0	95,350,319	11.1GB	125B	3.33GB	1	3.13GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/events	Index Size/events	Total Size/events
appV5R0	41.2B	6.3B	47.5B
appV5R3	25.7B	2.4B	28.1B
appV6R0	23.8B	6.7B	30.5B

It is challenging to make a direct comparison between appV6R0 and appV5R0 from a storage perspective. The appV5R0 implementation is the simplest bucketing possible, where event documents were merely appended to the items array without bucketing by day, as is done in appV6R0.

However, we can attempt a comparison between appV6R0 and appV5R3, the best solution so far. In appV6R0, data is bucketed by month, whereas in appV5R3, it is bucketed by quarter. Assuming document size scales linearly with the bucketing criteria (though this is not entirely accurate), the appV6R0 document would be approximately 3 * 125 = 375 bytes, which is 9.4% smaller than appV5R3.

Another indicator of improvement is the Data Size/events metric in the Event Statistics table. For appV6R0, each event uses an average of 23.8 bytes, compared to 27.7 bytes for appV5R3, representing a 14.1% reduction in size.

Load Test Results

Executing the load test for appV6R0 and plotting it alongside the results for appV5R0 and Desired rates, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rates

The two versions have very similar rate performance, with appV6R0 being slightly better in the second and third quarter, while appV5R0 is better in the first and fourth quarter.

Get Reports Latency

The two versions have very similar latency performance, with appV6R0 being slightly better in the second and third quarter, while appV5R0 is better in the first and fourth quarter.

Bulk Upsert Rates

Both versions have similar rate values, but it can be seen that appV6R0 has a small edge compared to appV5R0.

Bulk Upsert Latency

Although both versions have similar latency values for the first quarter of the test, for the final three-quarters, appV6R0 has a clear advantage over appV5R0.

Performance Summary

Despite the significant reduction in document and storage size achieved by appV6R0, the performance improvement was not as substantial as expected. This suggests that the bottleneck in the application when bucketing data by month may not be related to disk throughput.

Examining the collection stats table reveals that the index size for both versions is close to 3GB. This is near the 4GB of available memory on the machine running the database and exceeds the 1.5GB allocated by WiredTiger for cache. Therefore, it is likely that the limiting factor in this case is memory/cache rather than document size, which explains the lack of a significant performance improvement.

Issues and Improvements

To address the limitations observed in appV6R0, we propose adopting the same line of improvements applied from appV5R0 to appV5R1. Specifically, we will bucket the events by quarter in appV6R1. This approach not only follows the established pattern of enhancements but also aligns with the need to optimize performance further.

As highlighted in the Load Test Results, the current bottleneck lies in the size of the index relative to the available cache/memory. By increasing the bucketing interval from month to quarter, we can reduce the number of documents by approximately a factor of three. This reduction will, in turn, decrease the number of index entries by the same factor, leading to a smaller index size.

Application Version 6 Revision 1 (appV6R1): A Dynamic Quarter Bucket Document

Introduction

As discussed in the previous Issues and Improvements section, the primary bottleneck in appV6R0 was the index size nearing the memory capacity of the machine running MongoDB. To mitigate this issue, we propose to increase the bucketing interval from month to quarter for appV6R1, the same way we did in appV5R1.

This adjustment aims to reduce the number of documents and index entries by approximately a factor of three, thereby decreasing the overall index size. By adopting a quarter-based bucketing strategy, we align with the established pattern of enhancements applied in appV5R1 versions while addressing the specific memory/cache constraints identified in appV6R0.

The implementation of appV6R1 retains most of the code from appV6R0, with the following key differences:

The _id field will now be composed of key+year+quarter.
The field names in the items document will encode both month and day, as this information is necessary for filtering date ranges in the Get Reports operation.

The following example demonstrates how data for June 2022 (2022-06-XX), within the second quarter (Q2), is stored using the new schema:

const document = {
  _id: Buffer.from("...01202202"),
  items: {
    "0605": { a: 10, n: 3 },
    "0616": { p: 1, r: 1 },
    "0627": { a: 5, r: 1 },
    "0629": { p: 1 },
  },
};

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV6R0:

export type SchemaV6R0 = {
  _id: Buffer;
  items: Record<
    string,
    {
      a?: number;
      n?: number;
      p?: number;
      r?: number;
    }
  >;
};

Bulk Upsert

Based on the specification presented, we have the following updateOne operation for each event generated by this application version:

const MMDD = getMMDD(event.date); // Extract the month (MM) and day(DD) from the `event.date`

const operation = {
  updateOne: {
    filter: { _id: buildId(event.key, event.date) }, // key + year + quarter
    update: {
      $inc: {
        [`items.${MMDD}.a`]: event.approved,
        [`items.${MMDD}.n`]: event.noFunds,
        [`items.${MMDD}.p`]: event.pending,
        [`items.${MMDD}.r`]: event.rejected,
      },
    },
    upsert: true,
  },
};

This updateOne operation has a similar logic to the one in appV6R0, with the only differences being the filter and update criteria.

filter:

Target the document where the _id field matches the concatenated value of key, year, and quarter.
The buildId function converts the key+year+quarter into a binary format.

update:

Uses the $inc operator to increment the fields corresponding to the same MMDD as the event by the status values provided.

Get Reports

const pipeline = [
  { $match: docsFromKeyBetweenDate },
  { $addFields: buildTotalsField },
  { $group: groupSumTotals },
  { $project: { _id: 0 } },
];

This aggregation operation has a similar logic to the one in appV6R0, with the only differences being the implementation in the $addFields stage.

{ $addFields: itemsReduceAccumulator }:

A similar implementation to the one in appV6R0
The difference relies on extracting the value of year (YYYY) from the _id field and the month and day (MMDD) from the field name

The following JavaScript code is logic equivalent to the real aggregation pipeline code.

 const [YYYY] = _id.slice(-6, -2).toString(); // Get year from _id
 const items_array = Object.entries(items); // Convert the object to an array of [key, value]

 const totals = items_array.reduce(
   (accumulator, [MMDD, status]) => {
     let [MM, DD] = [MMDD.slice(0, 2), MMDD.slice(2, 4)];
     let statusDate = new Date(`${YYYY}-${MM}-${DD}`);

     if (statusDate >= reportStartDate && statusDate < reportEndDate) {
       accumulator.a += status.a || 0;
       accumulator.n += status.n || 0;
       accumulator.p += status.p || 0;
       accumulator.r += status.r || 0;
     }

     return accumulator;
   },
   { a: 0, n: 0, p: 0, r: 0 }
 );

Indexes

No additional indexes are required, maintaining the single _id index approach established in the appV4 implementation.

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV6R1, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Document Size	Storage Size	Indexes	Index Size
appV5R3	33,429,492	11.96GB	385B	3.24GB	1	1.11GB
appV6R0	95,350,319	11.1GB	125B	3.33GB	1	3.13GB
appV6R1	33,429,366	8.19GB	264B	2.34GB	1	1.22GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/events	Index Size/events	Total Size/events
appV5R3	25.7B	2.4B	28.1B
appV6R0	23.8B	6.7B	30.5B
appV6R1	17.6B	2.6B	20.2B

In the previous Initial Scenario Statistics analysis, we assumed that document size would scale linearly with the bucketing range. However, this assumption proved inaccurate. The average document size in appV6R1 is approximately twice as large as in appV6R0, even though it stores three times more data. Already a win for this new implementation.

Since appV6R1 buckets data by quarter at the document level and by day within the items sub-document, a fair comparison would be with appV5R3, the best-performing version so far. From the tables above, we observe a significant improvement in Document Size and consequently Data Size when transitioning from appV5R3 to appV6R1. Specifically, there was a 31.4% reduction in Document Size. From an index size perspective, there was no change, as both versions bucket events by quarter.

Load Test Results

Executing the load test for appV6R0 and plotting it alongside the results for appV5R0 and Desired rates, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rates

For the first three-quarters of the test, both versions have similar rate values, but, for the final quarter, appV6R1 has a notable edge over appV5R3.

Get Reports Latency

As happened in the rates graph, both versions have similar values for the first three-quarters, with appV6R1 being better than appV5R3 for the final quarter.

Bulk Upsert Rates

Both versions have very similar rate values throughout the test, but appV6R1 is able to get better values than appV5R3 in the final 20 minutes, but still not able to reach the desired rate.

Bulk Upsert Latency

Even though both versions have similar rate values, we can see that appV6R1 has considerably better latency values than appV5R3, being almost two times faster for the last three quarters of the test.

Issues and Improvements

Looking at the Get Reports graphs in the last Load Test Results, we're still not being able to reach the desired rates for this functionality. One way we could try to improve these operations is by using our well-known and old friend, the Computed Pattern.

Applying the Computed Pattern in the current version would be the same improvement tried from appV5R3 to appV5R4, which, instead of improving the performance, made it worse. Why would this solution work this time? The only way to know if it will work or not is to try, but before cracking our fingers and starting to work on the implementation, it's always a good idea to make a sanity check and see if there is at least one good reason to believe that this time things will be different (cof - cof).

When we applied the Computed Pattern from appV5R3 to appV5R4, we got a 8.2% increase in the document size and a slight degradation in performance in the Bulk Upsert functionality, with no performance gains in Get Reports. From appV5R3 to appV6R2, we got a 31.4% reduction in the document size, it could make sense to trade some of this reduction in favor of storing some pre-computed values. Another point is that the Bulk Upsert functionality in appV6R2 has its best performance so far, so maybe the extra cost of pre-computing the documents totals for this version is not a big of a deal as it was for appV5R4.

With these two "maybes" and a scientific spirit of always trying to test things to see where they'll break, let's give the Computed Pattern another chance.

Application Version 6 Revision 2 (appV6R2): A Dynamic Bucket and Computed Document

Introduction

As discussed in the previous Issues and Improvements section, in this revision we'll give another try to the Computed Pattern and pre-compute the status totals for each document. This implementation is practically equal to the one tried in appV5R4, with the only difference being that we are using a Dynamic Schema for the items field instead of an array.

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV6R1:

export type SchemaV6R1 = {
  _id: Buffer;
  totals: {
    a?: number; // Quarter total approved
    n?: number; // Quarter total noFunds
    p?: number; // Quarter total pending
    r?: number; // Quarter total rejected
  };
  items: Record<
    string,
    {
      a?: number; // Daily total approved
      n?: number; // Daily total noFunds
      p?: number; // Daily total pending
      r?: number; // Daily total rejected
    }
  >;
};

Indexes

No additional indexes are required, maintaining the single _id index approach established in the appV4 implementation.

Bulk Upsert

Based on the specifications, the following bulk updateOne operation is used for each event generated by the application:

const MMDD = getMMDD(event.date); // Extract the month (MM) and day(DD) from the `event.date`

const operation = {
  updateOne: {
    filter: { _id: buildId(event.key, event.date) }, // key + year + quarter
    update: {
      $inc: {
        "totals.a": event.approved,
        "totals.n": event.noFunds,
        "totals.p": event.pending,
        "totals.r": event.rejected,
        [`items.${MMDD}.a`]: event.approved,
        [`items.${MMDD}.n`]: event.noFunds,
        [`items.${MMDD}.p`]: event.pending,
        [`items.${MMDD}.r`]: event.rejected,
      },
    },
    upsert: true,
  },
};

This updateOne has almost the same logic as the one for appV6R1, with the differences being that we also increment the totals to pre-compute the quarter totals for the document. From a logic perspective, this operation is equal to the Bulk Upsert of appV5R4, but from an implementation perspective, it's way easier to write and understand, and from an execution perspective, it's less costly/intensive for having fewer stages and operations. This simplicity may also contribute to a better performance of the Computed Pattern when compared to appV5R4.

Get Reports

const pipeline = [
  { $match: docsFromKeyBetweenDate },
  { $addFields: buildTotalsField },
  { $group: groupSumTotals },
  { $project: { _id: 0 } },
];

This aggregation operation has a similar logic to the one in appV5R4, because of the pre-computed totals field, and the one in appV6R1, because the items field is of type document. The difference when compared to appV6R1 relies only on the $addFields stage. The complete code for this aggregation pipeline is quite complicated. Because of that, we will have just a pseudocode for it here.

{ $addFields: buildTotalsField }:

A similar implementation to the one in appV6R1
The main difference is if the quarter’s date range is within the limits of the report’s date range, we can use the pre-computed totals instead of calculating the value through a $reduce operation.

The following JavaScript code is logic equivalent to the real aggregation pipeline code.

 let totals;

 if (documentQuarterWithinReportDateRange) {
   // Use pre-computed quarterly totals
   totals = document.totals;
 } else {
   // Fall back to item-level aggregation
   const [YYYY] = _id.slice(-6, -2).toString(); // Get year from _id
   const items_array = Object.entries(items); // Convert the object to an array of [key, value]

   const totals = items_array.reduce(
     (accumulator, [MMDD, status]) => {
       let [MM, DD] = [MMDD.slice(0, 2), MMDD.slice(2, 4)];
       let statusDate = new Date(`${YYYY}-${MM}-${DD}`);

       if (statusDate >= reportStartDate && statusDate < reportEndDate) {
         accumulator.a += status.a || 0;
         accumulator.n += status.n || 0;
         accumulator.p += status.p || 0;
         accumulator.r += status.r || 0;
       }

       return accumulator;
     },
     { a: 0, n: 0, p: 0, r: 0 }
   );
 }

Indexes

No additional indexes are required, maintaining the single _id index approach established in the appV4 implementation.

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV6R2, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Document Size	Storage Size	Indexes	Index Size
appV5R3	33,429,492	11.96GB	385B	3.24GB	1	1.11GB
appV6R1	33,429,366	8.19GB	264B	2.34GB	1	1.22GB
appV6R2	33,429,207	9.11GB	293B	2.8GB	1	1.26GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/events	Index Size/events	Total Size/events
appV5R3	25.7B	2.4B	28.1B
appV6R1	17.6B	2.6B	20.2B
appV6R2	19.6B	2.7B	22.3B

As expected, we had a 11.2% increase in the Document Size by adding a totals field in each document of appV6R2. When comparing to appV5R3, we still have a reduction of 23.9% in the Document Size. Let's go to the Load Test Results and see if the trade-off between storage and computation cost will be worth it.

Load Test Results

Executing the load test for appV6R2 and plotting it alongside the results for appV6R1 and Desired rates, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rates

We can clearly see that appV6R2 has better rates than appV6R1 throughout the test, but still not reaching the top rate of 250 reports per second.

Get Reports Latency

As happened in the rates graph, appV6R2 provides lower latency than appV6R1 throughout the test.

Bulk Upsert Rates

Both versions have very similar rates values throughout the test, with appV6R2 being slightly better than appV6R1 in the final 20 minutes of the test, but still not being able to reach the desired rate.

Bulk Upsert Latency

Even though appV6R2 had better rates values than appV6R1, when looking at their latency it's not possible to point a winner, with appV6R2 being better in the first and final quartes and appV6R1 being better in the second and third quarters.

Performance Summary

The two "maybes" from the previous Issues and Improvements made up for their promises, and we got the best performance for appV6R2 when comparing to appV6R1. This is the redemption of the Computed Pattern applied on a document level. This revision is one of my favorites because it shows that the same optimization on very similar applications can lead to different results. In our case, the difference was caused by the application being very bottlenecked by the disk throughput.

Issues and Improvements

Let's tackle the last improvement on an application level. Those paying a close attention through the application versions may have already questioned it. In every Get Reports section, we have "To fulfill the Get Reports operation, five aggregation pipelines are required, one for each date interval". Do we really need to run five aggregation pipelines to generate the reports document? Isn't there a way to calculate everything in just one operation? The answer is "Yes", there is.

The reports document is composed of the fields oneYear, threeYears, fiveYears, sevenYears, and tenYears, where, until now, each one was generated by its own aggregation pipeline. Generating the reports this way is a waste of processing power because we are doing some part of the calculation multiple times. For example, to calculate the status totals for tenYears, we will also have to calculate the status totals for the others fields, as from a date range perspective, they are all contained in the tenYears date range.

So, for our next application revision, we'll condense the Get Reports five aggregation pipelines into one, avoiding wasting processing power on repeated calculation.

Application Version 6 Revision 3 (appV6R3): Getting Everything at Once

Introduction

As discussed in the previous Issues and Improvements section, in this revision, we'll improve the performance of our application by changing the Get Reports functionality to generate the reports document using only one aggregation pipeline instead of five.

The rationale behind this improvement is that when we generate the tenYears totals, we have also calculated the other totals, oneYear, threeYears, fiveYears, and sevenYears. As an example, when we make a request to Get Reports with the key ...0001 with the date 2022-01-01, the totals will be calculated with the following date range:

oneYear: From 2021-01-01 to 2022-01-01
threeYears: From 2020-01-01 to 2022-01-01
fiveYears: From 2018-01-01 to 2022-01-01
sevenYears: From 2016-01-01 to 2022-01-01
tenYear: From 2013-01-01 to 2022-01-01

As we can see from the list above, the date range for tenYears includes all the other date ranges.

Although we have successfully implemented the Computed Pattern in the previous revision, appV6R2, and got better results than appV6R1, we won't be using it as a base for this revision. There were two reasons for that:

Based on the results of our previous implementation of the Computed Pattern on a document level, from appV5R3 to appV5R4, I didn't expect it to get better results.
The implementation of the Get Reports to get the reports document through just one aggregation pipeline and also using the pre-computed field totals generated by the Computed Pattern would require a lot of work, and by the time of the latest versions of this series, I just wanted to finish it.

So, this revision will be built based on the appV6R1.

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV6R0:

export type SchemaV6R0 = {
  _id: Buffer;
  items: Record<
    string,
    {
      a?: number;
      n?: number;
      p?: number;
      r?: number;
    }
  >;
};

Bulk Upsert

Based on the specifications, the following bulk updateOne operation is used for each event generated by the application:

const YYYYMMDD = getYYYYMMDD(event.date); // Extract the year(YYYY), month(MM), and day(DD) from the `event.date`

const operation = {
  updateOne: {
    filter: { _id: buildId(event.key, event.date) }, // key + year + quarter
    update: {
      $inc: {
        [`items.${YYYYMMDD}.a`]: event.approved,
        [`items.${YYYYMMDD}.n`]: event.noFunds,
        [`items.${YYYYMMDD}.p`]: event.pending,
        [`items.${YYYYMMDD}.r`]: event.rejected,
      },
    },
    upsert: true,
  },
};

This updateOne has almost exactly the same logic as the one for appV6R1. The difference is that the name of the fields in the items document will be created based on year, month, and day (YYYYMMDD) instead of just month and day (MMDD). This change was made to reduce the complexity of the aggregation pipeline of the Get Reports.

Get Reports

To fulfill the Get Reports operation, one aggregation pipeline is required,

const pipeline = [
  { $match: docsFromKeyBetweenDate },
  { $addFields: buildTotalsField },
  { $group: groupCountTotals },
  { $project: format },
];

This aggregation operation has a similar logic to the one in appV6R1, with the only differences being the implementation in the $addFields stage.

{ $addFields: buildTotalsField }

It has a similar logic to the previous revision, where we first convert the items document into an array using the $objectToArray and then we use the reduce function to iterate over the array, accumulating the status.
The difference lies in the initial value and the logic of the reduce function.
The initial value in this case is an object/document with one field for each of the report date ranges. These fields for each report date range are also an object/document, with their fields being the possible status set to zero, as this is the initial value.
The logic in this case will check in which date range the item is, and based on that, increment the totals. If the item isInOneYearDateRange(...), it is also in all the other date ranges: three, five, seven, and ten years. If the item isInThreeYearsDateRange(...), it is also in all the other wide date ranges, five, seven, and ten years.

The following JavaScript code is logic equivalent to the real aggregation pipeline code. Senior developers could make the argument that this implementation could be less verbose or more optimized, but due to how MongoDB aggregation pipeline operators are specified, this is how it was implemented.

 const itemsArray = Object.entries(items); // Convert the object to an array of [key, value]

 const totals = itemsArray.reduce(
   (totals, [YYYYMMDD, status]) => {
    const [YYYY] = YYYYMMDD.slice(0, 4).toString(); // Get year
    const [MM] = YYYYMMDD.slice(4, 6).toString(); // Get month
    const [DD] = YYYYMMDD.slice(6, 8).toString(); // Get day
     let statusDate = new Date(`${YYYY}-${MM}-${DD}`);

     if isInOneYearDateRange(statusDate) {
       totals.oneYear = incrementTotals(totals.oneYear, status);
       totals.threeYears = incrementTotals(totals.threeYears, status);
       totals.fiveYears = incrementTotals(totals.fiveYears, status);
       totals.sevenYears = incrementTotals(totals.sevenYears, status);
       totals.tenYears = incrementTotals(totals.tenYears, status);
     } else if isInThreeYearsDateRange(statusDate) {
       totals.threeYears = incrementTotals(totals.threeYears, status);
       totals.fiveYears = incrementTotals(totals.fiveYears, status);
       totals.sevenYears = incrementTotals(totals.sevenYears, status);
       totals.tenYears = incrementTotals(totals.tenYears, status);
     } else if isInFiveYearsDateRange(statusDate) {
       totals.fiveYears = incrementTotals(totals.fiveYears, status);
       totals.sevenYears = incrementTotals(totals.sevenYears, status);
       totals.tenYears = incrementTotals(totals.tenYears, status);
     } else if isInSevenYearsDateRange(statusDate) {
       totals.sevenYears = incrementTotals(totals.sevenYears, status);
       totals.tenYears = incrementTotals(totals.tenYears, status);
     } else if isInTenYearsDateRange(statusDate) {
       totals.tenYears = incrementTotals(totals.tenYears, status);
     }

     return totals;
   },
   {
     oneYear: { a: 0, n: 0, p: 0, r: 0 },
     threeYears: { a: 0, n: 0, p: 0, r: 0 },
     fiveYears: { a: 0, n: 0, p: 0, r: 0 },
     sevenYears: { a: 0, n: 0, p: 0, r: 0 },
     tenYears: { a: 0, n: 0, p: 0, r: 0 },
   },
 );

Indexes

No additional indexes are required, maintaining the single _id index approach established in the appV4 implementation.

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV6R3, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Document Size	Storage Size	Indexes	Index Size
appV6R1	33,429,366	8.19GB	264B	2.34GB	1	1.22GB
appV6R2	33,429,207	9.11GB	293B	2.8GB	1	1.26GB
appV6R3	33,429,694	9.53GB	307B	2.56GB	1	1.19GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/events	Index Size/events	Total Size/events
appV6R1	17.6B	2.6B	20.2B
appV6R2	19.6B	2.7B	22.3B
appV6R3	20.5B	2.6B	23.1B

Because we are adding the year (YYYY) information in the name of each items document field, we got a 16.3% increase in storage size when compared to appV6R1 and a 4.8% increase in storage size when compared to appV6R2. This increase in storage size may be compensated by the gains in the Get Reports function, as we saw when going from appV6R1 to appV6R2.

Load Test Results

Executing the load test for appV6R3 and plotting it alongside the results for appV6R2, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rate

We have a huge improvement here when going from appV6R2 to appV6R3, for the first time, the application was able to reach all the desired rates in one phase.

Get Reports Latency

The latency also got huge improvements, with the peak value being reduced by 71% in the first phase, 67% in the second phase, 47% in the third phase, and 30% in the fourth phase.

Bulk Upsert Rate

As had happened in the previous version, the application was able to reach all the desired rates.

Bulk Upsert Latency

Here we have one of the biggest gains we had in this series, the latency went from being measured in seconds to being measured in milliseconds. We went from a peak of 1.8 seconds to 250ms in the first phase, from 2.3 seconds to 400ms in the second phase, from 2 seconds to 600ms in the third phase, and from 2.2 seconds to 800ms in the fourth phase

Issues and Improvements

The main bottleneck in our MongoDB server is still the disk throughput. As informed in the previous Issues and Improvements, this was the last improvement on an application level, so how can we extract more from our current hardware?

If we take a closer look at the MongoDB documentation, we'll find out that by default it uses block compression with the snappy compression library for all collections. Before the data is written to disk, it'll be compressed using the snappy library to reduce its size and speed up the writing process.

Would it be possible to use a different and more effective compression library to reduce the size of the data even further and, as a consequence, reduce the load on the server's disk? Yes, it's, and in the next application revision, we will use the zstd compression library instead of the default snappy compression library.

Application Version 6 Revision 4 (appV6R4): The `zstd` Compression Algorithm

Introduction

As discussed in the previous Issues and Improvements section, the performance gains of this version will be provided by changing the algorithm of the collection block compressor. By default, MongoDB uses the snappy, which we will change to zstd to have a better compression performance on the expense of more CPU usage.

All the schemas, functions, and code from this version are exactly the same as the appV6R3.

To create a collection that uses the zstd compression algorithm, the following command can be used.

db.createCollection("<collection-name>", {
  storageEngine: { wiredTiger: { configString: "block_compressor=zstd" } },
});

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV6R0:

export type SchemaV6R0 = {
  _id: Buffer;
  items: Record<
    string,
    {
      a?: number;
      n?: number;
      p?: number;
      r?: number;
    }
  >;
};

Bulk Upsert

Based on the specifications, the following bulk updateOne operation is used for each event generated by the application:

const YYYYMMDD = getYYYYMMDD(event.date); // Extract the year(YYYY), month(MM), and day(DD) from the `event.date`

const operation = {
  updateOne: {
    filter: { _id: buildId(event.key, event.date) }, // key + year + quarter
    update: {
      $inc: {
        [`items.${YYYYMMDD}.a`]: event.approved,
        [`items.${YYYYMMDD}.n`]: event.noFunds,
        [`items.${YYYYMMDD}.p`]: event.pending,
        [`items.${YYYYMMDD}.r`]: event.rejected,
      },
    },
    upsert: true,
  },
};

This updateOne has exactly the same logic as the one for appV6R3.

Get Reports

Based on what was presented in the Introduction, we have the following aggregation pipeline to generate the reports document.

const pipeline = [
  { $match: docsFromKeyBetweenDate },
  { $addFields: buildTotalsField },
  { $group: groupCountTotals },
  { $project: format },
];

This pipeline has exactly the same logic as the one for appV6R3.

Indexes

No additional indexes are required, maintaining the single _id index approach established in the appV4 implementation.

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV6R4, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Document Size	Storage Size	Indexes	Index Size
appV6R3	33,429,694	9.53GB	307B	2.56GB	1	1.19GB
appV6R4	33,429,372	9.53GB	307B	1.47GB	1	1.34GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Storage Size/events	Index Size/events	Total Storage Size/events
appV6R3	5.5B	2.6B	8.1B
appV6R4	3.2B	2.8B	6.0B

As the application implementation of appV6R4 is the same as appV5R3, the values for Data Size, Document Size, and Index Size are the same. The difference lies in Storage Size, which represents the Data Size after compression. Going from snappy to zstd decreased the Storage Size in a jaw-dropping 43%. Looking at the Event Statistics, there was a reduction of 26% of the storage required to register each event, going from 8.1 bytes to 6 bytes. These considerable reductions in size will probably translate to better performance on this version, as our main bottleneck is disk throughput.

Load Test Results

Executing the load test for appV6R4 and plotting it alongside the results for appV6R3, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rate

Even though we weren't able to reach all the desired rates, we got another huge improvement when going from appV6R3 to appV6R4, we could almost consider that in this revision, we were also able to reach the desired rates in the first, second and third quarters.

Get Reports Latency

The latency also got huge improvements, with the peak value being reduced by 30% in the first phase, 57% in the second phase, 61% in the third phase, and 57% in the fourth phase.

Bulk Upsert Rate

As had happened in the previous version, the application was able to reach all the desired rates.

Bulk Upsert Latency

Here we also got considerable improvements, with the peak value being reduced by 48% in the first phase, 39% in the second phase, 43% in the third phase, and 47% in the fourth phase.

Issues and Improvements

Although this is the last version and revision of the series, there is still room for improvement. For those willing to try them by themselves, here are the ones that I was able to think of:

Use the Computed Pattern in the appV6R4.
Optimize the aggregation pipeline logic for Get Reports in the appV6R4.
Change the zstd compression level from its default value 6 to a higher value.

Conclusion

This final part of "The Cost of Not Knowing MongoDB" series has explored the ultimate evolution of MongoDB application optimization, demonstrating how revolutionary design patterns and infrastructure-level improvements can transcend traditional performance boundaries. The journey through appV6R0 to appV6R4 represents the culmination of sophisticated MongoDB development practices, achieving performance levels that seemed impossible with the baseline appV1 implementation.

Series Transformation Summary

From Foundation to Revolution:

The complete series showcases a remarkable transformation across three distinct optimization phases:

Part 1 (appV1-appV4): Document-level optimizations achieving 51% storage reduction through schema refinement, data type optimization, and strategic indexing
Part 2 (appV5R0-appV5R4): Advanced pattern implementation with Bucket and Computed patterns, delivering 89% index size reduction and first-time achievement of target rates
Part 3 (appV6R0-appV6R4): Revolutionary Dynamic Schema Pattern with infrastructure optimization, culminating in sub-second latencies and comprehensive target rate achievement

Performance Evolution:

The progression reveals exponential improvements across all metrics:

Get Reports Latency: From 6.5 seconds (appV1) to 200-800ms (appV6R4) - a 92% improvement
Bulk Upsert Latency: From 62 seconds (appV1) to 250-800ms (appV6R4) - a 99% improvement
Storage Efficiency: From 128.1B per event (appV1) to 6.0B per event (appV6R4) - a 95% reduction
Target Rate Achievement: From consistent failures to sustained success across all operational phases

Architectural Paradigm Shifts

The Dynamic Schema Pattern Revolution:

appV6R0 through appV6R4 introduced the most sophisticated MongoDB design pattern explored in this series. The Dynamic Schema Pattern fundamentally redefined data organization by:

Eliminating Array Overhead: Replacing MongoDB arrays with computed object structures to minimize storage and processing costs
Single-Pipeline Optimization: Consolidating five separate aggregation pipelines into one optimized operation, reducing computational overhead by 80%
Infrastructure-Level Optimization: Implementing zstd compression, achieving 43% additional storage reduction over default snappy compression

Query Optimization Breakthroughs:

The implementation of intelligent date range calculation within aggregation pipelines eliminated redundant operations while maintaining data accuracy. This approach demonstrates senior-level MongoDB development by leveraging advanced aggregation framework capabilities to achieve both performance and maintainability.

Critical Technical Insights

Performance Bottleneck Evolution:

Throughout the series, we observed how optimization focus shifted as bottlenecks were resolved:

Initial Phase: Index size and query inefficiency dominated performance
Intermediate Phase: Document retrieval count became the limiting factor
Advanced Phase: Aggregation pipeline complexity constrained throughput
Final Phase: Disk I/O emerged as the ultimate hardware limitation

Pattern Application Maturity:

The series demonstrates the progression from junior to senior MongoDB development practices:

Junior Level: Schema design without understanding indexing implications (appV1)
Intermediate Level: Applying individual optimization techniques (appV2-appV4)
Advanced Level: Implementing established MongoDB patterns (appV5RX)
Senior Level: Creating custom patterns and infrastructure optimization (appV6RX)

Production Implementation Guidelines

When to Apply Each Pattern:

Based on the comprehensive analysis, the following guidelines emerge for production implementations:

Document-Level Optimizations: Essential for all MongoDB applications, providing 40-60% improvement with minimal complexity
Bucket Pattern: Optimal for time-series data with 10:1 or greater read-to-write ratios
Computed Pattern: Most effective in read-heavy scenarios with predictable aggregation requirements
Dynamic Schema Pattern: Reserved for high-performance applications where development complexity trade-offs are justified

Infrastructure Considerations:

The zstd compression implementation in appV6R4 demonstrates that infrastructure-level optimizations can provide substantial benefits (40%+ storage reduction) with minimal application changes. However, these optimizations require careful CPU utilization monitoring and may not be suitable for CPU-constrained environments.

The True Cost of Not Knowing MongoDB

This series reveals that the "cost" extends far beyond mere performance degradation:

Quantifiable Impacts:

Resource Utilization: Up to 20x more storage requirements for equivalent functionality
Infrastructure Costs: Potentially 10x higher hardware requirements due to inefficient patterns
Developer Productivity: Months of optimization work that could be avoided with proper initial design
Scalability Limitations: Fundamental architectural constraints that become exponentially expensive to resolve

Hidden Complexities:

More critically, the series demonstrates that MongoDB's apparent simplicity can mask sophisticated optimization requirements. The transition from appV1 to appV6R4 required a deep understanding of:

Aggregation framework internals and optimization strategies
Index behavior with different data types and query patterns
Storage engine compression algorithms and trade-offs
Memory management and cache utilization patterns

Final Recommendations

For Development Teams:

Invest in MongoDB Education: The performance differences documented in this series justify substantial training investments
Establish Pattern Libraries: Codify successful patterns like those demonstrated to prevent anti-pattern adoption
Implement Performance Testing: Regular load testing reveals optimization opportunities before they become production issues
Plan for Iteration: Schema evolution is inevitable; design systems that accommodate architectural improvements

For Architectural Decisions:

Start with Fundamentals: Proper indexing and schema design provide the foundation for all subsequent optimizations
Measure Before Optimizing: Each optimization phase in this series was guided by comprehensive performance measurement
Consider Total Cost of Ownership: The development complexity of advanced patterns must be weighed against performance requirements
Plan Infrastructure Scaling: Understanding that hardware limitations will eventually constrain software optimizations

Closing Reflection

The journey from appV1 to appV6R4 demonstrates that MongoDB mastery requires understanding not just the database itself, but the intricate relationships between schema design, query patterns, indexing strategies, aggregation frameworks, and infrastructure capabilities. The 99% performance improvements documented in this series are achievable, but they demand dedication to continuous learning and sophisticated engineering practices.

For organizations serious about MongoDB performance, this series provides both a roadmap for optimization and a compelling case for investing in advanced MongoDB expertise. The cost of not knowing MongoDB extends far beyond individual applications—it impacts entire technology strategies and competitive positioning in data-driven markets.

The patterns, techniques, and insights presented throughout this three-part series offer a comprehensive foundation for building high-performance MongoDB applications that can scale efficiently while maintaining operational excellence. Most importantly, they demonstrate that with proper knowledge and application, MongoDB can deliver extraordinary performance that justifies its position as a leading database technology for modern applications.

The Cost of Not Knowing MongoDB - Part 2: appV5R0 to appV5R4

Artur Garcia Costa — Thu, 22 Jan 2026 16:51:29 +0000

Application Version 5 Revision 0 and Revision 1: A simple way to use the Bucket Pattern
Application Version 5 Revision 2: Using the Bucket Pattern with the Computed Pattern
Application Version 5 Revision 3: Removing an aggregation pipeline anti-pattern
Application Version 5 Revision 4: Doubling down on the Computed Pattern

Article Introduction

Welcome to the second part of the series, "The Cost of Not Knowing MongoDB". Building upon the foundational optimizations explored in Part 1, this article delves into advanced MongoDB design patterns that can dramatically transform application performance.

In Part 1, we achieved significant improvements through field concatenation, data type optimization, and strategic field naming. However, as identified in the Issues and Improvements of appV4, these approaches represent only the beginning of what's possible with MongoDB schema design. This part introduces a paradigm shift from micro-optimizations to architectural patterns that fundamentally change how data is stored and retrieved.

The journey through appV5R0 to appV5R4 demonstrates the progressive implementation of two powerful MongoDB design patterns: the Bucket Pattern and the Computed Pattern.

Through comprehensive performance analysis and detailed implementation examples, this part reveals both the tremendous potential and important limitations of these advanced patterns, setting the stage for the revolutionary approaches explored in Part 3.

Application Version 5 Revision 0 and Revision 1 (appV5R0 and appV5R1): A simple way to use the `Bucket Pattern`

Introduction

When generating the oneYear totals report, the Get Reports function will need to retrieve an average of 60 documents and, in the worst-case scenario, 365 documents. To access each document, one index entry must be visited, and one disk read operation must be performed.

One way to reduce the number of index entries and documents retrieved to generate the report is to use the Bucket Pattern. According to the MongoDB documentation, "The bucket pattern separates long series of data into distinct objects. Separating large data series into smaller groups can improve query access patterns and simplify application logic."

Looking at our application from the perspective of the Bucket Pattern, so far, we have bucketed our data daily by a user, each document containing the status totals for one user in one day. For the two application versions presented in this section, appV5R0 and appV5R1, we’ll bucket the data by month (appV5R0) and by quarter (appV5R1).

As these are our first implementations using the Bucket Pattern, let’s make it as simple as possible.

For appV5R0, each document groups the events by month and user. Every document will have a field of type array called items to which each event document will be pushed. The event document pushed to the array will have its status field names shorthanded to its first letter, the same way we did in appV3 and appV4, and the date to which the event refers.

The _id field will have a logic similar to the one used in appV4, with the values of key and date concatenated and stored as hexadecimal/binary information. The difference is the date value—instead of being composed by year, month, and day (YYYYMMDD)—will only have year and month (YYYYMM), as we are bucketing the data by month.

For appV5R1, we have almost the same implementation as appV5R0, with the difference being that we’ll bucket the events by quarter, and the date value used to generate the _id field will be composed of year and quarter (YYYYQQ) instead of year and month (YYYYMM).

To build the _id field based on the key and date values for the appV5R0, the following TypeScript function was created:

const buildId = (key: string, date: Date): Buffer => {
  const [YYYY, MM] = date.toISOString().split();

  return Buffer.from(`${key}${YYYY}${MM}`, "hex");
};

To build the _id field based on the key and date values for the appV5R1, the following TypeScript functions were created:

const getQQ = (date: Date): string => {
  const month = Number(getMM(date));

  if (month >= 1 && month <= 3) return "01";
  else if (month >= 4 && month <= 6) return "02";
  else if (month >= 7 && month <= 9) return "03";
  else return "04";
};

const buildId = (key: string, date: Date): Buffer => {
  const [YYYY] = date.toISOString().split("-");
  const QQ = getQQ(date);

  return Buffer.from(`${key}${YYYY}${QQ}`, "hex");
};

This implementation reflects the knowledge of an intermediate MongoDB developer, for using the Bucket Pattern in its simplest form possible

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV5R0:

type SchemaV5R0 = {
  _id: Buffer; // Concatenated user key + time period (YYYYMM or YYYYQQ)
  items: Array<{
    date: Date;
    a?: number; // approved count
    n?: number; // noFunds count
    p?: number; // pending count
    r?: number; // rejected count
  }>;
};

Bulk Upsert

Based on the specification presented, we have the following updateOne operation for each event generated by this application version:

const operation = {
  updateOne: {
    filter: { _id: buildId(event.key, event.date) },
    update: {
      $push: {
        items: {
          date: event.date,
          a: event.approved,
          n: event.noFunds,
          p: event.pending,
          r: event.rejected,
        },
      },
    },
    upsert: true,
  },
};

filter:

Target the document where the _id field matches the concatenated value of key, year, and month/quarter.
The buildId function converts the key+year+month/quarter into a binary format.

update:

Uses $push to append the new event to the items array

upsert:

Ensures a new document is created if no matching document exists.

Get Reports

const pipeline = [
  {
    $match: {
      _id: {
        $gte: buildId(request.key, reportStartDate),
        $lte: buildId(request.key, reportEndDate),
      },
    },
  },
  {
    $unwind: {
      path: "$items",
    },
  },
  {
    $match: {
      "items.date": {
        $gte: reportStartDate,
        $lt: reportEndDate,
      },
    },
  },
  {
    $group: {
      _id: null,
      approved: { $sum: "$items.a" },
      noFunds: { $sum: "$items.n" },
      pending: { $sum: "$items.p" },
      rejected: { $sum: "$items.r" },
    },
  },
  { $project: { _id: 0 } },
];

{ $match: {...} }

The _id field is a binary representation of the concatenated key and date values.
The $gte operator specifies the start of the date range, while $lt specifies the end.
The result of buildId contains information by month/quarter, not day, as we need to build the report, so further filtering by day will be necessary

{ $unwind: {...} }

Deconstructs the items array, creating separate documents for each event within the matched buckets.

{ $match: {...} }

Applies precise date filtering at the individual event level, ensuring only events within the exact report date range are included.
It can be seen that we have already filtered by date, but as presented in the explanation of the first stage, we filtered by month/quarter, and to generate the report, we need to filter by day.

{ $group: {...} }

Group the filtered documents into a single result.
The _id field is set to null to group all matching documents from the previous stage together.
Computes the sum of the a, n, p, and r fields using the $sum operator.

$project

Removes the _id field from the final result.

Indexes

These implementations leverage the existing _id index exclusively, eliminating the need for additional compound indexes. The Bucket Pattern's consolidation of multiple events into a single document reduces index size and improves cache efficiency.

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV5R0 and appV5R1, we inserted 500 million event documents into the collections using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Avg. Document Size	Storage Size	Indexes	Index Size
appV4	359,615,279	19.66GB	59B	6.69GB	1	9.50GB
appV5R0	95,350,431	19.19GB	217B	5.06GB	1	2.95GB
appV5R1	33,429,649	15.75GB	506B	4.04GB	1	1.09GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/Event	Index Size/Event	Total Size/Event
appV4	42.2B	20.4B	62.6B
appV5R0	41.2B	6.3B	47.5B
appV5R1	33.8B	2.3B	36.1B

Analyzing the tables above, we can see that going from appV4 to appV5R0, we practically didn’t have improvements when looking at Data Size, but when considering the Index Size, the improvement was quite considerable. The index size for appV5R0 is 69% of the size of appV4.

When considering going from appV4 to appV5R1, the gains are even more impressive. In this case, we reduced the Data Size by 20% and the Index Size by 89%.

Looking at the event stats, we had considerable improvements in the Total Size/events, but what really catches the eye is the improvement in the Index Size/events, which is three times smaller for appV5R0 and nine times shorter for appV5R1.

This huge reduction in the index size is due to the use of the Bucket Pattern, where one document will store data for many events, reducing the total number of documents and, as a consequence, reducing the number of index entries.

With these impressive improvements regarding index size, it’s quite probable that we’ll also see impressive improvements in the application performance. One point of attention in the values presented above is that the index size of the two new versions is smaller than the memory size of the machine running the database, allowing the whole index to be kept in the cache, which is very good from a performance point of view.

Load Test Results

Executing the load test for appV5R0 and appV5R1 and plotting it alongside the results for appV4 and Desired rates, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rates

For the first time, the application is able to reach the target rate for both appV5R0 and appV5R1. appV5R1 nearly reaches all desired rates during the initial test quarter. Both versions demonstrate a clear performance advantage when compared to appV4, and appV5R1 shows significantly better results than appV5R0.

Get Reports Latency

Both new versions have notably lower latencies than appV4, and without degrading in the final half of the test. The appV5R1 reaches a peak latency of 211ms while appV5R0 reaches a peak latency of 530ms.

Bulk Upsert Rates

Both new versions almost reach all the desired rates throughout the test duration, degrading only in the final 20 minutes. It's possible to see that appV5R1 has a better performance than appV5R0.

Bulk Upsert Latency

Even though appV4 is able to reach lower latencies than appV5R1and appV5R0 for some parts of the first half of the test, this lower value is due to the requests being queued instead of the implementation being better. For the final half of the test, the two new versions are clearly a better solution with better values. Both new versions have the same value for peak latency, but the average latency for appV5R1 is lower than appV5R0.

Performance Analysis

The results clearly establish that quarterly bucketing (appV5R1) provides superior performance compared to monthly bucketing (appV5R0), validating the principle that larger bucket sizes can improve performance when appropriately balanced against query complexity.

Issues and Improvements

Because we made this first implementation of the Bucket Pattern as simple as possible, some clear possible optimizations weren’t considered. The main one is how we handle the items array field. In the current implementation, we just push the event documents to it, even when we already have events for a specific day.

A clear optimization here is one that we have been using from appV1 to appV4, where we create just one document per key and date/day, and when we have many events for the same key and date/day, we just increment the status of the document based on the status of the event.

Applying this optimization, we’ll reduce the size of the documents because the array of items will have fewer elements. We’ll also reduce the computational cost of generating the reports because we are pre-computing the status totals by day. This build pattern of pre-computing is quite common in that it has its own name, Computed Pattern.

Application Version 5 Revision 2 (appV5R2): Using the Bucket Pattern with the Computed Pattern

Introduction

As discussed in the Issues and Improvements of appV5R0 and appV5R1, we can use the Computed Pattern to pre-compute the total status by day in the items array field when inserting a new event. This reduces the computation cost of generating the reports and also reduces the document size by having fewer elements in the items array field.

Most of this application version will be similar to the appV5R1, where we bucketed the events by quarter. The only difference will be in the Bulk Upsert operation, where we will update an element in the items array field if an element with the same date of the new event already exists, or insert a new element in items if an element with the same date of the new event doesn’t exist.

The implementation showcases senior-level MongoDB development practices, utilizing advanced aggregation pipeline features within update operations.

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV5R0:

type SchemaV5R0 = {
  _id: Buffer; // User key + quarter (YYYYQQ)
  items: Array<{
    date: Date;
    a?: number; // approved total for the day
    n?: number; // noFunds total for the day
    p?: number; // pending total for the day
    r?: number; // rejected total for the day
  }>;
};

Bulk Upsert

Based on the specification presented, we have the following updateOne operation for each event generated by this application version:

const operation = {
  updateOne: {
    filter: { _id: buildId(event.key, event.date) },
    update: [
      { $set: { result: sumIfItemExists } },
      { $set: { items: returnItemsOrCreateNew } },
      { $unset: ["result"] },
    ],
    upsert: true,
  },
};

This updateOne operation has a similar logic to the one in appV5R1, with the only difference being the update logic.

update:

The complete code for this update logic is quite big, hard to get your head around quickly, and would also make the process of browsing through the article a little cumbersome. Because of that, here we have a pseudocode of it.
Our goal in this update operation is to increment the status of an element in the items array if an element with the same date of the new event already exists, or create a new element if there isn’t one with the same date. It’s not possible to achieve this functionality with the MongoDB Update Operators. The way around it is to use Update with Aggregation Pipeline, which allows a more expressive update statement.
To facilitate the understanding of the logic used in each stage of the aggregation pipeline, a simplified JavaScript version of the functionalities will be provided:

$set: { result: sumIfItemExists }:

Set the field result to the logic of the variable sumIfItemExists. As the name suggests, this logic will iterate through the items array looking for elements with the same date as the event. If there is one, this element will have the status present in the event summed/added to it. As we need a way to keep track of whether an element with the same date of the event was found and the event status was registered, there is an environment boolean variable called found that will keep track of it.
```
 const result = items.reduce(
   (accumulator, element) => {
     if (element.date === event.date) {
       element.a += event.a || 0;
       element.n += event.n || 0;
       element.p += event.p || 0;
       element.r += event.r || 0;

       accumulator.found = true;
     }

     accumulator.items.push(element);

     return accumulator;
   },
   { found: false, items: [] }
 );
```
The result variable/field will be generated using a reduce method on the items array field from the document we want to update. The initial value for the reduce method is an object with the fields found and items. The field accumulator.found has an initial value of false and is responsible for signaling if an element in the reduced execution had the same date as the event we want to register. If there is one element with the same date as the event, element.date === event.date, the status values of the element will be incremented by the status of the event and the accumulator.found field will be set to true, indicating that the event was registered. The accumulator.items array field will have the element of each iteration pushed to it, becoming the new items array field.
$set: { items: returnItemsOrCreateNew }:
Set the field items to the resulting logic of the variable returnItemsOrCreateNew. With a little effort of imagination, the name suggests that the logic present in the variable will return the items field of the previous stage if an element with the same date of the event was found, found == true, or return a new array generated by the concatenation of the items array field of the previous stage with a new array field containing the event element when an element with the same date of the event was not found during the reduced iterations, found == false.
```
 let items = [];

 if (result.found == true) {
   items = result.items;
 } else {
   items = result.items.concat([event]);
 }
```
$unset: ["result"]:
Removes the temporary result field created during the aggregation process.

This sophisticated update operation achieves the equivalent of an "upsert within an array" - functionality that requires careful orchestration of MongoDB's aggregation capabilities.

Get Reports

const pipeline = [
  {
    $match: {
      _id: {
        $gte: buildId(request.key, reportStartDate),
        $lte: buildId(request.key, reportEndDate),
      },
    },
  },
  { $unwind: { path: "$items" } },
  {
    $match: {
      "items.date": {
        $gte: reportStartDate,
        $lt: reportEndDate,
      },
    },
  },
  {
    $group: {
      _id: null,
      approved: { $sum: "$items.a" },
      noFunds: { $sum: "$items.n" },
      pending: { $sum: "$items.p" },
      rejected: { $sum: "$items.r" },
    },
  },
  { $project: { _id: 0 } },
];

This aggregation operation has the same logic as the one in appV5R1.

Indexes

No additional indexes are required, maintaining the single _id index approach established in the appV4 implementation.

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV5R2, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Avg. Document Size	Storage Size	Indexes	Index Size
appV5R1	33,429,468	15.75GB	506B	4.04GB	1	1.09GB
appV5R2	33,429,649	11.96GB	385B	3.26GB	1	1.16GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/Event	Index Size/Event	Total Size/Event
appV5R1	33.8B	2.3B	36.1B
appV5R2	25.7B	2.5B	28.2B

Analyzing the tables above, we have the expected result presented in the introduction, from appV5R1 to appV5R2. The only noticeable difference is the 24% reduction in the Data Size.

This reduction in the Data Size and Document Size will help in the performance of our application by reducing the time spent reading the document from the disk and the processing cost of decompressing the document from its compressed state.

Load Test Results

Executing the load test for appV5R2 and plotting it alongside the results for appV5R1 and Desired rates, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rates

Both versions have very similar rates, with appV5R2 being slightly better than appV5R1 for the final half of the test.

Get Reports Latency

Both versions have very similar latencies, with appV5R2 reaching lower peak values when compared to appV5R1 for the final half of the test.

Bulk Upsert Rates

Both versions have very similar rates, with appV5R2 being slightly better than appV5R1 for the final 20 minutes of the test, but still not reaching the desired rate.

Bulk Upsert Latency

Both versions have very similar latencies, with appV5R1 reaching lower peak values when compared to appV5R2 for the final 20 minutes of the test.

Performance Analysis

The results show modest improvements in Get Reports performance but slight degradation in Bulk Upsert performance. This outcome reflects the fundamental trade-off inherent in the Computed Pattern: increased write complexity in exchange for simplified read operations.

With writes occurring 4.5 times more frequently than reads, the increased computational cost of the complex aggregation pipeline during writes roughly balances the reduced computational cost during reads. The MongoDB documentation confirms this expectation: "If reads are significantly more common than writes, the computed pattern reduces the frequency of data computation."

In our load testing scenario, writes significantly outnumber reads, making the Computed Pattern's benefits less pronounced. However, this implementation provides a valuable reference architecture for applications with different read/write patterns.

Issues and Improvements

Let’s try to extract more performance from our application by searching for improvements in our current operations. Looking at the aggregation pipeline of Get Reports, we find a very common anti-pattern when fields of type array are involved. This anti-pattern is the $unwind followed by a $match, which happens in the second and third stages of our aggregation pipeline.

This combination of stages can hurt the performance of the aggregation pipeline because we are increasing the number of documents in the pipeline with the $unwind stage to later filter the documents with the $match. In other words, to get to a final state with fewer documents, we’re going through an intermediate state where we increase the number of documents.

In the next application revision, we’ll see how we can achieve the same final result using only one stage and without having an intermediate stage with more documents.

Application Version 5 Revision 3 (appV5R3): Removing an aggregation pipeline anti-pattern

Introduction

As presented in the Issues and Improvements of appV5R2, we have an anti-pattern in the aggregation pipeline of Get Reports that can harm the query performance. This anti-pattern is characterized by a $unwind stage followed by a $match. This combination of stages will first increase the number of documents, $unwind, to later filter them, $match. In a simplified way, to get to a final state, we’re going through a costly intermediary state.

One possible solution around this anti-pattern is to use the $addFields stage with the $filter operator on the items array field. With this combination, we would replace the items array field using the $addFields stage with a new array field generated by the $filter operator in the items array, where we would filter all elements where the date is inside the report's date range.

But, considering our aggregation pipeline with the optimization presented above, there is an even better solution. With the $filter operator, we will loop through all elements in the items field and only compare their dates with the report dates to filter the elements. As the final goal of our aggregation pipeline is to get the status totals of all elements within the report's date range, instead of just looping through the elements in items to filter them, we could already start to calculate the status totals. We can obtain this functionality by using the $reduce operator instead of the $filter.

The implementation represents senior-level MongoDB development practices, showcasing how sophisticated operators can eliminate performance bottlenecks while maintaining code clarity and functionality.

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV5R0:

type SchemaV5R0 = {
  _id: Buffer;
  items: Array<{
    date: Date;
    a?: number;
    n?: number;
    p?: number;
    r?: number;
  }>;
};

Indexes

No additional indexes are required, maintaining the single _id index approach established in the appV4 implementation.

Bulk Upsert

Based on the specification presented, we have the following bulk updateOne operation for each event generated by the application:

const operation = {
  updateOne: {
    filter: { _id: buildId(event.key, event.date) },
    update: [
      { $set: { result: sumIfItemExists } },
      { $set: { items: returnItemsOrCreateNew } },
      { $unset: ["result"] },
    ],
    upsert: true,
  },
};

This updateOne operation has the same logic as the one in appV5R2

Get Reports

const pipeline = [
  { $match: docsFromKeyBetweenDate },
  { $addFields: itemsReduceAccumulator },
  { $group: groupSumStatus },
  { $project: { _id: 0 } },
];

This aggregation operation has a similar logic to the one in appV5R1, with the differences being the change of the second stage from $unwind to $addFields and the change of a variable name in $group stage. The complete code for this aggregation pipeline is quite complicated. Because of that, we will have just a pseudocode for it here.

{ $addFields: itemsReduceAccumulator }:

Adds a new field to the document called totals that will have the status totals.
Uses $reduce to iterate through the items array, applying date filtering and status accumulation in a single operation.

The following JavaScript code is logic equivalent to the real aggregation pipeline code.

 // Equivalent JavaScript logic:
 const totals = items.reduce(
   (accumulator, element) => {
     if (element.date >= reportStartDate && element.date < reportEndDate) {
       accumulator.a += element.a || 0;
       accumulator.n += element.n || 0;
       accumulator.p += element.p || 0;
       accumulator.r += element.r || 0;
     }
     return accumulator;
   },
   { a: 0, n: 0, p: 0, r: 0 }
 );

{ $group: groupSumStatus }:

Group the totals of each document in the pipeline into final status totals using $sum operations.

 const groupSumStatus = {
   _id: null,
   approved: { $sum: "$totals.a" },
   noFunds: { $sum: "$totals.n" },
   pending: { $sum: "$totals.p" },
   rejected: { $sum: "$totals.r" },
 };

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV5R3, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Avg. Document Size	Storage Size	Indexes	Index Size
appV5R2	33,429,649	11.96GB	385B	3.26GB	1	1.16GB
appV5R3	33,429,492	11.96GB	385B	3.24GB	1	1.11GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/Event	Index Size/Event	Total Size/Event
appV5R2	25.7B	2.5B	28.2B
appV5R3	25.7B	2.4B	28.1B

As the document schema and Bulk Upsert operations for appV5R3 are the same as appV5R2, there is nothing to reason about in this section between the two revisions.

Load Test Results

Executing the load test for appV5R3 and plotting it alongside the results for appV5R2 and Desired rates, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rates

We can say that both versions have a similar performance, with each one reaching better rates throughout the test duration.

Get Reports Latency

It's almost indistinguishable which version has better latency values.

Bulk Upsert Rates

We can say that both versions have a similar performance, with appV5R3 being slightly better at the 20 final minutes of the test.

Bulk Upsert Latency

Even though both version have very similar latency values, we can see that appV5R2 has a slightly lower latency values for the first three quarters of the test, while the appV5R3 has a considerable better latency values for the final final quarter of the test.

Performance Analysis

While the optimized aggregation pipeline is demonstrably more efficient in terms of CPU and memory usage, the performance improvements are minimal. This outcome reveals that the current bottleneck is not a computational overhead, but a disk I/O limitation.

MongoDB Atlas metrics show the IOWAIT metric reaching nearly 15% of CPU usage, indicating that the CPU frequently waits for disk operations to complete. This disk bottleneck will become more apparent in subsequent versions and represents a fundamental infrastructure limitation that cannot be resolved through schema optimization alone.

The relatively modest performance gains demonstrate that optimizing beyond the current bottleneck yields diminishing returns, highlighting the importance of identifying and addressing the primary constraint in any system optimization effort.

Issues and Improvements

We’ve just seen that our implementation's limitation is the disk. To solve that, we have two options: Upgrade the disk where MongoDB stores data or change our implementation to reduce disk usage.

As the goal of this series is to show how much performance we can achieve with the same hardware by modeling how our application stores and reads data from MongoDB, we won’t upgrade the disk. A change in the application modeling for MongoDB will be left for the next article, appV6Rx.

For appV5R4, we will double down on the Computed Pattern and pre-compute the status totals by quarter, not just day. Even though we know it probably won’t provide better performance for things discussed in the "Load test result" of appV5R2, let’s flex our MongoDB and aggregation pipeline knowledge, and also provide a reference code example for the cases where the Computed Pattern is a good fit.

Application Version 5 Revision 4 (appV5R4): Doubling down on the Computed Pattern

Introduction

As presented in the issues and improvements of appV5R3, for this revision, we’ll double down on the Computed Pattern even though we have good evidence that it won’t provide a better performance—but, you know, for science.

We’ll also use the Computed Pattern to pre-compute the status totals for each document. As each document stores the events per quarter and user, our application will have on each document the status totals per quarter and user. These pre-computed totals will be stored in a field called totals.

One point of attention in this implementation is that we are adding a new field to the document, which will also increase the average document size. As seen in the previous revision, appV5R3, our current bottleneck is disk, another indication that this implementation won’t have better performance.

The implementation complexity increases significantly, requiring careful coordination between daily item management and quarterly total maintenance, showcasing the sophisticated techniques employed by senior MongoDB developers.

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV5R1:

type SchemaV5R1 = {
  _id: Buffer;
  totals: {
    a?: number; // Quarter total approved
    n?: number; // Quarter total noFunds
    p?: number; // Quarter total pending
    r?: number; // Quarter total rejected
  };
  items: Array<{
    date: Date;
    a?: number; // Daily total approved
    n?: number; // Daily total noFunds
    p?: number; // Daily total pending
    r?: number; // Daily total rejected
  }>;
};

Indexes

No additional indexes are required, maintaining the single _id index approach established in the appV4 implementation.

Bulk Upsert

Based on the specification presented, we have the following updateOne operation for each event generated by this application version:

const operation = {
  updateOne: {
    filter: { _id: buildId(event.key, event.date) },
    update: [
      { $set: newReportFields }, // Update quarterly totals
      { $set: { result: sumIfItemExists } }, // Process daily items
      { $set: { items: returnItemsOrCreateNew } }, // Update items array
      { $unset: ["result"] }, // Cleanup temporary field
    ],
    upsert: true,
  },
};

This updateOne operation has a similar logic to the one in appV5R3, with the only difference being an extra stage in the update aggregation pipeline logic to pre-compute the document status totals.

update:

To facilitate the understanding of the logic used in the aggregation pipeline, a simplified JavaScript version of the functionalities will be provided:

{ $set: newReportFields }:

Set the field totals to the resulting operation of incrementing each one of the possible status fields by the status provided in the event document.

 if (totals.a != null) {
   totals.a += event.a || 0;
 } else {
   totals.a = event.a || 0;
 }

 if (totals.n != null) {
   totals.n += event.n || 0;
 } else {
   totals.n = event.n || 0;
 }

 if (totals.p != null) {
   totals.p += event.p || 0;
 } else {
   totals.p = event.p || 0;
 }

 if (totals.r != null) {
   totals.r += event.r || 0;
 } else {
   totals.r = event.r || 0;
 }

Get Reports

const pipeline = [
  { $match: docsFromKeyBetweenDate },
  { $addFields: itemsReduceAccumulator },
  { $group: groupSumStatus },
  { $project: { _id: 0 } },
];

This aggregation operation has a similar logic to the one in appV5R3, with the only differences being the implementation in the $addFields stage. The complete code for this aggregation pipeline is quite complicated. Because of that, we will have just a pseudocode for it here.

{ $addFields: itemsReduceAccumulator }:

A similar implementation to the one in appV5R3
The main difference is if the quarter’s date range is within the limits of the report’s date range, we can use the pre-computed totals instead of calculating the value through a $reduce operation.

The following JavaScript code is logic equivalent to the real aggregation pipeline code.

 // Equivalent JavaScript logic:
 let totals;

 if (documentQuarterWithinReportDateRange) {
   // Use pre-computed quarterly totals
   totals = document.totals;
 } else {
   // Fall back to item-level aggregation
   totals = document.items.reduce(
     (accumulator, element) => {
       if (
         element.date >= reportStartDate &&
         element.date < reportEndDate
       ) {
         accumulator.a += element.a || 0;
         accumulator.n += element.n || 0;
         accumulator.p += element.p || 0;
         accumulator.r += element.r || 0;
       }

       return accumulator;
     },
     { a: 0, n: 0, p: 0, r: 0 }
   );
 }

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV5R4, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Avg. Document Size	Storage Size	Indexes	Index Size
appV5R3	33,429,492	11.96GB	385B	3.24GB	1	1.11GB
appV5R4	33,429,470	12.88GB	414B	3.72GB	1	1.24GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/Event	Index Size/Event	Total Size/Event
appV5R3	25.7B	2.4B	28.1B
appV5R4	27.7B	2.7B	30.4B

As discussed in this revision introduction, the additional totals field on each document in the collection increased the document size and the overall storage size. The Data Size of appV5R4 is 7,7% bigger than appV5R3 and the Total Size/events is 8,2%. Because disk is our limiting factor, the performance of appV5R4 will probably be worse than appV5R3.

Load Test Results

Executing the load test for appV5R4 and plotting it alongside the results for appV5R3 and Desired rates, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rates

It's clear that appV5R4 has worse rate values when compared to appV5R3, only slightly beating the previous version for the first quarter of the test.

Get Reports Latency

For the first two quarters, the test appV5R4 has a lower latency, but for the final two quarters, appV5R3 gets the lead.

Bulk Upsert Rates

Both versions have very similar rate values and also fall short of the desired rate in the final 20 minutes.

Bulk Upsert Latency

The new version, appV5R4, is only able to match the latency values of appV5R3 for the first quarter of the test, falling short for the rest of the three quarters.

Issues and Improvements

As spoiled in the previous Issues and improvements, to improve our application’s performance, we need to change our MongoDB implementation in a way that reduces disk usage. To achieve this, we need to reduce the document size.

You may think it is not possible to reduce our document size and overall collection/index size even more because we are already using just one index, concatenating two fields into one, using shorthand field names, and using the Bucket Pattern. But there is one thing called the Dynamic Schema that can help us.

In the Dynamic Schema, the values of a field become field names. Thus, field names also store data and, as a consequence, reduce the document size. As this pattern will require big changes in our current application schema, we’ll start a new version, appV6Rx, which we’ll play around with in the third part of this series.

Conclusion

That is the end of the second part of the series. We covered Bucket Pattern and Computed Pattern, and the many ways we can use these patterns to model how our application stores its data in MongoDB, and the big performance gains it can provide when used properly.

Here is a quick review of the improvements made between the application versions:

appV4 to appV5R0/appV5R1: This is the simplest possible implementation of the Bucket Pattern, grouping the events by month for appV5R0 and by quarter for appV5R1.
appV5R1 to appV5R2: Instead of just pushing the event document to the items array, we started to pre-compute the status totals by day, using the Computed Pattern.
appV5R2 to appV5R3: This improved the aggregation pipeline for Get Reports, preventing a costly intermediary stage. It didn’t provide performance improvements because our MongoDB instance is currently disk-limited.
appV5R3 to appV5R4: We doubled down on Computed Pattern to pre-calculate the totals field even though we knew the performance wouldn’t be better—but, just for science.

We had noticeable improvements in the version presented in this second part of the series when compared to the versions from the first part of the series. appV0 to appV4. appV5R3 showed the best performance of them all, but it still can’t reach all the desired rates. For the third and final version of this series, our application versions will be developed around the Dynamic Schema Pattern, which will reduce the overall document size and help with the current disk limitation.

For any further questions, you can go to the MongoDB Community Forum, or if you want to build your application using MongoDB, the MongoDB Developer Center has lots of examples and tutorials in many different programming languages.

The Cost of Not Knowing MongoDB - Part 1: appV0 to appV4

Artur Garcia Costa — Thu, 22 Jan 2026 16:50:40 +0000

Application Version 1: The baseline implementation
Application Version 2: Better Understanding Indexing
Application Version 3: Better Data Types and Field Name Shorthanding
Application Version 4: Taking Advantage of the _id Index

Article Introduction

Welcome to the first part of the series, "The Cost of Not Knowing MongoDB". This comprehensive analysis explores how different MongoDB schema design decisions can dramatically impact application performance, demonstrating the critical importance of understanding MongoDB's underlying mechanisms.

In this first article, we examine four progressive application versions, appV1 through appV4, each representing common approaches developers take when working with MongoDB. Through detailed performance testing and analysis, we reveal how seemingly minor schema modifications can lead to significant improvements in throughput, latency, and resource utilization.

The journey begins with appV1, a baseline implementation that reflects typical patterns used by junior MongoDB developers. We then progress through increasingly optimized versions, introducing concepts such as field concatenation, data type optimization, and strategic field abbreviation. Each version builds upon the lessons learned from its predecessor, culminating in appV4.

This foundational knowledge sets the stage for Part 2, where we explore advanced patterns like the Bucket Pattern and Computed Pattern to achieve even greater performance improvements.

Application Version 1 (appV1): The baseline implementation

Introduction

The first application version and the base case for our comparison would have been developed by someone with a junior knowledge level of MongoDB who just took a quick look at the documentation and learned that every document in a collection must have an _id field and this field is always unique indexed.

To take advantage of the _id obligatory field and index, the developer decides to store the values of key and date in an embedded document in the _id field. With that, each document will register the status totals for one user, specified by the field _id.key, in one day, specified by the field _id.date.

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV1:

type SchemaV1 = {
  _id: {
    key: string;
    date: Date;
  };
  approved?: number;
  noFunds?: number;
  pending?: number;
  rejected?: number;
};

Bulk Upsert

Based on the specification presented, we have the following updateOne operation for each event generated by this application version:

const operation = {
  updateOne: {
    filter: {
      _id: { date: event.date, key: event.key },
    },
    update: {
      $inc: {
        approved: event.approved,
        noFunds: event.noFunds,
        pending: event.pending,
        rejected: event.rejected,
      },
    },
    upsert: true,
  },
};

filter:

Target the document where the _id field matches { date: event.date, key: event.key }.

update:

Uses the $inc operator to increment counters (approved, noFunds, pending, rejected) based on the event data.

upsert:

Ensures a new document is created if no matching document exists.

Get Reports

const pipeline = [
  {
    $match: {
      "_id.key": request.key,
      "_id.date": { $gte: request.date - oneYear, $lt: request.date },
    },
  },
  {
    $group: {
      _id: null,
      approved: { $sum: "$approved" },
      noFunds: { $sum: "$noFunds" },
      pending: { $sum: "$pending" },
      rejected: { $sum: "$rejected" },
    },
  },
];

{ $match: {...} }:

Filters documents based on the key and date fields.
The "_id.key" field matches the user key provided in the request.
The "_id.date" field filters documents within the specified date range using $gte (greater than or equal to) and $lt (less than).

{ $group: {...} }:
- Group the filtered documents into a single result.
- The _id field is set to null to group all matching documents from the previous stage together.
- Computes the sum of the approved, noFunds, pending, and rejected fields using the $sum operator.

Indexes

Initially, appV1 aimed to use the default index on the _id field (which contained an embedded document with key and date). However, this default index on the embedded _id field was not sufficient to efficiently support the query patterns, particularly for the Get Reports function, which filters by _id.key and _id.date.

To address this, an additional compound index was created:

const keys = { "_id.key": 1, "_id.date": 1 };
const options = { unique: true };

db.appV1.createIndex(keys, options);

This explicit index on _id.key and _id.date ensures that queries filtering and sorting on these fields can be performed efficiently. The unique: true option enforces that the combination of _id.key and _id.date is unique across all documents in the collection. For a more detailed explanation of why an index on an embedded document's fields might be needed even if the top-level field is indexed, refer to Appendices - Index on Embedded Documents.

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV1, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier.

Collection	Documents	Data Size	Avg. Document Size	Storage Size	Indexes	Index Size
appV1	359,639,622	39.58GB	119B	8.78GB	2	20.06GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/Event	Index Size/Event	Total Size/Event
appV1	85B	43.1B	128.1B

Load Test Results

Executing the load test for appV1 and plotting it alongside the Desired values, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rate

The application never reaches the target rate of 25 reports per second during the first 10 minutes phase, peaking at only 16.5 reports per second. During the rest of the test, the rate stays around 6 reports per second.

Get Reports Latency

Begins at 2 seconds and progressively increases throughout the test duration, reaching a maximum of 6.5 seconds with an average of 4.5 seconds.

Bulk Upsert Rate

The application only reaches the desired rate of 250 events per second during the first 10 minutes of the test. During the rest of the test, the rate degrades to around 200 events per second.

Bulk Upsert Latency:

Starts at 10 seconds and exhibits similar degradation patterns, escalating to a maximum of 62 seconds with an average of 42 seconds.

Issues and Improvements

The first issue that can be pointed out and improved in this implementation is the document schema in combination with the two indexes. Because the fields key and date are in an embedded document in the field _id, their values are indexed twice: by the default/obligatory index in the _id field and by the index we created to support the Bulk Upserts and Get Reports operations.

As the key field is a 64-character string and the date field is of type date, these two values use at least 68 bytes of storage. As we have two indexes, each document will contribute to 136 index bytes in a non-compressed scenario.

The improvement here is to extract the fields key and date from the _id field and let the _id field keep its default value of type ObjectId. The ObjectId data type takes only 12 bytes of storage.

This first implementation can be seen as a forced worst-case scenario to make the more optimized solutions look better. Unfortunately, that is not the case. It's not hard to find implementations like this on the internet, and I've worked on a big project with a schema like this one, from which I got the idea for this first case.

Application Version 2 (appV2): Better Understanding Indexing

Introduction

As discussed in the issues and improvements of appV1, embedding the fields key and date as a document in the _id field trying to take advantage of its obligatory index is not a good solution for our application because we would still need to create an extra index and the index on the _id field would take more storage than needed.

To solve the issue of the index on the _id field being bigger than needed, the solution is to move out the fields key and date from the embedded document in the _id field, and let the _id field have its default value of type ObjectId. Each document would still register the status totals for one user, specified by the field key, in one day, specified by the field date, the same way it's done in appV1.

The second application version and the improvements to get to it would still have been developed by someone with a junior knowledge level of MongoDB, but who has gone more in-depth in the documentation related to indexes in MongoDB, especially when indexing fields of type documents.

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV2:

type SchemaV2 = {
  _id: ObjectId;
  key: string;
  date: Date;
  approved?: number;
  noFunds?: number;
  pending?: number;
  rejected?: number;
};

Bulk Upsert

Based on the specification presented, we have the following updateOne operation for each event generated by this application version:

const operation = {
  updateOne: {
    filter: { key: event.key, date: event.date },
    update: {
      $inc: {
        approved: event.approved,
        noFunds: event.noFunds,
        pending: event.pending,
        rejected: event.rejected,
      },
    },
    upsert: true,
  },
};

This updateOne operation has a similar logic to the one in appV1, with the only difference being the filter criteria.

filter:

Target the document where the fields date and key from the event document matches the fields key and date from a document in the collection.

Get Reports

const pipeline = [
  {
    $match: {
      key: request.key,
      date: { $gte: request.date - oneYear, $lt: request.date },
    },
  },
  {
    $group: {
      _id: null,
      approved: { $sum: "$approved" },
      noFunds: { $sum: "$noFunds" },
      pending: { $sum: "$pending" },
      rejected: { $sum: "$rejected" },
    },
  },
];

This aggregation operation has a similar logic to the one in appV1, with the only difference being the filtering criteria in the $match stage.

{ $match: {...} }:

The key field matches the user key provided in the request.
The date field filters documents within the specified date range using $gte (greater than or equal to) and $lt (less than).

Indexes

In appV2, the key and date fields were moved out of the _id field and became top-level fields. To support efficient querying for both Bulk Upsert (filtering by key and date) and Get Reports (filtering by key and a date range), a compound index was created on these two fields:

const keys = { key: 1, date: 1 };
const options = { unique: true };

db.appV2.createIndex(keys, options);

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV2, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Avg. Document Size	Storage Size	Indexes	Index Size
appV1	359,639,622	39.58GB	119B	8.78GB	2	20.06GB
appV2	359,614,536	41.92GB	126B	10.46GB	2	16.66GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/Event	Index Size/Event	Total Size/Event
appV1	85B	43.1B	128.1B
appV2	90B	35.8B	125.8B

Analyzing the tables above, we can see that from appV1 to appV2, we increased the data size by 6% and decreased the index size by 17%. We can say that our goal of making the index on the _id field smaller was accomplished.

Looking at the Event Statistics, the total size per event value decreased only by 1.8%, from 128.1B to 125.8B. With this difference being so small, there is a good chance that we won’t see significant improvements from a performance point of view.

Load Test Results

Executing the load test for appV2 and plotting it alongside the results for appV1 and Desired rates, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rates

Performance remains suboptimal, reaching only 17 reports per second compared to the target of 25 reports per second for the first 10 minutes of the test, slightly better than appV1. For the rest of the test, both versions have equally bad performance.

Get Reports Latency

appV2 demonstrates considerably worse latency performance compared to appV1, indicating that the schema changes negatively impacted read operations.

Bulk Upsert Rates

Similar to appV1, appV2 achieves the target rate of 250 events per second only during the first 10 minutes of testing. For the rest of the test, appV2 has a slightly better performance than appV1, but still way below the desired rates.

Bulk Upsert Latency

appV2 shows marginal improvement over appV1, suggesting some benefit from the reduced index size for write operations.

Performance Summary

The results align with the modest 1.8% improvement observed in the Initial Scenario Statistics. appV2's performance characteristics demonstrate that simply restructuring the _id field provides minimal benefits. The marginal improvements in Bulk Upsert operations (attributed to smaller indexes) are offset by degraded Get Reports performance (attributed to larger document sizes), resulting in negligible overall performance gains.

Issues and Improvements

The following document is a sample from the collection appV2:

const document = {
  _id: ObjectId("6685c0dfc2445d3c5913008f"),
  key: "0000000000000000000000000000000000000000000000000000000000000001",
  date: new Date("2022-06-25T00:00:00.000Z"),
  approved: 10,
  noFunds: 3,
  pending: 1,
  rejected: 1,
};

Analyzing it, aiming to reduce its size, two points of improvement can be found. One is the field key, which is of type string and will always have 64 characters of hexadecimal data, and the other is the name of the status fields, which combined can have up to 30 characters.

The field key, as presented in the scenario section, is composed of hexadecimal data, in which each character requires four bits to be presented. In our implementation so far, we have stored this data as strings using UTF-8 encoding, in which each character requires eight bits to be represented. So, we are using double the storage we need. One way around this issue is to store the hexadecimal data in its raw format using the binary data.

For the status field names, we can see that the names of the fields use more storage than the value itself. The field names are strings with at least 7 UTF-8 characters, which takes at least 7 bytes. The value of the status fields is a 32-bit integer, which takes 4 bytes. We can shorthand the status names by their first character, where approved becomes a, noFunds becomes n, pending becomes p, and rejected becomes r.

Application Version 3 (appV3): Better Data Types and Field Name Shorthanding

Introduction

As discussed in the issues and improvements of appV2, to reduce the document size, two improvements were proposed. One is to convert the data type of the field key from string to binary, requiring four bits to represent each hexadecimal character instead of the eight bits of a UTF-8 character. The other is to shorthand the name of the status fields by their first letter, requiring one byte for each field name instead of seven bytes. Each document would still register the status totals for one user, specified by the field key, in one day, specified by the field date, the same way it was done in the previous implementations.

To convert the key value from string to binary/buffer, the following TypeScript function was created:

const buildKey = (key: string): Buffer => {
  return Buffer.from(key, "hex");
};

The third application version has two improvements compared to the second version. The improvement of storing the field key as binary data to reduce its storage need would have been thought of by an intermediate to senior MongoDB developer. The improvement of shortening the names of the status fields would have been thought of by an intermediate MongoDB developer.

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV3:

type SchemaV3 = {
  _id: ObjectId;
  key: Buffer;
  date: Date;
  a?: number;
  n?: number;
  p?: number;
  r?: number;
};

Bulk Upsert

Based on the specification presented, we have the following updateOne operation for each event generated by this application version:

const operation = {
  updateOne: {
    filter: { key: buildKey(event.key), date: event.date },
    update: {
      $inc: {
        a: event.approved,
        n: event.noFunds,
        p: event.pending,
        r: event.rejected,
      },
    },
    upsert: true,
  },
};

This updateOne operation has a similar logic to the one in appV2, with the differences being the filter criteria and the $inc operation.

filter:

Target the document where the fields date and key from the event document matches the fields key and date from a document in the collection
The key is converted to binary format using the buildKey function.

update:

Uses the $inc operator to increment counters (a, n, p, r) based on the event data.

Get Reports

const pipeline = [
  {
    $match: {
      key: buildKey(event.key),
      date: { $gte: request.date - oneYear, $lt: request.date },
    },
  },
  {
    $group: {
      _id: null,
      approved: { $sum: "$a" },
      noFunds: { $sum: "$n" },
      pending: { $sum: "$p" },
      rejected: { $sum: "$r" },
    },
  },
];

This aggregation operation has a similar logic to the one in appV2, with the differences being the filtering criteria in the $match stage and the name of the statuses fields in the $group stage.

{ $match: {...} }:

The key field is converted to binary format using the buildKey function.

{ $group: {...} }:

Computes the sum of the a, n, p, and r fields using the $sum operator.

Indexes

Similar to appV2, appV3 relies on a compound index on the key and date fields to optimize Bulk Upsert and Get Reports operations. Even though the key field is now stored as binary data and status field names are shortened, the query patterns remain the same, necessitating the following index:

const keys = { key: 1, date: 1 };
const options = { unique: true };

db.appV3.createIndex(keys, options);

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV3, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Avg. Document Size	Storage Size	Indexes	Index Size
appV2	359,614,536	41.92GB	126B	10.46GB	2	16.66GB
appV3	359,633,376	28.7GB	86B	8.96GB	2	16.37GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/Event	Index Size/Event	Total Size/Event
appV2	90B	35.8B	125.8B
appV3	62B	35.2B	96.8B

Analyzing the tables above, we can see that from appV2 to appV3, there was practically no change in the index size and a decrease of 32% in the data size. Our goal of reducing the document size was accomplished.

Looking at the Event Statistics, the total size per event value decreased by 23%, from 125.8B to 96.8B. With this reduction, we’ll probably see considerable improvements.

Load Test Results

Executing the load test for appV3 and plotting it alongside the results for appV2 and Desired rates, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rates

While still falling short of the 25 reports per second target, appV3 demonstrates some improvement, maintaining approximately 16 reports per second for half the test duration, an enhancement over appV2.

Get Reports Latency

Maintains approximately 1.2 seconds for the first 100 minutes before degrading to levels similar to previous versions.

Bulk Upsert Rates

Successfully maintains target rates for the first 100 minutes of testing - achieving 250 events per second from 0-50 minutes and 500 events per second from 50-100 minutes. This marks the first version to sustain target performance for extended periods.

Bulk Upsert Latency

Sustains approximately 2.5 seconds during the first 100 minutes, considerably better than previous implementations, before experiencing degradation in the final test phase.

Issues and Improvements

Looking at the collection stats of appV3 and thinking about how MongoDB is executing our queries and what indexes are being used, we can see that the _id field and its index aren't being used in our application. The field by itself is not a big deal from a performance standpoint, but its obligatory unique index is that every time a new document is inserted in the collection, the index structure on the _id field has to be updated.

Going back to the idea from appV1 of trying to take advantage of the obligatory _id field and its index, is there a way that we can use it in our application?

Let's take a look at our filtering criteria in the Get Report and Bulk Upsert functions:

const bulkUpsertFilter = {
  key: event.key,
  date: event.date,
};

const getReportsFilter = {
  key: request.key,
  date: {
    $gte: new Date("2021-06-15"),
    $lt: new Date("2022-06-15"),
  },
};

In both filtering criteria, the key field is compared using equality. The date field is compared using equality in the Bulk Upsert and range in the Get Reports. What if we combine these two field values in just one, concatenating them, and store it in _id?

To guide us on how we should order the fields in the resulting concatenated value and get the best performance of the index on it, let's follow the Equality, Sort, and Range rule (ESR).

As seen above, the key field is compared by equality in both cases, and the date field is compared by equality just in one case, so let's choose the key field for the first part of our concatenated value and the date field for the second part. As we don't have a Sort operation in our queries, we can skip it. Next, we have Range comparison, which is used in the date field, so now it makes sense to keep it as the second part of our concatenated value. With that, the most optimal way of concatenating the two values and getting the best performance of its index is key+date.

One point of attention is how we are going to format the date field in this concatenation in a way that the range filter works, and we don't store more data than we really need. One possible implementation will be presented and tested in the next application version, appV4.

Application Version 4 (appV4): Taking Advantage of the `_id` Index

Introduction

As presented in the issues and improvements of appV3, one way to take advantage of the obligatory field and index on _id is to store on it the concatenated value of key + date. One thing that we need to cover now is what data type the _id field will have and how we are going to format the date field.

As seen in previous implementations, storing the key field as binary/hexadecimal data improved the performance. So, let's see if we can also store the resulting concatenated field, key + date, as binary/hexadecimal.

To store the date field in a binary/hexadecimal type, we have some options. One could be converting it to a 4-byte timestamp that measures the seconds since the Unix epoch, and the other could be converting it to the format YYYYMMDD, which stores year, month, and day. Both cases would require the same 32 bits/8 hexadecimal characters.

For our case, let's use the second option and store the date value as YYYYMMDD because it'll help in future implementation/improvements. Considering a key field with the value 0001 and a date field with the value 2022-01-01, we would have the following _id field:

const _id = Buffer.from("000120220101", "hex");

To concatenate and convert the key and date fields to their desired format and type, the following TypeScript function was created:

const buildId = (key: string, date: Date): Buffer => {
  const day = date.toISOString().split("T")[0].replace(/-/g, ""); // YYYYMMDD
  return Buffer.from(`${key}${day}`, "hex");
};

Each document would still register the status totals for one user in one day, specified by _id field, the same way it's done in the previous implementations.

These changes reflect an advanced understanding of MongoDB's indexing strategies and storage optimization techniques, demonstrating the expertise of a very experienced senior developer with deep knowledge of BSON data types and compound key design patterns.

Schema

The application implementation presented above would have the following TypeScript document schema denominated SchemaV4:

type SchemaV4 = {
  _id: Buffer;
  a?: number;
  n?: number;
  p?: number;
  r?: number;
};

Bulk Upsert

Based on the specification presented, we have the following updateOne operation for each event generated by this application version:

const operation = {
  updateOne: {
    filter: { _id: buildId(event.key, event.date) },
    update: {
      $inc: {
        a: event.approved,
        n: event.noFunds,
        p: event.pending,
        r: event.rejected,
      },
    },
    upsert: true,
  },
};

This updateOne operation has a similar logic to the one in appV3, with the only difference being the filter criteria.

filter:

Target the document where the _id field matches the concatenated value of key and date.
The buildId function converts the key and date into a binary format.

Get Reports

const pipeline = [
  {
    $match: {
      _id: {
        $gte: buildId(request.key, Date.now() - oneYear),
        $lt: buildId(request.key, Date.now()),
      },
    },
  },
  {
    $group: {
      _id: null,
      approved: { $sum: "$a" },
      noFunds: { $sum: "$n" },
      pending: { $sum: "$p" },
      rejected: { $sum: "$r" },
    },
  },
];

This aggregation operation has a similar logic to the one in appV1, with the only difference being the filtering criteria in the $match stage.

{ $match: {...} }:

The _id field is a binary representation of the concatenated key and date values.
The $gte operator specifies the start of the date range, while $lt specifies the end.

Indexes

The key design goal of appV4 was to leverage the mandatory, default index on the _id field. By storing the concatenated key and date (formatted as YYYYMMDD and converted to binary) directly in the _id field, appV4 eliminates the need for any additional custom indexes.

The default index on _id is automatically created by MongoDB and is unique. This index now directly supports the filtering requirements for both Bulk Upsert (equality match on the full _id) and Get Reports (range queries on the _id based on key and date ranges).

Initial Scenario Statistics

Collection Statistics

To evaluate the performance of appV4, we inserted 500 million event documents into the collection using the schema and Bulk Upsert function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:

Collection	Documents	Data Size	Avg. Document Size	Storage Size	Indexes	Index Size
appV3	359,633,376	28.7GB	86B	8.96GB	2	16.37GB
appV4	359,615,279	19.66GB	59B	6.69GB	1	9.5GB

Event Statistics

To evaluate the storage efficiency per event, the Event Statistics are calculated by dividing the total Data Size and Index Size by the 500 million events.

Collection	Data Size/Event	Index Size/Event	Total Size/Event
appV3	62B	35.2B	96.8B
appV4	42.4B	20.4B	62.6B

Analyzing the tables above, we can see that from appV3 to appV4, we reduced the data size by 32% and the index size by 42%—big improvements. We also have one less index to maintain now.

Looking at the Event Statistics, the total size per event value decreased by 35%, from 96.8B to 62.6B. With this reduction, we’ll probably see some significant improvements in performance.

Load Test Results

Executing the load test for appV4 and plotting it alongside the results for appV3 and Desired rates, we have the following results for Get Reports and Bulk Upsert.

Get Reports Rates

While still not achieving the target of 25 reports per second, appV4 shows consistently better average rates compared to appV3, representing incremental progress toward optimal performance.

Get Reports Latency

Both versions exhibit comparable latency behavior throughout most of the test duration. However, during the final 100 minutes when performance degrades, appV4 demonstrates better resilience with smaller latency increases compared to appV3.

Bulk Upsert Rates

Both versions successfully maintain target rates during the first 100 minutes, but appV4 demonstrates superior performance during the degraded final 100 minutes, sustaining higher rates than appV3.

Bulk Upsert Latency

Performance Summary

Despite the substantial improvements observed in the Initial Scenario Statistics (35% reduction in Total Size per Event from 96.8B to 62.6B), the performance gains in appV4 are more modest than anticipated. This suggests that while index optimization and storage reduction provide measurable benefits, the fundamental architectural constraints require more significant changes. The results indicate that appV4 has reached the optimization ceiling for the current document-per-day approach.

Issues and Improvements

Enough of looking at our documents to get a better performance. Let's focus on the application behavior.

When generating the oneYear totals, the Get Reports function will need to retrieve something close to 60 documents on average, and in the worst-case scenario, 365 documents. To access each one of these documents, one index entry will have to be visited, and one disk read operation will have to be performed. How can we increase the data density of the documents in our application and, with that, reduce the index entries and read operations needed to perform the desired operation?

One way of doing that is using the Bucket Pattern. According to the MongoDB documentation, "The bucket pattern separates long series of data into distinct objects. Separating large data series into smaller groups can improve query access patterns and simplify application logic."

Looking at our application from the perspective of the bucket pattern, so far, we have bucketed our data by daily user, each document containing the status totals for one user in one day. We can increase the bucketing range of our schema and, in one document, store events or status totals from a week, month, or even quarter.

Conclusion

That is the end of the first part of the series. We covered how indexes work on fields of type documents and saw some small changes that we can make to our application to reduce its storage and index needs, and as a consequence, improve its performance.

Here is a quick review of the improvements made between the application versions:

appV1 to appV2: Moved out the fields key and date from an embedded document in the _id field and let it have its default value, ObjectId
appV2 to appV3: Reduced the document size by short-handing the name of status fields and changed the data type of the key field from string to binary/hexadecimal
appV3 to appV4: Removed the need for an extra index by concatenating the values of key and date and storing them on the _id field

So far, none of our applications have gotten even close to the desired rates, but let's not give up. As presented in the issues and improvements of appV4, we can still improve our application by using the Bucket Pattern. The Bucket Pattern with the Computed Pattern will be the main points of improvement for the next application version, appV5, and its revisions.

Appendices

Index on Embedded Documents

This section examines how MongoDB indexes embedded document fields and explains why the appV1 implementation requires an additional index beyond the default _id index.

Index Behavior Analysis

To understand MongoDB's indexing behavior with embedded documents, we'll analyze how the default _id index performs with our appV1 query patterns. The following tests demonstrate the difference between exact document matching and embedded field queries through the explain functionality.

// A sample document
const doc = {
  _id: { key: "0001", date: new Date("2020-01-01") },
  approved: 2,
  rejected: 1,
};

// Making sure we have an empty collection
db.appV1.drop();

// Inserting the document in the `appV1` collection
db.appV1.insertOne(doc);

// Finding a document using `Bulk Upsert` filtering criteria
const bulkUpsertFilter = {
  _id: { key: "0001", date: new Date("2020-01-01") },
};
db.appV1.find(bulkUpsertFilter).explain("executionStats");
/*{
...
  executionStats: {
    nReturned: 1,
    totalKeysExamined: 1,
    totalDocsExamined: 1,
    ...
    executionStages: {
      stage: 'EXPRESS_IXSCAN',
      ...
    }
    ...
  },
  ...
}*/

// Finding a document using `Get Reports` filtering criteria
const getReportsFilter = {
  "_id.key": "0001",
  "_id.date": { $gte: new Date("2019-01-01"), $lte: new Date("2021-01-01") },
};
db.appV1.find(getReportsFilter).explain("executionStats");
/*{
...
  executionStats: {
    nReturned: 1,
    totalKeysExamined: 0,
    totalDocsExamined: 1,
    ...
    executionStages: {
      stage: 'COLLSCAN',
      ...
    }
    ...
  },
  ...
}*/

Index Utilization Results

The execution statistics reveal a critical performance difference:

Bulk Upsert Query: Uses the index efficiently (EXPRESS_IXSCAN) because it matches the entire embedded document exactly
Get Reports Query: Performs a collection scan (COLLSCAN) because it queries individual fields within the embedded document

This behavior occurs because MongoDB treats embedded documents as atomic values when indexing, not as collections of individual fields.

MongoDB's Embedded Document Indexing Strategy

MongoDB handles different data types with varying indexing approaches:

Primitive Types: Directly indexed with their native values
Arrays: Special indexing that creates entries for each array element
Embedded Documents: Indexed as serialized, atomic values

For embedded documents, MongoDB creates index entries using a stringified representation of the entire document structure:

const documentValue = { key: '0001', date: 2010-01-01T00:00:00.000Z };
const indexValue = "{key:0001,date:2010-01-01T00:00:00.000Z}";

Index Limitation Implications

This indexing strategy creates a fundamental limitation: since the index stores the embedded document as a serialized blob, MongoDB cannot access or search individual fields within that structure. Consequently:

Queries matching the entire embedded document can use the index effectively
Queries targeting specific embedded fields (like _id.key or _id.date) cannot utilize the index
Range queries on embedded fields require full collection scans

This explains why the appV1 implementation requires an additional compound index on _id.key and _id.date to support efficient querying of individual embedded document fields.

The Cost of Not Knowing MongoDB – Introduction

Artur Garcia Costa — Thu, 22 Jan 2026 16:49:52 +0000

The primary focus of this series is to demonstrate the significant performance gains you can achieve—and the costs you can save—by using MongoDB properly. This includes following best practices, studying your application's specific needs, and using those insights to model your data effectively.

To illustrate these potential gains, we will present a sample application. We will then develop and load-test various MongoDB implementations for this application. These implementations will cater to different levels of MongoDB expertise: beginner, intermediate, senior, and mind-blowing (🤯).

All code and supplementary information used throughout this series are available in the GitHub repository.

The Application: Finding Fraudulent Behavior in Transactions

The application's goal is to identify fraudulent behavior within a financial transaction system. It achieves this by analyzing the status of transactions for a specific user over a defined time period. The possible transaction statuses are approved, noFunds, pending, and rejected. Each user is uniquely identifiable by a 64-character hexadecimal key value.

The application receives details of each transaction through an event document. Each event document contains information for a single transaction, for one user, on a specific day. Consequently, it will include only one of the possible status fields, with this field having a numeric value of 1. For example, the following event document represents a pending transaction for the user with the key ...0001, which occurred on the date 2022-02-01:

const event = {
  key: "0000000000000000000000000000000000000000000000000000000000000001",
  date: new Date("2022-02-01"),
  pending: 1,
};

Transaction statuses are analyzed by comparing the total counts of each status for a given user over several trailing periods: oneYear, threeYears, fiveYears, sevenYears, and tenYears. These totals are provided in a reports document, which can be requested by providing the user's key and the end date for the report.

The following is an example of a reports document for the user with key ...0001 and an end date of 2022-06-15:

export const reports = [
  {
    id: "oneYear",
    end: new Date("2022-06-15T00:00:00.000Z"),
    start: new Date("2021-06-15T00:00:00.000Z"),
    totals: { approved: 4, noFunds: 1, pending: 1, rejected: 1 },
  },
  {
    id: "threeYears",
    end: new Date("2022-06-15T00:00:00.000Z"),
    start: new Date("2019-06-15T00:00:00.000Z"),
    totals: { approved: 8, noFunds: 2, pending: 2, rejected: 2 },
  },
  {
    id: "fiveYears",
    end: new Date("2022-06-15T00:00:00.000Z"),
    start: new Date("2017-06-15T00:00:00.000Z"),
    totals: { approved: 12, noFunds: 3, pending: 3, rejected: 3 },
  },
  {
    id: "sevenYears",
    end: new Date("2022-06-15T00:00:00.000Z"),
    start: new Date("2015-06-15T00:00:00.000Z"),
    totals: { approved: 16, noFunds: 4, pending: 4, rejected: 4 },
  },
  {
    id: "tenYears",
    end: new Date("2022-06-15T00:00:00.000Z"),
    start: new Date("2012-06-15T00:00:00.000Z"),
    totals: { approved: 20, noFunds: 5, pending: 5, rejected: 5 },
  },
];

Load Testing Methodology

To evaluate the performance of each application version, two functions were designed to run concurrently under load:

Bulk Upsert: Inserts event documents.
Get Reports: Generates reports document for a specific user key and date.

Parallel execution of these functions was achieved using worker threads, with 20 workers allocated to each. Each application version was tested for 200 minutes, with varying execution parameters applied throughout this period.

`Bulk Upsert` Function

The Bulk Upsert function receives batches of 250 event documents for registration. As its name suggests, these registrations are performed using MongoDB's upsert functionality, which attempts to update a document or creates a new one if it doesn't exist, using the data from the update operation. Each Bulk Upsert iteration is timed and its duration is recorded in a secondary database.

The batch processing rate is divided into four 50-minute phases, totaling 200 minutes. The rate begins at one batch insert per second and is incremented by one batch insert per second every 50 minutes, ultimately reaching four batch inserts per second (equivalent to 1,000 event documents per second).

`Get Reports` Function

The Get Reports function generates one reports document per execution. The duration of each execution is timed and recorded in the secondary database.

The rate of reports generation is divided into 40 phases, distributed as 10 sub-phases within each of the four Bulk Upsert phases. Within each Bulk Upsert phase, the Get Reports rate starts at 25 report requests per second and increases by 25 requests per second every five minutes. This culminates in 250 report requests per second by the end of that Bulk Upsert phase.

The following graph depicts the target rates for Bulk Upsert and Get Reports throughout the test scenario:

Initial Scenario and Data Generation

For a fair comparison across application versions, the initial dataset (working set) for the tests was designed to be larger than the available memory on the MongoDB server. This approach ensures significant cache activity and prevents the entire working set from residing in memory.

The following parameters were established for the initial dataset:

Data spanning 10 years: from 2010-01-01 to 2020-01-01.
50 million events per year, resulting in a total working set of 500 million events.
An average of 60 events per user (key) per year.

Given 50 million events per year and 60 events per user per year, the total number of unique users is approximately 833,333 (50,000,000 / 60). The user's key generator was configured to produce keys following an approximately normal (Gaussian) distribution. This simulates a real-world scenario where some users generate more events than others. The following graph illustrates the distribution of 50 million keys generated:

To further simulate a real-world scenario, the distribution of event statuses was set as follows:

80% approved
10% noFunds
7.5% pending
2.5% rejected

Initial Scenario Collection Statistics

Collection	Documents	Data Size	Avg. Document Size	Storage Size	Indexes	Index Size
appV1	359,639,622	39.58GB	119B	8.78GB	2	20.06GB
appV2	359,614,536	41.92GB	126B	10.46GB	2	16.66GB
appV3	359,633,376	28.7GB	86B	8.96GB	2	16.37GB
appV4	359,615,279	19.66GB	59B	6.69GB	1	9.5GB
appV5R0	95,350,431	19.19GB	217B	5.06GB	1	2.95GB
appV5R1	33,429,649	15.75GB	506B	4.04GB	1	1.09GB
appV5R2	33,429,649	11.96GB	385B	3.26GB	1	1.16GB
appV5R3	33,429,492	11.96GB	385B	3.24GB	1	1.11GB
appV5R4	33,429,470	12.88GB	414B	3.72GB	1	1.24GB
appV6R0	95,350,319	11.1GB	125B	3.33GB	1	3.13GB
appV6R1	33,429,366	8.19GB	264B	2.34GB	1	1.22GB
appV6R2	33,429,207	9.11GB	293B	2.8GB	1	1.26GB
appV6R3	33,429,694	9.53GB	307B	2.56GB	1	1.19GB
appV6R4	33,429,372	9.53GB	307B	1.47GB	1	1.34GB

Infrastructure Configuration

MongoDB Server Instance

The MongoDB server ran on an AWS EC2 c7a.large instance, equipped with 2 vCPUs and 4GB of memory. Two disks were attached:

A 15GB GP3 disk for the operating system.
A 300GB IO2 disk with 10,000 IOPS for MongoDB data storage.

The instance ran Ubuntu 22.04, fully updated at the time of testing. All recommended production settings were applied to optimize MongoDB performance on the available hardware.

Application Server Instance

The application server ran on an AWS EC2 c6a.xlarge instance, featuring 4 vCPUs and 8GB of memory. Two disks were attached:

A 10GB GP3 disk for the operating system.
A 10GB GP3 disk for a secondary MongoDB server, used for storing load test metrics.

This instance also ran Ubuntu 22.04, fully updated. Recommended production settings were applied to optimize its performance.

Unique Indexes Quirks and Unique Documents In an Array of Documents

Artur Garcia Costa — Tue, 06 Jan 2026 13:02:13 +0000

This article was reviewed and approved by MongoDB.

We are developing an application to summarize a user's financial situation. The main page of this application shows us the user's identification and the balances on all banking accounts synced with our application.

As we've seen in blog posts and recommendations of how to get the most out of MongoDB, "Data that is accessed together should be stored together", we thought of the following document/structure to store the data used on the main page of the application:

const user = {
  _id: 1,
  name: { first: "john", last: "smith" },
  accounts: [
    { balance: 500, bank: "abc", number: "123" },
    { balance: 2500, bank: "universal bank", number: "9029481" },
  ],
};

Based on the functionality of our application, we determined the following rules:

A user can register in the application and not sync a bank account;
An account is identified by its bank and number fields;
The same account shouldn't be registered for two different users;
The same account shouldn't be registered multiple times for the same user.

To enforce what was presented above, we decided to create an index with the following characteristics:

Given that the fields bank and number must not repeat, this index must be set as Unique;
Since we are indexing more than one field, it'll be of type Compound;
Since we are indexing documents inside of an array, it'll also be of type Multikey;

As a result of that, we have a Compound Multikey Unique Index with the following specification and options:

const specification = { "accounts.bank": 1, "accounts.number": 1 };
const options = { name: "Unique Account", unique: true };

To validate that our index works as we intended, we'll use the following data on our tests:

const user1 = { _id: 1, name: { first: "john", last: "smith" } };
const user2 = { _id: 2, name: { first: "john", last: "appleseed" } };
const account1 = { balance: 500, bank: "abc", number: "123" };

First, let's add the users to the collection:

db.users.createIndex(specification, options); // Unique Account

db.users.insertOne(user1); // { acknowledged: true, insertedId: 1)}

db.users.insertOne(user2); /* MongoServerError: E11000 duplicate key...
...error collection: test.users index: Unique Account dup key: ...
...{ accounts.bank: null, accounts.number: null } */

Pretty good, we haven't even started working with the accounts, and we already have an error. Let's see what is going on.

Analyzing the error message, it says we have a duplicate key for the index Unique Account with the value of null for the fields accounts.bank and accounts.number. This is due to how indexing works in MongoDB, when we insert a document in an indexed collection, and this document doesn't have one or more of the fields specified in the index, the value of the missing fields will be considered null, and an entry will be added to the index.

Using this logic to analyze our previous test, when we inserted user1, it didn't have the fields accounts.bank and accounts.number and generated an entry in the index Unique Account with the value of null for both. When we tried to insert the user2 in the collection, we had the same behavior, and another entry in the index Unique Account would have been created if we hadn't specified this index as unique. More info about missing fields and unique indexes can be found here.

The solution for this issue is to only index documents with the fields accounts.bank and accounts.number. To accomplish that, we can specify a Partial Filter Expression on our index options to accomplish that. Now we have a Compound Multikey Unique Partial Index (fancy name, hum, who are we trying to impress here?) with the following specification and options:

const specification = { "accounts.bank": 1, "accounts.number": 1 };
const optionsV2 = {
  name: "Unique Account V2",
  partialFilterExpression: {
    "accounts.bank": { $exists: true },
    "accounts.number": { $exists: true },
  },
  unique: true,
};

Back to our tests:

// Cleaning our environment
db.users.drop({}); // Delete documents and indexes definitions

/* Tests */
db.users.createIndex(specification, optionsV2); // Unique Account V2
db.users.insertOne(user1); // { acknowledged: true, insertedId: 1)}
db.users.insertOne(user2); // { acknowledged: true, insertedId: 2)}

Our new index implementation worked, and now we can insert those two users without accounts. Let's test account duplication, starting with the same account for two different users:

// Cleaning the collection
db.users.deleteMany({}); // Delete documents, keep indexes
db.users.insertMany([user1, user2]);

/* Test */
db.users.updateOne({ _id: user1._id }, { $push: { accounts: account1 } });
// { ... matchedCount: 1, modifiedCount: 1 ...}

db.users.updateOne({ _id: user2._id }, { $push: { accounts: account1 } });
/* MongoServerError: E11000 duplicate key error collection: test.users 
index: Unique Account V2 dup key:...
... { accounts.bank: "abc", accounts.number: "123" } */

We couldn't insert the same account into different users as we expected. Now we'll try the same account for the same user.

// Cleaning the collection
db.users.deleteMany({}); // Delete documents, keep indexes
db.users.insertMany([user1, user2]);

/* Test */
db.users.updateOne({ _id: user1._id }, { $push: { accounts: account1 } });
// { ... matchedCount: 1, modifiedCount: 1 ...}

db.users.updateOne({ _id: user1._id }, { $push: { accounts: account1 } });
// { ... matchedCount: 1, modifiedCount: 1 ...}

db.users.findOne({ _id: user1._id });
/*{
 _id: 1,
 name: { first: 'john', last: 'smith' },
 accounts: [
   { balance: 500, bank: 'abc', number: '123' },
   { balance: 500, bank: 'abc', number: '123' }
 ]
}*/

When we don't expect things to work, they do. Again, another error caused by not knowing or considering how indexes work on MongoDB. Looking at this part of MongoDB documentation, we'll learn that MongoDB indexes don't duplicate strictly equal entries, with the same key values pointing to the same document. Considering this, when we inserted account1 for the second time on our user, an index entry wasn't created, with that, we don't have duplicate values on it.

Some of you more knowledgeable on MongoDB may think that using \$addToSet instead of \$push would resolve our problem. Not this time, young padawan. The $addToSet function would consider all the fields in the account's document, but as we specified at the beginning of our journey, an account must be unique and identifiable by the fields bank and number.

Ok, what can we do now? Our index has a ton of options and compound names, and our application doesn't behave as we hoped.

A simple way out of this situation is to change how our update function is structured, changing its filter parameter to only match the user's documents where the account we want to insert isn't in the accounts array.

// Cleaning the collection
db.users.deleteMany({}); // Delete documents, keep indexes
db.users.insertMany([user1, user2]);

/* Test */
const bankFilter = {
  $not: { $elemMatch: { bank: account1.bank, number: account1.number } },
};

db.users.updateOne(
  { _id: user1._id, accounts: bankFilter },
  { $push: { accounts: account1 } }
); // { ... matchedCount: 1, modifiedCount: 1 ...}

db.users.updateOne(
  { _id: user1._id, accounts: bankFilter },
  { $push: { accounts: account1 } }
); // { ... matchedCount: 0, modifiedCount: 0 ...}

db.users.findOne({ _id: user1._id });
/*{
 _id: 1,
 name: { first: 'john', last: 'smith' },
 accounts: [ { balance: 500, bank: 'abc', number: '123' } ]
}*/

Problem solved, we tried to insert the same account for the same user, and it didn't insert, but it also didn't error out.

This behavior doesn't meet our expectations because it doesn't make clear to the user that this operation is prohibited. Another point of concern is that this solution considers that every time a new account is inserted in the database, it'll use the correct update filter parameters.

We've worked in some companies and know that as people come and go, some knowledge about the implementation is lost, interns will try to reinvent the wheel, and some nasty shortcuts will be taken. We want a solution that will error out in any case and stop even the most unscrupulous developer/administrator who dares to change data directly on the production database 😱.

MongoDB Schema Validation for the win.

A quick note before we go down this rabbit role. MongoDB best practices recommend implementing schema validation on the application level and using MongoDB Schema Validation as a backstop.

In MongoDB Schema Validation, it's possible to use the operator $expr to write an aggregation expression to validate the data of a document when it has been inserted or updated. With that, we can write an expression to verify if the items inside an array are unique.

After some consideration, we get the following expression:

const accountsSet = {
  $setIntersection: {
    $map: {
      input: "$accounts",
      in: { bank: "$$this.bank", number: "$$this.number" },
    },
  },
};

const uniqueAccounts = {
  $eq: [{ $size: "$accounts" }, { $size: accountsSet }],
};

const accountsValidator = {
  $expr: {
    $cond: {
      if: { $isArray: "$accounts" },
      then: uniqueAccounts,
      else: true,
    },
  },
};

The first operation we have inside of $expr is a $cond. When the logic specified in the if field results in true, the logic within the field then will be executed, when the result is false, the logic within the else field will be executed.

Using this knowledge to interpret our code, when the accounts array exists in the document, { $isArray: "$accounts" }, we will execute the logic withinuniqueAccounts when the array doesn't exist, we return true signaling that the document passed the schema validation.

Inside the uniqueAccounts variable, we verify if the $size of two things is $eq. The first thing is the size of the array field $accounts, and the second thing is the size of accountsSet that is generated by the $setIntersection function. If the two arrays have the same size, the logic will return true, and the document will pass the validation, otherwise, the logic will returnfalse, the document will fail validation, and the operation will error out.

The $setIntersenction function will perform a set operation on the array passed to it, removing duplicate entries. The array passed to $setIntersection will be generated by a $map function, which maps each account in $accounts to only have the fields bank and number.

Let's see if this is witchcraft or science:

// Cleaning the collection
db.users.drop({}); // Delete documents and indexes definitions
db.createCollection("users", { validator: accountsValidator });
db.users.createIndex(specification, optionsV2);
db.users.insertMany([user1, user2]);

/* Test */
db.users.updateOne({ _id: user1._id }, { $push: { accounts: account1 } });
// { ... matchedCount: 1, modifiedCount: 1 ...}

db.users.updateOne(
  { _id: user1._id },
  { $push: { accounts: account1 } }
); 
/* MongoServerError: Document failed validation
Additional information: {
 failingDocumentId: 1,
 details: {
   operatorName: '$expr',
   specifiedAs: {
     '$expr': {
       '$cond': {
         if: { '$and': '$accounts' },
         then: { '$eq': [ [Object], [Object] ] },
         else: true
       }
     }
   },
   reason: 'expression did not match',
   expressionResult: false
 }
}*/

Mission accomplished.

Improving Storage and Read Performance for Free: Flat vs Structured Schemas

Artur Garcia Costa — Mon, 08 Dec 2025 18:52:37 +0000

This article was reviewed and approved by MongoDB.

When developers or administrators who had previously only been "followers of the word of relational data modeling" start to use MongoDB, it is common to see documents with flat schemas. This behavior happens because relational data modeling makes you think about data and schemas in a flat, two-dimensional structure called tables.

In MongoDB, data is stored as BSON documents, almost a binary representation of JSON documents, with slight differences. Because of this, we can create schemas with more dimensions/levels. More details about BSON implementation can be found in its specification. You can also learn more about its differences from JSON.

MongoDB documents are composed of one or more key/value pairs, where the value of a field can be any of the BSON data types, including other documents, arrays, or arrays of documents.

Using documents, arrays, or arrays of documents as values for fields enables the creation of a structured schema, where one field can represent a group of related information. This structured schema is an alternative to a flat schema.

Let's see an example of how to write the same user document using the two schemas:

The two documents above contain the same data. The one on the left, flatUser, uses a flat schema where all the field-and-value pairs are on the same level. The one on the right, structuredUser, employs a structured schema where the field and values have nested levels according to related information inside the document.

So, what are the advantages of using a structured rather than a flat one? The quick answer for those in a hurry is that a structured schema may require less storage and be faster to traverse than a flat schema. For those who want to know why, we need a better understanding of BSON.

For the purpose of this article, a BSON document can be seen as a list of items, where each item represents a field-and-value pair of the document. An item is composed of the field’s type, name, length, and data in a serialized form. The field type is one byte long and indicates the data type in the data field. The field name is the field's name in a string form. The field length is four bytes long and indicates the length of the data field for those types where the size is not fixed. The data field is the actual data of the field-and-value pair. Putting this definition in a graphical representation, we have:

Let's see how a structured schema uses less storage than a flat schema by analyzing the field-and-value pair related to the user's name.

In the flatUser, we have the following table from a storage perspective:

field-and-value	Type	Field Name	Field Length	Field Data	Total
name_first: "john"	1 byte	10 bytes	4 bytes	4 bytes	19 bytes
name_last: "smith"	1 byte	9 bytes	4 bytes	5 bytes	19 bytes
name_middle: "oliver"	1 byte	11 bytes	4 bytes	6 bytes	22 bytes

Adding up the table's total sizes, the flat document uses 60 bytes to store the field and value related to the user's name.

To analyze the storage of the structuredUser, let's divide it into two tables. In the first table, we'll have the storage used by the document of the field name, and in the second table, we'll have the storage utilized by the field-and-value name.

Let’s build the first table for the value/content of the field name:

field-and-value	Type	Field Name	Field Length	Field Data	Total Size
first: "john"	1 byte	5 bytes	4 bytes	4 bytes	14 bytes
last: "smith"	1 byte	4 bytes	4 bytes	5 bytes	14 bytes
middle: "oliver"	1 byte	6 bytes	4 bytes	6 bytes	17 bytes

Adding up the previous table's total sizes, the value/Field Data of the field name uses 45 bytes. Building the second table for the field-and-value name, we get:

field-and-value	Type	Field Name	Field Length	Field Data	Total Size
name: { … }	1 byte	4 bytes	4 bytes	45 bytes	54 bytes

The structured document uses 54 bytes to store the values related to the user's name.

Comparing the tables, we see the main difference is the "Field Name" storage size. The flat schema uses 30 bytes to store the names of its fields, while the structured schema uses 19 bytes to store the names of its fields. This is due to the repetition of the sub-string "name_" in the "Field Name" of the flat schema.

Storing the two documents in a MongoDB instance, we will get a size of 403 bytes for the flat schema and 307 bytes for the structured schema. Not bad getting a 24% improvement in storage just by changing the schema, and a structured document is easier to read and more pleasant to look at.

Now, let's see how a structured schema is faster to traverse than a flat schema by getting the zip code of the work address.

In the flatUser document, to get to the field address_work_zip starting at the beginning of the document, a cursor would need to perform a 12 field names comparison until it reaches the desired field.

In the structuredUser document, to get to the field address.work.zip starting at the beginning of the document, a cursor would need to perform an 8 field names comparison. The smaller number of comparisons here is due to some values of a field-and-value pair being a document. When the cursor checks the field name, it can jump three fields/comparison — first, middle, and last— because it knows that address.work.zip won't be inside of name.<field>. When the cursor checks the field address.home, it can also jump five fields/comparison — street, number, zip, state, and country.

To quantify the performance gain on traversing a structured schema instead of a flat schema in MongoDB, a test with the following methodology was used:

To isolate the result to be influenced just by the traversing of the documents, the MongoDB instance used was configured with in-memory storage.
Documents with 10, 25, 50, and 100 fields were utilized for the flat schema.
Documents with 2x5, 5x5, 10x5, and 20x5 fields were used for the structured schema, where 2x5 means two fields of type document with five fields for each document.
Each collection had 10.000 documents generated using faker/npm.
To force the MongoDB engine to loop through all documents and all fields inside each document, all queries were made searching for a field and value that wasn't present in the documents.
Each query was executed 100 times in a row for each document size and schema.
No concurrent operation was executed during each test.

Now, to the test results:

Documents	Flat	Structured	Difference	Improvement
10 / 2x5	487 ms	376 ms	111 ms	22,8%
25 / 5x5	624 ms	434 ms	190 ms	30,4%
50 / 10x5	915 ms	617 ms	298 ms	32,6%
100 / 20x5	1384 ms	891 ms	493 ms	35,6%

As our theory predicted, traversing a structured document is faster than traversing a flat one. The gains presented in this test shouldn't be considered for all cases when comparing structured and flat schemas, the improvements in traversing will depend on how the nested fields and documents are organized.

This article showed how to better use your MongoDB deployment by changing the schema of your document for the same data/information. Another option to extract more performance from your MongoDB deployment is to apply the common schema patterns of MongoDB. In this case, you will analyze which data you should put in your document/schema. The article Building with Patterns has the most common patterns and will significantly help.

The code used to get the above results is available in the GitHub repository.

The Pitfall of Increasing Read Capacity by Reading From Secondary Nodes in a MongoDB Replica Set

Artur Garcia Costa — Wed, 03 Dec 2025 18:43:37 +0000

This article was reviewed and approved by MongoDB and was originally published in foojay.io and Delbridge Solutions.

The scenario

Imagine we are responsible for managing the MongoDB cluster that supports our country's national financial payment system, similar to Pix in Brazil. Our application was designed to be read-heavy, with one write operation for every 20 read operations.

With Black Friday approaching, a critical period for our national financial payment system, we have been entrusted with the crucial task of creating a scaling plan for our cluster to handle the increased demand during this shopping spree. Given that our system is read-heavy, we are exploring ways to enhance the read performance and capacity of our cluster.

We're in charge of the national financial payment system that powers a staggering 60% of all transactions across the nation. That's why ensuring the highest availability of this MongoDB cluster is absolutely critical—it's the backbone of our economy!

A solution from AI Models

As a database administrator or database developer in 2025, our first step when searching for solutions is to consult AI. These AI models, including GPT-5, Grok Code Fast 1, Claude Sonnet 4, and Gemini 2.5 Pro, are advanced tools that can provide insights and recommendations based on the specific query we ask. I asked the question, "How can I increase read performance and capacity in a MongoDB replica set cluster?” to these AI models.

A standard recommendation across all responses was to distribute read operations to secondary nodes using the readPreference setting to enhance performance and increase the number of secondary nodes to boost read capacity.

An interesting observation is that nearly all AI models correctly warned that reading from secondary nodes could yield stale information, which means the data might not be the most up-to-date, as the replication of write operations between nodes requires some time.

The pitfall of scaling capacity by reading from secondary nodes

Let's imagine we have a replica set cluster consisting of three nodes: one primary node and two secondaries. Each node can handle up to 100 read operations per second. If we distribute the read operations equally among the nodes, the entire replica set cluster should be able to accommodate a total of 300 read operations per second.

Our application requires 240 read operations per second. Since we have configured it to balance the operations across the replica set nodes, each node will handle 80 read operations per second, which is below its capacity of 100 reads per second.

However, a potential risk lurks in the shadows. Imagine a network outage in one of the availability zones where one of our replica set nodes is deployed, causing this primary or secondary node to go down. Now, our application is still requesting its 240 read operations per second, but with only two nodes remaining, each node needs to process 120 read operations per second.

Since each node can only handle 100 read operations per second, this overloads their hardware, which may lead to further failures. As a result, the remaining nodes may go down, taking down the entire MongoDB cluster along with the application.

Increasing read capacity vs increasing read performance

Let's first clarify the difference between read capacity and read performance in a MongoDB cluster:

Read capacity: How many read operations the cluster can manage without overloading its hardware or significantly increasing the time required to complete these operations
Read performance: How quickly a read operation can be fulfilled by the cluster

As discussed previously, utilizing secondary nodes to enhance the cluster’s read capacity may inadvertently reduce its availability. This availability reduction occurs because if one node fails, the remaining nodes could become overloaded with read requests.

Therefore, when high availability is crucial for your application, reading from secondary nodes should be limited to improve performance. Two ways of doing that are:

Proximity: Locating the secondary node closer to the application reduces the latency of requests and responses between them.
Caching: Consistently executing the same queries on the same node allows its cache to retain the necessary data, leading to faster query fulfillment.

Properly increasing read capacity

To safely and reliably increase read capacity without sacrificing availability, the best approach is to scale your cluster—either vertically (scaling up) or horizontally (scaling out).

Vertical scaling (scale up)

This method involves increasing the resources of existing nodes, such as CPU, RAM, storage, and IOPS.

Advantages:
- Operational simplicity: No changes are needed for data distribution or query routing.
- Minimal application change: Connection strings and query patterns typically remain the same.
- Immediate performance improvement: It’s particularly effective for workloads that are limited by CPU or memory.
Disadvantages:
- Upper limits: Eventually, you will reach the maximum instance size available; a single machine's resources can cap throughput.
- Non-linear performance growth: The performance of your application usually doesn't grow linearly with the instance size, meaning that doubling your resources might not double your throughput.
- Single-node bottlenecks: Hot documents or collections and heavy aggregation can still face contention for a primary node's resources.
- [MongoDB EA only] Obtain and provision additional resources: While MongoDB Atlas offers simple methods for vertical scaling, on-premises deployments often face limitations due to resource availability.

Horizontal scaling (scale out via sharding)

This approach distributes data and workload across multiple shards by partitioning the data.

Advantages:
- Near-linear throughput growth: Adding shards can increase capacity for both reads and writes, in addition to total storage.
- Hotspot mitigation: Proper shard keys can help evenly spread the load to avoid bottlenecks on individual nodes.
- Geographic flexibility: Zone sharding keeps data close to users and meets data residency requirements.
Disadvantages:
- Design complexity: Selecting the right shard key is crucial; poorly chosen shard keys can lead to imbalances or inefficient scatter-gather queries.
- Operational overhead: Tasks such as chunk balancing, resharding, and managing cross-shard queries or transactions can add complexity.
- Query pattern considerations: To maximize targeted reads and avoid fan-out, applications may need to include the shard key in their queries.
- [MongoDB EA only] Obtain and provision additional resources: While MongoDB Atlas offers simple methods for horizontal scaling, on-premises deployments often face limitations due to resource availability.

For more information on scaling in MongoDB, refer to the articles "A Guide to Horizontal vs Vertical Scaling" and "Database Scaling," or check the official documentation on scaling strategies.

Maybe other ways around it?

Some readers who are more knowledgeable about MongoDB cluster topology and node types may think that, at least in MongoDB Atlas, we could have increased our cluster's read capacity by utilizing read-only or analytical nodes. As Master Yoda would say, "Much to learn you still have, my young padawan." First, let's understand what these nodes are and their purpose, and then we can assess whether they fit our needs.

Read-only node

In reviewing the official MongoDB documentation for Atlas read-only nodes, I've identified two key points that are particularly relevant to our case:

“Use read-only nodes to optimize local reads in the nodes' respective service areas.”
“Read-only nodes don't provide high availability because they don't participate in elections.”

The first point indicates that read-only nodes can enhance performance by being located closer to the application, thereby reducing read latency. However, since our goal is to increase read capacity, this solution is not ideal.

The second point emphasizes that read-only nodes do not contribute to high availability, which is a critical requirement for our application. Therefore, this aspect does not provide any advantage for us.

Analytics node

In reviewing the official MongoDB documentation for Atlas analytics nodes, we can find very similar relevant points of attention to the read-only case:

“Use analytics nodes to isolate queries which you do not wish to contend with your operational workload.”
“Read-only nodes don't provide high availability because they don't participate in elections.”

The second point is the same as in the read-only case, so there’s no need for further discussion on it. The first point implies that the analytics node will handle analytical queries, which could negatively impact the performance of everyday queries in your application. Therefore, this does not contribute to increasing read capacity.

Conclusion

While distributing read operations across secondary MongoDB nodes to boost capacity might sound appealing, it can inadvertently impact availability—something that's crucial for systems like our national financial payment network. Such an approach could lead to cascading failures during outages, which we definitely want to avoid!

Instead, focus on scaling strategies. Consider vertical scaling for immediate performance enhancements, or horizontal sharding to ensure consistent throughput and address hotspot concerns. While read-only and analytical nodes offer certain benefits, they don't fully address the need for high availability and read capacity.

Forem: Artur Garcia Costa

The Cost of Not Knowing MongoDB - Part 3: appV6R0 to appV6R4

Table Of Contents

Article Introduction

Application Version 6 Revision 0 (appV6R0): A Dynamic Monthly Bucket Document

Introduction

Schema

Bulk Upsert

Get Reports

Indexes

Initial Scenario Statistics

Collection Statistics

Event Statistics

Load Test Results

Get Reports Rates

Get Reports Latency

Bulk Upsert Rates

Bulk Upsert Latency

Performance Summary

Issues and Improvements

Application Version 6 Revision 1 (appV6R1): A Dynamic Quarter Bucket Document

Introduction

Schema

Bulk Upsert

Get Reports

Indexes

Initial Scenario Statistics

Collection Statistics

Event Statistics

Load Test Results

Get Reports Rates

Get Reports Latency

Bulk Upsert Rates

Bulk Upsert Latency

Issues and Improvements

Application Version 6 Revision 2 (appV6R2): A Dynamic Bucket and Computed Document

Introduction

Schema

Indexes

Bulk Upsert

Get Reports

Indexes

Initial Scenario Statistics

Collection Statistics

Event Statistics

Load Test Results

Get Reports Rates

Get Reports Latency

Bulk Upsert Rates

Bulk Upsert Latency

Performance Summary

Issues and Improvements

Application Version 6 Revision 3 (appV6R3): Getting Everything at Once

Introduction

Schema

Bulk Upsert

Get Reports

Indexes

Initial Scenario Statistics

Collection Statistics

Event Statistics

Load Test Results

Get Reports Rate

Get Reports Latency

Bulk Upsert Rate

Bulk Upsert Latency

Issues and Improvements

Application Version 6 Revision 4 (appV6R4): The zstd Compression Algorithm

Introduction

Schema

Bulk Upsert

Get Reports

Indexes

Initial Scenario Statistics

Collection Statistics

Event Statistics

Load Test Results

Get Reports Rate

Get Reports Latency

Bulk Upsert Rate

Application Version 6 Revision 4 (appV6R4): The `zstd` Compression Algorithm

Application Version 5 Revision 0 and Revision 1 (appV5R0 and appV5R1): A simple way to use the `Bucket Pattern`