<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Artur Garcia Costa</title>
    <description>The latest articles on Forem by Artur Garcia Costa (@arturgc).</description>
    <link>https://forem.com/arturgc</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1060790%2F486d30b1-ed55-4662-83ae-e60d886bfa43.jpeg</url>
      <title>Forem: Artur Garcia Costa</title>
      <link>https://forem.com/arturgc</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/arturgc"/>
    <language>en</language>
    <item>
      <title>The Cost of Not Knowing MongoDB - Part 3: appV6R0 to appV6R4</title>
      <dc:creator>Artur Garcia Costa</dc:creator>
      <pubDate>Thu, 22 Jan 2026 16:52:07 +0000</pubDate>
      <link>https://forem.com/arturgc/the-cost-of-not-knowing-mongodb-part-3-appv6r0-to-appv6r4-22an</link>
      <guid>https://forem.com/arturgc/the-cost-of-not-knowing-mongodb-part-3-appv6r0-to-appv6r4-22an</guid>
      <description>&lt;h2&gt;
  
  
  Table Of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Application Version 6 Revision 0: A Dynamic Monthly Bucket Document&lt;/li&gt;
&lt;li&gt;Application Version 6 Revision 1: A Dynamic Quarter Bucket Document&lt;/li&gt;
&lt;li&gt;Application Version 6 Revision 2: A Dynamic Bucket and Computed Document&lt;/li&gt;
&lt;li&gt;Application Version 6 Revision 3: Getting everything at once&lt;/li&gt;
&lt;li&gt;Application Version 6 Revision 4: The &lt;code&gt;zstd&lt;/code&gt; Compression Algorithm&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Article Introduction
&lt;/h2&gt;

&lt;p&gt;Welcome to the third and final part of the series, &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-introduction-335h"&gt;"The Cost of Not Knowing MongoDB"&lt;/a&gt;. Building upon the foundational optimizations explored in &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-part-1-appv0-to-appv4-2p66"&gt;Part 1&lt;/a&gt; and &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-part-2-appv5r0-to-appv5r4-40p7"&gt;Part 2&lt;/a&gt;, this article delves into advanced MongoDB design patterns that can dramatically transform application performance.&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-part-1-appv0-to-appv4-2p66"&gt;Part 1&lt;/a&gt;, we improved application performance by concatenating fields, changing data types, and shortening field names. In the &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-part-2-appv5r0-to-appv5r4-40p7"&gt;Part 2&lt;/a&gt;, we implemented the &lt;code&gt;Bucket Pattern&lt;/code&gt; and &lt;code&gt;Computed Patterns&lt;/code&gt; and optimized the aggregation pipeline to achieve even better performance.&lt;/p&gt;

&lt;p&gt;In this final article, we address the &lt;code&gt;Issues and Improvements&lt;/code&gt; identified in &lt;code&gt;AppV5R4&lt;/code&gt;. Specifically, we focus on reducing the document size in our application to alleviate the disk throughput bottleneck on the MongoDB server. This reduction will be accomplished by adopting a &lt;code&gt;Dynamic Schema&lt;/code&gt; and modifying the storage compression algorithm.&lt;/p&gt;

&lt;p&gt;All the application versions and revisions from this article would have been developed by a senior MongoDB developer, as it's built on all the previous versions and utilizes the &lt;code&gt;Dynamic Schema&lt;/code&gt; pattern, which isn't very common to see.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 6 Revision 0 (appV6R0): A Dynamic Monthly Bucket Document &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As mentioned in the &lt;code&gt;Issues and Improvements&lt;/code&gt; of &lt;code&gt;appV5R4&lt;/code&gt; from the &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-part-2-appv5r0-to-appv5r4-40p7#:~:text=the%20three%20quarters.-,Issues%20and%20Improvements,-As%20spoiled%20in"&gt;previous article&lt;/a&gt;, the primary limitation of our MongoDB server is its disk throughput. To address this, we need to reduce the size of the documents being stored.&lt;/p&gt;

&lt;p&gt;Consider the following document from &lt;code&gt;appV5R3&lt;/code&gt;, which has provided the best performance so far:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...01202202&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-05&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-16&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-27&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-29&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;items&lt;/code&gt; array in this document contains only four elements, but on average, it will have around 10 elements, and in the worst-case scenario, it could have up to 90 elements. These elements are the primary contributors to the document size, so they should be the focus of our optimization efforts.&lt;/p&gt;

&lt;p&gt;One commonality among the elements is the presence of the &lt;code&gt;date&lt;/code&gt; field and part of its value, year and month, for the previous document. By rethinking how this field and its value could be stored, we can reduce storage requirements.&lt;/p&gt;

&lt;p&gt;An unconventional solution we could use is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Changing the &lt;code&gt;items&lt;/code&gt; field type from an array to a document.&lt;/li&gt;
&lt;li&gt;Using the &lt;code&gt;date&lt;/code&gt; value as the field name in the &lt;code&gt;items&lt;/code&gt; document.&lt;/li&gt;
&lt;li&gt;Storing the status totals as the value for each &lt;code&gt;date&lt;/code&gt; field.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is the previous document represented using the new schema idea:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...01202202&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="mi"&gt;20220605&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="mi"&gt;20220616&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="mi"&gt;20220627&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="mi"&gt;20220629&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While this schema may not significantly reduce the document size compared to &lt;code&gt;appV5R3&lt;/code&gt;, we can further optimize it by leveraging the fact that the year is already embedded in the &lt;code&gt;_id&lt;/code&gt; field. This eliminates the need to repeat the year in the field names of the &lt;code&gt;items&lt;/code&gt; document.&lt;/p&gt;

&lt;p&gt;With this approach, the &lt;code&gt;items&lt;/code&gt; document adopts a &lt;code&gt;Dynamic Schema&lt;/code&gt;, where field names encode information and are not predefined.&lt;/p&gt;

&lt;p&gt;To demonstrate various implementation possibilities, we will revisit all the bucketing criteria used in the &lt;code&gt;appV5RX&lt;/code&gt; implementations, starting with &lt;code&gt;appV5R0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For &lt;code&gt;appV6R0&lt;/code&gt;, which builds upon &lt;code&gt;appV5R0&lt;/code&gt; but uses a dynamic schema, data is bucketed by year and month. The field names in the &lt;code&gt;items&lt;/code&gt; document represent only the day of the date, as the year and month are already stored in the &lt;code&gt;_id&lt;/code&gt; field.&lt;/p&gt;

&lt;p&gt;A detailed explanation of the bucketing logic and functions used to implement the current application can be found in the &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-part-2-appv5r0-to-appv5r4-40p7#:~:text=Application%20Version%205%20Revision%200%20and%20Revision%201%20(appV5R0%20and%20appV5R1)%3A%20A%20simple%20way%20to%20use%20the%20Bucket%20Pattern"&gt;&lt;code&gt;appV5R0 introduction&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The following document stores data for January 2022 (2022-01-XX), applying the newly presented idea:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...01202201&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;05&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="mi"&gt;27&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="mi"&gt;29&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV6R0&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV6R0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;
    &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specification presented, we have the following &lt;code&gt;updateOne&lt;/code&gt; operation for each &lt;code&gt;event&lt;/code&gt; generated by this application version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;DD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getDD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Extract the `day` from the `event.date`&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// key + year + month&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.a`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.n`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.p`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.r`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;filter&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Target the document where the &lt;code&gt;_id&lt;/code&gt; field matches the concatenated value of &lt;code&gt;key&lt;/code&gt;, &lt;code&gt;year&lt;/code&gt;, and &lt;code&gt;month&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;buildId&lt;/code&gt; function converts the &lt;code&gt;key&lt;/code&gt;+&lt;code&gt;year&lt;/code&gt;+&lt;code&gt;month&lt;/code&gt; into a binary format.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;update&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uses the &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/update/inc/" rel="noopener noreferrer"&gt;$inc&lt;/a&gt; operator to increment the fields corresponding to the same &lt;code&gt;DD&lt;/code&gt; as the &lt;code&gt;event&lt;/code&gt; by the status values provided.&lt;/li&gt;
&lt;li&gt;If a field does not exist in the &lt;code&gt;items&lt;/code&gt; document and the &lt;code&gt;event&lt;/code&gt; provides a value for it, &lt;code&gt;$inc&lt;/code&gt; treats the non-existent field as having a value of 0 and performs the operation.&lt;/li&gt;
&lt;li&gt;If a field exists in the &lt;code&gt;items&lt;/code&gt; document but the &lt;code&gt;event&lt;/code&gt; does not provide a value for it (i.e., &lt;code&gt;undefined&lt;/code&gt;), &lt;code&gt;$inc&lt;/code&gt; treats it as 0 and performs the operation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;upsert&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ensures a new document is created if no matching document exists.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;docsFromKeyBetweenDate&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$addFields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;buildTotalsField&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;groupSumTotals&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The complete code for this aggregation pipeline is quite complicated. Because of that, we will have just a pseudocode for it here.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $match: docsFromKeyBetweenDate }&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Range-filters documents by &lt;code&gt;_id&lt;/code&gt; to retrieve only buckets within the report date range. It has the exact same logic as &lt;code&gt;appV5R0&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $addFields: buildTotalsField }&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;The logic is similar to the one used in the &lt;code&gt;Get Reports&lt;/code&gt; of &lt;code&gt;appV5R3&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/aggregation/objectToArray/" rel="noopener noreferrer"&gt;&lt;code&gt;$objectToArray&lt;/code&gt;&lt;/a&gt; operator is used to convert the &lt;code&gt;items&lt;/code&gt; document into an array, enabling a &lt;code&gt;$reduce&lt;/code&gt; operation.&lt;/li&gt;
&lt;li&gt;Filtering the &lt;code&gt;items&lt;/code&gt; fields within the report's range involves extracting the &lt;code&gt;year&lt;/code&gt; and &lt;code&gt;month&lt;/code&gt; from the &lt;code&gt;_id&lt;/code&gt; field and the &lt;code&gt;day&lt;/code&gt; from the field names in the &lt;code&gt;items&lt;/code&gt; document.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The following JavaScript code is logic equivalent to the real aggregation pipeline code.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt; &lt;span class="c1"&gt;// Equivalent JavaScript logic:&lt;/span&gt;
 &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;MM&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Get month from _id&lt;/span&gt;
 &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Get year from _id&lt;/span&gt;
 &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;items_array&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Convert the object to an array of [key, value]&lt;/span&gt;

 &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;totals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;items_array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
     &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;statusDate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MM&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

     &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusDate&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;reportStartDate&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;statusDate&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;reportEndDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
     &lt;span class="p"&gt;}&lt;/span&gt;

     &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
   &lt;span class="p"&gt;},&lt;/span&gt;
   &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $group: groupCountTotals }&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Group the &lt;code&gt;totals&lt;/code&gt; of each document in the pipeline into final status totals using &lt;code&gt;$sum&lt;/code&gt; operations.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $project: { _id: 0 } }&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Format the resulting document to has the &lt;code&gt;reports&lt;/code&gt; format.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;No additional indexes are required, maintaining the single &lt;code&gt;_id&lt;/code&gt; index approach established in the &lt;code&gt;appV4&lt;/code&gt; implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV6R0&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R0&lt;/td&gt;
&lt;td&gt;95,350,431&lt;/td&gt;
&lt;td&gt;19.19GB&lt;/td&gt;
&lt;td&gt;217B&lt;/td&gt;
&lt;td&gt;5.06GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2.95GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;33,429,492&lt;/td&gt;
&lt;td&gt;11.96GB&lt;/td&gt;
&lt;td&gt;385B&lt;/td&gt;
&lt;td&gt;3.24GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.11GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R0&lt;/td&gt;
&lt;td&gt;95,350,319&lt;/td&gt;
&lt;td&gt;11.1GB&lt;/td&gt;
&lt;td&gt;125B&lt;/td&gt;
&lt;td&gt;3.33GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;3.13GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/events&lt;/th&gt;
&lt;th&gt;Index Size/events&lt;/th&gt;
&lt;th&gt;Total Size/events&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R0&lt;/td&gt;
&lt;td&gt;41.2B&lt;/td&gt;
&lt;td&gt;6.3B&lt;/td&gt;
&lt;td&gt;47.5B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;25.7B&lt;/td&gt;
&lt;td&gt;2.4B&lt;/td&gt;
&lt;td&gt;28.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R0&lt;/td&gt;
&lt;td&gt;23.8B&lt;/td&gt;
&lt;td&gt;6.7B&lt;/td&gt;
&lt;td&gt;30.5B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;It is challenging to make a direct comparison between &lt;code&gt;appV6R0&lt;/code&gt; and &lt;code&gt;appV5R0&lt;/code&gt; from a storage perspective. The &lt;code&gt;appV5R0&lt;/code&gt; implementation is the simplest bucketing possible, where &lt;code&gt;event&lt;/code&gt; documents were merely appended to the &lt;code&gt;items&lt;/code&gt; array without bucketing by &lt;code&gt;day&lt;/code&gt;, as is done in &lt;code&gt;appV6R0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;However, we can attempt a comparison between &lt;code&gt;appV6R0&lt;/code&gt; and &lt;code&gt;appV5R3&lt;/code&gt;, the best solution so far. In &lt;code&gt;appV6R0&lt;/code&gt;, data is bucketed by month, whereas in &lt;code&gt;appV5R3&lt;/code&gt;, it is bucketed by quarter. Assuming document size scales linearly with the bucketing criteria (though this is not entirely accurate), the &lt;code&gt;appV6R0&lt;/code&gt; document would be approximately &lt;code&gt;3 * 125 = 375 bytes&lt;/code&gt;, which is 9.4% smaller than &lt;code&gt;appV5R3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Another indicator of improvement is the &lt;code&gt;Data Size/events&lt;/code&gt; metric in the &lt;code&gt;Event Statistics&lt;/code&gt; table. For &lt;code&gt;appV6R0&lt;/code&gt;, each event uses an average of 23.8 bytes, compared to 27.7 bytes for &lt;code&gt;appV5R3&lt;/code&gt;, representing a 14.1% reduction in size.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV6R0&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV5R0&lt;/code&gt; and &lt;code&gt;Desired&lt;/code&gt; rates, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The two versions have very similar rate performance, with &lt;code&gt;appV6R0&lt;/code&gt; being slightly better in the second and third quarter, while &lt;code&gt;appV5R0&lt;/code&gt; is better in the first and fourth quarter.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5uq78qy2ki2t2xqm1bu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5uq78qy2ki2t2xqm1bu.png" alt="Get Reports Rate - appV5R0 vs appV6R0" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The two versions have very similar latency performance, with &lt;code&gt;appV6R0&lt;/code&gt; being slightly better in the second and third quarter, while &lt;code&gt;appV5R0&lt;/code&gt; is better in the first and fourth quarter.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffr6n98b734g07lar6m2f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffr6n98b734g07lar6m2f.png" alt="Get Reports Latency - appV5R0 vs appV6R0" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions have similar rate values, but it can be seen that &lt;code&gt;appV6R0&lt;/code&gt; has a small edge compared to &lt;code&gt;appV5R0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp21u7fhngxxbnxsmmsha.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp21u7fhngxxbnxsmmsha.png" alt="Bulk Upsert Rate - appV5R0 vs appV6R0" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Although both versions have similar latency values for the first quarter of the test, for the final three-quarters, &lt;code&gt;appV6R0&lt;/code&gt; has a clear advantage over &lt;code&gt;appV5R0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1yv64i3z9k8pi3tz39qz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1yv64i3z9k8pi3tz39qz.png" alt="Bulk Upsert Latency - appV5R0 vs appV6R0" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Performance Summary&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Despite the significant reduction in document and storage size achieved by &lt;code&gt;appV6R0&lt;/code&gt;, the performance improvement was not as substantial as expected. This suggests that the bottleneck in the application when bucketing data by month may not be related to disk throughput.&lt;/p&gt;

&lt;p&gt;Examining the &lt;code&gt;collection stats&lt;/code&gt; table reveals that the index size for both versions is close to 3GB. This is near the 4GB of available memory on the machine running the database and exceeds the &lt;a href="https://www.mongodb.com/docs/manual/core/wiredtiger/#memory-use" rel="noopener noreferrer"&gt;1.5GB allocated by WiredTiger for cache&lt;/a&gt;. Therefore, it is likely that the limiting factor in this case is memory/cache rather than document size, which explains the lack of a significant performance improvement.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;To address the limitations observed in &lt;code&gt;appV6R0&lt;/code&gt;, we propose adopting the same line of improvements applied from &lt;code&gt;appV5R0&lt;/code&gt; to &lt;code&gt;appV5R1&lt;/code&gt;. Specifically, we will bucket the events by quarter in &lt;code&gt;appV6R1&lt;/code&gt;. This approach not only follows the established pattern of enhancements but also aligns with the need to optimize performance further.&lt;/p&gt;

&lt;p&gt;As highlighted in the &lt;code&gt;Load Test Results&lt;/code&gt;, the current bottleneck lies in the size of the index relative to the available cache/memory. By increasing the bucketing interval from month to quarter, we can reduce the number of documents by approximately a factor of three. This reduction will, in turn, decrease the number of index entries by the same factor, leading to a smaller index size.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 6 Revision 1 (appV6R1): A Dynamic Quarter Bucket Document &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As discussed in the previous &lt;code&gt;Issues and Improvements&lt;/code&gt; section, the primary bottleneck in &lt;code&gt;appV6R0&lt;/code&gt; was the index size nearing the memory capacity of the machine running MongoDB. To mitigate this issue, we propose to increase the bucketing interval from month to quarter for &lt;code&gt;appV6R1&lt;/code&gt;, the same way we did in &lt;code&gt;appV5R1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This adjustment aims to reduce the number of documents and index entries by approximately a factor of three, thereby decreasing the overall index size. By adopting a quarter-based bucketing strategy, we align with the established pattern of enhancements applied in &lt;code&gt;appV5R1&lt;/code&gt; versions while addressing the specific memory/cache constraints identified in &lt;code&gt;appV6R0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The implementation of &lt;code&gt;appV6R1&lt;/code&gt; retains most of the code from &lt;code&gt;appV6R0&lt;/code&gt;, with the following key differences:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;_id&lt;/code&gt; field will now be composed of &lt;code&gt;key&lt;/code&gt;+&lt;code&gt;year&lt;/code&gt;+&lt;code&gt;quarter&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The field names in the &lt;code&gt;items&lt;/code&gt; document will encode both &lt;code&gt;month&lt;/code&gt; and &lt;code&gt;day&lt;/code&gt;, as this information is necessary for filtering date ranges in the &lt;code&gt;Get Reports&lt;/code&gt; operation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The following example demonstrates how data for June 2022 (2022-06-XX), within the second quarter (Q2), is stored using the new schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...01202202&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0605&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0616&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0627&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0629&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV6R0&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV6R0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;
    &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specification presented, we have the following &lt;code&gt;updateOne&lt;/code&gt; operation for each &lt;code&gt;event&lt;/code&gt; generated by this application version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MMDD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getMMDD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Extract the month (MM) and day(DD) from the `event.date`&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// key + year + quarter&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.a`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.n`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.p`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.r`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;updateOne&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV6R0&lt;/code&gt;, with the only differences being the &lt;code&gt;filter&lt;/code&gt; and &lt;code&gt;update&lt;/code&gt; criteria.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;filter&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Target the document where the &lt;code&gt;_id&lt;/code&gt; field matches the concatenated value of &lt;code&gt;key&lt;/code&gt;, &lt;code&gt;year&lt;/code&gt;, and &lt;code&gt;quarter&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;buildId&lt;/code&gt; function converts the &lt;code&gt;key&lt;/code&gt;+&lt;code&gt;year&lt;/code&gt;+&lt;code&gt;quarter&lt;/code&gt; into a binary format.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;update&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uses the &lt;code&gt;$inc&lt;/code&gt; operator to increment the fields corresponding to the same &lt;code&gt;MMDD&lt;/code&gt; as the &lt;code&gt;event&lt;/code&gt; by the status values provided.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;docsFromKeyBetweenDate&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$addFields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;buildTotalsField&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;groupSumTotals&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;aggregation&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV6R0&lt;/code&gt;, with the only differences being the implementation in the &lt;code&gt;$addFields&lt;/code&gt; stage.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $addFields: itemsReduceAccumulator }&lt;/code&gt;:&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;A similar implementation to the one in &lt;code&gt;appV6R0&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The difference relies on extracting the value of year (&lt;code&gt;YYYY&lt;/code&gt;) from the &lt;code&gt;_id&lt;/code&gt; field and the month and day (&lt;code&gt;MMDD&lt;/code&gt;) from the field name&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The following JavaScript code is logic equivalent to the real aggregation pipeline code.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Get year from _id&lt;/span&gt;
 &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;items_array&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Convert the object to an array of [key, value]&lt;/span&gt;

 &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;totals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;items_array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
     &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;MM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)];&lt;/span&gt;
     &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;statusDate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MM&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

     &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusDate&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;reportStartDate&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;statusDate&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;reportEndDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
     &lt;span class="p"&gt;}&lt;/span&gt;

     &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
   &lt;span class="p"&gt;},&lt;/span&gt;
   &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;No additional indexes are required, maintaining the single &lt;code&gt;_id&lt;/code&gt; index approach established in the &lt;code&gt;appV4&lt;/code&gt; implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV6R1&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;33,429,492&lt;/td&gt;
&lt;td&gt;11.96GB&lt;/td&gt;
&lt;td&gt;385B&lt;/td&gt;
&lt;td&gt;3.24GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.11GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R0&lt;/td&gt;
&lt;td&gt;95,350,319&lt;/td&gt;
&lt;td&gt;11.1GB&lt;/td&gt;
&lt;td&gt;125B&lt;/td&gt;
&lt;td&gt;3.33GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;3.13GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R1&lt;/td&gt;
&lt;td&gt;33,429,366&lt;/td&gt;
&lt;td&gt;8.19GB&lt;/td&gt;
&lt;td&gt;264B&lt;/td&gt;
&lt;td&gt;2.34GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.22GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/events&lt;/th&gt;
&lt;th&gt;Index Size/events&lt;/th&gt;
&lt;th&gt;Total Size/events&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;25.7B&lt;/td&gt;
&lt;td&gt;2.4B&lt;/td&gt;
&lt;td&gt;28.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R0&lt;/td&gt;
&lt;td&gt;23.8B&lt;/td&gt;
&lt;td&gt;6.7B&lt;/td&gt;
&lt;td&gt;30.5B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R1&lt;/td&gt;
&lt;td&gt;17.6B&lt;/td&gt;
&lt;td&gt;2.6B&lt;/td&gt;
&lt;td&gt;20.2B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In the previous &lt;code&gt;Initial Scenario Statistics&lt;/code&gt; analysis, we assumed that document size would scale linearly with the bucketing range. However, this assumption proved inaccurate. The average document size in &lt;code&gt;appV6R1&lt;/code&gt; is approximately twice as large as in &lt;code&gt;appV6R0&lt;/code&gt;, even though it stores three times more data. Already a win for this new implementation.&lt;/p&gt;

&lt;p&gt;Since &lt;code&gt;appV6R1&lt;/code&gt; buckets data by quarter at the document level and by day within the &lt;code&gt;items&lt;/code&gt; sub-document, a fair comparison would be with &lt;code&gt;appV5R3&lt;/code&gt;, the best-performing version so far. From the tables above, we observe a significant improvement in &lt;code&gt;Document Size&lt;/code&gt; and consequently &lt;code&gt;Data Size&lt;/code&gt; when transitioning from &lt;code&gt;appV5R3&lt;/code&gt; to &lt;code&gt;appV6R1&lt;/code&gt;. Specifically, there was a 31.4% reduction in &lt;code&gt;Document Size&lt;/code&gt;. From an index size perspective, there was no change, as both versions bucket events by quarter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV6R0&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV5R0&lt;/code&gt; and &lt;code&gt;Desired&lt;/code&gt; rates, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;For the first three-quarters of the test, both versions have similar rate values, but, for the final quarter, &lt;code&gt;appV6R1&lt;/code&gt; has a notable edge over &lt;code&gt;appV5R3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkij3iotd6g6qcd5fpqpq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkij3iotd6g6qcd5fpqpq.png" alt="Get Reports Rate - appV5R3 vs appV6R1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;As happened in the rates graph, both versions have similar values for the first three-quarters, with &lt;code&gt;appV6R1&lt;/code&gt; being better than &lt;code&gt;appV5R3&lt;/code&gt; for the final quarter.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frh0z0hurd8ns9f3oywbx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frh0z0hurd8ns9f3oywbx.png" alt="Get Reports Latency - appV5R3 vs appV6R1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions have very similar rate values throughout the test, but &lt;code&gt;appV6R1&lt;/code&gt; is able to get better values than &lt;code&gt;appV5R3&lt;/code&gt; in the final 20 minutes, but still not able to reach the desired rate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7c8keq38gdozmrydq8ey.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7c8keq38gdozmrydq8ey.png" alt="Bulk Upsert Rate - appV5R3 vs appV6R1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Even though both versions have similar rate values, we can see that &lt;code&gt;appV6R1&lt;/code&gt; has considerably better latency values than &lt;code&gt;appV5R3&lt;/code&gt;, being almost two times faster for the last three quarters of the test.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzz7zipfs9ukybp87bzk3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzz7zipfs9ukybp87bzk3.png" alt="Bulk Upsert Latency - appV5R3 vs appV6R1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;Looking at the &lt;code&gt;Get Reports&lt;/code&gt; graphs in the last &lt;code&gt;Load Test Results&lt;/code&gt;, we're still not being able to reach the desired rates for this functionality. One way we could try to improve these operations is by using our well-known and old friend, the &lt;code&gt;Computed Pattern&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Applying the &lt;code&gt;Computed Pattern&lt;/code&gt; in the current version would be the same improvement tried from &lt;code&gt;appV5R3&lt;/code&gt; to &lt;code&gt;appV5R4&lt;/code&gt;, which, instead of improving the performance, made it worse. Why would this solution work this time? The only way to know if it will work or not is to try, but before cracking our fingers and starting to work on the implementation, it's always a good idea to make a sanity check and see if there is at least one good reason to believe that this time things will be different (cof - cof).&lt;/p&gt;

&lt;p&gt;When we applied the &lt;code&gt;Computed Pattern&lt;/code&gt; from &lt;code&gt;appV5R3&lt;/code&gt; to &lt;code&gt;appV5R4&lt;/code&gt;, we got a 8.2% increase in the document size and a slight degradation in performance in the &lt;code&gt;Bulk Upsert&lt;/code&gt; functionality, with no performance gains in &lt;code&gt;Get Reports&lt;/code&gt;. From &lt;code&gt;appV5R3&lt;/code&gt; to &lt;code&gt;appV6R2&lt;/code&gt;, we got a 31.4% reduction in the document size, it could make sense to trade some of this reduction in favor of storing some pre-computed values. Another point is that the &lt;code&gt;Bulk Upsert&lt;/code&gt; functionality in &lt;code&gt;appV6R2&lt;/code&gt; has its best performance so far, so maybe the extra cost of pre-computing the documents totals for this version is not a big of a deal as it was for &lt;code&gt;appV5R4&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;With these two "maybes" and a scientific spirit of always trying to test things to see where they'll break, let's give the &lt;code&gt;Computed Pattern&lt;/code&gt; another chance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 6 Revision 2 (appV6R2): A Dynamic Bucket and Computed Document &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As discussed in the previous &lt;code&gt;Issues and Improvements&lt;/code&gt; section, in this revision we'll give another try to the &lt;code&gt;Computed Pattern&lt;/code&gt; and pre-compute the status totals for each document. This implementation is practically equal to the one tried in &lt;code&gt;appV5R4&lt;/code&gt;, with the only difference being that we are using a &lt;code&gt;Dynamic Schema&lt;/code&gt; for the &lt;code&gt;items&lt;/code&gt; field instead of an array.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV6R1&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV6R1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Quarter total approved&lt;/span&gt;
    &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Quarter total noFunds&lt;/span&gt;
    &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Quarter total pending&lt;/span&gt;
    &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Quarter total rejected&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="nl"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;
    &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Daily total approved&lt;/span&gt;
      &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Daily total noFunds&lt;/span&gt;
      &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Daily total pending&lt;/span&gt;
      &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Daily total rejected&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;No additional indexes are required, maintaining the single &lt;code&gt;_id&lt;/code&gt; index approach established in the &lt;code&gt;appV4&lt;/code&gt; implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specifications, the following bulk &lt;code&gt;updateOne&lt;/code&gt; operation is used for each &lt;code&gt;event&lt;/code&gt; generated by the application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MMDD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getMMDD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Extract the month (MM) and day(DD) from the `event.date`&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// key + year + quarter&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;totals.a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;totals.n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;totals.p&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;totals.r&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.a`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.n`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.p`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.r`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;updateOne&lt;/code&gt; has almost the same logic as the one for &lt;code&gt;appV6R1&lt;/code&gt;, with the differences being that we also increment the &lt;code&gt;totals&lt;/code&gt; to pre-compute the quarter totals for the document. From a logic perspective, this operation is equal to the &lt;code&gt;Bulk Upsert&lt;/code&gt; of &lt;code&gt;appV5R4&lt;/code&gt;, but from an implementation perspective, it's way easier to write and understand, and from an execution perspective, it's less costly/intensive for having fewer stages and operations. This simplicity may also contribute to a better performance of the &lt;code&gt;Computed Pattern&lt;/code&gt; when compared to &lt;code&gt;appV5R4&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;docsFromKeyBetweenDate&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$addFields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;buildTotalsField&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;groupSumTotals&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;aggregation&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV5R4&lt;/code&gt;, because of the pre-computed &lt;code&gt;totals&lt;/code&gt; field, and the one in &lt;code&gt;appV6R1&lt;/code&gt;, because the &lt;code&gt;items&lt;/code&gt; field is of type document. The difference when compared to &lt;code&gt;appV6R1&lt;/code&gt; relies only on the &lt;code&gt;$addFields&lt;/code&gt; stage. The complete code for this aggregation pipeline is quite complicated. Because of that, we will have just a pseudocode for it here.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $addFields: buildTotalsField }&lt;/code&gt;:&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;A similar implementation to the one in &lt;code&gt;appV6R1&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The main difference is if the quarter’s date range is within the limits of the report’s date range, we can use the pre-computed &lt;code&gt;totals&lt;/code&gt; instead of calculating the value through a &lt;code&gt;$reduce&lt;/code&gt; operation.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The following JavaScript code is logic equivalent to the real aggregation pipeline code.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt; &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

 &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;documentQuarterWithinReportDateRange&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="c1"&gt;// Use pre-computed quarterly totals&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="c1"&gt;// Fall back to item-level aggregation&lt;/span&gt;
   &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Get year from _id&lt;/span&gt;
   &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;items_array&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Convert the object to an array of [key, value]&lt;/span&gt;

   &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;totals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;items_array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;MM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;MMDD&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)];&lt;/span&gt;
       &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;statusDate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MM&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

       &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusDate&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;reportStartDate&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;statusDate&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;reportEndDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
         &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
         &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
         &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="p"&gt;}&lt;/span&gt;

       &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
     &lt;span class="p"&gt;},&lt;/span&gt;
     &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
   &lt;span class="p"&gt;);&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;No additional indexes are required, maintaining the single &lt;code&gt;_id&lt;/code&gt; index approach established in the &lt;code&gt;appV4&lt;/code&gt; implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV6R2&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;33,429,492&lt;/td&gt;
&lt;td&gt;11.96GB&lt;/td&gt;
&lt;td&gt;385B&lt;/td&gt;
&lt;td&gt;3.24GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.11GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R1&lt;/td&gt;
&lt;td&gt;33,429,366&lt;/td&gt;
&lt;td&gt;8.19GB&lt;/td&gt;
&lt;td&gt;264B&lt;/td&gt;
&lt;td&gt;2.34GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.22GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R2&lt;/td&gt;
&lt;td&gt;33,429,207&lt;/td&gt;
&lt;td&gt;9.11GB&lt;/td&gt;
&lt;td&gt;293B&lt;/td&gt;
&lt;td&gt;2.8GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.26GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/events&lt;/th&gt;
&lt;th&gt;Index Size/events&lt;/th&gt;
&lt;th&gt;Total Size/events&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;25.7B&lt;/td&gt;
&lt;td&gt;2.4B&lt;/td&gt;
&lt;td&gt;28.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R1&lt;/td&gt;
&lt;td&gt;17.6B&lt;/td&gt;
&lt;td&gt;2.6B&lt;/td&gt;
&lt;td&gt;20.2B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R2&lt;/td&gt;
&lt;td&gt;19.6B&lt;/td&gt;
&lt;td&gt;2.7B&lt;/td&gt;
&lt;td&gt;22.3B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;As expected, we had a 11.2% increase in the &lt;code&gt;Document Size&lt;/code&gt; by adding a &lt;code&gt;totals&lt;/code&gt; field in each document of &lt;code&gt;appV6R2&lt;/code&gt;. When comparing to &lt;code&gt;appV5R3&lt;/code&gt;, we still have a reduction of 23.9% in the &lt;code&gt;Document Size&lt;/code&gt;. Let's go to the &lt;code&gt;Load Test Results&lt;/code&gt; and see if the trade-off between storage and computation cost will be worth it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV6R2&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV6R1&lt;/code&gt; and &lt;code&gt;Desired&lt;/code&gt; rates, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;We can clearly see that &lt;code&gt;appV6R2&lt;/code&gt; has better rates than &lt;code&gt;appV6R1&lt;/code&gt; throughout the test, but still not reaching the top rate of 250 reports per second.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1e35o9swquhoybka0k6s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1e35o9swquhoybka0k6s.png" alt="Get Reports Rate - appV6R1 vs appV6R2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;As happened in the rates graph, &lt;code&gt;appV6R2&lt;/code&gt; provides lower latency than &lt;code&gt;appV6R1&lt;/code&gt; throughout the test.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb83f9227b47elhqf5k3r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb83f9227b47elhqf5k3r.png" alt="Get Reports Latency - appV6R1 vs appV6R2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions have very similar rates values throughout the test, with &lt;code&gt;appV6R2&lt;/code&gt; being slightly better than &lt;code&gt;appV6R1&lt;/code&gt; in the final 20 minutes of the test, but still not being able to reach the desired rate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmxzmtn8tjfojzkcf8m34.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmxzmtn8tjfojzkcf8m34.png" alt="Bulk Upsert Rate - appV6R1 vs appV6R2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Even though &lt;code&gt;appV6R2&lt;/code&gt; had better rates values than &lt;code&gt;appV6R1&lt;/code&gt;, when looking at their latency it's not possible to point a winner, with &lt;code&gt;appV6R2&lt;/code&gt; being better in the first and final quartes and &lt;code&gt;appV6R1&lt;/code&gt; being better in the second and third quarters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu76mz7n3m2cduj8g4zfv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu76mz7n3m2cduj8g4zfv.png" alt="Bulk Upsert Latency - appV6R1 vs appV6R2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Performance Summary&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The two "maybes" from the previous &lt;code&gt;Issues and Improvements&lt;/code&gt; made up for their promises, and we got the best performance for &lt;code&gt;appV6R2&lt;/code&gt; when comparing to &lt;code&gt;appV6R1&lt;/code&gt;. This is the redemption of the &lt;code&gt;Computed Pattern&lt;/code&gt; applied on a document level. This revision is one of my favorites because it shows that the same optimization on very similar applications can lead to different results. In our case, the difference was caused by the application being very bottlenecked by the disk throughput.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;Let's tackle the last improvement on an application level. Those paying a close attention through the application versions may have already questioned it. In every &lt;code&gt;Get Reports&lt;/code&gt; section, we have "To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval". Do we really need to run five aggregation pipelines to generate the &lt;code&gt;reports&lt;/code&gt; document? Isn't there a way to calculate everything in just one operation? The answer is "Yes", there is.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;reports&lt;/code&gt; document is composed of the fields &lt;code&gt;oneYear&lt;/code&gt;, &lt;code&gt;threeYears&lt;/code&gt;, &lt;code&gt;fiveYears&lt;/code&gt;, &lt;code&gt;sevenYears&lt;/code&gt;, and &lt;code&gt;tenYears&lt;/code&gt;, where, until now, each one was generated by its own aggregation pipeline. Generating the &lt;code&gt;reports&lt;/code&gt; this way is a waste of processing power because we are doing some part of the calculation multiple times. For example, to calculate the status totals for &lt;code&gt;tenYears&lt;/code&gt;, we will also have to calculate the status totals for the others fields, as from a date range perspective, they are all contained in the &lt;code&gt;tenYears&lt;/code&gt; date range.&lt;/p&gt;

&lt;p&gt;So, for our next application revision, we'll condense the &lt;code&gt;Get Reports&lt;/code&gt; five aggregation pipelines into one, avoiding wasting processing power on repeated calculation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 6 Revision 3 (appV6R3): Getting Everything at Once &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As discussed in the previous &lt;code&gt;Issues and Improvements&lt;/code&gt; section, in this revision, we'll improve the performance of our application by changing the &lt;code&gt;Get Reports&lt;/code&gt; functionality to generate the &lt;code&gt;reports&lt;/code&gt; document using only one aggregation pipeline instead of five.&lt;/p&gt;

&lt;p&gt;The rationale behind this improvement is that when we generate the &lt;code&gt;tenYears&lt;/code&gt; totals, we have also calculated the other totals, &lt;code&gt;oneYear&lt;/code&gt;, &lt;code&gt;threeYears&lt;/code&gt;, &lt;code&gt;fiveYears&lt;/code&gt;, and &lt;code&gt;sevenYears&lt;/code&gt;. As an example, when we make a request to &lt;code&gt;Get Reports&lt;/code&gt; with the &lt;code&gt;key&lt;/code&gt; &lt;code&gt;...0001&lt;/code&gt; with the &lt;code&gt;date&lt;/code&gt; &lt;code&gt;2022-01-01&lt;/code&gt;, the totals will be calculated with the following date range:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;oneYear&lt;/code&gt;: From &lt;code&gt;2021-01-01&lt;/code&gt; to &lt;code&gt;2022-01-01&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;threeYears&lt;/code&gt;: From &lt;code&gt;2020-01-01&lt;/code&gt; to &lt;code&gt;2022-01-01&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fiveYears&lt;/code&gt;: From &lt;code&gt;2018-01-01&lt;/code&gt; to &lt;code&gt;2022-01-01&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sevenYears&lt;/code&gt;: From &lt;code&gt;2016-01-01&lt;/code&gt; to &lt;code&gt;2022-01-01&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;tenYear&lt;/code&gt;: From &lt;code&gt;2013-01-01&lt;/code&gt; to &lt;code&gt;2022-01-01&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As we can see from the list above, the date range for &lt;code&gt;tenYears&lt;/code&gt; includes all the other date ranges.&lt;/p&gt;

&lt;p&gt;Although we have successfully implemented the &lt;code&gt;Computed Pattern&lt;/code&gt; in the previous revision, &lt;code&gt;appV6R2&lt;/code&gt;, and got better results than &lt;code&gt;appV6R1&lt;/code&gt;, we won't be using it as a base for this revision. There were two reasons for that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Based on the results of our previous implementation of the &lt;code&gt;Computed Pattern&lt;/code&gt; on a document level, from &lt;code&gt;appV5R3&lt;/code&gt; to &lt;code&gt;appV5R4&lt;/code&gt;, I didn't expect it to get better results.&lt;/li&gt;
&lt;li&gt;The implementation of the &lt;code&gt;Get Reports&lt;/code&gt; to get the &lt;code&gt;reports&lt;/code&gt; document through just one aggregation pipeline and also using the pre-computed field &lt;code&gt;totals&lt;/code&gt; generated by the &lt;code&gt;Computed Pattern&lt;/code&gt; would require a lot of work, and by the time of the latest versions of this series, I just wanted to finish it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So, this revision will be built based on the &lt;code&gt;appV6R1&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV6R0&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV6R0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;
    &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specifications, the following bulk &lt;code&gt;updateOne&lt;/code&gt; operation is used for each &lt;code&gt;event&lt;/code&gt; generated by the application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getYYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Extract the year(YYYY), month(MM), and day(DD) from the `event.date`&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// key + year + quarter&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.a`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.n`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.p`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.r`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;updateOne&lt;/code&gt; has almost exactly the same logic as the one for &lt;code&gt;appV6R1&lt;/code&gt;. The difference is that the name of the fields in the &lt;code&gt;items&lt;/code&gt; document will be created based on year, month, and day (&lt;code&gt;YYYYMMDD&lt;/code&gt;) instead of just month and day (&lt;code&gt;MMDD&lt;/code&gt;). This change was made to reduce the complexity of the aggregation pipeline of the &lt;code&gt;Get Reports&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, one aggregation pipeline is required,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;docsFromKeyBetweenDate&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$addFields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;buildTotalsField&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;groupCountTotals&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;format&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;aggregation&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV6R1&lt;/code&gt;, with the only differences being the implementation in the &lt;code&gt;$addFields&lt;/code&gt; stage.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $addFields: buildTotalsField }&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;It has a similar logic to the previous revision, where we first convert the &lt;code&gt;items&lt;/code&gt; document into an array using the &lt;code&gt;$objectToArray&lt;/code&gt; and then we use the &lt;code&gt;reduce&lt;/code&gt; function to iterate over the array, accumulating the status.&lt;/li&gt;
&lt;li&gt;The difference lies in the initial value and the logic of the &lt;code&gt;reduce&lt;/code&gt; function.&lt;/li&gt;
&lt;li&gt;The initial value in this case is an object/document with one field for each of the report date ranges. These fields for each report date range are also an object/document, with their fields being the possible status set to zero, as this is the initial value.&lt;/li&gt;
&lt;li&gt;The logic in this case will check in which date range the item is, and based on that, increment the totals. If the item &lt;code&gt;isInOneYearDateRange(...)&lt;/code&gt;, it is also in all the other date ranges: three, five, seven, and ten years. If the item &lt;code&gt;isInThreeYearsDateRange(...)&lt;/code&gt;, it is also in all the other wide date ranges, five, seven, and ten years.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The following JavaScript code is logic equivalent to the real aggregation pipeline code. Senior developers could make the argument that this implementation could be less verbose or more optimized, but due to how MongoDB aggregation pipeline operators are specified, this is how it was implemented.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;itemsArray&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Convert the object to an array of [key, value]&lt;/span&gt;

 &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;totals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;itemsArray&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Get year&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;MM&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Get month&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Get day&lt;/span&gt;
     &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;statusDate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;MM&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;DD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

     &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isInOneYearDateRange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;oneYear&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;oneYear&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;threeYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;threeYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fiveYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fiveYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sevenYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sevenYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
     &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isInThreeYearsDateRange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;threeYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;threeYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fiveYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fiveYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sevenYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sevenYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
     &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isInFiveYearsDateRange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fiveYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fiveYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sevenYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sevenYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
     &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isInSevenYearsDateRange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sevenYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sevenYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
     &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isInTenYearsDateRange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenYears&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;incrementTotals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenYears&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
     &lt;span class="p"&gt;}&lt;/span&gt;

     &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
   &lt;span class="p"&gt;},&lt;/span&gt;
   &lt;span class="p"&gt;{&lt;/span&gt;
     &lt;span class="na"&gt;oneYear&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
     &lt;span class="na"&gt;threeYears&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
     &lt;span class="na"&gt;fiveYears&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
     &lt;span class="na"&gt;sevenYears&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
     &lt;span class="na"&gt;tenYears&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
   &lt;span class="p"&gt;},&lt;/span&gt;
 &lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;No additional indexes are required, maintaining the single &lt;code&gt;_id&lt;/code&gt; index approach established in the &lt;code&gt;appV4&lt;/code&gt; implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV6R3&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV6R1&lt;/td&gt;
&lt;td&gt;33,429,366&lt;/td&gt;
&lt;td&gt;8.19GB&lt;/td&gt;
&lt;td&gt;264B&lt;/td&gt;
&lt;td&gt;2.34GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.22GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R2&lt;/td&gt;
&lt;td&gt;33,429,207&lt;/td&gt;
&lt;td&gt;9.11GB&lt;/td&gt;
&lt;td&gt;293B&lt;/td&gt;
&lt;td&gt;2.8GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.26GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R3&lt;/td&gt;
&lt;td&gt;33,429,694&lt;/td&gt;
&lt;td&gt;9.53GB&lt;/td&gt;
&lt;td&gt;307B&lt;/td&gt;
&lt;td&gt;2.56GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.19GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/events&lt;/th&gt;
&lt;th&gt;Index Size/events&lt;/th&gt;
&lt;th&gt;Total Size/events&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV6R1&lt;/td&gt;
&lt;td&gt;17.6B&lt;/td&gt;
&lt;td&gt;2.6B&lt;/td&gt;
&lt;td&gt;20.2B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R2&lt;/td&gt;
&lt;td&gt;19.6B&lt;/td&gt;
&lt;td&gt;2.7B&lt;/td&gt;
&lt;td&gt;22.3B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R3&lt;/td&gt;
&lt;td&gt;20.5B&lt;/td&gt;
&lt;td&gt;2.6B&lt;/td&gt;
&lt;td&gt;23.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Because we are adding the year (&lt;code&gt;YYYY&lt;/code&gt;) information in the name of each &lt;code&gt;items&lt;/code&gt; document field, we got a 16.3% increase in storage size when compared to &lt;code&gt;appV6R1&lt;/code&gt; and a 4.8% increase in storage size when compared to &lt;code&gt;appV6R2&lt;/code&gt;. This increase in storage size may be compensated by the gains in the &lt;code&gt;Get Reports&lt;/code&gt; function, as we saw when going from &lt;code&gt;appV6R1&lt;/code&gt; to &lt;code&gt;appV6R2&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV6R3&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV6R2&lt;/code&gt;, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rate&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;We have a huge improvement here when going from &lt;code&gt;appV6R2&lt;/code&gt; to &lt;code&gt;appV6R3&lt;/code&gt;, for the first time, the application was able to reach all the desired rates in one phase.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8jp4dgray9ferlmv2525.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8jp4dgray9ferlmv2525.png" alt="Get Reports Rate - appV6R2 vs appV6R3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The latency also got huge improvements, with the peak value being reduced by 71% in the first phase, 67% in the second phase, 47% in the third phase, and 30% in the fourth phase.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2nqakew2a5rggwo1fzmw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2nqakew2a5rggwo1fzmw.png" alt="Get Reports Latency - appV6R2 vs appV6R3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rate&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;As had happened in the previous version, the application was able to reach all the desired rates.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuf7s1qxzjuyn0qe7e1qn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuf7s1qxzjuyn0qe7e1qn.png" alt="Bulk Upsert Rate - appV6R2 vs appV6R3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Here we have one of the biggest gains we had in this series, the latency went from being measured in seconds to being measured in milliseconds. We went from a peak of 1.8 seconds to 250ms in the first phase, from 2.3 seconds to 400ms in the second phase, from 2 seconds to 600ms in the third phase, and from 2.2 seconds to 800ms in the fourth phase&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffw2lkeogwm7lcku8mwdy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffw2lkeogwm7lcku8mwdy.png" alt="Bulk Upsert Latency - appV6R2 vs appV6R3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;The main bottleneck in our MongoDB server is still the disk throughput. As informed in the previous &lt;code&gt;Issues and Improvements&lt;/code&gt;, this was the last improvement on an application level, so how can we extract more from our current hardware?&lt;/p&gt;

&lt;p&gt;If we take a closer look at the &lt;a href="https://www.mongodb.com/docs/manual/core/wiredtiger/#compression" rel="noopener noreferrer"&gt;MongoDB documentation&lt;/a&gt;, we'll find out that by default it uses block compression with the &lt;code&gt;snappy&lt;/code&gt; compression library for all collections. Before the data is written to disk, it'll be compressed using the &lt;code&gt;snappy&lt;/code&gt; library to reduce its size and speed up the writing process.&lt;/p&gt;

&lt;p&gt;Would it be possible to use a different and more effective compression library to reduce the size of the data even further and, as a consequence, reduce the load on the server's disk? Yes, it's, and in the next application revision, we will use the &lt;code&gt;zstd&lt;/code&gt; compression library instead of the default &lt;code&gt;snappy&lt;/code&gt; compression library.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 6 Revision 4 (appV6R4): The &lt;code&gt;zstd&lt;/code&gt; Compression Algorithm &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As discussed in the previous &lt;code&gt;Issues and Improvements&lt;/code&gt; section, the performance gains of this version will be provided by changing the algorithm of the &lt;a href="https://www.mongodb.com/docs/manual/reference/configuration-options/#mongodb-setting-storage.wiredTiger.collectionConfig.blockCompressor" rel="noopener noreferrer"&gt;collection block compressor&lt;/a&gt;. By default, MongoDB uses the &lt;a href="https://www.mongodb.com/docs/manual/reference/glossary/#std-term-snappy" rel="noopener noreferrer"&gt;&lt;code&gt;snappy&lt;/code&gt;&lt;/a&gt;, which we will change to &lt;a href="https://www.mongodb.com/docs/manual/reference/glossary/#std-term-zstd" rel="noopener noreferrer"&gt;&lt;code&gt;zstd&lt;/code&gt;&lt;/a&gt; to have a better compression performance on the expense of more CPU usage.&lt;/p&gt;

&lt;p&gt;All the schemas, functions, and code from this version are exactly the same as the &lt;code&gt;appV6R3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To create a collection that uses the &lt;code&gt;zstd&lt;/code&gt; compression algorithm, the following command can be used.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createCollection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;collection-name&amp;gt;&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;storageEngine&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;wiredTiger&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;configString&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;block_compressor=zstd&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV6R0&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV6R0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;
    &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specifications, the following bulk &lt;code&gt;updateOne&lt;/code&gt; operation is used for each &lt;code&gt;event&lt;/code&gt; generated by the application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getYYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Extract the year(YYYY), month(MM), and day(DD) from the `event.date`&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// key + year + quarter&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.a`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.n`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.p`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`items.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;YYYYMMDD&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.r`&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;updateOne&lt;/code&gt; has exactly the same logic as the one for &lt;code&gt;appV6R3&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on what was presented in the &lt;code&gt;Introduction&lt;/code&gt;, we have the following aggregation pipeline to generate the &lt;code&gt;reports&lt;/code&gt; document.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;docsFromKeyBetweenDate&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$addFields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;buildTotalsField&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;groupCountTotals&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;format&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;pipeline&lt;/code&gt; has exactly the same logic as the one for &lt;code&gt;appV6R3&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;No additional indexes are required, maintaining the single &lt;code&gt;_id&lt;/code&gt; index approach established in the &lt;code&gt;appV4&lt;/code&gt; implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV6R4&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV6R3&lt;/td&gt;
&lt;td&gt;33,429,694&lt;/td&gt;
&lt;td&gt;9.53GB&lt;/td&gt;
&lt;td&gt;307B&lt;/td&gt;
&lt;td&gt;2.56GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.19GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R4&lt;/td&gt;
&lt;td&gt;33,429,372&lt;/td&gt;
&lt;td&gt;9.53GB&lt;/td&gt;
&lt;td&gt;307B&lt;/td&gt;
&lt;td&gt;1.47GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.34GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Storage Size/events&lt;/th&gt;
&lt;th&gt;Index Size/events&lt;/th&gt;
&lt;th&gt;Total Storage Size/events&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV6R3&lt;/td&gt;
&lt;td&gt;5.5B&lt;/td&gt;
&lt;td&gt;2.6B&lt;/td&gt;
&lt;td&gt;8.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R4&lt;/td&gt;
&lt;td&gt;3.2B&lt;/td&gt;
&lt;td&gt;2.8B&lt;/td&gt;
&lt;td&gt;6.0B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;As the application implementation of &lt;code&gt;appV6R4&lt;/code&gt; is the same as &lt;code&gt;appV5R3&lt;/code&gt;, the values for &lt;code&gt;Data Size&lt;/code&gt;, &lt;code&gt;Document Size&lt;/code&gt;, and &lt;code&gt;Index Size&lt;/code&gt; are the same. The difference lies in &lt;code&gt;Storage Size&lt;/code&gt;, which represents the &lt;code&gt;Data Size&lt;/code&gt; after compression. Going from &lt;code&gt;snappy&lt;/code&gt; to &lt;code&gt;zstd&lt;/code&gt; decreased the &lt;code&gt;Storage Size&lt;/code&gt; in a jaw-dropping 43%. Looking at the &lt;code&gt;Event Statistics&lt;/code&gt;, there was a reduction of 26% of the storage required to register each event, going from 8.1 bytes to 6 bytes. These considerable reductions in size will probably translate to better performance on this version, as our main bottleneck is disk throughput.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV6R4&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV6R3&lt;/code&gt;, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rate&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Even though we weren't able to reach all the desired rates, we got another huge improvement when going from &lt;code&gt;appV6R3&lt;/code&gt; to &lt;code&gt;appV6R4&lt;/code&gt;, we could almost consider that in this revision, we were also able to reach the desired rates in the first, second and third quarters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx75mma5ipa1y1gbbijhl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx75mma5ipa1y1gbbijhl.png" alt="Get Reports Rate - appV6R3 vs appV6R4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The latency also got huge improvements, with the peak value being reduced by 30% in the first phase, 57% in the second phase, 61% in the third phase, and 57% in the fourth phase.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwkn19vjdqmhg8p4dectu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwkn19vjdqmhg8p4dectu.png" alt="Get Reports Latency - appV6R3 vs appV6R4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rate&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;As had happened in the previous version, the application was able to reach all the desired rates.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkn0ocwo4dlwzkyofydpt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkn0ocwo4dlwzkyofydpt.png" alt="Bulk Upsert Rate - appV6R3 vs appV6R4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Here we also got considerable improvements, with the peak value being reduced by 48% in the first phase, 39% in the second phase, 43% in the third phase, and 47% in the fourth phase.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo55x9lup0nlvp8fei0r2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo55x9lup0nlvp8fei0r2.png" alt="Bulk Upsert Latency - appV6R3 vs appV6R4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;Although this is the last version and revision of the series, there is still room for improvement. For those willing to try them by themselves, here are the ones that I was able to think of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use the &lt;code&gt;Computed Pattern&lt;/code&gt; in the &lt;code&gt;appV6R4&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Optimize the aggregation pipeline logic for &lt;code&gt;Get Reports&lt;/code&gt; in the &lt;code&gt;appV6R4&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Change the &lt;a href="https://www.mongodb.com/docs/manual/reference/configuration-options/#mongodb-setting-storage.wiredTiger.engineConfig.zstdCompressionLevel" rel="noopener noreferrer"&gt;&lt;code&gt;zstd&lt;/code&gt; compression level&lt;/a&gt; from its default value 6 to a higher value.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This final part of "The Cost of Not Knowing MongoDB" series has explored the ultimate evolution of MongoDB application optimization, demonstrating how revolutionary design patterns and infrastructure-level improvements can transcend traditional performance boundaries. The journey through &lt;code&gt;appV6R0&lt;/code&gt; to &lt;code&gt;appV6R4&lt;/code&gt; represents the culmination of sophisticated MongoDB development practices, achieving performance levels that seemed impossible with the baseline &lt;code&gt;appV1&lt;/code&gt; implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Series Transformation Summary
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;From Foundation to Revolution:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The complete series showcases a remarkable transformation across three distinct optimization phases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Part 1 (&lt;code&gt;appV1&lt;/code&gt;-&lt;code&gt;appV4&lt;/code&gt;)&lt;/strong&gt;: Document-level optimizations achieving 51% storage reduction through schema refinement, data type optimization, and strategic indexing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 2 (&lt;code&gt;appV5R0&lt;/code&gt;-&lt;code&gt;appV5R4&lt;/code&gt;)&lt;/strong&gt;: Advanced pattern implementation with Bucket and Computed patterns, delivering 89% index size reduction and first-time achievement of target rates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 3 (&lt;code&gt;appV6R0&lt;/code&gt;-&lt;code&gt;appV6R4&lt;/code&gt;)&lt;/strong&gt;: Revolutionary Dynamic Schema Pattern with infrastructure optimization, culminating in sub-second latencies and comprehensive target rate achievement&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Performance Evolution:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The progression reveals exponential improvements across all metrics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Get Reports Latency&lt;/strong&gt;: From 6.5 seconds (&lt;code&gt;appV1&lt;/code&gt;) to 200-800ms (&lt;code&gt;appV6R4&lt;/code&gt;) - a 92% improvement&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;: From 62 seconds (&lt;code&gt;appV1&lt;/code&gt;) to 250-800ms (&lt;code&gt;appV6R4&lt;/code&gt;) - a 99% improvement&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage Efficiency&lt;/strong&gt;: From 128.1B per event (&lt;code&gt;appV1&lt;/code&gt;) to 6.0B per event (&lt;code&gt;appV6R4&lt;/code&gt;) - a 95% reduction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Target Rate Achievement&lt;/strong&gt;: From consistent failures to sustained success across all operational phases&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Architectural Paradigm Shifts
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Dynamic Schema Pattern Revolution:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;appV6R0&lt;/code&gt; through &lt;code&gt;appV6R4&lt;/code&gt; introduced the most sophisticated MongoDB design pattern explored in this series. The Dynamic Schema Pattern fundamentally redefined data organization by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Eliminating Array Overhead&lt;/strong&gt;: Replacing MongoDB arrays with computed object structures to minimize storage and processing costs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single-Pipeline Optimization&lt;/strong&gt;: Consolidating five separate aggregation pipelines into one optimized operation, reducing computational overhead by 80%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Infrastructure-Level Optimization&lt;/strong&gt;: Implementing &lt;code&gt;zstd&lt;/code&gt; compression, achieving 43% additional storage reduction over default &lt;code&gt;snappy&lt;/code&gt; compression&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Query Optimization Breakthroughs:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The implementation of intelligent date range calculation within aggregation pipelines eliminated redundant operations while maintaining data accuracy. This approach demonstrates senior-level MongoDB development by leveraging advanced aggregation framework capabilities to achieve both performance and maintainability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Critical Technical Insights
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Performance Bottleneck Evolution:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Throughout the series, we observed how optimization focus shifted as bottlenecks were resolved:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Initial Phase&lt;/strong&gt;: Index size and query inefficiency dominated performance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intermediate Phase&lt;/strong&gt;: Document retrieval count became the limiting factor&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Advanced Phase&lt;/strong&gt;: Aggregation pipeline complexity constrained throughput&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Final Phase&lt;/strong&gt;: Disk I/O emerged as the ultimate hardware limitation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Pattern Application Maturity:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The series demonstrates the progression from junior to senior MongoDB development practices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Junior Level&lt;/strong&gt;: Schema design without understanding indexing implications (&lt;code&gt;appV1&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intermediate Level&lt;/strong&gt;: Applying individual optimization techniques (&lt;code&gt;appV2&lt;/code&gt;-&lt;code&gt;appV4&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Advanced Level&lt;/strong&gt;: Implementing established MongoDB patterns (&lt;code&gt;appV5RX&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Senior Level&lt;/strong&gt;: Creating custom patterns and infrastructure optimization (&lt;code&gt;appV6RX&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Production Implementation Guidelines
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;When to Apply Each Pattern:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Based on the comprehensive analysis, the following guidelines emerge for production implementations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Document-Level Optimizations&lt;/strong&gt;: Essential for all MongoDB applications, providing 40-60% improvement with minimal complexity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bucket Pattern&lt;/strong&gt;: Optimal for time-series data with 10:1 or greater read-to-write ratios&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Computed Pattern&lt;/strong&gt;: Most effective in read-heavy scenarios with predictable aggregation requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Schema Pattern&lt;/strong&gt;: Reserved for high-performance applications where development complexity trade-offs are justified&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Infrastructure Considerations:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;zstd&lt;/code&gt; compression implementation in &lt;code&gt;appV6R4&lt;/code&gt; demonstrates that infrastructure-level optimizations can provide substantial benefits (40%+ storage reduction) with minimal application changes. However, these optimizations require careful CPU utilization monitoring and may not be suitable for CPU-constrained environments.&lt;/p&gt;

&lt;h3&gt;
  
  
  The True Cost of Not Knowing MongoDB
&lt;/h3&gt;

&lt;p&gt;This series reveals that the "cost" extends far beyond mere performance degradation:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quantifiable Impacts:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Resource Utilization&lt;/strong&gt;: Up to 20x more storage requirements for equivalent functionality&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Infrastructure Costs&lt;/strong&gt;: Potentially 10x higher hardware requirements due to inefficient patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developer Productivity&lt;/strong&gt;: Months of optimization work that could be avoided with proper initial design&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Limitations&lt;/strong&gt;: Fundamental architectural constraints that become exponentially expensive to resolve&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hidden Complexities:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;More critically, the series demonstrates that MongoDB's apparent simplicity can mask sophisticated optimization requirements. The transition from &lt;code&gt;appV1&lt;/code&gt; to &lt;code&gt;appV6R4&lt;/code&gt; required a deep understanding of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Aggregation framework internals and optimization strategies&lt;/li&gt;
&lt;li&gt;Index behavior with different data types and query patterns&lt;/li&gt;
&lt;li&gt;Storage engine compression algorithms and trade-offs&lt;/li&gt;
&lt;li&gt;Memory management and cache utilization patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Final Recommendations
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;For Development Teams:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Invest in MongoDB Education&lt;/strong&gt;: The performance differences documented in this series justify substantial training investments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Establish Pattern Libraries&lt;/strong&gt;: Codify successful patterns like those demonstrated to prevent anti-pattern adoption&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implement Performance Testing&lt;/strong&gt;: Regular load testing reveals optimization opportunities before they become production issues&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan for Iteration&lt;/strong&gt;: Schema evolution is inevitable; design systems that accommodate architectural improvements&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;For Architectural Decisions:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start with Fundamentals&lt;/strong&gt;: Proper indexing and schema design provide the foundation for all subsequent optimizations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Measure Before Optimizing&lt;/strong&gt;: Each optimization phase in this series was guided by comprehensive performance measurement&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consider Total Cost of Ownership&lt;/strong&gt;: The development complexity of advanced patterns must be weighed against performance requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan Infrastructure Scaling&lt;/strong&gt;: Understanding that hardware limitations will eventually constrain software optimizations&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Closing Reflection
&lt;/h3&gt;

&lt;p&gt;The journey from &lt;code&gt;appV1&lt;/code&gt; to &lt;code&gt;appV6R4&lt;/code&gt; demonstrates that MongoDB mastery requires understanding not just the database itself, but the intricate relationships between schema design, query patterns, indexing strategies, aggregation frameworks, and infrastructure capabilities. The 99% performance improvements documented in this series are achievable, but they demand dedication to continuous learning and sophisticated engineering practices.&lt;/p&gt;

&lt;p&gt;For organizations serious about MongoDB performance, this series provides both a roadmap for optimization and a compelling case for investing in advanced MongoDB expertise. The cost of not knowing MongoDB extends far beyond individual applications—it impacts entire technology strategies and competitive positioning in data-driven markets.&lt;/p&gt;

&lt;p&gt;The patterns, techniques, and insights presented throughout this three-part series offer a comprehensive foundation for building high-performance MongoDB applications that can scale efficiently while maintaining operational excellence. Most importantly, they demonstrate that with proper knowledge and application, MongoDB can deliver extraordinary performance that justifies its position as a leading database technology for modern applications.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mongodb</category>
      <category>performance</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>The Cost of Not Knowing MongoDB - Part 2: appV5R0 to appV5R4</title>
      <dc:creator>Artur Garcia Costa</dc:creator>
      <pubDate>Thu, 22 Jan 2026 16:51:29 +0000</pubDate>
      <link>https://forem.com/arturgc/the-cost-of-not-knowing-mongodb-part-2-appv5r0-to-appv5r4-40p7</link>
      <guid>https://forem.com/arturgc/the-cost-of-not-knowing-mongodb-part-2-appv5r0-to-appv5r4-40p7</guid>
      <description>&lt;h2&gt;
  
  
  Table Of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Application Version 5 Revision 0 and Revision 1: A simple way to use the &lt;code&gt;Bucket Pattern&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Application Version 5 Revision 2: Using the Bucket Pattern with the Computed Pattern&lt;/li&gt;
&lt;li&gt;Application Version 5 Revision 3: Removing an aggregation pipeline anti-pattern&lt;/li&gt;
&lt;li&gt;Application Version 5 Revision 4: Doubling down on the Computed Pattern&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Article Introduction
&lt;/h2&gt;

&lt;p&gt;Welcome to the second part of the series, &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-introduction-335h"&gt;"The Cost of Not Knowing MongoDB"&lt;/a&gt;. Building upon the foundational optimizations explored in &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-part-1-appv0-to-appv4-2p66"&gt;Part 1&lt;/a&gt;, this article delves into advanced MongoDB design patterns that can dramatically transform application performance.&lt;/p&gt;

&lt;p&gt;In Part 1, we achieved significant improvements through field concatenation, data type optimization, and strategic field naming. However, as identified in the &lt;code&gt;Issues and Improvements&lt;/code&gt; of &lt;code&gt;appV4&lt;/code&gt;, these approaches represent only the beginning of what's possible with MongoDB schema design. This part introduces a paradigm shift from micro-optimizations to architectural patterns that fundamentally change how data is stored and retrieved.&lt;/p&gt;

&lt;p&gt;The journey through &lt;code&gt;appV5R0&lt;/code&gt; to &lt;code&gt;appV5R4&lt;/code&gt; demonstrates the progressive implementation of two powerful MongoDB design patterns: the &lt;a href="https://www.mongodb.com/blog/post/building-with-patterns-the-bucket-pattern" rel="noopener noreferrer"&gt;Bucket Pattern&lt;/a&gt; and the &lt;a href="https://www.mongodb.com/blog/post/building-with-patterns-the-computed-pattern" rel="noopener noreferrer"&gt;Computed Pattern&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Through comprehensive performance analysis and detailed implementation examples, this part reveals both the tremendous potential and important limitations of these advanced patterns, setting the stage for the revolutionary approaches explored in &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-part-3-appv6r0-to-appv6r4-22an"&gt;Part 3&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 5 Revision 0 and Revision 1 (appV5R0 and appV5R1): A simple way to use the &lt;code&gt;Bucket Pattern&lt;/code&gt; &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;When generating the &lt;code&gt;oneYear&lt;/code&gt; totals report, the &lt;code&gt;Get Reports&lt;/code&gt; function will need to retrieve an average of 60 documents and, in the worst-case scenario, 365 documents. To access each document, one index entry must be visited, and one disk read operation must be performed.&lt;/p&gt;

&lt;p&gt;One way to reduce the number of index entries and documents retrieved to generate the report is to use the &lt;a href="https://www.mongodb.com/blog/post/building-with-patterns-the-bucket-pattern" rel="noopener noreferrer"&gt;&lt;code&gt;Bucket Pattern&lt;/code&gt;&lt;/a&gt;. According to the &lt;a href="https://www.mongodb.com/docs/manual/data-modeling/design-patterns/group-data/bucket-pattern/" rel="noopener noreferrer"&gt;MongoDB documentation&lt;/a&gt;, "The bucket pattern separates long series of data into distinct objects. Separating large data series into smaller groups can improve query access patterns and simplify application logic."&lt;/p&gt;

&lt;p&gt;Looking at our application from the perspective of the &lt;code&gt;Bucket Pattern&lt;/code&gt;, so far, we have bucketed our data daily by a user, each document containing the status totals for one user in one day. For the two application versions presented in this section, &lt;code&gt;appV5R0&lt;/code&gt; and &lt;code&gt;appV5R1&lt;/code&gt;, we’ll bucket the data by month (&lt;code&gt;appV5R0&lt;/code&gt;) and by quarter (&lt;code&gt;appV5R1&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;As these are our first implementations using the Bucket Pattern, let’s make it as simple as possible.&lt;/p&gt;

&lt;p&gt;For &lt;code&gt;appV5R0&lt;/code&gt;, each document groups the events by month and user. Every document will have a field of type array called &lt;code&gt;items&lt;/code&gt; to which each event document will be pushed. The event document pushed to the array will have its status field names shorthanded to its first letter, the same way we did in &lt;code&gt;appV3&lt;/code&gt; and &lt;code&gt;appV4&lt;/code&gt;, and the date to which the event refers.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;_id&lt;/code&gt; field will have a logic similar to the one used in &lt;code&gt;appV4&lt;/code&gt;, with the values of &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; concatenated and stored as hexadecimal/binary information. The difference is the &lt;code&gt;date&lt;/code&gt; value—instead of being composed by year, month, and day (&lt;code&gt;YYYYMMDD&lt;/code&gt;)—will only have year and month (&lt;code&gt;YYYYMM&lt;/code&gt;), as we are bucketing the data by month.&lt;/p&gt;

&lt;p&gt;For &lt;code&gt;appV5R1&lt;/code&gt;, we have almost the same implementation as &lt;code&gt;appV5R0&lt;/code&gt;, with the difference being that we’ll bucket the events by quarter, and the &lt;code&gt;date&lt;/code&gt; value used to generate the &lt;code&gt;_id&lt;/code&gt; field will be composed of year and quarter (&lt;code&gt;YYYYQQ&lt;/code&gt;) instead of year and month (&lt;code&gt;YYYYMM&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;To build the &lt;code&gt;_id&lt;/code&gt; field based on the key and date values for the &lt;code&gt;appV5R0&lt;/code&gt;, the following TypeScript function was created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;buildId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;MM&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;}${&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;}${&lt;/span&gt;&lt;span class="nx"&gt;MM&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To build the &lt;code&gt;_id&lt;/code&gt; field based on the key and date values for the &lt;code&gt;appV5R1&lt;/code&gt;, the following TypeScript functions were created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;getQQ&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;month&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;getMM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;month&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;month&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;month&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;month&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;02&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;month&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;month&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;03&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;04&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;buildId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;QQ&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getQQ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;}${&lt;/span&gt;&lt;span class="nx"&gt;YYYY&lt;/span&gt;&lt;span class="p"&gt;}${&lt;/span&gt;&lt;span class="nx"&gt;QQ&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This implementation reflects the knowledge of an intermediate MongoDB developer, for using the &lt;code&gt;Bucket Pattern&lt;/code&gt; in its simplest form possible&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV5R0&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV5R0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Concatenated user key + time period (YYYYMM or YYYYQQ)&lt;/span&gt;
  &lt;span class="nl"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// approved count&lt;/span&gt;
    &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// noFunds count&lt;/span&gt;
    &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// pending count&lt;/span&gt;
    &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// rejected count&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specification presented, we have the following &lt;code&gt;updateOne&lt;/code&gt; operation for each &lt;code&gt;event&lt;/code&gt; generated by this application version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;$push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;filter&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Target the document where the &lt;code&gt;_id&lt;/code&gt; field matches the concatenated value of &lt;code&gt;key&lt;/code&gt;, &lt;code&gt;year&lt;/code&gt;, and &lt;code&gt;month/quarter&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;buildId&lt;/code&gt; function converts the &lt;code&gt;key&lt;/code&gt;+&lt;code&gt;year&lt;/code&gt;+&lt;code&gt;month/quarter&lt;/code&gt; into a binary format.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;update&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uses &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/update/push/" rel="noopener noreferrer"&gt;&lt;code&gt;$push&lt;/code&gt;&lt;/a&gt; to append the new event to the &lt;code&gt;items&lt;/code&gt; array&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;upsert&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ensures a new document is created if no matching document exists.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reportStartDate&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="na"&gt;$lte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reportEndDate&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$unwind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$items&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;items.date&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;reportStartDate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;$lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;reportEndDate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$items.a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$items.n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$items.p&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$items.r&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $match: {...} }&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;_id&lt;/code&gt; field is a binary representation of the concatenated &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; values.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;$gte&lt;/code&gt; operator specifies the start of the date range, while &lt;code&gt;$lt&lt;/code&gt; specifies the end.&lt;/li&gt;
&lt;li&gt;The result of &lt;code&gt;buildId&lt;/code&gt; contains information by month/quarter, not day, as we need to build the report, so further filtering by day will be necessary&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $unwind: {...} }&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Deconstructs the &lt;code&gt;items&lt;/code&gt; array, creating separate documents for each event within the matched buckets.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $match: {...} }&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Applies precise date filtering at the individual event level, ensuring only events within the exact report date range are included.&lt;/li&gt;
&lt;li&gt;It can be seen that we have already filtered by date, but as presented in the explanation of the first stage, we filtered by month/quarter, and to generate the report, we need to filter by day.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $group: {...} }&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Group the filtered documents into a single result.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;_id&lt;/code&gt; field is set to &lt;code&gt;null&lt;/code&gt; to group all matching documents from the previous stage together.&lt;/li&gt;
&lt;li&gt;Computes the sum of the &lt;code&gt;a&lt;/code&gt;, &lt;code&gt;n&lt;/code&gt;, &lt;code&gt;p&lt;/code&gt;, and &lt;code&gt;r&lt;/code&gt; fields using the &lt;code&gt;$sum&lt;/code&gt; operator.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;$project&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Removes the &lt;code&gt;_id&lt;/code&gt; field from the final result.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;These implementations leverage the existing &lt;code&gt;_id&lt;/code&gt; index exclusively, eliminating the need for additional compound indexes. The Bucket Pattern's consolidation of multiple events into a single document reduces index size and improves cache efficiency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV5R0&lt;/code&gt; and &lt;code&gt;appV5R1&lt;/code&gt;, we inserted 500 million event documents into the collections using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Avg. Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV4&lt;/td&gt;
&lt;td&gt;359,615,279&lt;/td&gt;
&lt;td&gt;19.66GB&lt;/td&gt;
&lt;td&gt;59B&lt;/td&gt;
&lt;td&gt;6.69GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;9.50GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R0&lt;/td&gt;
&lt;td&gt;95,350,431&lt;/td&gt;
&lt;td&gt;19.19GB&lt;/td&gt;
&lt;td&gt;217B&lt;/td&gt;
&lt;td&gt;5.06GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2.95GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R1&lt;/td&gt;
&lt;td&gt;33,429,649&lt;/td&gt;
&lt;td&gt;15.75GB&lt;/td&gt;
&lt;td&gt;506B&lt;/td&gt;
&lt;td&gt;4.04GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.09GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/Event&lt;/th&gt;
&lt;th&gt;Index Size/Event&lt;/th&gt;
&lt;th&gt;Total Size/Event&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV4&lt;/td&gt;
&lt;td&gt;42.2B&lt;/td&gt;
&lt;td&gt;20.4B&lt;/td&gt;
&lt;td&gt;62.6B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R0&lt;/td&gt;
&lt;td&gt;41.2B&lt;/td&gt;
&lt;td&gt;6.3B&lt;/td&gt;
&lt;td&gt;47.5B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R1&lt;/td&gt;
&lt;td&gt;33.8B&lt;/td&gt;
&lt;td&gt;2.3B&lt;/td&gt;
&lt;td&gt;36.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Analyzing the tables above, we can see that going from &lt;code&gt;appV4&lt;/code&gt; to &lt;code&gt;appV5R0&lt;/code&gt;, we practically didn’t have improvements when looking at &lt;code&gt;Data Size&lt;/code&gt;, but when considering the &lt;code&gt;Index Size&lt;/code&gt;, the improvement was quite considerable. The index size for &lt;code&gt;appV5R0&lt;/code&gt; is 69% of the size of &lt;code&gt;appV4&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When considering going from &lt;code&gt;appV4&lt;/code&gt; to &lt;code&gt;appV5R1&lt;/code&gt;, the gains are even more impressive. In this case, we reduced the &lt;code&gt;Data Size&lt;/code&gt; by 20% and the &lt;code&gt;Index Size&lt;/code&gt; by 89%.&lt;/p&gt;

&lt;p&gt;Looking at the &lt;code&gt;event stats&lt;/code&gt;, we had considerable improvements in the &lt;code&gt;Total Size/events&lt;/code&gt;, but what really catches the eye is the improvement in the &lt;code&gt;Index Size/events&lt;/code&gt;, which is three times smaller for &lt;code&gt;appV5R0&lt;/code&gt; and &lt;em&gt;nine&lt;/em&gt; times shorter for &lt;code&gt;appV5R1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This huge reduction in the index size is due to the use of the Bucket Pattern, where one document will store data for many events, reducing the total number of documents and, as a consequence, reducing the number of index entries.&lt;/p&gt;

&lt;p&gt;With these impressive improvements regarding index size, it’s quite probable that we’ll also see impressive improvements in the application performance. One point of attention in the values presented above is that the index size of the two new versions is smaller than the memory size of the machine running the database, allowing the whole index to be kept in the cache, which is very good from a performance point of view.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV5R0&lt;/code&gt; and &lt;code&gt;appV5R1&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV4&lt;/code&gt; and &lt;code&gt;Desired&lt;/code&gt; rates, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;For the first time, the application is able to reach the target rate for both &lt;code&gt;appV5R0&lt;/code&gt; and &lt;code&gt;appV5R1&lt;/code&gt;. &lt;code&gt;appV5R1&lt;/code&gt; nearly reaches all desired rates during the initial test quarter. Both versions demonstrate a clear performance advantage when compared to &lt;code&gt;appV4&lt;/code&gt;, and &lt;code&gt;appV5R1&lt;/code&gt; shows significantly better results than &lt;code&gt;appV5R0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xc4sfbk0pjm9i961arz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xc4sfbk0pjm9i961arz.png" alt="Get Reports Rate - appV4 vs appV5R0 vs appV5R1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both new versions have notably lower latencies than &lt;code&gt;appV4&lt;/code&gt;, and without degrading in the final half of the test. The &lt;code&gt;appV5R1&lt;/code&gt; reaches a peak latency of 211ms while &lt;code&gt;appV5R0&lt;/code&gt; reaches a peak latency of 530ms.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0uit8cia559z9plcc2zo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0uit8cia559z9plcc2zo.png" alt="Get Reports Latency - appV4 vs appV5R0 vs appV5R1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both new versions almost reach all the desired rates throughout the test duration, degrading only in the final 20 minutes. It's possible to see that &lt;code&gt;appV5R1&lt;/code&gt; has a better performance than &lt;code&gt;appV5R0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs9zy94hm9p36u9hb1u9a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs9zy94hm9p36u9hb1u9a.png" alt="Bulk Upsert Rate - appV4 vs appV5R0 vs appV5R1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Even though &lt;code&gt;appV4&lt;/code&gt; is able to reach lower latencies than &lt;code&gt;appV5R1&lt;/code&gt;and &lt;code&gt;appV5R0&lt;/code&gt; for some parts of the first half of the test, this lower value is due to the requests being queued instead of the implementation being better. For the final half of the test, the two new versions are clearly a better solution with better values. Both new versions have the same value for peak latency, but the average latency for &lt;code&gt;appV5R1&lt;/code&gt; is lower than &lt;code&gt;appV5R0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8cggaofek0wh5xtrigl9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8cggaofek0wh5xtrigl9.png" alt="Bulk Upsert Latency - appV4 vs appV5R0 vs appV5R1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Performance Analysis&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The results clearly establish that quarterly bucketing (&lt;code&gt;appV5R1&lt;/code&gt;) provides superior performance compared to monthly bucketing (&lt;code&gt;appV5R0&lt;/code&gt;), validating the principle that larger bucket sizes can improve performance when appropriately balanced against query complexity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;Because we made this first implementation of the Bucket Pattern as simple as possible, some clear possible optimizations weren’t considered. The main one is how we handle the items array field. In the current implementation, we just push the event documents to it, even when we already have events for a specific day.&lt;/p&gt;

&lt;p&gt;A clear optimization here is one that we have been using from appV1 to appV4, where we create just one document per key and date/day, and when we have many events for the same key and date/day, we just increment the status of the document based on the status of the event.&lt;/p&gt;

&lt;p&gt;Applying this optimization, we’ll reduce the size of the documents because the array of items will have fewer elements. We’ll also reduce the computational cost of generating the reports because we are pre-computing the status totals by day. This &lt;a href="https://www.mongodb.com/blog/post/building-with-patterns-a-summary" rel="noopener noreferrer"&gt;build pattern of pre-computing&lt;/a&gt; is quite common in that it has its own name, &lt;a href="https://www.mongodb.com/blog/post/building-with-patterns-the-computed-pattern" rel="noopener noreferrer"&gt;Computed Pattern&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 5 Revision 2 (appV5R2): Using the Bucket Pattern with the Computed Pattern &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As discussed in the &lt;code&gt;Issues and Improvements&lt;/code&gt; of &lt;code&gt;appV5R0&lt;/code&gt; and &lt;code&gt;appV5R1&lt;/code&gt;, we can use the &lt;a href="https://www.mongodb.com/blog/post/building-with-patterns-the-computed-pattern" rel="noopener noreferrer"&gt;Computed Pattern&lt;/a&gt; to pre-compute the total status by day in the items array field when inserting a new &lt;code&gt;event&lt;/code&gt;. This reduces the computation cost of generating the reports and also reduces the document size by having fewer elements in the &lt;code&gt;items&lt;/code&gt; array field.&lt;/p&gt;

&lt;p&gt;Most of this application version will be similar to the &lt;code&gt;appV5R1&lt;/code&gt;, where we bucketed the events by quarter. The only difference will be in the &lt;code&gt;Bulk Upsert&lt;/code&gt; operation, where we will update an element in the &lt;code&gt;items&lt;/code&gt; array field if an element with the same &lt;code&gt;date&lt;/code&gt; of the new &lt;code&gt;event&lt;/code&gt; already exists, or insert a new element in items if an element with the same &lt;code&gt;date&lt;/code&gt; of the new &lt;code&gt;event&lt;/code&gt; doesn’t exist.&lt;/p&gt;

&lt;p&gt;The implementation showcases senior-level MongoDB development practices, utilizing advanced aggregation pipeline features within update operations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV5R0&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV5R0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// User key + quarter (YYYYQQ)&lt;/span&gt;
  &lt;span class="nl"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// approved total for the day&lt;/span&gt;
    &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// noFunds total for the day&lt;/span&gt;
    &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// pending total for the day&lt;/span&gt;
    &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// rejected total for the day&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specification presented, we have the following &lt;code&gt;updateOne&lt;/code&gt; operation for each &lt;code&gt;event&lt;/code&gt; generated by this application version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$set&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;sumIfItemExists&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$set&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;returnItemsOrCreateNew&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$unset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;result&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;updateOne&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV5R1&lt;/code&gt;, with the only difference being the &lt;code&gt;update&lt;/code&gt; logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;update&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The complete code for this &lt;code&gt;update&lt;/code&gt; logic is quite big, hard to get your head around quickly, and would also make the process of browsing through the article a little cumbersome. Because of that, here we have a pseudocode of it.&lt;/li&gt;
&lt;li&gt;Our goal in this update operation is to increment the &lt;code&gt;status&lt;/code&gt; of an &lt;code&gt;element&lt;/code&gt; in the &lt;code&gt;items&lt;/code&gt; array if an &lt;code&gt;element&lt;/code&gt; with the same &lt;code&gt;date&lt;/code&gt; of the new &lt;code&gt;event&lt;/code&gt; already exists, or create a new &lt;code&gt;element&lt;/code&gt; if there isn’t one with the same date. It’s not possible to achieve this functionality with the &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/update/#update-operators-1" rel="noopener noreferrer"&gt;MongoDB Update Operators&lt;/a&gt;. The way around it is to use &lt;a href="https://www.mongodb.com/docs/manual/reference/method/db.collection.update/#update-with-aggregation-pipeline" rel="noopener noreferrer"&gt;Update with Aggregation Pipeline&lt;/a&gt;, which allows a more expressive update statement.&lt;/li&gt;
&lt;li&gt;To facilitate the understanding of the logic used in each stage of the aggregation pipeline, a simplified JavaScript version of the functionalities will be provided:&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;$set: { result: sumIfItemExists }&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;Set the field &lt;code&gt;result&lt;/code&gt; to the logic of the variable &lt;code&gt;sumIfItemExists&lt;/code&gt;. As the name suggests, this logic will iterate through the &lt;code&gt;items&lt;/code&gt; array looking for elements with the same &lt;code&gt;date&lt;/code&gt; as the &lt;code&gt;event&lt;/code&gt;. If there is one, this &lt;code&gt;element&lt;/code&gt; will have the status present in the &lt;code&gt;event&lt;/code&gt; summed/added to it. As we need a way to keep track of whether an &lt;code&gt;element&lt;/code&gt; with the same &lt;code&gt;date&lt;/code&gt; of the &lt;code&gt;event&lt;/code&gt; was found and the &lt;code&gt;event&lt;/code&gt; status was registered, there is an environment boolean variable called &lt;code&gt;found&lt;/code&gt; that will keep track of it.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
     &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;found&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
     &lt;span class="p"&gt;}&lt;/span&gt;

     &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

     &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
   &lt;span class="p"&gt;},&lt;/span&gt;
   &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;found&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;The &lt;code&gt;result&lt;/code&gt; variable/field will be generated using a &lt;a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/reduce" rel="noopener noreferrer"&gt;reduce&lt;/a&gt; method on the &lt;code&gt;items&lt;/code&gt; array field from the document we want to update. The initial value for the reduce method is an object with the fields &lt;code&gt;found&lt;/code&gt; and &lt;code&gt;items&lt;/code&gt;. The field &lt;code&gt;accumulator.found&lt;/code&gt; has an initial value of &lt;code&gt;false&lt;/code&gt; and is responsible for signaling if an &lt;code&gt;element&lt;/code&gt; in the reduced execution had the same &lt;code&gt;date&lt;/code&gt; as the &lt;code&gt;event&lt;/code&gt; we want to register. If there is one &lt;code&gt;element&lt;/code&gt; with the same date as the &lt;code&gt;event&lt;/code&gt;, &lt;code&gt;element.date === event.date&lt;/code&gt;, the status values of the &lt;code&gt;element&lt;/code&gt; will be incremented by the status of the &lt;code&gt;event&lt;/code&gt; and the &lt;code&gt;accumulator.found&lt;/code&gt; field will be set to &lt;code&gt;true&lt;/code&gt;, indicating that the &lt;code&gt;event&lt;/code&gt; was registered. The &lt;code&gt;accumulator.items&lt;/code&gt; array field will have the &lt;code&gt;element&lt;/code&gt; of each iteration pushed to it, becoming the new &lt;code&gt;items&lt;/code&gt; array field.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;$set: { items: returnItemsOrCreateNew }&lt;/code&gt;:&lt;br&gt;
 Set the field &lt;code&gt;items&lt;/code&gt; to the resulting logic of the variable &lt;code&gt;returnItemsOrCreateNew&lt;/code&gt;. With a little effort of imagination, the name suggests that the logic present in the variable will return the &lt;code&gt;items&lt;/code&gt; field of the previous stage if an element with the same &lt;code&gt;date&lt;/code&gt; of the &lt;code&gt;event&lt;/code&gt; was found, &lt;code&gt;found == true&lt;/code&gt;, or return a new array generated by the concatenation of the &lt;code&gt;items&lt;/code&gt; array field of the previous stage with a new array field containing the &lt;code&gt;event&lt;/code&gt; element when an element with the same &lt;code&gt;date&lt;/code&gt; of the &lt;code&gt;event&lt;/code&gt; was not found during the reduced iterations, &lt;code&gt;found == false&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt; &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

 &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;found&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;concat&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;$unset: ["result"]&lt;/code&gt;:&lt;br&gt;
 Removes the temporary &lt;code&gt;result&lt;/code&gt; field created during the aggregation process.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This sophisticated update operation achieves the equivalent of an "upsert within an array" - functionality that requires careful orchestration of MongoDB's aggregation capabilities.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reportStartDate&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="na"&gt;$lte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reportEndDate&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$unwind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$items&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;items.date&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;reportStartDate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;$lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;reportEndDate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$items.a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$items.n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$items.p&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$items.r&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;aggregation&lt;/code&gt; operation has the same logic as the one in &lt;code&gt;appV5R1&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;No additional indexes are required, maintaining the single &lt;code&gt;_id&lt;/code&gt; index approach established in the &lt;code&gt;appV4&lt;/code&gt; implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV5R2&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Avg. Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R1&lt;/td&gt;
&lt;td&gt;33,429,468&lt;/td&gt;
&lt;td&gt;15.75GB&lt;/td&gt;
&lt;td&gt;506B&lt;/td&gt;
&lt;td&gt;4.04GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.09GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R2&lt;/td&gt;
&lt;td&gt;33,429,649&lt;/td&gt;
&lt;td&gt;11.96GB&lt;/td&gt;
&lt;td&gt;385B&lt;/td&gt;
&lt;td&gt;3.26GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.16GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/Event&lt;/th&gt;
&lt;th&gt;Index Size/Event&lt;/th&gt;
&lt;th&gt;Total Size/Event&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R1&lt;/td&gt;
&lt;td&gt;33.8B&lt;/td&gt;
&lt;td&gt;2.3B&lt;/td&gt;
&lt;td&gt;36.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R2&lt;/td&gt;
&lt;td&gt;25.7B&lt;/td&gt;
&lt;td&gt;2.5B&lt;/td&gt;
&lt;td&gt;28.2B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Analyzing the tables above, we have the expected result presented in the introduction, from &lt;code&gt;appV5R1&lt;/code&gt; to &lt;code&gt;appV5R2&lt;/code&gt;. The only noticeable difference is the 24% reduction in the &lt;code&gt;Data Size&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This reduction in the &lt;code&gt;Data Size&lt;/code&gt; and &lt;code&gt;Document Size&lt;/code&gt; will help in the performance of our application by reducing the time spent reading the document from the disk and the processing cost of decompressing the document from its compressed state.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV5R2&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV5R1&lt;/code&gt; and &lt;code&gt;Desired&lt;/code&gt; rates, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions have very similar rates, with &lt;code&gt;appV5R2&lt;/code&gt; being slightly better than &lt;code&gt;appV5R1&lt;/code&gt; for the final half of the test.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9bq98z33adr0zmplotof.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9bq98z33adr0zmplotof.png" alt="Get Reports Rate - appV5R1 vs appV5R2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions have very similar latencies, with &lt;code&gt;appV5R2&lt;/code&gt; reaching lower peak values when compared to &lt;code&gt;appV5R1&lt;/code&gt; for the final half of the test.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp3qrwt7kqoys0mop6e3w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp3qrwt7kqoys0mop6e3w.png" alt="Get Reports Latency - appV5R1 vs appV5R2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions have very similar rates, with &lt;code&gt;appV5R2&lt;/code&gt; being slightly better than &lt;code&gt;appV5R1&lt;/code&gt; for the final 20 minutes of the test, but still not reaching the desired rate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl257dp7x6yxuamfn64cv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl257dp7x6yxuamfn64cv.png" alt="Bulk Upsert Rate - appV5R1 vs appV5R2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions have very similar latencies, with &lt;code&gt;appV5R1&lt;/code&gt; reaching lower peak values when compared to &lt;code&gt;appV5R2&lt;/code&gt; for the final 20 minutes of the test.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwlnh2fyqps8kjy0gsaxc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwlnh2fyqps8kjy0gsaxc.png" alt="Bulk Upsert Latency - appV5R1 vs appV5R2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Performance Analysis&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The results show modest improvements in &lt;code&gt;Get Reports&lt;/code&gt; performance but slight degradation in &lt;code&gt;Bulk Upsert&lt;/code&gt; performance. This outcome reflects the fundamental trade-off inherent in the Computed Pattern: increased write complexity in exchange for simplified read operations.&lt;/p&gt;

&lt;p&gt;With writes occurring 4.5 times more frequently than reads, the increased computational cost of the complex aggregation pipeline during writes roughly balances the reduced computational cost during reads. The MongoDB documentation confirms this expectation: "If reads are significantly more common than writes, the computed pattern reduces the frequency of data computation."&lt;/p&gt;

&lt;p&gt;In our load testing scenario, writes significantly outnumber reads, making the Computed Pattern's benefits less pronounced. However, this implementation provides a valuable reference architecture for applications with different read/write patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;Let’s try to extract more performance from our application by searching for improvements in our current operations. Looking at the aggregation pipeline of Get Reports, we find a very common anti-pattern when fields of type array are involved. This anti-pattern is the $unwind followed by a $match, which happens in the second and third stages of our aggregation pipeline.&lt;/p&gt;

&lt;p&gt;This combination of stages can hurt the performance of the aggregation pipeline because we are increasing the number of documents in the pipeline with the $unwind stage to later filter the documents with the $match. In other words, to get to a final state with fewer documents, we’re going through an intermediate state where we increase the number of documents.&lt;/p&gt;

&lt;p&gt;In the next application revision, we’ll see how we can achieve the same final result using only one stage and without having an intermediate stage with more documents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 5 Revision 3 (appV5R3): Removing an aggregation pipeline anti-pattern &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As presented in the &lt;code&gt;Issues and Improvements&lt;/code&gt; of appV5R2, we have an anti-pattern in the aggregation pipeline of &lt;code&gt;Get Reports&lt;/code&gt; that can harm the query performance. This anti-pattern is characterized by a &lt;code&gt;$unwind&lt;/code&gt; stage followed by a &lt;code&gt;$match&lt;/code&gt;. This combination of stages will first increase the number of documents, &lt;code&gt;$unwind&lt;/code&gt;, to later filter them, &lt;code&gt;$match&lt;/code&gt;. In a simplified way, to get to a final state, we’re going through a costly intermediary state.&lt;/p&gt;

&lt;p&gt;One possible solution around this anti-pattern is to use the &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/aggregation/addFields/" rel="noopener noreferrer"&gt;&lt;code&gt;$addFields&lt;/code&gt;&lt;/a&gt; stage with the &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/aggregation/filter/" rel="noopener noreferrer"&gt;&lt;code&gt;$filter&lt;/code&gt;&lt;/a&gt; operator on the &lt;code&gt;items&lt;/code&gt; array field. With this combination, we would replace the &lt;code&gt;items&lt;/code&gt; array field using the &lt;code&gt;$addFields&lt;/code&gt; stage with a new array field generated by the &lt;code&gt;$filter&lt;/code&gt; operator in the &lt;code&gt;items&lt;/code&gt; array, where we would filter all elements where the &lt;code&gt;date&lt;/code&gt; is inside the report's date range.&lt;/p&gt;

&lt;p&gt;But, considering our aggregation pipeline with the optimization presented above, there is an even better solution. With the &lt;code&gt;$filter&lt;/code&gt; operator, we will loop through all elements in the &lt;code&gt;items&lt;/code&gt; field and only compare their dates with the report dates to filter the elements. As the final goal of our aggregation pipeline is to get the status totals of all elements within the report's date range, instead of just looping through the elements in &lt;code&gt;items&lt;/code&gt; to filter them, we could already start to calculate the status totals. We can obtain this functionality by using the &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/aggregation/reduce/" rel="noopener noreferrer"&gt;&lt;code&gt;$reduce&lt;/code&gt;&lt;/a&gt; operator instead of the &lt;code&gt;$filter&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The implementation represents senior-level MongoDB development practices, showcasing how sophisticated operators can eliminate performance bottlenecks while maintaining code clarity and functionality.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV5R0&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV5R0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;No additional indexes are required, maintaining the single &lt;code&gt;_id&lt;/code&gt; index approach established in the &lt;code&gt;appV4&lt;/code&gt; implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specification presented, we have the following bulk &lt;code&gt;updateOne&lt;/code&gt; operation for each &lt;code&gt;event&lt;/code&gt; generated by the application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$set&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;sumIfItemExists&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$set&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;returnItemsOrCreateNew&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$unset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;result&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;updateOne&lt;/code&gt; operation has the same logic as the one in &lt;code&gt;appV5R2&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;docsFromKeyBetweenDate&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$addFields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;itemsReduceAccumulator&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;groupSumStatus&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;aggregation&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV5R1&lt;/code&gt;, with the differences being the change of the second stage from &lt;code&gt;$unwind&lt;/code&gt; to &lt;code&gt;$addFields&lt;/code&gt; and the change of a variable name in &lt;code&gt;$group&lt;/code&gt; stage. The complete code for this aggregation pipeline is quite complicated. Because of that, we will have just a pseudocode for it here.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $addFields: itemsReduceAccumulator }&lt;/code&gt;:&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Adds a new field to the document called &lt;code&gt;totals&lt;/code&gt; that will have the status totals.&lt;/li&gt;
&lt;li&gt;Uses &lt;code&gt;$reduce&lt;/code&gt; to iterate through the &lt;code&gt;items&lt;/code&gt; array, applying date filtering and status accumulation in a single operation.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The following JavaScript code is logic equivalent to the real aggregation pipeline code.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt; &lt;span class="c1"&gt;// Equivalent JavaScript logic:&lt;/span&gt;
 &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;totals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
     &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;reportStartDate&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;reportEndDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
     &lt;span class="p"&gt;}&lt;/span&gt;
     &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
   &lt;span class="p"&gt;},&lt;/span&gt;
   &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $group: groupSumStatus }&lt;/code&gt;:&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Group the &lt;code&gt;totals&lt;/code&gt; of each document in the pipeline into final status totals using &lt;code&gt;$sum&lt;/code&gt; operations.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;groupSumStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$totals.a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
   &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$totals.n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
   &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$totals.p&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
   &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$totals.r&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
 &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV5R3&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Avg. Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R2&lt;/td&gt;
&lt;td&gt;33,429,649&lt;/td&gt;
&lt;td&gt;11.96GB&lt;/td&gt;
&lt;td&gt;385B&lt;/td&gt;
&lt;td&gt;3.26GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.16GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;33,429,492&lt;/td&gt;
&lt;td&gt;11.96GB&lt;/td&gt;
&lt;td&gt;385B&lt;/td&gt;
&lt;td&gt;3.24GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.11GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/Event&lt;/th&gt;
&lt;th&gt;Index Size/Event&lt;/th&gt;
&lt;th&gt;Total Size/Event&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R2&lt;/td&gt;
&lt;td&gt;25.7B&lt;/td&gt;
&lt;td&gt;2.5B&lt;/td&gt;
&lt;td&gt;28.2B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;25.7B&lt;/td&gt;
&lt;td&gt;2.4B&lt;/td&gt;
&lt;td&gt;28.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;As the document schema and Bulk Upsert operations for appV5R3 are the same as appV5R2, there is nothing to reason about in this section between the two revisions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV5R3&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV5R2&lt;/code&gt; and &lt;code&gt;Desired&lt;/code&gt; rates, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;We can say that both versions have a similar performance, with each one reaching better rates throughout the test duration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fldcliuxnmoue490xmvy8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fldcliuxnmoue490xmvy8.png" alt="Get Reports Rate - appV5R2 vs appV5R3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;It's almost indistinguishable which version has better latency values.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq94wi5qtuynaltlhwzed.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq94wi5qtuynaltlhwzed.png" alt="Get Reports Latency - appV5R2 vs appV5R3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;We can say that both versions have a similar performance, with &lt;code&gt;appV5R3&lt;/code&gt; being slightly better at the 20 final minutes of the test.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fss7ojqd46gxa92vw74ks.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fss7ojqd46gxa92vw74ks.png" alt="Bulk Upsert Rate - appV5R2 vs appV5R3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Even though both version have very similar latency values, we can see that &lt;code&gt;appV5R2&lt;/code&gt; has a slightly lower latency values for the first three quarters of the test, while the &lt;code&gt;appV5R3&lt;/code&gt; has a considerable better latency values for the final final quarter of the test.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff24dkxo6ybg3k4bv8fua.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff24dkxo6ybg3k4bv8fua.png" alt="Bulk Upsert Latency - appV5R2 vs appV5R3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Performance Analysis&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;While the optimized aggregation pipeline is demonstrably more efficient in terms of CPU and memory usage, the performance improvements are minimal. This outcome reveals that the current bottleneck is not a computational overhead, but a disk I/O limitation.&lt;/p&gt;

&lt;p&gt;MongoDB Atlas metrics show the &lt;code&gt;IOWAIT&lt;/code&gt; metric reaching nearly 15% of CPU usage, indicating that the CPU frequently waits for disk operations to complete. This disk bottleneck will become more apparent in subsequent versions and represents a fundamental infrastructure limitation that cannot be resolved through schema optimization alone.&lt;/p&gt;

&lt;p&gt;The relatively modest performance gains demonstrate that optimizing beyond the current bottleneck yields diminishing returns, highlighting the importance of identifying and addressing the primary constraint in any system optimization effort.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;We’ve just seen that our implementation's limitation is the disk. To solve that, we have two options: Upgrade the disk where MongoDB stores data or change our implementation to reduce disk usage.&lt;/p&gt;

&lt;p&gt;As the goal of this series is to show how much performance we can achieve with the same hardware by modeling how our application stores and reads data from MongoDB, we won’t upgrade the disk. A change in the application modeling for MongoDB will be left for the next article, appV6Rx.&lt;/p&gt;

&lt;p&gt;For appV5R4, we will double down on the Computed Pattern and pre-compute the status totals by quarter, not just day. Even though we know it probably won’t provide better performance for things discussed in the "Load test result" of &lt;code&gt;appV5R2&lt;/code&gt;, let’s flex our MongoDB and aggregation pipeline knowledge, and also provide a reference code example for the cases where the Computed Pattern is a good fit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 5 Revision 4 (appV5R4): Doubling down on the Computed Pattern &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As presented in the issues and improvements of &lt;code&gt;appV5R3&lt;/code&gt;, for this revision, we’ll double down on the &lt;code&gt;Computed Pattern&lt;/code&gt; even though we have good evidence that it won’t provide a better performance—but, you know, for science.&lt;/p&gt;

&lt;p&gt;We’ll also use the &lt;code&gt;Computed Pattern&lt;/code&gt; to pre-compute the status totals for each document. As each document stores the events per quarter and user, our application will have on each document the status totals per quarter and user. These pre-computed totals will be stored in a field called &lt;code&gt;totals&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;One point of attention in this implementation is that we are adding a new field to the document, which will also increase the average document size. As seen in the previous revision, &lt;code&gt;appV5R3&lt;/code&gt;, our current bottleneck is disk, another indication that this implementation won’t have better performance.&lt;/p&gt;

&lt;p&gt;The implementation complexity increases significantly, requiring careful coordination between daily item management and quarterly total maintenance, showcasing the sophisticated techniques employed by senior MongoDB developers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV5R1&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV5R1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Quarter total approved&lt;/span&gt;
    &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Quarter total noFunds&lt;/span&gt;
    &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Quarter total pending&lt;/span&gt;
    &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Quarter total rejected&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="nl"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Daily total approved&lt;/span&gt;
    &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Daily total noFunds&lt;/span&gt;
    &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Daily total pending&lt;/span&gt;
    &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Daily total rejected&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;No additional indexes are required, maintaining the single &lt;code&gt;_id&lt;/code&gt; index approach established in the &lt;code&gt;appV4&lt;/code&gt; implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specification presented, we have the following &lt;code&gt;updateOne&lt;/code&gt; operation for each &lt;code&gt;event&lt;/code&gt; generated by this application version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$set&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;newReportFields&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// Update quarterly totals&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$set&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;sumIfItemExists&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// Process daily items&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$set&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;returnItemsOrCreateNew&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// Update items array&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$unset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;result&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// Cleanup temporary field&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;updateOne&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV5R3&lt;/code&gt;, with the only difference being an extra stage in the &lt;code&gt;update&lt;/code&gt; aggregation pipeline logic to pre-compute the document status &lt;code&gt;totals&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;update&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;To facilitate the understanding of the logic used in the aggregation pipeline, a simplified JavaScript version of the functionalities will be provided:&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;{ $set: newReportFields }&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;Set the field &lt;code&gt;totals&lt;/code&gt; to the resulting operation of incrementing each one of the possible &lt;code&gt;status&lt;/code&gt; fields by the &lt;code&gt;status&lt;/code&gt; provided in the &lt;code&gt;event&lt;/code&gt; document.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;

 &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;

 &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;

 &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;docsFromKeyBetweenDate&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$addFields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;itemsReduceAccumulator&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;groupSumStatus&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;aggregation&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV5R3&lt;/code&gt;, with the only differences being the implementation in the &lt;code&gt;$addFields&lt;/code&gt; stage. The complete code for this aggregation pipeline is quite complicated. Because of that, we will have just a pseudocode for it here.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;{ $addFields: itemsReduceAccumulator }&lt;/code&gt;:&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;A similar implementation to the one in &lt;code&gt;appV5R3&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The main difference is if the quarter’s date range is within the limits of the report’s date range, we can use the pre-computed &lt;code&gt;totals&lt;/code&gt; instead of calculating the value through a &lt;code&gt;$reduce&lt;/code&gt; operation.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The following JavaScript code is logic equivalent to the real aggregation pipeline code.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt; &lt;span class="c1"&gt;// Equivalent JavaScript logic:&lt;/span&gt;
 &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

 &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;documentQuarterWithinReportDateRange&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="c1"&gt;// Use pre-computed quarterly totals&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="c1"&gt;// Fall back to item-level aggregation&lt;/span&gt;
   &lt;span class="nx"&gt;totals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
         &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;reportStartDate&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
         &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;reportEndDate&lt;/span&gt;
       &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
         &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
         &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
         &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
       &lt;span class="p"&gt;}&lt;/span&gt;

       &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;accumulator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
     &lt;span class="p"&gt;},&lt;/span&gt;
     &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
   &lt;span class="p"&gt;);&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV5R4&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Avg. Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;33,429,492&lt;/td&gt;
&lt;td&gt;11.96GB&lt;/td&gt;
&lt;td&gt;385B&lt;/td&gt;
&lt;td&gt;3.24GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.11GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R4&lt;/td&gt;
&lt;td&gt;33,429,470&lt;/td&gt;
&lt;td&gt;12.88GB&lt;/td&gt;
&lt;td&gt;414B&lt;/td&gt;
&lt;td&gt;3.72GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.24GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/Event&lt;/th&gt;
&lt;th&gt;Index Size/Event&lt;/th&gt;
&lt;th&gt;Total Size/Event&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;25.7B&lt;/td&gt;
&lt;td&gt;2.4B&lt;/td&gt;
&lt;td&gt;28.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R4&lt;/td&gt;
&lt;td&gt;27.7B&lt;/td&gt;
&lt;td&gt;2.7B&lt;/td&gt;
&lt;td&gt;30.4B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;As discussed in this revision introduction, the additional &lt;code&gt;totals&lt;/code&gt; field on each document in the collection increased the document size and the overall storage size. The &lt;code&gt;Data Size&lt;/code&gt; of &lt;code&gt;appV5R4&lt;/code&gt; is 7,7% bigger than &lt;code&gt;appV5R3&lt;/code&gt; and the &lt;code&gt;Total Size/events&lt;/code&gt; is 8,2%. Because disk is our limiting factor, the performance of &lt;code&gt;appV5R4&lt;/code&gt; will probably be worse than &lt;code&gt;appV5R3&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV5R4&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV5R3&lt;/code&gt; and &lt;code&gt;Desired&lt;/code&gt; rates, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;It's clear that &lt;code&gt;appV5R4&lt;/code&gt; has worse rate values when compared to &lt;code&gt;appV5R3&lt;/code&gt;, only slightly beating the previous version for the first quarter of the test.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1z23oif6rsg7sa5xhq5b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1z23oif6rsg7sa5xhq5b.png" alt="Get Reports Rate - appV5R3 vs appV5R4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;For the first two quarters, the test &lt;code&gt;appV5R4&lt;/code&gt; has a lower latency, but for the final two quarters, &lt;code&gt;appV5R3&lt;/code&gt; gets the lead.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjzx7zccja4qzpz0hsqc3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjzx7zccja4qzpz0hsqc3.png" alt="Get Reports Latency - appV5R3 vs appV5R4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions have very similar rate values and also fall short of the desired rate in the final 20 minutes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fczjcm6he5isnmgmkdiqn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fczjcm6he5isnmgmkdiqn.png" alt="Bulk Upsert Rate - appV5R3 vs appV5R4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The new version, &lt;code&gt;appV5R4&lt;/code&gt;, is only able to match the latency values of &lt;code&gt;appV5R3&lt;/code&gt; for the first quarter of the test, falling short for the rest of the three quarters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frrl59mryxqp7s0svdbk0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frrl59mryxqp7s0svdbk0.png" alt="Bulk Upsert Latency - appV5R3 vs appV5R4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;As spoiled in the previous Issues and improvements, to improve our application’s performance, we need to change our MongoDB implementation in a way that reduces disk usage. To achieve this, we need to reduce the document size.&lt;/p&gt;

&lt;p&gt;You may think it is not possible to reduce our document size and overall collection/index size even more because we are already using just one index, concatenating two fields into one, using shorthand field names, and using the Bucket Pattern. But there is one thing called the Dynamic Schema that can help us.&lt;/p&gt;

&lt;p&gt;In the Dynamic Schema, the values of a field become field names. Thus, field names also store data and, as a consequence, reduce the document size. As this pattern will require big changes in our current application schema, we’ll start a new version, appV6Rx, which we’ll play around with in the third part of this series.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;That is the end of the second part of the series. We covered Bucket Pattern and Computed Pattern, and the many ways we can use these patterns to model how our application stores its data in MongoDB, and the big performance gains it can provide when used properly.&lt;/p&gt;

&lt;p&gt;Here is a quick review of the improvements made between the application versions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;appV4&lt;/code&gt; to &lt;code&gt;appV5R0&lt;/code&gt;/&lt;code&gt;appV5R1&lt;/code&gt;: This is the simplest possible implementation of the Bucket Pattern, grouping the events by month for &lt;code&gt;appV5R0&lt;/code&gt; and by quarter for &lt;code&gt;appV5R1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;appV5R1&lt;/code&gt; to &lt;code&gt;appV5R2&lt;/code&gt;: Instead of just pushing the event document to the items array, we started to pre-compute the status totals by day, using the Computed Pattern.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;appV5R2&lt;/code&gt; to &lt;code&gt;appV5R3&lt;/code&gt;: This improved the aggregation pipeline for Get Reports, preventing a costly intermediary stage. It didn’t provide performance improvements because our MongoDB instance is currently disk-limited.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;appV5R3&lt;/code&gt; to &lt;code&gt;appV5R4&lt;/code&gt;: We doubled down on Computed Pattern to pre-calculate the totals field even though we knew the performance wouldn’t be better—but, just for science.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We had noticeable improvements in the version presented in this second part of the series when compared to the versions from the first part of the series. &lt;code&gt;appV0&lt;/code&gt; to &lt;code&gt;appV4&lt;/code&gt;. &lt;code&gt;appV5R3&lt;/code&gt; showed the best performance of them all, but it still can’t reach all the desired rates. For the third and final version of this series, our application versions will be developed around the &lt;code&gt;Dynamic Schema Pattern&lt;/code&gt;, which will reduce the overall document size and help with the current disk limitation.&lt;/p&gt;

&lt;p&gt;For any further questions, you can go to the MongoDB Community Forum, or if you want to build your application using MongoDB, the MongoDB Developer Center has lots of examples and tutorials in many different programming languages.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mongodb</category>
      <category>performance</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>The Cost of Not Knowing MongoDB - Part 1: appV0 to appV4</title>
      <dc:creator>Artur Garcia Costa</dc:creator>
      <pubDate>Thu, 22 Jan 2026 16:50:40 +0000</pubDate>
      <link>https://forem.com/arturgc/the-cost-of-not-knowing-mongodb-part-1-appv0-to-appv4-2p66</link>
      <guid>https://forem.com/arturgc/the-cost-of-not-knowing-mongodb-part-1-appv0-to-appv4-2p66</guid>
      <description>&lt;h2&gt;
  
  
  Table Of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Application Version 1: The baseline implementation&lt;/li&gt;
&lt;li&gt;Application Version 2: Better Understanding Indexing&lt;/li&gt;
&lt;li&gt;Application Version 3: Better Data Types and Field Name Shorthanding&lt;/li&gt;
&lt;li&gt;Application Version 4: Taking Advantage of the &lt;code&gt;_id&lt;/code&gt; Index&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Article Introduction
&lt;/h2&gt;

&lt;p&gt;Welcome to the first part of the series, &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-introduction-335h"&gt;"The Cost of Not Knowing MongoDB"&lt;/a&gt;. This comprehensive analysis explores how different MongoDB schema design decisions can dramatically impact application performance, demonstrating the critical importance of understanding MongoDB's underlying mechanisms.&lt;/p&gt;

&lt;p&gt;In this first article, we examine four progressive application versions, &lt;code&gt;appV1&lt;/code&gt; through &lt;code&gt;appV4&lt;/code&gt;, each representing common approaches developers take when working with MongoDB. Through detailed performance testing and analysis, we reveal how seemingly minor schema modifications can lead to significant improvements in throughput, latency, and resource utilization.&lt;/p&gt;

&lt;p&gt;The journey begins with &lt;code&gt;appV1&lt;/code&gt;, a baseline implementation that reflects typical patterns used by junior MongoDB developers. We then progress through increasingly optimized versions, introducing concepts such as field concatenation, data type optimization, and strategic field abbreviation. Each version builds upon the lessons learned from its predecessor, culminating in &lt;code&gt;appV4&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This foundational knowledge sets the stage for &lt;a href="https://dev.to/arturgc/the-cost-of-not-knowing-mongodb-part-2-appv5r0-to-appv5r4-40p7"&gt;Part 2&lt;/a&gt;, where we explore advanced patterns like the &lt;a href="https://www.mongodb.com/docs/manual/data-modeling/design-patterns/group-data/bucket-pattern/" rel="noopener noreferrer"&gt;&lt;code&gt;Bucket Pattern&lt;/code&gt;&lt;/a&gt; and &lt;a href="https://www.mongodb.com/docs/manual/data-modeling/design-patterns/computed-values/computed-schema-pattern/" rel="noopener noreferrer"&gt;&lt;code&gt;Computed Pattern&lt;/code&gt;&lt;/a&gt; to achieve even greater performance improvements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 1 (appV1): The baseline implementation &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;The first application version and the base case for our comparison would have been developed by someone with a junior knowledge level of MongoDB who just took a quick look at the documentation and learned that every document in a collection &lt;a href="https://www.mongodb.com/docs/manual/indexes/#default-index" rel="noopener noreferrer"&gt;must have an &lt;code&gt;_id&lt;/code&gt;&lt;/a&gt; field and &lt;a href="https://www.mongodb.com/docs/manual/indexes/#default-index" rel="noopener noreferrer"&gt;this field is always unique indexed&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To take advantage of the &lt;code&gt;_id&lt;/code&gt; obligatory field and index, the developer decides to store the values of &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; in an embedded document in the &lt;code&gt;_id&lt;/code&gt; field. With that, each document will register the status totals for one user, specified by the field &lt;code&gt;_id.key&lt;/code&gt;, in one day, specified by the field &lt;code&gt;_id.date&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV1&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="nl"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specification presented, we have the following &lt;code&gt;updateOne&lt;/code&gt; operation for each &lt;code&gt;event&lt;/code&gt; generated by this application version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;filter&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Target the document where the &lt;code&gt;_id&lt;/code&gt; field matches &lt;code&gt;{ date: event.date, key: event.key }&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;update&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uses the &lt;code&gt;$inc&lt;/code&gt; operator to increment counters (&lt;code&gt;approved&lt;/code&gt;, &lt;code&gt;noFunds&lt;/code&gt;, &lt;code&gt;pending&lt;/code&gt;, &lt;code&gt;rejected&lt;/code&gt;) based on the event data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;upsert&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ensures a new document is created if no matching document exists.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_id.key&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_id.date&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;oneYear&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;$lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$approved&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$noFunds&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$pending&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$rejected&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;{ $match: {...} }&lt;/code&gt;&lt;/strong&gt;:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Filters documents based on the &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; fields.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;"_id.key"&lt;/code&gt; field matches the user key provided in the request.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;"_id.date"&lt;/code&gt; field filters documents within the specified date range using &lt;code&gt;$gte&lt;/code&gt; (greater than or equal to) and &lt;code&gt;$lt&lt;/code&gt; (less than).&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;{ $group: {...} }&lt;/code&gt;&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Group the filtered documents into a single result.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;_id&lt;/code&gt; field is set to &lt;code&gt;null&lt;/code&gt; to group all matching documents from the previous stage together.&lt;/li&gt;
&lt;li&gt;Computes the sum of the &lt;code&gt;approved&lt;/code&gt;, &lt;code&gt;noFunds&lt;/code&gt;, &lt;code&gt;pending&lt;/code&gt;, and &lt;code&gt;rejected&lt;/code&gt; fields using the &lt;code&gt;$sum&lt;/code&gt; operator.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;Initially, &lt;code&gt;appV1&lt;/code&gt; aimed to use the default index on the &lt;code&gt;_id&lt;/code&gt; field (which contained an embedded document with &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt;). However, this default index on the embedded &lt;code&gt;_id&lt;/code&gt; field was not sufficient to efficiently support the query patterns, particularly for the &lt;code&gt;Get Reports&lt;/code&gt; function, which filters by &lt;code&gt;_id.key&lt;/code&gt; and &lt;code&gt;_id.date&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To address this, an additional compound index was created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;keys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_id.key&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_id.date&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;unique&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;appV1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This explicit index on &lt;code&gt;_id.key&lt;/code&gt; and &lt;code&gt;_id.date&lt;/code&gt; ensures that queries filtering and sorting on these fields can be performed efficiently. The &lt;code&gt;unique: true&lt;/code&gt; option enforces that the combination of &lt;code&gt;_id.key&lt;/code&gt; and &lt;code&gt;_id.date&lt;/code&gt; is unique across all documents in the collection. For a more detailed explanation of why an index on an embedded document's fields might be needed even if the top-level field is indexed, refer to Appendices - Index on Embedded Documents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV1&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Avg. Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV1&lt;/td&gt;
&lt;td&gt;359,639,622&lt;/td&gt;
&lt;td&gt;39.58GB&lt;/td&gt;
&lt;td&gt;119B&lt;/td&gt;
&lt;td&gt;8.78GB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;20.06GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/Event&lt;/th&gt;
&lt;th&gt;Index Size/Event&lt;/th&gt;
&lt;th&gt;Total Size/Event&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV1&lt;/td&gt;
&lt;td&gt;85B&lt;/td&gt;
&lt;td&gt;43.1B&lt;/td&gt;
&lt;td&gt;128.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV1&lt;/code&gt; and plotting it alongside the &lt;code&gt;Desired&lt;/code&gt; values, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rate&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The application never reaches the target rate of 25 reports per second during the first 10 minutes phase, peaking at only 16.5 reports per second. During the rest of the test, the rate stays around 6 reports per second.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjobfxl3q6x4m3xgkmldr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjobfxl3q6x4m3xgkmldr.png" alt="Get Reports Rate - appV1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Begins at 2 seconds and progressively increases throughout the test duration, reaching a maximum of 6.5 seconds with an average of 4.5 seconds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qft1i8j1dalkasx34o6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qft1i8j1dalkasx34o6.png" alt="Get Reports Latency - appV1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rate&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The application only reaches the desired rate of 250 events per second during the first 10 minutes of the test. During the rest of the test, the rate degrades to around 200 events per second.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkbva7w05xaaua5klr891.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkbva7w05xaaua5klr891.png" alt="Bulk Upsert Rate - appV1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;:
&lt;/h4&gt;

&lt;p&gt;Starts at 10 seconds and exhibits similar degradation patterns, escalating to a maximum of 62 seconds with an average of 42 seconds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjnyhrajcdg1s9qn69psa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjnyhrajcdg1s9qn69psa.png" alt="Bulk Upsert Latency - appV1" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;The first issue that can be pointed out and improved in this implementation is the document schema in combination with the two indexes. Because the fields &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; are in an embedded document in the field &lt;code&gt;_id&lt;/code&gt;, their values are indexed twice: by the default/obligatory index in the &lt;code&gt;_id&lt;/code&gt; field and by the index we created to support the &lt;code&gt;Bulk Upserts&lt;/code&gt; and &lt;code&gt;Get Reports&lt;/code&gt; operations.&lt;/p&gt;

&lt;p&gt;As the &lt;code&gt;key&lt;/code&gt; field is a 64-character string and the &lt;code&gt;date&lt;/code&gt; field is of type date, these two values use at least 68 bytes of storage. As we have two indexes, each document will contribute to 136 index bytes in a non-compressed scenario.&lt;/p&gt;

&lt;p&gt;The improvement here is to extract the fields &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; from the &lt;code&gt;_id&lt;/code&gt; field and let the &lt;code&gt;_id&lt;/code&gt; field keep its default value of type ObjectId. The ObjectId data type takes only 12 bytes of storage.&lt;/p&gt;

&lt;p&gt;This first implementation can be seen as a forced worst-case scenario to make the more optimized solutions look better. Unfortunately, that is not the case. It's not hard to find implementations like this on the internet, and I've worked on a big project with a schema like this one, from which I got the idea for this first case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 2 (appV2): Better Understanding Indexing &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As discussed in the issues and improvements of &lt;code&gt;appV1&lt;/code&gt;, embedding the fields &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; as a document in the &lt;code&gt;_id&lt;/code&gt; field trying to take advantage of its obligatory index is not a good solution for our application because we would still need to create an extra index and the index on the &lt;code&gt;_id&lt;/code&gt; field would take more storage than needed.&lt;/p&gt;

&lt;p&gt;To solve the issue of the index on the &lt;code&gt;_id&lt;/code&gt; field being bigger than needed, the solution is to move out the fields &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; from the embedded document in the &lt;code&gt;_id&lt;/code&gt; field, and let the &lt;code&gt;_id&lt;/code&gt; field have its default value of type &lt;a href="https://www.mongodb.com/docs/manual/reference/bson-types/?tck=mongodb_ai_chatbot#objectid" rel="noopener noreferrer"&gt;&lt;code&gt;ObjectId&lt;/code&gt;&lt;/a&gt;. Each document would still register the status totals for one user, specified by the field &lt;code&gt;key&lt;/code&gt;, in one day, specified by the field &lt;code&gt;date&lt;/code&gt;, the same way it's done in &lt;code&gt;appV1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The second application version and the improvements to get to it would still have been developed by someone with a junior knowledge level of MongoDB, but who has gone more in-depth in the documentation related to indexes in MongoDB, especially when indexing fields of type documents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV2&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specification presented, we have the following &lt;code&gt;updateOne&lt;/code&gt; operation for each &lt;code&gt;event&lt;/code&gt; generated by this application version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;updateOne&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV1&lt;/code&gt;, with the only difference being the &lt;code&gt;filter&lt;/code&gt; criteria.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;filter&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Target the document where the fields &lt;code&gt;date&lt;/code&gt; and &lt;code&gt;key&lt;/code&gt; from the &lt;code&gt;event&lt;/code&gt; document matches the fields &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; from a document in the collection.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;oneYear&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;$lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$approved&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$noFunds&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$pending&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$rejected&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;aggregation&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV1&lt;/code&gt;, with the only difference being the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;{ $match: {...} }&lt;/code&gt;&lt;/strong&gt;:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;key&lt;/code&gt; field matches the user key provided in the request.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;date&lt;/code&gt; field filters documents within the specified date range using &lt;code&gt;$gte&lt;/code&gt; (greater than or equal to) and &lt;code&gt;$lt&lt;/code&gt; (less than).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;In &lt;code&gt;appV2&lt;/code&gt;, the &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; fields were moved out of the &lt;code&gt;_id&lt;/code&gt; field and became top-level fields. To support efficient querying for both &lt;code&gt;Bulk Upsert&lt;/code&gt; (filtering by &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt;) and &lt;code&gt;Get Reports&lt;/code&gt; (filtering by &lt;code&gt;key&lt;/code&gt; and a &lt;code&gt;date&lt;/code&gt; range), a compound index was created on these two fields:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;keys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;unique&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;appV2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV2&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Avg. Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV1&lt;/td&gt;
&lt;td&gt;359,639,622&lt;/td&gt;
&lt;td&gt;39.58GB&lt;/td&gt;
&lt;td&gt;119B&lt;/td&gt;
&lt;td&gt;8.78GB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;20.06GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV2&lt;/td&gt;
&lt;td&gt;359,614,536&lt;/td&gt;
&lt;td&gt;41.92GB&lt;/td&gt;
&lt;td&gt;126B&lt;/td&gt;
&lt;td&gt;10.46GB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;16.66GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/Event&lt;/th&gt;
&lt;th&gt;Index Size/Event&lt;/th&gt;
&lt;th&gt;Total Size/Event&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV1&lt;/td&gt;
&lt;td&gt;85B&lt;/td&gt;
&lt;td&gt;43.1B&lt;/td&gt;
&lt;td&gt;128.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV2&lt;/td&gt;
&lt;td&gt;90B&lt;/td&gt;
&lt;td&gt;35.8B&lt;/td&gt;
&lt;td&gt;125.8B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Analyzing the tables above, we can see that from &lt;code&gt;appV1&lt;/code&gt; to &lt;code&gt;appV2&lt;/code&gt;, we increased the data size by 6% and decreased the index size by 17%. We can say that our goal of making the index on the &lt;code&gt;_id&lt;/code&gt; field smaller was accomplished.&lt;/p&gt;

&lt;p&gt;Looking at the &lt;code&gt;Event Statistics&lt;/code&gt;, the total size per event value decreased only by 1.8%, from 128.1B to 125.8B. With this difference being so small, there is a good chance that we won’t see significant improvements from a performance point of view.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV2&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV1&lt;/code&gt; and &lt;code&gt;Desired&lt;/code&gt; rates, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Performance remains suboptimal, reaching only 17 reports per second compared to the target of 25 reports per second for the first 10 minutes of the test, slightly better than &lt;code&gt;appV1&lt;/code&gt;. For the rest of the test, both versions have equally bad performance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flmmthz60mq5rxpbwsj90.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flmmthz60mq5rxpbwsj90.png" alt="Get Reports Rate - appV1 vs appV2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;appV2&lt;/code&gt; demonstrates considerably worse latency performance compared to &lt;code&gt;appV1&lt;/code&gt;, indicating that the schema changes negatively impacted read operations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fegg0kce16qs3v850ca98.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fegg0kce16qs3v850ca98.png" alt="Get Reports Latency - appV1 vs appV2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Similar to &lt;code&gt;appV1&lt;/code&gt;, &lt;code&gt;appV2&lt;/code&gt; achieves the target rate of 250 events per second only during the first 10 minutes of testing. For the rest of the test, &lt;code&gt;appV2&lt;/code&gt; has a slightly better performance than &lt;code&gt;appV1&lt;/code&gt;, but still way below the desired rates.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcnpq9suhpe3zu4ydkn3t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcnpq9suhpe3zu4ydkn3t.png" alt="Bulk Upsert Rate - appV1 vs appV2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;appV2&lt;/code&gt; shows marginal improvement over &lt;code&gt;appV1&lt;/code&gt;, suggesting some benefit from the reduced index size for write operations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foo136f70hrl7r1k5truf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foo136f70hrl7r1k5truf.png" alt="Bulk Upsert Latency - appV1 vs appV2" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Performance Summary&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The results align with the modest 1.8% improvement observed in the Initial Scenario Statistics. &lt;code&gt;appV2&lt;/code&gt;'s performance characteristics demonstrate that simply restructuring the &lt;code&gt;_id&lt;/code&gt; field provides minimal benefits. The marginal improvements in &lt;code&gt;Bulk Upsert&lt;/code&gt; operations (attributed to smaller indexes) are offset by degraded &lt;code&gt;Get Reports&lt;/code&gt; performance (attributed to larger document sizes), resulting in negligible overall performance gains.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;The following document is a sample from the collection &lt;code&gt;appV2&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;6685c0dfc2445d3c5913008f&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0000000000000000000000000000000000000000000000000000000000000001&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-25T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Analyzing it, aiming to reduce its size, two points of improvement can be found. One is the field &lt;code&gt;key&lt;/code&gt;, which is of type string and will always have 64 characters of hexadecimal data, and the other is the name of the status fields, which combined can have up to 30 characters.&lt;/p&gt;

&lt;p&gt;The field &lt;code&gt;key&lt;/code&gt;, as presented in the scenario section, is composed of hexadecimal data, in which each character requires four bits to be presented. In our implementation so far, we have stored this data as strings using UTF-8 encoding, in which each character requires eight bits to be represented. So, we are using double the storage we need. One way around this issue is to store the hexadecimal data in its raw format using the binary data.&lt;/p&gt;

&lt;p&gt;For the status field names, we can see that the names of the fields use more storage than the value itself. The field names are strings with at least 7 UTF-8 characters, which takes at least 7 bytes. The value of the status fields is a 32-bit integer, which takes 4 bytes. We can shorthand the status names by their first character, where &lt;code&gt;approved&lt;/code&gt; becomes &lt;code&gt;a&lt;/code&gt;, &lt;code&gt;noFunds&lt;/code&gt; becomes &lt;code&gt;n&lt;/code&gt;, &lt;code&gt;pending&lt;/code&gt; becomes &lt;code&gt;p&lt;/code&gt;, and &lt;code&gt;rejected&lt;/code&gt; becomes &lt;code&gt;r&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 3 (appV3): Better Data Types and Field Name Shorthanding &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As discussed in the issues and improvements of &lt;code&gt;appV2&lt;/code&gt;, to reduce the document size, two improvements were proposed. One is to convert the data type of the field &lt;code&gt;key&lt;/code&gt; from string to binary, requiring four bits to represent each hexadecimal character instead of the eight bits of a UTF-8 character. The other is to shorthand the name of the status fields by their first letter, requiring one byte for each field name instead of seven bytes. Each document would still register the status totals for one user, specified by the field &lt;code&gt;key&lt;/code&gt;, in one day, specified by the field &lt;code&gt;date&lt;/code&gt;, the same way it was done in the previous implementations.&lt;/p&gt;

&lt;p&gt;To convert the &lt;code&gt;key&lt;/code&gt; value from string to binary/buffer, the following TypeScript function was created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;buildKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The third application version has two improvements compared to the second version. The improvement of storing the field &lt;code&gt;key&lt;/code&gt; as binary data to reduce its storage need would have been thought of by an intermediate to senior MongoDB developer. The improvement of shortening the names of the status fields would have been thought of by an intermediate MongoDB developer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV3&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Upsert &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Based on the specification presented, we have the following &lt;code&gt;updateOne&lt;/code&gt; operation for each &lt;code&gt;event&lt;/code&gt; generated by this application version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;updateOne&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV2&lt;/code&gt;, with the differences being the &lt;code&gt;filter&lt;/code&gt; criteria and the &lt;code&gt;$inc&lt;/code&gt; operation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;filter&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Target the document where the fields &lt;code&gt;date&lt;/code&gt; and &lt;code&gt;key&lt;/code&gt; from the &lt;code&gt;event&lt;/code&gt; document matches the fields &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; from a document in the collection&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;key&lt;/code&gt; is converted to binary format using the &lt;code&gt;buildKey&lt;/code&gt; function.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;update&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uses the &lt;code&gt;$inc&lt;/code&gt; operator to increment counters (&lt;code&gt;a&lt;/code&gt;, &lt;code&gt;n&lt;/code&gt;, &lt;code&gt;p&lt;/code&gt;, &lt;code&gt;r&lt;/code&gt;) based on the event data.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Get Reports &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;oneYear&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;$lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$p&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$r&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;aggregation&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV2&lt;/code&gt;, with the differences being the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage and the name of the statuses fields in the &lt;code&gt;$group&lt;/code&gt; stage.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;{ $match: {...} }&lt;/code&gt;&lt;/strong&gt;:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;key&lt;/code&gt; field is converted to binary format using the &lt;code&gt;buildKey&lt;/code&gt; function.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;{ $group: {...} }&lt;/code&gt;&lt;/strong&gt;:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Computes the sum of the &lt;code&gt;a&lt;/code&gt;, &lt;code&gt;n&lt;/code&gt;, &lt;code&gt;p&lt;/code&gt;, and &lt;code&gt;r&lt;/code&gt; fields using the &lt;code&gt;$sum&lt;/code&gt; operator.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;Similar to &lt;code&gt;appV2&lt;/code&gt;, &lt;code&gt;appV3&lt;/code&gt; relies on a compound index on the &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; fields to optimize &lt;code&gt;Bulk Upsert&lt;/code&gt; and &lt;code&gt;Get Reports&lt;/code&gt; operations. Even though the &lt;code&gt;key&lt;/code&gt; field is now stored as binary data and status field names are shortened, the query patterns remain the same, necessitating the following index:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;keys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;unique&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;appV3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV3&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Avg. Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV2&lt;/td&gt;
&lt;td&gt;359,614,536&lt;/td&gt;
&lt;td&gt;41.92GB&lt;/td&gt;
&lt;td&gt;126B&lt;/td&gt;
&lt;td&gt;10.46GB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;16.66GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV3&lt;/td&gt;
&lt;td&gt;359,633,376&lt;/td&gt;
&lt;td&gt;28.7GB&lt;/td&gt;
&lt;td&gt;86B&lt;/td&gt;
&lt;td&gt;8.96GB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;16.37GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/Event&lt;/th&gt;
&lt;th&gt;Index Size/Event&lt;/th&gt;
&lt;th&gt;Total Size/Event&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV2&lt;/td&gt;
&lt;td&gt;90B&lt;/td&gt;
&lt;td&gt;35.8B&lt;/td&gt;
&lt;td&gt;125.8B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV3&lt;/td&gt;
&lt;td&gt;62B&lt;/td&gt;
&lt;td&gt;35.2B&lt;/td&gt;
&lt;td&gt;96.8B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Analyzing the tables above, we can see that from &lt;code&gt;appV2&lt;/code&gt; to &lt;code&gt;appV3&lt;/code&gt;, there was practically no change in the index size and a decrease of 32% in the data size. Our goal of reducing the document size was accomplished.&lt;/p&gt;

&lt;p&gt;Looking at the &lt;code&gt;Event Statistics&lt;/code&gt;, the total size per event value decreased by 23%, from 125.8B to 96.8B. With this reduction, we’ll probably see considerable improvements.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV3&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV2&lt;/code&gt; and &lt;code&gt;Desired&lt;/code&gt; rates, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;While still falling short of the 25 reports per second target, &lt;code&gt;appV3&lt;/code&gt; demonstrates some improvement, maintaining approximately 16 reports per second for half the test duration, an enhancement over &lt;code&gt;appV2&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdjwgw9z2pg7y3nz7o7sf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdjwgw9z2pg7y3nz7o7sf.png" alt="Get Reports Rate - appV2 vs appV3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Maintains approximately 1.2 seconds for the first 100 minutes before degrading to levels similar to previous versions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9cgjhxwygwh65eubm1h6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9cgjhxwygwh65eubm1h6.png" alt="Get Reports Latency - appV2 vs appV3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Successfully maintains target rates for the first 100 minutes of testing - achieving 250 events per second from 0-50 minutes and 500 events per second from 50-100 minutes. This marks the first version to sustain target performance for extended periods.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4itc8qpi9vcvpvck67v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4itc8qpi9vcvpvck67v.png" alt="Bulk Upsert Rate - appV2 vs appV3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Sustains approximately 2.5 seconds during the first 100 minutes, considerably better than previous implementations, before experiencing degradation in the final test phase.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmdyup9jsw0mja3hmqy2k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmdyup9jsw0mja3hmqy2k.png" alt="Bulk Upsert Latency - appV2 vs appV3" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;Looking at the &lt;code&gt;collection stats&lt;/code&gt; of &lt;code&gt;appV3&lt;/code&gt; and thinking about how MongoDB is executing our queries and what indexes are being used, we can see that the &lt;code&gt;_id&lt;/code&gt; field and its index aren't being used in our application. The field by itself is not a big deal from a performance standpoint, but its obligatory unique index is that every time a new document is inserted in the collection, the index structure on the &lt;code&gt;_id&lt;/code&gt; field has to be updated.&lt;/p&gt;

&lt;p&gt;Going back to the idea from &lt;code&gt;appV1&lt;/code&gt; of trying to take advantage of the obligatory &lt;code&gt;_id&lt;/code&gt; field and its index, is there a way that we can use it in our application?&lt;/p&gt;

&lt;p&gt;Let's take a look at our filtering criteria in the &lt;code&gt;Get Report&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt; functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bulkUpsertFilter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;getReportsFilter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2021-06-15&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;$lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-15&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In both filtering criteria, the &lt;code&gt;key&lt;/code&gt; field is compared using equality. The &lt;code&gt;date&lt;/code&gt; field is compared using equality in the &lt;code&gt;Bulk Upsert&lt;/code&gt; and range in the &lt;code&gt;Get Reports&lt;/code&gt;. What if we combine these two field values in just one, concatenating them, and store it in &lt;code&gt;_id&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;To guide us on how we should order the fields in the resulting concatenated value and get the best performance of the index on it, let's follow the Equality, Sort, and Range rule (ESR).&lt;/p&gt;

&lt;p&gt;As seen above, the &lt;code&gt;key&lt;/code&gt; field is compared by equality in both cases, and the &lt;code&gt;date&lt;/code&gt; field is compared by equality just in one case, so let's choose the &lt;code&gt;key&lt;/code&gt; field for the first part of our concatenated value and the &lt;code&gt;date&lt;/code&gt; field for the second part. As we don't have a Sort operation in our queries, we can skip it. Next, we have Range comparison, which is used in the &lt;code&gt;date&lt;/code&gt; field, so now it makes sense to keep it as the second part of our concatenated value. With that, the most optimal way of concatenating the two values and getting the best performance of its index is &lt;code&gt;key&lt;/code&gt;+&lt;code&gt;date&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;One point of attention is how we are going to format the &lt;code&gt;date&lt;/code&gt; field in this concatenation in a way that the range filter works, and we don't store more data than we really need. One possible implementation will be presented and tested in the next application version, &lt;code&gt;appV4&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Version 4 (appV4): Taking Advantage of the &lt;code&gt;_id&lt;/code&gt; Index &lt;a&gt;&lt;/a&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As presented in the issues and improvements of &lt;code&gt;appV3&lt;/code&gt;, one way to take advantage of the obligatory field and index on &lt;code&gt;_id&lt;/code&gt; is to store on it the concatenated value of &lt;code&gt;key&lt;/code&gt; + &lt;code&gt;date&lt;/code&gt;. One thing that we need to cover now is what data type the &lt;code&gt;_id&lt;/code&gt; field will have and how we are going to format the &lt;code&gt;date&lt;/code&gt; field.&lt;/p&gt;

&lt;p&gt;As seen in previous implementations, storing the &lt;code&gt;key&lt;/code&gt; field as binary/hexadecimal data improved the performance. So, let's see if we can also store the resulting concatenated field, &lt;code&gt;key&lt;/code&gt; + &lt;code&gt;date&lt;/code&gt;, as binary/hexadecimal.&lt;/p&gt;

&lt;p&gt;To store the &lt;code&gt;date&lt;/code&gt; field in a binary/hexadecimal type, we have some options. One could be converting it to a 4-byte timestamp that measures the seconds since the Unix epoch, and the other could be converting it to the format &lt;code&gt;YYYYMMDD&lt;/code&gt;, which stores year, month, and day. Both cases would require the same 32 bits/8 hexadecimal characters.&lt;/p&gt;

&lt;p&gt;For our case, let's use the second option and store the &lt;code&gt;date&lt;/code&gt; value as &lt;code&gt;YYYYMMDD&lt;/code&gt; because it'll help in future implementation/improvements. Considering a &lt;code&gt;key&lt;/code&gt; field with the value &lt;code&gt;0001&lt;/code&gt; and a &lt;code&gt;date&lt;/code&gt; field with the value &lt;code&gt;2022-01-01&lt;/code&gt;, we would have the following &lt;code&gt;_id&lt;/code&gt; field:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;000120220101&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To concatenate and convert the &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; fields to their desired format and type, the following TypeScript function was created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;buildId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;day&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;T&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/-/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// YYYYMMDD&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;}${&lt;/span&gt;&lt;span class="nx"&gt;day&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each document would still register the status totals for one user in one day, specified by &lt;code&gt;_id&lt;/code&gt; field, the same way it's done in the previous implementations.&lt;/p&gt;

&lt;p&gt;These changes reflect an advanced understanding of MongoDB's indexing strategies and storage optimization techniques, demonstrating the expertise of a very experienced senior developer with deep knowledge of BSON data types and compound key design patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;p&gt;The application implementation presented above would have the following TypeScript document schema denominated &lt;code&gt;SchemaV4&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SchemaV4&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;a&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;n&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;p&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;r&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Upsert
&lt;/h3&gt;

&lt;p&gt;Based on the specification presented, we have the following &lt;code&gt;updateOne&lt;/code&gt; operation for each &lt;code&gt;event&lt;/code&gt; generated by this application version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;updateOne&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV3&lt;/code&gt;, with the only difference being the &lt;code&gt;filter&lt;/code&gt; criteria.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;filter&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Target the document where the &lt;code&gt;_id&lt;/code&gt; field matches the concatenated value of &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;buildId&lt;/code&gt; function converts the &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; into a binary format.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Get Reports
&lt;/h3&gt;

&lt;p&gt;To fulfill the &lt;code&gt;Get Reports&lt;/code&gt; operation, five aggregation pipelines are required, one for each date interval. Each pipeline follows the same structure, differing only in the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;oneYear&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="na"&gt;$lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$p&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$r&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;aggregation&lt;/code&gt; operation has a similar logic to the one in &lt;code&gt;appV1&lt;/code&gt;, with the only difference being the filtering criteria in the &lt;code&gt;$match&lt;/code&gt; stage.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;{ $match: {...} }&lt;/code&gt;&lt;/strong&gt;:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;_id&lt;/code&gt; field is a binary representation of the concatenated &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; values.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;$gte&lt;/code&gt; operator specifies the start of the date range, while &lt;code&gt;$lt&lt;/code&gt; specifies the end.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Indexes
&lt;/h3&gt;

&lt;p&gt;The key design goal of &lt;code&gt;appV4&lt;/code&gt; was to leverage the mandatory, default index on the &lt;code&gt;_id&lt;/code&gt; field. By storing the concatenated &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; (formatted as &lt;code&gt;YYYYMMDD&lt;/code&gt; and converted to binary) directly in the &lt;code&gt;_id&lt;/code&gt; field, &lt;code&gt;appV4&lt;/code&gt; eliminates the need for any additional custom indexes.&lt;/p&gt;

&lt;p&gt;The default index on &lt;code&gt;_id&lt;/code&gt; is automatically created by MongoDB and is unique. This index now directly supports the filtering requirements for both &lt;code&gt;Bulk Upsert&lt;/code&gt; (equality match on the full &lt;code&gt;_id&lt;/code&gt;) and &lt;code&gt;Get Reports&lt;/code&gt; (range queries on the &lt;code&gt;_id&lt;/code&gt; based on &lt;code&gt;key&lt;/code&gt; and date ranges).&lt;/p&gt;

&lt;h3&gt;
  
  
  Initial Scenario Statistics
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Collection Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the performance of &lt;code&gt;appV4&lt;/code&gt;, we inserted 500 million event documents into the collection using the schema and &lt;code&gt;Bulk Upsert&lt;/code&gt; function described earlier. For comparison, the tables below also include statistics from previous comparable application versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Avg. Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV3&lt;/td&gt;
&lt;td&gt;359,633,376&lt;/td&gt;
&lt;td&gt;28.7GB&lt;/td&gt;
&lt;td&gt;86B&lt;/td&gt;
&lt;td&gt;8.96GB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;16.37GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV4&lt;/td&gt;
&lt;td&gt;359,615,279&lt;/td&gt;
&lt;td&gt;19.66GB&lt;/td&gt;
&lt;td&gt;59B&lt;/td&gt;
&lt;td&gt;6.69GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;9.5GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Event Statistics
&lt;/h4&gt;

&lt;p&gt;To evaluate the storage efficiency per event, the &lt;code&gt;Event Statistics&lt;/code&gt; are calculated by dividing the total Data Size and Index Size by the 500 million events.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Data Size/Event&lt;/th&gt;
&lt;th&gt;Index Size/Event&lt;/th&gt;
&lt;th&gt;Total Size/Event&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV3&lt;/td&gt;
&lt;td&gt;62B&lt;/td&gt;
&lt;td&gt;35.2B&lt;/td&gt;
&lt;td&gt;96.8B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV4&lt;/td&gt;
&lt;td&gt;42.4B&lt;/td&gt;
&lt;td&gt;20.4B&lt;/td&gt;
&lt;td&gt;62.6B&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Analyzing the tables above, we can see that from &lt;code&gt;appV3&lt;/code&gt; to &lt;code&gt;appV4&lt;/code&gt;, we reduced the data size by 32% and the index size by 42%—big improvements. We also have one less index to maintain now.&lt;/p&gt;

&lt;p&gt;Looking at the &lt;code&gt;Event Statistics&lt;/code&gt;, the total size per event value decreased by 35%, from 96.8B to 62.6B. With this reduction, we’ll probably see some significant improvements in performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Test Results
&lt;/h3&gt;

&lt;p&gt;Executing the load test for &lt;code&gt;appV4&lt;/code&gt; and plotting it alongside the results for &lt;code&gt;appV3&lt;/code&gt; and &lt;code&gt;Desired&lt;/code&gt; rates, we have the following results for &lt;code&gt;Get Reports&lt;/code&gt; and &lt;code&gt;Bulk Upsert&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;While still not achieving the target of 25 reports per second, &lt;code&gt;appV4&lt;/code&gt; shows consistently better average rates compared to &lt;code&gt;appV3&lt;/code&gt;, representing incremental progress toward optimal performance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb2q2ko2lkrb1sx12sqnh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb2q2ko2lkrb1sx12sqnh.png" alt="Get Reports Rate - appV3 vs appV4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Get Reports Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions exhibit comparable latency behavior throughout most of the test duration. However, during the final 100 minutes when performance degrades, &lt;code&gt;appV4&lt;/code&gt; demonstrates better resilience with smaller latency increases compared to &lt;code&gt;appV3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjl6y8evnamtsbruvrv0y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjl6y8evnamtsbruvrv0y.png" alt="Get Reports Latency - appV3 vs appV4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Rates&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions successfully maintain target rates during the first 100 minutes, but &lt;code&gt;appV4&lt;/code&gt; demonstrates superior performance during the degraded final 100 minutes, sustaining higher rates than &lt;code&gt;appV3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fah7gmywbc5s18f9go07l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fah7gmywbc5s18f9go07l.png" alt="Bulk Upsert Rate - appV3 vs appV4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Bulk Upsert Latency&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Both versions exhibit comparable latency behavior throughout most of the test duration. However, during the final 100 minutes when performance degrades, &lt;code&gt;appV4&lt;/code&gt; demonstrates better resilience with smaller latency increases compared to &lt;code&gt;appV3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1g00tg8775i36lo8rfir.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1g00tg8775i36lo8rfir.png" alt="Bulk Upsert Latency - appV3 vs appV4" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Performance Summary&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Despite the substantial improvements observed in the &lt;code&gt;Initial Scenario Statistics&lt;/code&gt; (35% reduction in Total Size per Event from 96.8B to 62.6B), the performance gains in &lt;code&gt;appV4&lt;/code&gt; are more modest than anticipated. This suggests that while index optimization and storage reduction provide measurable benefits, the fundamental architectural constraints require more significant changes. The results indicate that &lt;code&gt;appV4&lt;/code&gt; has reached the optimization ceiling for the current document-per-day approach.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issues and Improvements
&lt;/h3&gt;

&lt;p&gt;Enough of looking at our documents to get a better performance. Let's focus on the application behavior.&lt;/p&gt;

&lt;p&gt;When generating the &lt;code&gt;oneYear&lt;/code&gt; totals, the &lt;code&gt;Get Reports&lt;/code&gt; function will need to retrieve something close to 60 documents on average, and in the worst-case scenario, 365 documents. To access each one of these documents, one index entry will have to be visited, and one disk read operation will have to be performed. How can we increase the data density of the documents in our application and, with that, reduce the index entries and read operations needed to perform the desired operation?&lt;/p&gt;

&lt;p&gt;One way of doing that is using the &lt;a href="https://www.mongodb.com/blog/post/building-with-patterns-the-bucket-pattern" rel="noopener noreferrer"&gt;Bucket Pattern&lt;/a&gt;. According to the &lt;a href="https://www.mongodb.com/docs/manual/data-modeling/design-patterns/group-data/bucket-pattern/" rel="noopener noreferrer"&gt;MongoDB documentation&lt;/a&gt;, "The bucket pattern separates long series of data into distinct objects. Separating large data series into smaller groups can improve query access patterns and simplify application logic."&lt;/p&gt;

&lt;p&gt;Looking at our application from the perspective of the bucket pattern, so far, we have bucketed our data by daily user, each document containing the status totals for one user in one day. We can increase the bucketing range of our schema and, in one document, store events or status totals from a week, month, or even quarter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;That is the end of the first part of the series. We covered how indexes work on fields of type documents and saw some small changes that we can make to our application to reduce its storage and index needs, and as a consequence, improve its performance.&lt;/p&gt;

&lt;p&gt;Here is a quick review of the improvements made between the application versions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;appV1&lt;/code&gt; to &lt;code&gt;appV2&lt;/code&gt;: Moved out the fields &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; from an embedded document in the &lt;code&gt;_id&lt;/code&gt; field and let it have its default value, ObjectId&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;appV2&lt;/code&gt; to &lt;code&gt;appV3&lt;/code&gt;: Reduced the document size by short-handing the name of status fields and changed the data type of the &lt;code&gt;key&lt;/code&gt; field from string to binary/hexadecimal&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;appV3&lt;/code&gt; to &lt;code&gt;appV4&lt;/code&gt;: Removed the need for an extra index by concatenating the values of &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; and storing them on the &lt;code&gt;_id&lt;/code&gt; field&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So far, none of our applications have gotten even close to the desired rates, but let's not give up. As presented in the issues and improvements of &lt;code&gt;appV4&lt;/code&gt;, we can still improve our application by using the Bucket Pattern. The Bucket Pattern with the &lt;a href="https://www.mongodb.com/blog/post/building-with-patterns-the-computed-pattern" rel="noopener noreferrer"&gt;Computed Pattern&lt;/a&gt; will be the main points of improvement for the next application version, &lt;code&gt;appV5&lt;/code&gt;, and its revisions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Appendices
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Index on Embedded Documents
&lt;/h3&gt;

&lt;p&gt;This section examines how MongoDB indexes embedded document fields and explains why the &lt;code&gt;appV1&lt;/code&gt; implementation requires an additional index beyond the default &lt;code&gt;_id&lt;/code&gt; index.&lt;/p&gt;

&lt;h4&gt;
  
  
  Index Behavior Analysis
&lt;/h4&gt;

&lt;p&gt;To understand MongoDB's indexing behavior with embedded documents, we'll analyze how the default &lt;code&gt;_id&lt;/code&gt; index performs with our &lt;code&gt;appV1&lt;/code&gt; query patterns. The following tests demonstrate the difference between exact document matching and embedded field queries through the &lt;a href="https://www.mongodb.com/docs/manual/reference/command/explain/" rel="noopener noreferrer"&gt;&lt;code&gt;explain&lt;/code&gt;&lt;/a&gt; functionality.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// A sample document&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0001&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2020-01-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// Making sure we have an empty collection&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;appV1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Inserting the document in the `appV1` collection&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;appV1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Finding a document using `Bulk Upsert` filtering criteria&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bulkUpsertFilter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0001&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2020-01-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;appV1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;bulkUpsertFilter&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;explain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;executionStats&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="cm"&gt;/*{
...
  executionStats: {
    nReturned: 1,
    totalKeysExamined: 1,
    totalDocsExamined: 1,
    ...
    executionStages: {
      stage: 'EXPRESS_IXSCAN',
      ...
    }
    ...
  },
  ...
}*/&lt;/span&gt;

&lt;span class="c1"&gt;// Finding a document using `Get Reports` filtering criteria&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;getReportsFilter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_id.key&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0001&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_id.date&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2019-01-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;$lte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2021-01-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;appV1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;getReportsFilter&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;explain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;executionStats&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="cm"&gt;/*{
...
  executionStats: {
    nReturned: 1,
    totalKeysExamined: 0,
    totalDocsExamined: 1,
    ...
    executionStages: {
      stage: 'COLLSCAN',
      ...
    }
    ...
  },
  ...
}*/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Index Utilization Results
&lt;/h4&gt;

&lt;p&gt;The execution statistics reveal a critical performance difference:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bulk Upsert Query&lt;/strong&gt;: Uses the index efficiently (&lt;code&gt;EXPRESS_IXSCAN&lt;/code&gt;) because it matches the entire embedded document exactly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get Reports Query&lt;/strong&gt;: Performs a collection scan (&lt;code&gt;COLLSCAN&lt;/code&gt;) because it queries individual fields within the embedded document&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This behavior occurs because MongoDB treats embedded documents as atomic values when indexing, not as collections of individual fields.&lt;/p&gt;

&lt;h4&gt;
  
  
  MongoDB's Embedded Document Indexing Strategy
&lt;/h4&gt;

&lt;p&gt;MongoDB handles different data types with varying indexing approaches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Primitive Types&lt;/strong&gt;: Directly indexed with their native values&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.mongodb.com/docs/manual/core/indexes/index-types/index-multikey/create-multikey-index-basic/" rel="noopener noreferrer"&gt;&lt;strong&gt;Arrays&lt;/strong&gt;&lt;/a&gt;: Special indexing that creates entries for each array element&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.mongodb.com/docs/manual/core/indexes/index-types/index-single/create-embedded-object-index/" rel="noopener noreferrer"&gt;&lt;strong&gt;Embedded Documents&lt;/strong&gt;&lt;/a&gt;: Indexed as serialized, atomic values&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For embedded documents, MongoDB creates index entries using a stringified representation of the entire document structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;documentValue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0001&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2010&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="na"&gt;T00&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;00.000&lt;/span&gt;&lt;span class="nx"&gt;Z&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;indexValue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;{key:0001,date:2010-01-01T00:00:00.000Z}&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Index Limitation Implications
&lt;/h4&gt;

&lt;p&gt;This indexing strategy creates a fundamental limitation: since the index stores the embedded document as a serialized blob, MongoDB cannot access or search individual fields within that structure. Consequently:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Queries matching the entire embedded document can use the index effectively&lt;/li&gt;
&lt;li&gt;Queries targeting specific embedded fields (like &lt;code&gt;_id.key&lt;/code&gt; or &lt;code&gt;_id.date&lt;/code&gt;) cannot utilize the index&lt;/li&gt;
&lt;li&gt;Range queries on embedded fields require full collection scans&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This explains why the &lt;code&gt;appV1&lt;/code&gt; implementation requires an additional compound index on &lt;code&gt;_id.key&lt;/code&gt; and &lt;code&gt;_id.date&lt;/code&gt; to support efficient querying of individual embedded document fields.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mongodb</category>
      <category>performance</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>The Cost of Not Knowing MongoDB – Introduction</title>
      <dc:creator>Artur Garcia Costa</dc:creator>
      <pubDate>Thu, 22 Jan 2026 16:49:52 +0000</pubDate>
      <link>https://forem.com/arturgc/the-cost-of-not-knowing-mongodb-introduction-335h</link>
      <guid>https://forem.com/arturgc/the-cost-of-not-knowing-mongodb-introduction-335h</guid>
      <description>&lt;p&gt;The primary focus of this series is to demonstrate the significant performance gains you can achieve—and the costs you can save—by using MongoDB properly. This includes following best practices, studying your application's specific needs, and using those insights to model your data effectively.&lt;/p&gt;

&lt;p&gt;To illustrate these potential gains, we will present a sample application. We will then develop and load-test various MongoDB implementations for this application. These implementations will cater to different levels of MongoDB expertise: beginner, intermediate, senior, and mind-blowing (🤯).&lt;/p&gt;

&lt;p&gt;All code and supplementary information used throughout this series are available in the &lt;a href="https://github.com/ArturGC/the-cost-of-not-knowing-mongodb" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Application: Finding Fraudulent Behavior in Transactions
&lt;/h2&gt;

&lt;p&gt;The application's goal is to identify fraudulent behavior within a financial transaction system. It achieves this by analyzing the status of transactions for a specific user over a defined time period. The possible transaction statuses are &lt;code&gt;approved&lt;/code&gt;, &lt;code&gt;noFunds&lt;/code&gt;, &lt;code&gt;pending&lt;/code&gt;, and &lt;code&gt;rejected&lt;/code&gt;. Each user is uniquely identifiable by a 64-character hexadecimal &lt;code&gt;key&lt;/code&gt; value.&lt;/p&gt;

&lt;p&gt;The application receives details of each transaction through an &lt;code&gt;event&lt;/code&gt; document. Each &lt;code&gt;event&lt;/code&gt; document contains information for a single transaction, for one user, on a specific day. Consequently, it will include only one of the possible status fields, with this field having a numeric value of 1. For example, the following &lt;code&gt;event&lt;/code&gt; document represents a &lt;code&gt;pending&lt;/code&gt; transaction for the user with the &lt;code&gt;key&lt;/code&gt; &lt;code&gt;...0001&lt;/code&gt;, which occurred on the &lt;code&gt;date&lt;/code&gt; &lt;code&gt;2022-02-01&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0000000000000000000000000000000000000000000000000000000000000001&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-02-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Transaction statuses are analyzed by comparing the total counts of each status for a given user over several trailing periods: &lt;code&gt;oneYear&lt;/code&gt;, &lt;code&gt;threeYears&lt;/code&gt;, &lt;code&gt;fiveYears&lt;/code&gt;, &lt;code&gt;sevenYears&lt;/code&gt;, and &lt;code&gt;tenYears&lt;/code&gt;. These totals are provided in a &lt;code&gt;reports&lt;/code&gt; document, which can be requested by providing the user's &lt;code&gt;key&lt;/code&gt; and the end &lt;code&gt;date&lt;/code&gt; for the report.&lt;/p&gt;

&lt;p&gt;The following is an example of a &lt;code&gt;reports&lt;/code&gt; document for the user with &lt;code&gt;key&lt;/code&gt; &lt;code&gt;...0001&lt;/code&gt; and an end &lt;code&gt;date&lt;/code&gt; of &lt;code&gt;2022-06-15&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;oneYear&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;end&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-15T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2021-06-15T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;threeYears&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;end&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-15T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2019-06-15T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fiveYears&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;end&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-15T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2017-06-15T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sevenYears&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;end&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-15T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2015-06-15T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tenYears&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;end&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2022-06-15T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2012-06-15T00:00:00.000Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;totals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;noFunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;rejected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Load Testing Methodology
&lt;/h2&gt;

&lt;p&gt;To evaluate the performance of each application version, two functions were designed to run concurrently under load:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;code&gt;Bulk Upsert&lt;/code&gt;: Inserts event documents.&lt;/li&gt;
&lt;li&gt; &lt;code&gt;Get Reports&lt;/code&gt;: Generates &lt;code&gt;reports&lt;/code&gt; document for a specific user &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Parallel execution of these functions was achieved using worker threads, with 20 workers allocated to each. Each application version was tested for 200 minutes, with varying execution parameters applied throughout this period.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Bulk Upsert&lt;/code&gt; Function
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;Bulk Upsert&lt;/code&gt; function receives batches of 250 event documents for registration. As its name suggests, these registrations are performed using MongoDB's &lt;a href="https://www.mongodb.com/docs/manual/reference/method/db.collection.update/#insert-a-new-document-if-no-match-exists--upsert-" rel="noopener noreferrer"&gt;&lt;code&gt;upsert&lt;/code&gt;&lt;/a&gt; functionality, which attempts to update a document or creates a new one if it doesn't exist, using the data from the update operation. Each &lt;code&gt;Bulk Upsert&lt;/code&gt; iteration is timed and its duration is recorded in a secondary database.&lt;/p&gt;

&lt;p&gt;The batch processing rate is divided into four 50-minute phases, totaling 200 minutes. The rate begins at one batch insert per second and is incremented by one batch insert per second every 50 minutes, ultimately reaching four batch inserts per second (equivalent to 1,000 event documents per second).&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Get Reports&lt;/code&gt; Function
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;Get Reports&lt;/code&gt; function generates one &lt;code&gt;reports&lt;/code&gt; document per execution. The duration of each execution is timed and recorded in the secondary database.&lt;/p&gt;

&lt;p&gt;The rate of &lt;code&gt;reports&lt;/code&gt; generation is divided into 40 phases, distributed as 10 sub-phases within each of the four &lt;code&gt;Bulk Upsert&lt;/code&gt; phases. Within each &lt;code&gt;Bulk Upsert&lt;/code&gt; phase, the &lt;code&gt;Get Reports&lt;/code&gt; rate starts at 25 report requests per second and increases by 25 requests per second every five minutes. This culminates in 250 report requests per second by the end of that &lt;code&gt;Bulk Upsert&lt;/code&gt; phase.&lt;/p&gt;

&lt;p&gt;The following graph depicts the target rates for &lt;code&gt;Bulk Upsert&lt;/code&gt; and &lt;code&gt;Get Reports&lt;/code&gt; throughout the test scenario:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fccpscpm19v4dgt6ekco8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fccpscpm19v4dgt6ekco8.png" alt="Desired rates for Bulk Upsert and Get Reports" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Initial Scenario and Data Generation
&lt;/h2&gt;

&lt;p&gt;For a fair comparison across application versions, the initial dataset (working set) for the tests was designed to be larger than the available memory on the MongoDB server. This approach ensures significant cache activity and prevents the entire working set from residing in memory.&lt;/p&gt;

&lt;p&gt;The following parameters were established for the initial dataset:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data spanning 10 years: from &lt;code&gt;2010-01-01&lt;/code&gt; to &lt;code&gt;2020-01-01&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;50 million events per year, resulting in a total working set of 500 million events.&lt;/li&gt;
&lt;li&gt;An average of 60 events per user (&lt;code&gt;key&lt;/code&gt;) per year.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Given 50 million events per year and 60 events per user per year, the total number of unique users is approximately 833,333 (50,000,000 / 60). The user's &lt;code&gt;key&lt;/code&gt; generator was configured to produce keys following an approximately normal (Gaussian) distribution. This simulates a real-world scenario where some users generate more events than others. The following graph illustrates the distribution of 50 million keys generated:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftod5eeqtg3xbz9exonaa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftod5eeqtg3xbz9exonaa.png" alt="Keys Distribution" width="717" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To further simulate a real-world scenario, the distribution of event statuses was set as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;80% &lt;code&gt;approved&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;10% &lt;code&gt;noFunds&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;7.5% &lt;code&gt;pending&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;2.5% &lt;code&gt;rejected&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Initial Scenario Collection Statistics
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Data Size&lt;/th&gt;
&lt;th&gt;Avg. Document Size&lt;/th&gt;
&lt;th&gt;Storage Size&lt;/th&gt;
&lt;th&gt;Indexes&lt;/th&gt;
&lt;th&gt;Index Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;appV1&lt;/td&gt;
&lt;td&gt;359,639,622&lt;/td&gt;
&lt;td&gt;39.58GB&lt;/td&gt;
&lt;td&gt;119B&lt;/td&gt;
&lt;td&gt;8.78GB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;20.06GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV2&lt;/td&gt;
&lt;td&gt;359,614,536&lt;/td&gt;
&lt;td&gt;41.92GB&lt;/td&gt;
&lt;td&gt;126B&lt;/td&gt;
&lt;td&gt;10.46GB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;16.66GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV3&lt;/td&gt;
&lt;td&gt;359,633,376&lt;/td&gt;
&lt;td&gt;28.7GB&lt;/td&gt;
&lt;td&gt;86B&lt;/td&gt;
&lt;td&gt;8.96GB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;16.37GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV4&lt;/td&gt;
&lt;td&gt;359,615,279&lt;/td&gt;
&lt;td&gt;19.66GB&lt;/td&gt;
&lt;td&gt;59B&lt;/td&gt;
&lt;td&gt;6.69GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;9.5GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R0&lt;/td&gt;
&lt;td&gt;95,350,431&lt;/td&gt;
&lt;td&gt;19.19GB&lt;/td&gt;
&lt;td&gt;217B&lt;/td&gt;
&lt;td&gt;5.06GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2.95GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R1&lt;/td&gt;
&lt;td&gt;33,429,649&lt;/td&gt;
&lt;td&gt;15.75GB&lt;/td&gt;
&lt;td&gt;506B&lt;/td&gt;
&lt;td&gt;4.04GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.09GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R2&lt;/td&gt;
&lt;td&gt;33,429,649&lt;/td&gt;
&lt;td&gt;11.96GB&lt;/td&gt;
&lt;td&gt;385B&lt;/td&gt;
&lt;td&gt;3.26GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.16GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R3&lt;/td&gt;
&lt;td&gt;33,429,492&lt;/td&gt;
&lt;td&gt;11.96GB&lt;/td&gt;
&lt;td&gt;385B&lt;/td&gt;
&lt;td&gt;3.24GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.11GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV5R4&lt;/td&gt;
&lt;td&gt;33,429,470&lt;/td&gt;
&lt;td&gt;12.88GB&lt;/td&gt;
&lt;td&gt;414B&lt;/td&gt;
&lt;td&gt;3.72GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.24GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R0&lt;/td&gt;
&lt;td&gt;95,350,319&lt;/td&gt;
&lt;td&gt;11.1GB&lt;/td&gt;
&lt;td&gt;125B&lt;/td&gt;
&lt;td&gt;3.33GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;3.13GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R1&lt;/td&gt;
&lt;td&gt;33,429,366&lt;/td&gt;
&lt;td&gt;8.19GB&lt;/td&gt;
&lt;td&gt;264B&lt;/td&gt;
&lt;td&gt;2.34GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.22GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R2&lt;/td&gt;
&lt;td&gt;33,429,207&lt;/td&gt;
&lt;td&gt;9.11GB&lt;/td&gt;
&lt;td&gt;293B&lt;/td&gt;
&lt;td&gt;2.8GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.26GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R3&lt;/td&gt;
&lt;td&gt;33,429,694&lt;/td&gt;
&lt;td&gt;9.53GB&lt;/td&gt;
&lt;td&gt;307B&lt;/td&gt;
&lt;td&gt;2.56GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.19GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;appV6R4&lt;/td&gt;
&lt;td&gt;33,429,372&lt;/td&gt;
&lt;td&gt;9.53GB&lt;/td&gt;
&lt;td&gt;307B&lt;/td&gt;
&lt;td&gt;1.47GB&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.34GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Infrastructure Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  MongoDB Server Instance
&lt;/h3&gt;

&lt;p&gt;The MongoDB server ran on an AWS EC2 &lt;code&gt;c7a.large&lt;/code&gt; instance, equipped with 2 vCPUs and 4GB of memory. Two disks were attached:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A 15GB GP3 disk for the operating system.&lt;/li&gt;
&lt;li&gt;A 300GB IO2 disk with 10,000 IOPS for MongoDB data storage.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The instance ran Ubuntu 22.04, fully updated at the time of testing. All recommended production settings were applied to optimize MongoDB performance on the available hardware.&lt;/p&gt;

&lt;h3&gt;
  
  
  Application Server Instance
&lt;/h3&gt;

&lt;p&gt;The application server ran on an AWS EC2 &lt;code&gt;c6a.xlarge&lt;/code&gt; instance, featuring 4 vCPUs and 8GB of memory. Two disks were attached:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A 10GB GP3 disk for the operating system.&lt;/li&gt;
&lt;li&gt;A 10GB GP3 disk for a secondary MongoDB server, used for storing load test metrics.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This instance also ran Ubuntu 22.04, fully updated. Recommended production settings were applied to optimize its performance.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mongodb</category>
      <category>performance</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>Unique Indexes Quirks and Unique Documents In an Array of Documents</title>
      <dc:creator>Artur Garcia Costa</dc:creator>
      <pubDate>Tue, 06 Jan 2026 13:02:13 +0000</pubDate>
      <link>https://forem.com/arturgc/unique-indexes-quirks-and-unique-documents-in-an-array-of-documents-570a</link>
      <guid>https://forem.com/arturgc/unique-indexes-quirks-and-unique-documents-in-an-array-of-documents-570a</guid>
      <description>&lt;p&gt;This article was reviewed and approved by MongoDB.&lt;/p&gt;

&lt;p&gt;We are developing an application to summarize a user's financial situation. The main page of this application shows us the user's identification and the balances on all banking accounts synced with our application.&lt;/p&gt;

&lt;p&gt;As we've seen in blog posts and recommendations of how to get the most out of MongoDB, &lt;a href="https://www.mongodb.com/developer/products/mongodb/schema-design-anti-pattern-separating-data/" rel="noopener noreferrer"&gt;"Data that is accessed together should be stored together"&lt;/a&gt;, we thought of the following document/structure to store the data used on the main page of the application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;first&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;john&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;last&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;smith&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;balance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;bank&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;abc&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;123&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;balance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;bank&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;universal bank&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;9029481&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Based on the functionality of our application, we determined the following rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A user can register in the application and not sync a bank account;&lt;/li&gt;
&lt;li&gt;An account is identified by its &lt;code&gt;bank&lt;/code&gt; and &lt;code&gt;number&lt;/code&gt; fields;&lt;/li&gt;
&lt;li&gt;The same account shouldn't be registered for two different users;&lt;/li&gt;
&lt;li&gt;The same account shouldn't be registered multiple times for the same user.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To enforce what was presented above, we decided to create an index with the following characteristics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Given that the fields &lt;code&gt;bank&lt;/code&gt; and &lt;code&gt;number&lt;/code&gt; must not repeat, this index must be set as &lt;a href="https://www.mongodb.com/docs/manual/core/index-unique/" rel="noopener noreferrer"&gt;Unique&lt;/a&gt;;&lt;/li&gt;
&lt;li&gt;Since we are indexing more than one field, it'll be of type &lt;a href="https://www.mongodb.com/docs/manual/core/index-compound/" rel="noopener noreferrer"&gt;Compound&lt;/a&gt;;&lt;/li&gt;
&lt;li&gt;Since we are indexing documents inside of an array, it'll also be of type &lt;a href="https://www.mongodb.com/docs/manual/core/index-multikey/" rel="noopener noreferrer"&gt;Multikey&lt;/a&gt;;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As a result of that, we have a &lt;code&gt;Compound Multikey Unique Index&lt;/code&gt; with the following specification and options:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;specification&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;accounts.bank&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;accounts.number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Unique Account&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;unique&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To validate that our index works as we intended, we'll use the following data on our tests:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;first&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;john&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;last&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;smith&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;first&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;john&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;last&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;appleseed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;balance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;bank&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;abc&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;123&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First, let's add the users to the collection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;specification&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Unique Account&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// { acknowledged: true, insertedId: 1)}&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="cm"&gt;/* MongoServerError: E11000 duplicate key...
...error collection: test.users index: Unique Account dup key: ...
...{ accounts.bank: null, accounts.number: null } */&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pretty good, we haven't even started working with the accounts, and we already have an error. Let's see what is going on.&lt;/p&gt;

&lt;p&gt;Analyzing the error message, it says we have a duplicate key for the index &lt;code&gt;Unique Account&lt;/code&gt; with the value of &lt;code&gt;null&lt;/code&gt; for the fields &lt;code&gt;accounts.bank&lt;/code&gt; and &lt;code&gt;accounts.number&lt;/code&gt;. This is due to how indexing works in MongoDB, when we insert a document in an indexed collection, and this document doesn't have one or more of the fields specified in the index, the value of the missing fields will be considered &lt;code&gt;null&lt;/code&gt;, and an entry will be added to the index.&lt;/p&gt;

&lt;p&gt;Using this logic to analyze our previous test, when we inserted &lt;code&gt;user1&lt;/code&gt;, it didn't have the fields &lt;code&gt;accounts.bank&lt;/code&gt; and &lt;code&gt;accounts.number&lt;/code&gt; and generated an entry in the index &lt;code&gt;Unique Account&lt;/code&gt; with the value of &lt;code&gt;null&lt;/code&gt; for both. When we tried to insert the &lt;code&gt;user2&lt;/code&gt; in the collection, we had the same behavior, and another entry in the index &lt;code&gt;Unique Account&lt;/code&gt; would have been created if we hadn't specified this index as &lt;code&gt;unique&lt;/code&gt;. More info about missing fields and unique indexes can be found &lt;a href="https://www.mongodb.com/docs/v5.0/core/index-unique/#unique-index-and-missing-field" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The solution for this issue is to only index documents with the fields &lt;code&gt;accounts.bank&lt;/code&gt; and &lt;code&gt;accounts.number&lt;/code&gt;. To accomplish that, we can specify a &lt;a href="https://www.mongodb.com/docs/manual/core/index-partial/#std-label-partial-index-with-unique-constraints" rel="noopener noreferrer"&gt;Partial Filter Expression&lt;/a&gt; on our index options to accomplish that. Now we have a &lt;code&gt;Compound Multikey Unique Partial Index&lt;/code&gt; (fancy name, hum, who are we trying to impress here?) with the following specification and options:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;specification&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;accounts.bank&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;accounts.number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;optionsV2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Unique Account V2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;partialFilterExpression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;accounts.bank&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$exists&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;accounts.number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$exists&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;unique&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Back to our tests:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Cleaning our environment&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;({});&lt;/span&gt; &lt;span class="c1"&gt;// Delete documents and indexes definitions&lt;/span&gt;

&lt;span class="cm"&gt;/* Tests */&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;specification&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;optionsV2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Unique Account V2&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// { acknowledged: true, insertedId: 1)}&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// { acknowledged: true, insertedId: 2)}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our new index implementation worked, and now we can insert those two users without accounts. Let's test account duplication, starting with the same account for two different users:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Cleaning the collection&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;deleteMany&lt;/span&gt;&lt;span class="p"&gt;({});&lt;/span&gt; &lt;span class="c1"&gt;// Delete documents, keep indexes&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertMany&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;user2&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="cm"&gt;/* Test */&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="c1"&gt;// { ... matchedCount: 1, modifiedCount: 1 ...}&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="cm"&gt;/* MongoServerError: E11000 duplicate key error collection: test.users 
index: Unique Account V2 dup key:...
... { accounts.bank: "abc", accounts.number: "123" } */&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We couldn't insert the same account into different users as we expected. Now we'll try the same account for the same user.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Cleaning the collection&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;deleteMany&lt;/span&gt;&lt;span class="p"&gt;({});&lt;/span&gt; &lt;span class="c1"&gt;// Delete documents, keep indexes&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertMany&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;user2&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="cm"&gt;/* Test */&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="c1"&gt;// { ... matchedCount: 1, modifiedCount: 1 ...}&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="c1"&gt;// { ... matchedCount: 1, modifiedCount: 1 ...}&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="cm"&gt;/*{
 _id: 1,
 name: { first: 'john', last: 'smith' },
 accounts: [
   { balance: 500, bank: 'abc', number: '123' },
   { balance: 500, bank: 'abc', number: '123' }
 ]
}*/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When we don't expect things to work, they do. Again, another error caused by not knowing or considering how indexes work on MongoDB. Looking at &lt;a href="https://www.mongodb.com/docs/manual/core/index-unique/#unique-constraint-across-separate-documents" rel="noopener noreferrer"&gt;this part&lt;/a&gt; of MongoDB documentation, we'll learn that MongoDB indexes don't duplicate strictly equal entries, with the same key values pointing to the same document. Considering this, when we inserted &lt;code&gt;account1&lt;/code&gt; for the second time on our user, an index entry wasn't created, with that, we don't have duplicate values on it.&lt;/p&gt;

&lt;p&gt;Some of you more knowledgeable on MongoDB may think that using &lt;a href="https://www.mongodb.com/docs/v6.0/reference/operator/update/addToSet/" rel="noopener noreferrer"&gt;\$addToSet&lt;/a&gt; instead of &lt;a href="https://www.mongodb.com/docs/v6.0/reference/operator/update/push/" rel="noopener noreferrer"&gt;\$push&lt;/a&gt; would resolve our problem. Not this time, young padawan. The &lt;code&gt;$addToSet&lt;/code&gt; function would consider all the fields in the account's document, but as we specified at the beginning of our journey, an account must be unique and identifiable by the fields &lt;code&gt;bank&lt;/code&gt; and &lt;code&gt;number&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Ok, what can we do now? Our index has a ton of options and compound names, and our application doesn't behave as we hoped.&lt;/p&gt;

&lt;p&gt;A simple way out of this situation is to change how our update function is structured, changing its filter parameter to only match the user's documents where the account we want to insert isn't in the &lt;code&gt;accounts&lt;/code&gt; array.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Cleaning the collection&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;deleteMany&lt;/span&gt;&lt;span class="p"&gt;({});&lt;/span&gt; &lt;span class="c1"&gt;// Delete documents, keep indexes&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertMany&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;user2&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="cm"&gt;/* Test */&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bankFilter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;$not&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$elemMatch&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;bank&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;bank&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;number&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;bankFilter&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// { ... matchedCount: 1, modifiedCount: 1 ...}&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;bankFilter&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// { ... matchedCount: 0, modifiedCount: 0 ...}&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="cm"&gt;/*{
 _id: 1,
 name: { first: 'john', last: 'smith' },
 accounts: [ { balance: 500, bank: 'abc', number: '123' } ]
}*/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Problem solved, we tried to insert the same account for the same user, and it didn't insert, but it also didn't error out.&lt;/p&gt;

&lt;p&gt;This behavior doesn't meet our expectations because it doesn't make clear to the user that this operation is prohibited. Another point of concern is that this solution considers that every time a new account is inserted in the database, it'll use the correct update filter parameters.&lt;/p&gt;

&lt;p&gt;We've worked in some companies and know that as people come and go, some knowledge about the implementation is lost, interns will try to reinvent the wheel, and some nasty shortcuts will be taken. We want a solution that will error out in any case and stop even the most unscrupulous developer/administrator who dares to change data directly on the production database 😱.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.mongodb.com/docs/manual/core/schema-validation/" rel="noopener noreferrer"&gt;MongoDB Schema Validation&lt;/a&gt; for the win.&lt;/p&gt;

&lt;p&gt;A quick note before we go down this rabbit role. MongoDB best practices recommend implementing schema validation on the application level and using MongoDB Schema Validation as a backstop.&lt;/p&gt;

&lt;p&gt;In MongoDB Schema Validation, it's possible to use the operator &lt;code&gt;$expr&lt;/code&gt; to write an aggregation expression to validate the data of a document when it has been inserted or updated. With that, we can write an expression to verify if the items inside an array are unique.&lt;/p&gt;

&lt;p&gt;After some consideration, we get the following expression:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;accountsSet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;$setIntersection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$map&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$accounts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;in&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;bank&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$$this.bank&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$$this.number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;uniqueAccounts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;$eq&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;$size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$accounts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;accountsSet&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;accountsValidator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;$expr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$cond&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$isArray&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$accounts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;then&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;uniqueAccounts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first operation we have inside of &lt;code&gt;$expr&lt;/code&gt; is a &lt;code&gt;$cond&lt;/code&gt;. When the logic specified in the &lt;code&gt;if&lt;/code&gt; field results in &lt;code&gt;true&lt;/code&gt;, the logic within the field &lt;code&gt;then&lt;/code&gt; will be executed, when the result is &lt;code&gt;false&lt;/code&gt;, the logic within the &lt;code&gt;else&lt;/code&gt; field will be executed.&lt;/p&gt;

&lt;p&gt;Using this knowledge to interpret our code, when the accounts array exists in the document, &lt;code&gt;{ $isArray: "$accounts" }&lt;/code&gt;, we will execute the logic within&lt;code&gt;uniqueAccounts&lt;/code&gt; when the array doesn't exist, we return &lt;code&gt;true&lt;/code&gt; signaling that the document passed the schema validation.&lt;/p&gt;

&lt;p&gt;Inside the &lt;code&gt;uniqueAccounts&lt;/code&gt; variable, we verify if the &lt;code&gt;$size&lt;/code&gt; of two things is &lt;code&gt;$eq&lt;/code&gt;. The first thing is the size of the array field &lt;code&gt;$accounts&lt;/code&gt;, and the second thing is the size of &lt;code&gt;accountsSet&lt;/code&gt; that is generated by the &lt;code&gt;$setIntersection&lt;/code&gt; function. If the two arrays have the same size, the logic will return &lt;code&gt;true&lt;/code&gt;, and the document will pass the validation, otherwise, the logic will return&lt;code&gt;false&lt;/code&gt;, the document will fail validation, and the operation will error out.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;$setIntersenction&lt;/code&gt; function will perform a set operation on the array passed to it, removing duplicate entries. The array passed to &lt;code&gt;$setIntersection&lt;/code&gt; will be generated by a &lt;code&gt;$map&lt;/code&gt; function, which maps each account in &lt;code&gt;$accounts&lt;/code&gt; to only have the fields &lt;code&gt;bank&lt;/code&gt; and &lt;code&gt;number&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Let's see if this is witchcraft or science:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Cleaning the collection&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;({});&lt;/span&gt; &lt;span class="c1"&gt;// Delete documents and indexes definitions&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createCollection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;users&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;validator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;accountsValidator&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;specification&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;optionsV2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertMany&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;user2&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="cm"&gt;/* Test */&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="c1"&gt;// { ... matchedCount: 1, modifiedCount: 1 ...}&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;account1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt; 
&lt;span class="cm"&gt;/* MongoServerError: Document failed validation
Additional information: {
 failingDocumentId: 1,
 details: {
   operatorName: '$expr',
   specifiedAs: {
     '$expr': {
       '$cond': {
         if: { '$and': '$accounts' },
         then: { '$eq': [ [Object], [Object] ] },
         else: true
       }
     }
   },
   reason: 'expression did not match',
   expressionResult: false
 }
}*/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Mission accomplished.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>indexes</category>
      <category>schemavalidation</category>
      <category>datamodeling</category>
    </item>
    <item>
      <title>Improving Storage and Read Performance for Free: Flat vs Structured Schemas</title>
      <dc:creator>Artur Garcia Costa</dc:creator>
      <pubDate>Mon, 08 Dec 2025 18:52:37 +0000</pubDate>
      <link>https://forem.com/arturgc/improving-storage-and-read-performance-for-free-flat-vs-structured-schemas-7gk</link>
      <guid>https://forem.com/arturgc/improving-storage-and-read-performance-for-free-flat-vs-structured-schemas-7gk</guid>
      <description>&lt;p&gt;This article was reviewed and approved by MongoDB.&lt;/p&gt;

&lt;p&gt;When developers or administrators who had previously only been &lt;code&gt;"followers of the word of relational data modeling"&lt;/code&gt; start to use MongoDB, it is common to see documents with flat schemas. This behavior happens because relational data modeling makes you think about data and schemas in a flat, two-dimensional structure called tables.&lt;/p&gt;

&lt;p&gt;In MongoDB, data is stored as BSON documents, almost a binary representation of JSON documents, with slight differences. Because of this, we can create schemas with more dimensions/levels. More details about BSON implementation can be found in its &lt;a href="https://bsonspec.org/spec.html" rel="noopener noreferrer"&gt;specification&lt;/a&gt;. You can also learn more about its &lt;a href="https://www.mongodb.com/json-and-bson" rel="noopener noreferrer"&gt;differences from JSON&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;MongoDB documents are composed of one or more key/value pairs, where the value of a field can be any of the BSON data types, including other documents, arrays, or arrays of documents.&lt;/p&gt;

&lt;p&gt;Using documents, arrays, or arrays of documents as values for fields enables the creation of a structured schema, where one field can represent a group of related information. This structured schema is an alternative to a flat schema.  &lt;/p&gt;

&lt;p&gt;Let's see an example of how to write the same &lt;code&gt;user&lt;/code&gt; document using the two schemas:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi6tmc2oxok3vc1gvm58l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi6tmc2oxok3vc1gvm58l.png" alt=" " width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The two documents above contain the same data. The one on the left, &lt;code&gt;flatUser&lt;/code&gt;, uses a flat schema where all the field-and-value pairs are on the same level. The one on the right, &lt;code&gt;structuredUser&lt;/code&gt;, employs a structured schema where the field and values have nested levels according to related information inside the document.&lt;/p&gt;

&lt;p&gt;So, what are the advantages of using a structured rather than a flat one? The quick answer for those in a hurry is that a structured schema may require &lt;strong&gt;less storage&lt;/strong&gt; and be &lt;strong&gt;faster to traverse&lt;/strong&gt; than a flat schema. For those who want to know why, we need a better understanding of BSON.&lt;/p&gt;

&lt;p&gt;For the purpose of this article, a BSON document can be seen as a list of items, where each item represents a field-and-value pair of the document. An item is composed of the field’s &lt;code&gt;type&lt;/code&gt;, &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;length&lt;/code&gt;, and &lt;code&gt;data&lt;/code&gt; in a serialized form. The field &lt;code&gt;type&lt;/code&gt; is one byte long and indicates the data type in the &lt;code&gt;data&lt;/code&gt; field. The field &lt;code&gt;name&lt;/code&gt; is the field's name in a string form. The field &lt;code&gt;length&lt;/code&gt; is four bytes long and indicates the length of the &lt;code&gt;data&lt;/code&gt; field for those &lt;code&gt;types&lt;/code&gt; where the size is not fixed. The &lt;code&gt;data&lt;/code&gt; field is the actual data of the field-and-value pair. Putting this definition in a graphical representation, we have:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F144zum43ad06z2wtzqz6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F144zum43ad06z2wtzqz6.jpg" alt=" " width="800" height="156"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's see how a structured schema uses less storage than a flat schema by analyzing the field-and-value pair related to the user's name.&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;flatUser&lt;/code&gt;, we have the following table from a storage perspective:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;field-and-value&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Field Name&lt;/th&gt;
&lt;th&gt;Field Length&lt;/th&gt;
&lt;th&gt;Field Data&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;name_first: "john"&lt;/td&gt;
&lt;td&gt;1 byte&lt;/td&gt;
&lt;td&gt;10 bytes&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;19 bytes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;name_last: "smith"&lt;/td&gt;
&lt;td&gt;1 byte&lt;/td&gt;
&lt;td&gt;9 bytes&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;5 bytes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;19 bytes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;name_middle: "oliver"&lt;/td&gt;
&lt;td&gt;1 byte&lt;/td&gt;
&lt;td&gt;11 bytes&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;6 bytes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;22 bytes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Adding up the table's total sizes, the flat document uses 60 bytes to store the field and value related to the user's name.&lt;/p&gt;

&lt;p&gt;To analyze the storage of the &lt;code&gt;structuredUser&lt;/code&gt;, let's divide it into two tables. In the first table, we'll have the storage used by the document of the field &lt;code&gt;name&lt;/code&gt;, and in the second table, we'll have the storage utilized by the field-and-value &lt;code&gt;name&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Let’s build the first table for the value/content of the field &lt;code&gt;name&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;field-and-value&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Field Name&lt;/th&gt;
&lt;th&gt;Field Length&lt;/th&gt;
&lt;th&gt;Field Data&lt;/th&gt;
&lt;th&gt;Total Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;first: "john"&lt;/td&gt;
&lt;td&gt;1 byte&lt;/td&gt;
&lt;td&gt;5 bytes&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;14 bytes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;last: "smith"&lt;/td&gt;
&lt;td&gt;1 byte&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;5 bytes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;14 bytes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;middle: "oliver"&lt;/td&gt;
&lt;td&gt;1 byte&lt;/td&gt;
&lt;td&gt;6 bytes&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;6 bytes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;17 bytes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Adding up the previous table's total sizes, the value/Field Data of the field &lt;code&gt;name&lt;/code&gt; uses 45 bytes. Building the second table for the field-and-value &lt;code&gt;name&lt;/code&gt;, we get:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;field-and-value&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Field Name&lt;/th&gt;
&lt;th&gt;Field Length&lt;/th&gt;
&lt;th&gt;Field Data&lt;/th&gt;
&lt;th&gt;Total Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;name: { … }&lt;/td&gt;
&lt;td&gt;1 byte&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;td&gt;45 bytes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;54 bytes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The structured document uses 54 bytes to store the values related to the user's name.&lt;/p&gt;

&lt;p&gt;Comparing the tables, we see the main difference is the "Field Name" storage size. The flat schema uses 30 bytes to store the names of its fields, while the structured schema uses 19 bytes to store the names of its fields. This is due to the repetition of the sub-string "name_" in the "Field Name" of the flat schema.&lt;/p&gt;

&lt;p&gt;Storing the two documents in a MongoDB instance, we will get a size of 403 bytes for the flat schema and 307 bytes for the structured schema. Not bad getting a 24% improvement in storage just by changing the schema, and a structured document is easier to read and more pleasant to look at.&lt;/p&gt;

&lt;p&gt;Now, let's see how a structured schema is faster to traverse than a flat schema by getting the zip code of the work address.&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;flatUser&lt;/code&gt; document, to get to the field &lt;code&gt;address_work_zip&lt;/code&gt; starting at the beginning of the document, a cursor would need to perform a &lt;strong&gt;12 field names comparison&lt;/strong&gt; until it reaches the desired field.&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;structuredUser&lt;/code&gt; document, to get to the field &lt;code&gt;address.work.zip&lt;/code&gt; starting at the beginning of the document, a cursor would need to perform an &lt;strong&gt;8 field names comparison&lt;/strong&gt;. The smaller number of comparisons here is due to some values of a field-and-value pair being a document. When the cursor checks the field &lt;code&gt;name&lt;/code&gt;, it can jump three fields/comparison — &lt;code&gt;first&lt;/code&gt;, &lt;code&gt;middle&lt;/code&gt;, and &lt;code&gt;last&lt;/code&gt;— because it knows that &lt;code&gt;address.work.zip&lt;/code&gt; won't be inside of &lt;code&gt;name.&amp;lt;field&amp;gt;&lt;/code&gt;. When the cursor checks the field &lt;code&gt;address.home&lt;/code&gt;, it can also jump five fields/comparison — &lt;code&gt;street&lt;/code&gt;, &lt;code&gt;number&lt;/code&gt;, &lt;code&gt;zip&lt;/code&gt;, &lt;code&gt;state&lt;/code&gt;, and &lt;code&gt;country&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To quantify the performance gain on traversing a structured schema instead of a flat schema in MongoDB, a test with the following methodology was used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;To isolate the result to be influenced just by the traversing of the documents, the MongoDB instance used was configured with &lt;a href="https://www.mongodb.com/docs/manual/core/inmemory/" rel="noopener noreferrer"&gt;in-memory storage&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Documents with 10, 25, 50, and 100 fields were utilized for the flat schema.&lt;/li&gt;
&lt;li&gt;Documents with 2x5, 5x5, 10x5, and 20x5 fields were used for the structured schema, where 2x5 means two fields of type document with five fields for each document.&lt;/li&gt;
&lt;li&gt;Each collection had 10.000 documents generated using &lt;a href="https://www.npmjs.com/package/@faker-js/faker" rel="noopener noreferrer"&gt;faker/npm&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;To force the MongoDB engine to loop through all documents and all fields inside each document, all queries were made searching for a field and value that wasn't present in the documents.&lt;/li&gt;
&lt;li&gt;Each query was executed 100 times in a row for each document size and schema.&lt;/li&gt;
&lt;li&gt;No concurrent operation was executed during each test.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now, to the test results:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Documents&lt;/th&gt;
&lt;th&gt;Flat&lt;/th&gt;
&lt;th&gt;Structured&lt;/th&gt;
&lt;th&gt;Difference&lt;/th&gt;
&lt;th&gt;Improvement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10 / 2x5&lt;/td&gt;
&lt;td&gt;487 ms&lt;/td&gt;
&lt;td&gt;376 ms&lt;/td&gt;
&lt;td&gt;111 ms&lt;/td&gt;
&lt;td&gt;22,8%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;25 / 5x5&lt;/td&gt;
&lt;td&gt;624 ms&lt;/td&gt;
&lt;td&gt;434 ms&lt;/td&gt;
&lt;td&gt;190 ms&lt;/td&gt;
&lt;td&gt;30,4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50 / 10x5&lt;/td&gt;
&lt;td&gt;915 ms&lt;/td&gt;
&lt;td&gt;617 ms&lt;/td&gt;
&lt;td&gt;298 ms&lt;/td&gt;
&lt;td&gt;32,6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100 / 20x5&lt;/td&gt;
&lt;td&gt;1384 ms&lt;/td&gt;
&lt;td&gt;891 ms&lt;/td&gt;
&lt;td&gt;493 ms&lt;/td&gt;
&lt;td&gt;35,6%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;As our theory predicted, traversing a structured document is faster than traversing a flat one. The gains presented in this test shouldn't be considered for all cases when comparing structured and flat schemas, the improvements in traversing will depend on how the nested fields and documents are organized.&lt;/p&gt;

&lt;p&gt;This article showed how to better use your MongoDB deployment by changing the schema of your document for the same data/information. Another option to extract more performance from your MongoDB deployment is to apply the common schema patterns of MongoDB. In this case, you will analyze which data you should put in your document/schema. The article &lt;a href="https://www.mongodb.com/blog/post/building-with-patterns-a-summary" rel="noopener noreferrer"&gt;Building with Patterns&lt;/a&gt; has the most common patterns and will significantly help.&lt;/p&gt;

&lt;p&gt;The code used to get the above results is available in the &lt;a href="https://github.com/ArturGC/flat-vs-structured" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>performance</category>
      <category>schemadesign</category>
      <category>datamodeling</category>
    </item>
    <item>
      <title>The Pitfall of Increasing Read Capacity by Reading From Secondary Nodes in a MongoDB Replica Set</title>
      <dc:creator>Artur Garcia Costa</dc:creator>
      <pubDate>Wed, 03 Dec 2025 18:43:37 +0000</pubDate>
      <link>https://forem.com/arturgc/the-pitfall-of-increasing-read-capacity-by-reading-from-secondary-nodes-in-a-mongodb-replica-set-49cn</link>
      <guid>https://forem.com/arturgc/the-pitfall-of-increasing-read-capacity-by-reading-from-secondary-nodes-in-a-mongodb-replica-set-49cn</guid>
      <description>&lt;p&gt;This article was reviewed and approved by MongoDB and was originally published in &lt;a href="https://foojay.io/today/the-pitfall-of-increasing-read-capacity-by-reading-from-secondary-nodes-in-a-mongodb-replica-set/" rel="noopener noreferrer"&gt;foojay.io&lt;/a&gt; and &lt;a href="https://delbridge.solutions/read-capacity-mongodb-replica-set/" rel="noopener noreferrer"&gt;Delbridge Solutions&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The scenario
&lt;/h2&gt;

&lt;p&gt;Imagine we are responsible for managing the MongoDB cluster that supports our country's national financial payment system, similar to &lt;a href="https://en.wikipedia.org/wiki/Pix_(payment_system)" rel="noopener noreferrer"&gt;Pix&lt;/a&gt; in Brazil. Our application was designed to be read-heavy, with one write operation for every 20 read operations.&lt;/p&gt;

&lt;p&gt;With &lt;a href="https://en.wikipedia.org/wiki/Black_Friday_(shopping)" rel="noopener noreferrer"&gt;Black Friday&lt;/a&gt; approaching, a critical period for our national financial payment system, we have been entrusted with the crucial task of creating a scaling plan for our cluster to handle the increased demand during this shopping spree. Given that our system is read-heavy, we are exploring ways to enhance the read performance and capacity of our cluster.&lt;/p&gt;

&lt;p&gt;We're in charge of the national financial payment system that powers a staggering 60% of all transactions across the nation. That's why &lt;strong&gt;ensuring the highest availability&lt;/strong&gt; of this MongoDB cluster is &lt;strong&gt;absolutely critical&lt;/strong&gt;—it's the backbone of our economy!&lt;/p&gt;

&lt;h2&gt;
  
  
  A solution from AI Models
&lt;/h2&gt;

&lt;p&gt;As a database administrator or database developer in 2025, our first step when searching for solutions is to consult AI. These AI models, including GPT-5, Grok Code Fast 1, Claude Sonnet 4, and Gemini 2.5 Pro, are advanced tools that can provide insights and recommendations based on the specific query we ask. I asked the question, &lt;code&gt;"How can I increase read performance and capacity in a MongoDB replica set cluster?”&lt;/code&gt; to these AI models. &lt;/p&gt;

&lt;p&gt;A standard recommendation across all responses was to distribute read operations to secondary nodes using the &lt;code&gt;readPreference&lt;/code&gt; setting to enhance performance and increase the number of secondary nodes to boost read capacity.&lt;/p&gt;

&lt;p&gt;An interesting observation is that nearly all AI models correctly warned that reading from secondary nodes could yield &lt;code&gt;stale information&lt;/code&gt;, which means the data might not be the most up-to-date, as the &lt;a href="https://www.mongodb.com/docs/manual/replication/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant" rel="noopener noreferrer"&gt;replication of write operations&lt;/a&gt; between nodes requires some time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pitfall of scaling capacity by reading from secondary nodes
&lt;/h2&gt;

&lt;p&gt;Let's imagine we have a replica set cluster consisting of three nodes: one &lt;a href="https://www.mongodb.com/docs/manual/core/replica-set-primary/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant" rel="noopener noreferrer"&gt;primary&lt;/a&gt; node and two &lt;a href="https://www.mongodb.com/docs/manual/core/replica-set-secondary/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant" rel="noopener noreferrer"&gt;secondaries&lt;/a&gt;. &lt;strong&gt;Each node can handle up to 100 read operations per second&lt;/strong&gt;. If we distribute the read operations equally among the nodes, the &lt;strong&gt;entire replica set cluster should be able to accommodate a total of 300 read operations per second&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fldc4hg1g87ag8ui83bsv.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fldc4hg1g87ag8ui83bsv.jpg" alt="The image illustrating an application server performing 240 read operations per second in a MongoDB Replica Set cluster. This cluster consists of three nodes: one primary node and two secondary nodes. Each node handles 80 read operations per second" width="720" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Our &lt;strong&gt;application requires 240 read operations per second&lt;/strong&gt;. Since we have configured it to balance the operations across the replica set nodes, &lt;strong&gt;each node will handle 80 read operations per second&lt;/strong&gt;, which is below its capacity of 100 reads per second.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F47ewp11orhel990a539v.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F47ewp11orhel990a539v.jpg" alt="An image shows an application server executing 80 read operations per second on a MongoDB Replica Set cluster, which consists of three nodes: one primary and two secondary nodes, each capable of handling a maximum of 100 read operations per second" width="717" height="406"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However, a potential risk lurks in the shadows. Imagine a network outage in one of the availability zones where one of our replica set nodes is deployed, &lt;strong&gt;causing this primary or secondary node to go down&lt;/strong&gt;. Now, our application is &lt;strong&gt;still requesting its 240 read operations per second&lt;/strong&gt;, but with only two nodes remaining, &lt;strong&gt;each node needs to process 120 read operations per second&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvruybjshywhir3qsh3vt.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvruybjshywhir3qsh3vt.jpg" alt="The image depicts an application server executing 240 read operations per second in a MongoDB Replica Set cluster. This cluster is made up of three nodes: one primary node and two secondary nodes. When one of the secondary nodes experiences a network outage and goes down, the remaining nodes each handle 120 read operations per second" width="717" height="407"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since &lt;strong&gt;each node can only handle 100 read operations per second&lt;/strong&gt;, this overloads their hardware, which may lead to further failures. As a result, the remaining nodes may go down, taking down the entire MongoDB cluster along with the application.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ne2pv8nwsb1lz802yo4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ne2pv8nwsb1lz802yo4.jpg" alt="The image shows an application server performing zero read operations per second within a MongoDB Replica Set cluster. This issue is caused by a network outage in one of the secondary nodes, followed by read overload on the remaining nodes, leading to the cluster being down" width="720" height="406"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Increasing read capacity vs increasing read performance
&lt;/h2&gt;

&lt;p&gt;Let's first clarify the difference between read capacity and read performance in a MongoDB cluster:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read capacity: &lt;strong&gt;How many read operations&lt;/strong&gt; the cluster can manage without overloading its hardware or significantly increasing the time required to complete these operations
&lt;/li&gt;
&lt;li&gt;Read performance: &lt;strong&gt;How quickly a read operation&lt;/strong&gt; can be fulfilled by the cluster&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As discussed previously, utilizing secondary nodes to enhance the cluster’s read capacity may inadvertently reduce its availability. This availability reduction occurs because if one node fails, the remaining nodes could become overloaded with read requests.&lt;/p&gt;

&lt;p&gt;Therefore, when high availability is crucial for your application, reading from secondary nodes should be limited to improve performance. Two ways of doing that are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Proximity&lt;/strong&gt;: Locating the secondary node closer to the application reduces the latency of requests and responses between them.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching&lt;/strong&gt;: Consistently executing the same queries on the same node allows its cache to retain the necessary data, leading to faster query fulfillment.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Properly increasing read capacity
&lt;/h2&gt;

&lt;p&gt;To safely and reliably increase read capacity without sacrificing availability, the best approach is to scale your cluster—either vertically (scaling up) or horizontally (scaling out).&lt;/p&gt;

&lt;h3&gt;
  
  
  Vertical scaling (scale up)
&lt;/h3&gt;

&lt;p&gt;This method involves increasing the resources of existing nodes, such as CPU, RAM, storage, and IOPS.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Advantages&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Operational simplicity&lt;/strong&gt;: No changes are needed for data distribution or query routing.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimal application change&lt;/strong&gt;: Connection strings and query patterns typically remain the same.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Immediate performance improvement&lt;/strong&gt;: It’s particularly effective for workloads that are limited by CPU or memory.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Disadvantages&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Upper limits&lt;/strong&gt;: Eventually, you will reach the maximum instance size available; a single machine's resources can cap throughput.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-linear performance growth&lt;/strong&gt;: The performance of your application usually doesn't grow linearly with the instance size, meaning that doubling your resources might not double your throughput.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single-node bottlenecks&lt;/strong&gt;: Hot documents or collections and heavy aggregation can still face contention for a primary node's resources.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;[MongoDB EA only] Obtain and provision additional resources:&lt;/strong&gt; While MongoDB Atlas offers simple methods for vertical scaling, on-premises deployments often face limitations due to resource availability.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Horizontal scaling (scale out via sharding)
&lt;/h3&gt;

&lt;p&gt;This approach distributes data and workload across multiple shards by &lt;a href="https://www.mongodb.com/docs/manual/core/sharding-data-partitioning/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant" rel="noopener noreferrer"&gt;partitioning the data&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Advantages&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Near-linear throughput growth&lt;/strong&gt;: Adding shards can increase capacity for both reads and writes, in addition to total storage.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hotspot mitigation&lt;/strong&gt;: Proper &lt;a href="https://www.mongodb.com/docs/manual/sharding/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant" rel="noopener noreferrer"&gt;shard&lt;/a&gt; keys can help evenly spread the load to avoid bottlenecks on individual nodes.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geographic flexibility&lt;/strong&gt;: Zone sharding keeps data close to users and meets data residency requirements.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Disadvantages&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Design complexity&lt;/strong&gt;: &lt;a href="https://www.mongodb.com/docs/manual/core/sharding-choose-a-shard-key/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant" rel="noopener noreferrer"&gt;Selecting the right shard key is crucial&lt;/a&gt;; poorly chosen shard keys can lead to imbalances or inefficient scatter-gather queries.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operational overhead&lt;/strong&gt;: Tasks such as chunk balancing, resharding, and managing cross-shard queries or transactions can add complexity.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query pattern considerations&lt;/strong&gt;: To maximize targeted reads and avoid fan-out, applications may need to include the shard key in their queries.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;[MongoDB EA only] Obtain and provision additional resources:&lt;/strong&gt; While MongoDB Atlas offers simple methods for horizontal scaling, on-premises deployments often face limitations due to resource availability.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;For more information on scaling in MongoDB, refer to the articles &lt;a href="https://www.mongodb.com/resources/basics/horizontal-vs-vertical-scaling/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant" rel="noopener noreferrer"&gt;"A Guide to Horizontal vs Vertical Scaling"&lt;/a&gt; and &lt;a href="https://www.mongodb.com/resources/basics/scaling/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant" rel="noopener noreferrer"&gt;"Database Scaling,"&lt;/a&gt; or check the official documentation on &lt;a href="https://www.mongodb.com/docs/manual/core/sharding-scaling-strategies/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant" rel="noopener noreferrer"&gt;scaling strategies&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Maybe other ways around it?
&lt;/h2&gt;

&lt;p&gt;Some readers who are more knowledgeable about MongoDB cluster topology and node types may think that, at least in MongoDB Atlas,  we could have increased our cluster's read capacity by utilizing &lt;a href="https://www.mongodb.com/docs/atlas/cluster-config/multi-cloud-distribution/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant#read-only-nodes-for-optimal-local-reads" rel="noopener noreferrer"&gt;read-only&lt;/a&gt; or &lt;a href="https://www.mongodb.com/docs/atlas/cluster-config/multi-cloud-distribution/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant#analytics-nodes-for-workload-isolation" rel="noopener noreferrer"&gt;analytical&lt;/a&gt; nodes. As Master Yoda would say, "Much to learn you still have, my young padawan." First, let's understand what these nodes are and their purpose, and then we can assess whether they fit our needs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Read-only node
&lt;/h3&gt;

&lt;p&gt;In reviewing the &lt;a href="https://www.mongodb.com/docs/atlas/cluster-config/multi-cloud-distribution/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant#read-only-nodes-for-optimal-local-reads" rel="noopener noreferrer"&gt;official MongoDB documentation for Atlas read-only nodes&lt;/a&gt;, I've identified two key points that are particularly relevant to our case:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;“Use read-only nodes to optimize local reads in the nodes' respective service areas.”
&lt;/li&gt;
&lt;li&gt;“Read-only nodes don't provide high availability because they don't participate in &lt;a href="https://www.mongodb.com/docs/manual/reference/glossary/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant#std-term-election" rel="noopener noreferrer"&gt;elections&lt;/a&gt;.”&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first point indicates that &lt;code&gt;read-only&lt;/code&gt; nodes can enhance performance by being located closer to the application, thereby reducing read latency. However, since our goal is to increase read capacity, this solution is not ideal.&lt;/p&gt;

&lt;p&gt;The second point emphasizes that &lt;code&gt;read-only&lt;/code&gt; nodes do not contribute to high availability, which is a critical requirement for our application. Therefore, this aspect does not provide any advantage for us.&lt;/p&gt;

&lt;h3&gt;
  
  
  Analytics node
&lt;/h3&gt;

&lt;p&gt;In reviewing the &lt;a href="https://www.mongodb.com/docs/atlas/cluster-config/multi-cloud-distribution/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant#analytics-nodes-for-workload-isolation" rel="noopener noreferrer"&gt;official MongoDB documentation for Atlas analytics nodes&lt;/a&gt;, we can find very similar relevant points of attention to the &lt;code&gt;read-only&lt;/code&gt; case:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;“Use &lt;a href="https://www.mongodb.com/docs/atlas/reference/faq/deployment/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant#std-label-analytics-nodes-overview" rel="noopener noreferrer"&gt;analytics nodes&lt;/a&gt; to isolate queries which you do not wish to contend with your operational workload.”
&lt;/li&gt;
&lt;li&gt;“Read-only nodes don't provide high availability because they don't participate in &lt;a href="https://www.mongodb.com/docs/manual/reference/glossary/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=pitfall-mongodb-foojay&amp;amp;utm_term=megan.grant#std-term-election" rel="noopener noreferrer"&gt;elections&lt;/a&gt;.”&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The second point is the same as in the &lt;code&gt;read-only&lt;/code&gt; case, so there’s no need for further discussion on it. The first point implies that the &lt;code&gt;analytics&lt;/code&gt; &lt;code&gt;node&lt;/code&gt; will handle analytical queries, which could negatively impact the performance of everyday queries in your application. Therefore, this does not contribute to increasing read capacity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;While distributing read operations across secondary MongoDB nodes to boost capacity might sound appealing, it can inadvertently impact availability—something that's crucial for systems like our national financial payment network. Such an approach could lead to cascading failures during outages, which we definitely want to avoid! &lt;/p&gt;

&lt;p&gt;Instead, focus on scaling strategies. Consider vertical scaling for immediate performance enhancements, or horizontal sharding to ensure consistent throughput and address hotspot concerns. While &lt;code&gt;read-only&lt;/code&gt; and &lt;code&gt;analytical&lt;/code&gt; nodes offer certain benefits, they don't fully address the need for high availability and read capacity.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>performance</category>
      <category>replicaset</category>
      <category>availability</category>
    </item>
  </channel>
</rss>
