<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Piter Adyson</title>
    <description>The latest articles on Forem by Piter Adyson (@piteradyson).</description>
    <link>https://forem.com/piteradyson</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3695030%2Ff4b91be7-ecc2-40b8-a468-e8051784554b.png</url>
      <title>Forem: Piter Adyson</title>
      <link>https://forem.com/piteradyson</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/piteradyson"/>
    <language>en</language>
    <item>
      <title>MongoDB vs PostgreSQL — 6 factors to consider when choosing your database</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Fri, 13 Feb 2026 17:38:20 +0000</pubDate>
      <link>https://forem.com/piteradyson/mongodb-vs-postgresql-6-factors-to-consider-when-choosing-your-database-35mj</link>
      <guid>https://forem.com/piteradyson/mongodb-vs-postgresql-6-factors-to-consider-when-choosing-your-database-35mj</guid>
      <description>&lt;p&gt;Choosing between MongoDB and PostgreSQL is one of the most important decisions you'll make for your project. Both databases are mature, reliable and widely used. But they're fundamentally different in how they store, query and scale data. This choice affects your development speed, operational costs and how easily your system can grow.&lt;/p&gt;

&lt;p&gt;Many developers pick a database based on what's familiar or what's trending. That's fine for small projects. But if you're building something that needs to scale or handle complex data relationships, you need to understand the real differences. This article breaks down six key factors to help you make an informed decision: data model, query complexity, scalability, consistency, performance and backup strategies.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd9qyym521ma10c79mr7y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd9qyym521ma10c79mr7y.png" alt="PostgreSQL vs MongoDB" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Data model and schema flexibility
&lt;/h2&gt;

&lt;p&gt;The data model is probably the biggest difference between these two databases. PostgreSQL is a relational database that uses tables with strict schemas. You define columns, types and relationships upfront. MongoDB is a document database that stores JSON-like documents with flexible schemas. Each document can have different fields, and you can change the structure on the fly.&lt;/p&gt;

&lt;p&gt;PostgreSQL's structured approach works great when your data has clear relationships and you know the schema in advance. Think user accounts, orders, inventory or financial records. The strict schema catches errors early and ensures data integrity. But changing the schema later requires migrations, which can be painful on large datasets.&lt;/p&gt;

&lt;p&gt;MongoDB's flexibility is useful when you're building something new and your data model is still evolving. Or when you're dealing with semi-structured data like logs, events or user-generated content. You can store different document shapes in the same collection. No migrations needed. But that flexibility comes at a cost: you need to handle data validation in your application code instead of relying on the database.&lt;/p&gt;

&lt;p&gt;Here's a quick comparison:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;PostgreSQL&lt;/th&gt;
&lt;th&gt;MongoDB&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Schema&lt;/td&gt;
&lt;td&gt;Strict, predefined table structure&lt;/td&gt;
&lt;td&gt;Flexible, documents can vary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data relationships&lt;/td&gt;
&lt;td&gt;Built-in foreign keys and joins&lt;/td&gt;
&lt;td&gt;Manual references or embedding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema changes&lt;/td&gt;
&lt;td&gt;Requires migrations&lt;/td&gt;
&lt;td&gt;No migrations needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data validation&lt;/td&gt;
&lt;td&gt;Enforced at database level&lt;/td&gt;
&lt;td&gt;Enforced at application level&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Well-defined, relational data&lt;/td&gt;
&lt;td&gt;Evolving, semi-structured data&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The reality is most applications have structured data that benefits from PostgreSQL's relational model. User profiles, orders, products and analytics usually have predictable relationships. MongoDB makes sense when you're prototyping quickly or dealing with truly flexible data structures.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Query capabilities and complexity
&lt;/h2&gt;

&lt;p&gt;PostgreSQL has one of the most powerful query engines available. It supports complex joins, subqueries, window functions, common table expressions and full SQL. You can express almost any data relationship or transformation in a single query. Need to join five tables, aggregate by multiple dimensions and filter with complex conditions? PostgreSQL handles it.&lt;/p&gt;

&lt;p&gt;MongoDB's query language is simpler and more limited. It's based on JSON-like syntax and works well for basic queries on single collections. But once you need to join data across collections or perform complex aggregations, things get awkward. MongoDB added a $lookup operator for joins, but it's slower and less flexible than SQL joins. You often end up making multiple queries or denormalizing your data to avoid joins entirely.&lt;/p&gt;

&lt;p&gt;For most business applications, query complexity matters. You'll need reports, analytics and ad-hoc queries. PostgreSQL makes this easy. MongoDB makes it painful unless you carefully structure your data to avoid joins.&lt;/p&gt;

&lt;p&gt;Here's a comparison of common query patterns:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query type&lt;/th&gt;
&lt;th&gt;PostgreSQL&lt;/th&gt;
&lt;th&gt;MongoDB&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Simple filters&lt;/td&gt;
&lt;td&gt;Fast and straightforward&lt;/td&gt;
&lt;td&gt;Fast and straightforward&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-table joins&lt;/td&gt;
&lt;td&gt;Native and efficient&lt;/td&gt;
&lt;td&gt;Limited, slower with $lookup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aggregations&lt;/td&gt;
&lt;td&gt;Full SQL power&lt;/td&gt;
&lt;td&gt;Aggregation pipeline (good but limited)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Full-text search&lt;/td&gt;
&lt;td&gt;Built-in with GIN indexes&lt;/td&gt;
&lt;td&gt;Text indexes available&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complex analytics&lt;/td&gt;
&lt;td&gt;Excellent with window functions&lt;/td&gt;
&lt;td&gt;Requires careful data modeling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ad-hoc queries&lt;/td&gt;
&lt;td&gt;Easy with standard SQL&lt;/td&gt;
&lt;td&gt;More difficult without joins&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If your application needs complex queries or you're not sure what queries you'll need later, PostgreSQL gives you more flexibility. MongoDB works if your access patterns are known upfront and you can structure your data accordingly.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Scalability and sharding
&lt;/h2&gt;

&lt;p&gt;MongoDB was built for horizontal scaling from the start. It has native sharding support that distributes data across multiple servers automatically. You can add more machines to handle more data or traffic. MongoDB handles shard key selection, data distribution and routing queries to the right shards. This makes it easier to scale out without changing application code.&lt;/p&gt;

&lt;p&gt;PostgreSQL's strength is vertical scaling. You get better performance by upgrading to a bigger server with more CPU, RAM and faster storage. PostgreSQL can handle massive datasets on a single node if you have enough hardware. Horizontal scaling is possible through manual sharding or tools like Citus, but it's more complex and less mature than MongoDB's built-in approach.&lt;/p&gt;

&lt;p&gt;For most projects, vertical scaling is simpler and cheaper than horizontal scaling. Modern servers are powerful. A single PostgreSQL instance can handle millions of records and thousands of queries per second. You only need horizontal scaling if you're dealing with truly massive datasets (hundreds of terabytes) or extreme traffic levels (hundreds of thousands of concurrent users).&lt;/p&gt;

&lt;p&gt;The key questions to ask:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How much data will you have in 1-2 years?&lt;/li&gt;
&lt;li&gt;What's your traffic growth projection?&lt;/li&gt;
&lt;li&gt;Can you predict your bottlenecks?&lt;/li&gt;
&lt;li&gt;Do you have the expertise to manage sharded databases?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're building a typical web application or SaaS product, PostgreSQL's vertical scaling will probably be enough. MongoDB's sharding makes sense if you're building something that needs to scale globally from day one or you know you'll hit horizontal scaling limits quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Consistency and ACID guarantees
&lt;/h2&gt;

&lt;p&gt;PostgreSQL provides full ACID guarantees for all transactions. Atomicity, Consistency, Isolation and Durability are guaranteed out of the box. Multi-document transactions work correctly. If something fails, everything rolls back. Your data stays consistent even under high load or system failures.&lt;/p&gt;

&lt;p&gt;MongoDB added multi-document ACID transactions in version 4.0, but they're slower and more limited than PostgreSQL's transactions. Single-document operations in MongoDB are atomic, but cross-document consistency requires explicit transactions. In practice, many MongoDB users avoid transactions entirely by denormalizing data into single documents.&lt;/p&gt;

&lt;p&gt;For financial applications, e-commerce or anything where data integrity is critical, PostgreSQL's consistency guarantees are valuable. You can trust that your data won't end up in an inconsistent state. MongoDB works fine for use cases where eventual consistency is acceptable, like logging, caching or analytics pipelines.&lt;/p&gt;

&lt;p&gt;MongoDB does offer tunable consistency levels (write concerns and read concerns), which gives you flexibility. But that flexibility also means you need to think carefully about consistency trade-offs and configure them correctly. PostgreSQL just works consistently by default.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Performance characteristics
&lt;/h2&gt;

&lt;p&gt;Performance depends heavily on your use case and access patterns. MongoDB is generally faster for simple reads and writes on single documents. If you're doing lots of inserts or updates on independent records, MongoDB can outperform PostgreSQL. Document databases avoid join overhead by storing related data together.&lt;/p&gt;

&lt;p&gt;PostgreSQL is faster when you need complex queries, joins or aggregations. Its query planner is extremely sophisticated and can optimize complicated queries that would be slow or impossible in MongoDB. PostgreSQL also has better support for indexes, including partial indexes, expression indexes and various index types (B-tree, hash, GIN, GiST).&lt;/p&gt;

&lt;p&gt;Both databases can be fast if you design your schema and indexes properly. But they optimize for different workloads:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MongoDB: fast single-document operations, high write throughput&lt;/li&gt;
&lt;li&gt;PostgreSQL: fast complex queries, efficient joins, flexible indexing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your application is read-heavy with complex queries, PostgreSQL will likely be faster. If you're write-heavy with simple access patterns, MongoDB might have an edge. In practice, most bottlenecks come from poor schema design or missing indexes, not the database choice itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Backup strategies and operational complexity
&lt;/h2&gt;

&lt;p&gt;Backups are critical for production databases. PostgreSQL has mature backup tools like pg_dump for logical backups and pg_basebackup for physical backups. Point-in-time recovery is available through WAL archiving. Most managed PostgreSQL services (AWS RDS, Google Cloud SQL, Azure Database) include automated backups with easy restore.&lt;/p&gt;

&lt;p&gt;MongoDB has mongodump for logical backups and filesystem snapshots for physical backups. Backups are straightforward for single-node deployments. But backing up sharded MongoDB clusters requires careful coordination to ensure consistency across shards. You need to stop the balancer and take snapshots at the same time on all shards.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://databasus.com/mongodb-backup" rel="noopener noreferrer"&gt;Databasus&lt;/a&gt; is an industry standard backup tool that supports both PostgreSQL and MongoDB. It handles scheduled backups, multiple storage destinations (S3, Google Drive, local storage) and notifications across Slack, Discord and email. Whether you're running PostgreSQL or MongoDB, Databasus simplifies backup management with a clean interface and reliable scheduling.&lt;/p&gt;

&lt;p&gt;Operational complexity also differs between these databases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt;: simpler operations, well-understood tooling, extensive documentation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MongoDB&lt;/strong&gt;: more complex operations with sharding, requires specialized knowledge for production deployments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;PostgreSQL has been around since 1996 and has decades of operational experience built into its tooling and documentation. MongoDB is newer (2009) and still evolving. If you don't have dedicated database administrators, PostgreSQL is easier to operate reliably.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which one should you choose?
&lt;/h2&gt;

&lt;p&gt;After comparing these six factors, here's a practical decision framework:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose PostgreSQL if you:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Have structured, relational data with clear relationships&lt;/li&gt;
&lt;li&gt;Need complex queries, joins or analytical workloads&lt;/li&gt;
&lt;li&gt;Want full ACID guarantees and strong consistency&lt;/li&gt;
&lt;li&gt;Prefer simpler operations and well-established tooling&lt;/li&gt;
&lt;li&gt;Can scale vertically and don't need global horizontal scaling immediately&lt;/li&gt;
&lt;li&gt;Are building typical web applications, SaaS products or business software&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Choose MongoDB if you:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Have flexible, semi-structured data with evolving schemas&lt;/li&gt;
&lt;li&gt;Need high write throughput with simple access patterns&lt;/li&gt;
&lt;li&gt;Know your access patterns upfront and can denormalize data&lt;/li&gt;
&lt;li&gt;Need native horizontal sharding for massive scale&lt;/li&gt;
&lt;li&gt;Are building applications like content management, catalogs or logging systems&lt;/li&gt;
&lt;li&gt;Have expertise to manage distributed database operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most developers building standard applications, PostgreSQL is the safer choice. It's more flexible for queries, easier to operate and handles most workloads efficiently. MongoDB makes sense for specific use cases where its strengths (flexibility, sharding, document model) align with your requirements.&lt;/p&gt;

&lt;p&gt;The good news is both databases are excellent. Either choice can work if you design your schema properly and use the database's strengths. But understanding these differences helps you pick the right tool and avoid fighting against your database later.&lt;/p&gt;

</description>
      <category>database</category>
      <category>postgres</category>
      <category>mongodb</category>
    </item>
    <item>
      <title>8 MySQL security mistakes that expose your database to attackers</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Wed, 11 Feb 2026 19:55:32 +0000</pubDate>
      <link>https://forem.com/piteradyson/8-mysql-security-mistakes-that-expose-your-database-to-attackers-d3a</link>
      <guid>https://forem.com/piteradyson/8-mysql-security-mistakes-that-expose-your-database-to-attackers-d3a</guid>
      <description>&lt;p&gt;MySQL is one of the most deployed databases in the world, which also makes it one of the most targeted. A lot of MySQL installations in the wild are running with default settings, overly permissive user accounts and no encryption. Some of these are dev setups that accidentally went to production. Others are production systems that nobody ever hardened because "it's behind a firewall."&lt;/p&gt;

&lt;p&gt;This article covers eight real security mistakes that leave MySQL databases exposed. Not abstract threat models, but concrete misconfigurations that attackers actually look for and exploit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fph0ywdz2adykebe6jrin.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fph0ywdz2adykebe6jrin.png" alt="MySQL security mistakes" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Running with default credentials and the root account
&lt;/h2&gt;

&lt;p&gt;This sounds obvious, but it still happens constantly. Fresh MySQL installations often ship with a root account that has no password or a well-known default password. Automated scanners specifically look for MySQL instances on port 3306 with empty root passwords. It takes seconds to find and exploit.&lt;/p&gt;

&lt;p&gt;The root account in MySQL has unrestricted access to everything: all databases, all tables, all administrative commands. Using it for application connections means your app has full control over the server, including the ability to drop databases, create users and modify grants.&lt;/p&gt;

&lt;p&gt;Fix the root password immediately after installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;USER&lt;/span&gt; &lt;span class="s1"&gt;'root'&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="s1"&gt;'localhost'&lt;/span&gt; &lt;span class="n"&gt;IDENTIFIED&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="s1"&gt;'a-strong-random-password-here'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;FLUSH&lt;/span&gt; &lt;span class="k"&gt;PRIVILEGES&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then create separate accounts for each application with only the privileges it needs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;USER&lt;/span&gt; &lt;span class="s1"&gt;'app_user'&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="s1"&gt;'10.0.1.%'&lt;/span&gt; &lt;span class="n"&gt;IDENTIFIED&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="s1"&gt;'another-strong-password'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;GRANT&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;myapp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="s1"&gt;'app_user'&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="s1"&gt;'10.0.1.%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;FLUSH&lt;/span&gt; &lt;span class="k"&gt;PRIVILEGES&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;10.0.1.%&lt;/code&gt; host restriction means this user can only connect from your application subnet. If someone steals the credentials, they can't use them from an arbitrary machine.&lt;/p&gt;

&lt;p&gt;Run &lt;code&gt;mysql_secure_installation&lt;/code&gt; on every new MySQL instance. It removes anonymous users, disables remote root login and drops the test database. This takes thirty seconds and closes the most common attack vectors.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Granting excessive privileges to application users
&lt;/h2&gt;

&lt;p&gt;Most MySQL applications need SELECT, INSERT, UPDATE and DELETE on specific databases. That's it. Yet it's common to see application accounts with &lt;code&gt;GRANT ALL PRIVILEGES ON *.*&lt;/code&gt; because someone copied a Stack Overflow answer during initial setup and never revisited it.&lt;/p&gt;

&lt;p&gt;The damage from excessive privileges scales with the access level. An application account with &lt;code&gt;FILE&lt;/code&gt; privilege can read any file the MySQL process can access on the server filesystem. &lt;code&gt;PROCESS&lt;/code&gt; lets it see all running queries, including those from other users. &lt;code&gt;SUPER&lt;/code&gt; lets it kill connections and change global variables.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Privilege&lt;/th&gt;
&lt;th&gt;What it allows&lt;/th&gt;
&lt;th&gt;Risk if compromised&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ALL PRIVILEGES ON *.*&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full administrative access&lt;/td&gt;
&lt;td&gt;Complete server takeover&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;FILE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Read/write server filesystem&lt;/td&gt;
&lt;td&gt;Credential theft, data exfiltration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PROCESS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;View all running queries&lt;/td&gt;
&lt;td&gt;Exposure of sensitive queries and data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SUPER&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Kill connections, change configs&lt;/td&gt;
&lt;td&gt;Denial of service, configuration tampering&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SELECT, INSERT, UPDATE, DELETE ON app.*&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Standard CRUD on one database&lt;/td&gt;
&lt;td&gt;Limited to application data only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Audit your current grants to see what's actually assigned:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Super_priv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;File_priv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Process_priv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Grant_priv&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;mysql&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;user&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;Super_priv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Y'&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;File_priv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Y'&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;Process_priv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Y'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your application user shows up in this list, something is wrong. Revoke what it doesn't need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;REVOKE&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt; &lt;span class="k"&gt;PRIVILEGES&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="s1"&gt;'app_user'&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="s1"&gt;'10.0.1.%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;GRANT&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;myapp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="s1"&gt;'app_user'&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="s1"&gt;'10.0.1.%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;FLUSH&lt;/span&gt; &lt;span class="k"&gt;PRIVILEGES&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A good rule of thumb: if you can't explain why an account needs a specific privilege, it shouldn't have it.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Exposing MySQL to the internet without network restrictions
&lt;/h2&gt;

&lt;p&gt;By default, MySQL listens on all network interfaces. That means if your server has a public IP address, MySQL is reachable from the entire internet. Combined with weak credentials (mistake #1), this is how most MySQL breaches happen.&lt;/p&gt;

&lt;p&gt;Check your current binding:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SHOW&lt;/span&gt; &lt;span class="n"&gt;VARIABLES&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'bind_address'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If it shows &lt;code&gt;0.0.0.0&lt;/code&gt; or &lt;code&gt;*&lt;/code&gt;, MySQL is accepting connections from everywhere.&lt;/p&gt;

&lt;p&gt;Restrict it in &lt;code&gt;my.cnf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[mysqld]&lt;/span&gt;
&lt;span class="py"&gt;bind-address&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;127.0.0.1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This limits MySQL to local connections only. If your application runs on a different server, bind to the private network interface instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[mysqld]&lt;/span&gt;
&lt;span class="py"&gt;bind-address&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;10.0.1.5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But network binding alone isn't enough. Add firewall rules to restrict port 3306:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# iptables example: only allow MySQL connections from app server&lt;/span&gt;
iptables &lt;span class="nt"&gt;-A&lt;/span&gt; INPUT &lt;span class="nt"&gt;-p&lt;/span&gt; tcp &lt;span class="nt"&gt;--dport&lt;/span&gt; 3306 &lt;span class="nt"&gt;-s&lt;/span&gt; 10.0.1.10 &lt;span class="nt"&gt;-j&lt;/span&gt; ACCEPT
iptables &lt;span class="nt"&gt;-A&lt;/span&gt; INPUT &lt;span class="nt"&gt;-p&lt;/span&gt; tcp &lt;span class="nt"&gt;--dport&lt;/span&gt; 3306 &lt;span class="nt"&gt;-j&lt;/span&gt; DROP
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also disable the &lt;code&gt;skip-networking&lt;/code&gt; and &lt;code&gt;skip-name-resolve&lt;/code&gt; options thoughtfully. &lt;code&gt;skip-networking&lt;/code&gt; disables TCP connections entirely (only socket connections work), which is fine if the application is on the same host. &lt;code&gt;skip-name-resolve&lt;/code&gt; prevents DNS lookups for connecting hosts, which speeds up connections and removes DNS spoofing as an attack vector.&lt;/p&gt;

&lt;p&gt;If your application must reach MySQL over the internet, use an SSH tunnel or VPN instead of opening port 3306 directly. Never expose MySQL to the public internet, even with strong passwords.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Not encrypting connections with TLS
&lt;/h2&gt;

&lt;p&gt;MySQL connections transmit data in plaintext by default. This includes queries, result sets, usernames and passwords. Anyone who can capture network traffic between your application and MySQL can read everything.&lt;/p&gt;

&lt;p&gt;This isn't just a theoretical concern. On shared hosting, cloud VPCs with misconfigured security groups and corporate networks, packet sniffing is a real threat. Even "private" networks aren't always as isolated as you think.&lt;/p&gt;

&lt;p&gt;Check if TLS is currently enabled:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SHOW&lt;/span&gt; &lt;span class="n"&gt;VARIABLES&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'%ssl%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To enable TLS, generate or obtain certificates and configure MySQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[mysqld]&lt;/span&gt;
&lt;span class="py"&gt;ssl-ca&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/etc/mysql/ssl/ca-cert.pem&lt;/span&gt;
&lt;span class="py"&gt;ssl-cert&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/etc/mysql/ssl/server-cert.pem&lt;/span&gt;
&lt;span class="py"&gt;ssl-key&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/etc/mysql/ssl/server-key.pem&lt;/span&gt;
&lt;span class="py"&gt;require_secure_transport&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;ON&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;require_secure_transport = ON&lt;/code&gt; setting forces all connections to use TLS. Without it, clients can still connect unencrypted.&lt;/p&gt;

&lt;p&gt;You can also enforce TLS on a per-user basis, which is useful for a gradual rollout:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;USER&lt;/span&gt; &lt;span class="s1"&gt;'app_user'&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="s1"&gt;'10.0.1.%'&lt;/span&gt; &lt;span class="n"&gt;REQUIRE&lt;/span&gt; &lt;span class="n"&gt;SSL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;FLUSH&lt;/span&gt; &lt;span class="k"&gt;PRIVILEGES&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify that connections are actually encrypted:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ssl_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ssl_cipher&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;mysql&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;user&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'app_user'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And from the client side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SHOW&lt;/span&gt; &lt;span class="n"&gt;STATUS&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'Ssl_cipher'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;Ssl_cipher&lt;/code&gt; returns an empty string, the connection is unencrypted.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Leaving the binary log and data directory unprotected
&lt;/h2&gt;

&lt;p&gt;MySQL's binary log contains every data-modifying statement that runs against the database. If an attacker gains access to the filesystem, they can read the binary log and reconstruct your entire data history: every insert, update and delete.&lt;/p&gt;

&lt;p&gt;The data directory itself contains the actual table files. Depending on the storage engine, these might be readable with basic tools. InnoDB files can be parsed with specialized utilities to extract raw data, bypassing MySQL authentication entirely.&lt;/p&gt;

&lt;p&gt;Check your current file permissions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; /var/lib/mysql/
&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; /var/log/mysql/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The MySQL data directory and log directory should be owned by the &lt;code&gt;mysql&lt;/code&gt; user and group, with no world-readable permissions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chown&lt;/span&gt; &lt;span class="nt"&gt;-R&lt;/span&gt; mysql:mysql /var/lib/mysql
&lt;span class="nb"&gt;chmod &lt;/span&gt;750 /var/lib/mysql
&lt;span class="nb"&gt;chown&lt;/span&gt; &lt;span class="nt"&gt;-R&lt;/span&gt; mysql:mysql /var/log/mysql
&lt;span class="nb"&gt;chmod &lt;/span&gt;750 /var/log/mysql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also protect the MySQL configuration file, which may contain passwords:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod &lt;/span&gt;600 /etc/mysql/my.cnf
&lt;span class="nb"&gt;chown &lt;/span&gt;root:root /etc/mysql/my.cnf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you're running MySQL in Docker, make sure the volume mounts for data and logs aren't world-readable on the host filesystem. Default Docker volume permissions can be more permissive than you expect.&lt;/p&gt;

&lt;p&gt;For the binary log specifically, consider encrypting it. MySQL 8.0+ supports binary log encryption:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[mysqld]&lt;/span&gt;
&lt;span class="py"&gt;binlog_encryption&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;ON&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This encrypts the binary log files at rest. Even if someone copies the files, they can't read the contents without the encryption key.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Ignoring SQL injection in application code
&lt;/h2&gt;

&lt;p&gt;SQL injection has been the number one database attack vector for over two decades, and it still works because developers keep building queries by concatenating user input directly into SQL strings. MySQL doesn't have a built-in defense against this. The protection has to come from application code.&lt;/p&gt;

&lt;p&gt;An injectable query looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Vulnerable: user input directly in the query string
&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM users WHERE email = &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;
&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;user_input&lt;/code&gt; is &lt;code&gt;' OR '1'='1' --&lt;/code&gt;, the query becomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="s1"&gt;'1'&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'1'&lt;/span&gt; &lt;span class="c1"&gt;--'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This returns every row in the users table. More destructive payloads can drop tables, read files from disk (if the MySQL user has &lt;code&gt;FILE&lt;/code&gt; privilege) or create new admin accounts.&lt;/p&gt;

&lt;p&gt;The fix is parameterized queries. Every database library supports them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Safe: parameterized query
&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM users WHERE email = %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Node.js with mysql2&lt;/span&gt;
&lt;span class="nx"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SELECT * FROM users WHERE email = ?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;userInput&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Go with database/sql&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"SELECT * FROM users WHERE email = ?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userInput&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Parameterized queries separate the SQL structure from the data. The database engine knows that the parameter is a value, not SQL code, regardless of what it contains.&lt;/p&gt;

&lt;p&gt;On the MySQL side, you can reduce the blast radius by removing the &lt;code&gt;FILE&lt;/code&gt; privilege from application accounts (see mistake #2) and by running MySQL with &lt;code&gt;--local-infile=0&lt;/code&gt; to disable &lt;code&gt;LOAD DATA LOCAL INFILE&lt;/code&gt;, which attackers use for file reading through SQL injection.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Not auditing or monitoring database access
&lt;/h2&gt;

&lt;p&gt;If someone is accessing your MySQL database in ways they shouldn't, how quickly would you know? Most MySQL installations have no audit logging enabled. An attacker could be reading sensitive tables for weeks before anyone notices.&lt;/p&gt;

&lt;p&gt;MySQL Enterprise Edition includes an audit plugin, but the community edition requires other approaches. The general query log is one option, though it captures everything and creates enormous log files on busy servers.&lt;/p&gt;

&lt;p&gt;A more practical approach for the community edition is to enable specific logging:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[mysqld]&lt;/span&gt;
&lt;span class="py"&gt;log_error&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/var/log/mysql/error.log&lt;/span&gt;
&lt;span class="py"&gt;slow_query_log&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;1&lt;/span&gt;
&lt;span class="py"&gt;slow_query_log_file&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/var/log/mysql/slow.log&lt;/span&gt;
&lt;span class="py"&gt;log_queries_not_using_indexes&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For connection monitoring, regularly check who is connected and what they're doing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;information_schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;processlist&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'system user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'event_scheduler'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="nb"&gt;time&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Track failed login attempts by checking the error log. Repeated failed logins from the same IP usually mean a brute force attack is underway.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Monitoring area&lt;/th&gt;
&lt;th&gt;What to watch for&lt;/th&gt;
&lt;th&gt;How to check&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Failed logins&lt;/td&gt;
&lt;td&gt;Brute force attempts&lt;/td&gt;
&lt;td&gt;Error log entries with "Access denied"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unusual connections&lt;/td&gt;
&lt;td&gt;Unknown hosts or users&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;SHOW PROCESSLIST&lt;/code&gt; or processlist table&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema changes&lt;/td&gt;
&lt;td&gt;Unauthorized DDL&lt;/td&gt;
&lt;td&gt;General log or trigger-based auditing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privilege escalation&lt;/td&gt;
&lt;td&gt;New grants or users&lt;/td&gt;
&lt;td&gt;Periodic diff of &lt;code&gt;mysql.user&lt;/code&gt; table&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Large data reads&lt;/td&gt;
&lt;td&gt;Bulk exfiltration&lt;/td&gt;
&lt;td&gt;Slow query log, network monitoring&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For production systems, consider deploying a third-party audit plugin like &lt;code&gt;audit_log&lt;/code&gt; from Percona or MariaDB's audit plugin (which works with MySQL forks). These provide structured, filterable audit trails without the overhead of the general query log.&lt;/p&gt;

&lt;p&gt;Set up alerts for critical events: new user creation, privilege changes, connections from unexpected hosts and queries against sensitive tables. The goal is to detect unusual activity before it becomes a full breach.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Skipping backups or storing them insecurely
&lt;/h2&gt;

&lt;p&gt;Security isn't just about preventing unauthorized access. It's also about recovery. Ransomware attacks against MySQL databases are real: attackers gain access, drop all tables and leave a ransom note. Without backups, you're negotiating with criminals.&lt;/p&gt;

&lt;p&gt;But having backups isn't enough if they're stored insecurely. Unencrypted backup files sitting on the same server as MySQL are useless in a ransomware scenario because the attacker deletes them too. Backups on an S3 bucket with public read access are just a different kind of data breach.&lt;/p&gt;

&lt;p&gt;A secure backup strategy covers three things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Encryption&lt;/strong&gt; — Backup files should be encrypted at rest so they're useless if stolen&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Offsite storage&lt;/strong&gt; — At least one copy should be on a separate system or cloud storage that the MySQL server doesn't have delete access to&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Regular testing&lt;/strong&gt; — A backup you've never restored is a backup you hope works&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For MySQL, &lt;code&gt;mysqldump&lt;/code&gt; is the basic tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mysqldump &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; &lt;span class="nt"&gt;--routines&lt;/span&gt; &lt;span class="nt"&gt;--triggers&lt;/span&gt; myapp | &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nb"&gt;gzip&lt;/span&gt; | openssl enc &lt;span class="nt"&gt;-aes-256-cbc&lt;/span&gt; &lt;span class="nt"&gt;-salt&lt;/span&gt; &lt;span class="nt"&gt;-pbkdf2&lt;/span&gt; &lt;span class="nt"&gt;-out&lt;/span&gt; /backup/myapp_&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%F&lt;span class="si"&gt;)&lt;/span&gt;.sql.gz.enc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a compressed, encrypted backup. But managing encryption keys, scheduling and offsite storage manually is tedious and error-prone.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://databasus.com/mysql-backup" rel="noopener noreferrer"&gt;MySQL backup&lt;/a&gt; tools like Databasus automate the entire process. It's an industry standard for MySQL backup tools that handles scheduling, compression, AES-256-GCM encryption and storage to multiple destinations like S3, Google Drive and SFTP. It's suitable for individual developers and enterprise teams, with workspace-based access management and audit logs.&lt;/p&gt;

&lt;p&gt;Whatever approach you choose, make sure your backups are not accessible from the MySQL server with the same credentials. If the database server is compromised, the attacker shouldn't be able to delete your backups.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pattern behind these mistakes
&lt;/h2&gt;

&lt;p&gt;Looking at these eight mistakes together, a pattern emerges. Most MySQL security failures come from defaults that were never changed, permissions that were never reviewed and monitoring that was never set up. None of these fixes are complex. They don't require expensive tools or deep security expertise.&lt;/p&gt;

&lt;p&gt;Start with the basics: strong credentials, minimal privileges, network restrictions and encrypted connections. Then add monitoring so you know when something unusual happens. And keep tested, encrypted backups so you can recover when prevention fails.&lt;/p&gt;

&lt;p&gt;The best time to secure your MySQL database was when you first set it up. The second best time is now.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mysql</category>
    </item>
    <item>
      <title>PostgreSQL slow queries — 7 ways to find and fix performance bottlenecks</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Tue, 10 Feb 2026 19:52:03 +0000</pubDate>
      <link>https://forem.com/piteradyson/postgresql-slow-queries-7-ways-to-find-and-fix-performance-bottlenecks-2app</link>
      <guid>https://forem.com/piteradyson/postgresql-slow-queries-7-ways-to-find-and-fix-performance-bottlenecks-2app</guid>
      <description>&lt;p&gt;Every PostgreSQL database eventually develops slow queries. It might start small: a dashboard that takes a bit longer to load, an API endpoint that times out during peak traffic, a report that used to run in seconds and now takes minutes. The tricky part is that slow queries rarely announce themselves. They creep in as data grows, schemas change and new features pile on.&lt;/p&gt;

&lt;p&gt;This article covers seven practical ways to find the queries that are hurting your database and fix them. Not theoretical advice, but actual tools and techniques you can apply to a running PostgreSQL instance today.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcy4cg9sjdsyiw01qzgqb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcy4cg9sjdsyiw01qzgqb.png" alt="PostgreSQL query optimization" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Enable pg_stat_statements to find your worst offenders
&lt;/h2&gt;

&lt;p&gt;The single most useful extension for tracking slow queries in PostgreSQL is &lt;code&gt;pg_stat_statements&lt;/code&gt;. It records execution statistics for every query that runs against your database, including how many times it ran, total execution time, rows returned and more.&lt;/p&gt;

&lt;p&gt;Most performance problems come from a handful of queries. pg_stat_statements lets you find them without guessing.&lt;/p&gt;

&lt;p&gt;To enable it, add the extension to your &lt;code&gt;postgresql.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After restarting PostgreSQL, create the extension:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;pg_stat_statements&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then query it to find the most time-consuming queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;calls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_exec_time&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_time_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean_exec_time&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_time_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;total_exec_time&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_exec_time&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;())::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;percent_total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_statements&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_exec_time&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This shows you which queries consume the most cumulative time. A query that runs 50,000 times a day at 100ms each is a bigger problem than a query that runs once at 5 seconds. The &lt;code&gt;percent_total&lt;/code&gt; column makes this obvious.&lt;/p&gt;

&lt;p&gt;You can also find queries with the highest average execution time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;calls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean_exec_time&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_time_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_exec_time&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;max_time_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_statements&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;calls&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;mean_exec_time&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;WHERE calls &amp;gt; 10&lt;/code&gt; filter avoids one-off admin queries that would distort the results.&lt;/p&gt;

&lt;p&gt;Reset statistics periodically to keep the data relevant:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;pg_stat_statements_reset&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;pg_stat_statements is the starting point. Everything else in this article builds on knowing which queries to focus on.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Use EXPLAIN ANALYZE to understand what's actually happening
&lt;/h2&gt;

&lt;p&gt;Once you know which queries are slow, &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; tells you why. It runs the query and shows the execution plan PostgreSQL actually used, including the time spent at each step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'completed'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output looks something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hash Join  (cost=12.50..345.00 rows=150 actual time=0.82..15.43 rows=143 loops=1)
  Hash Cond: (o.customer_id = c.id)
  -&amp;gt;  Seq Scan on orders o  (cost=0.00..310.00 rows=150 actual time=0.02..14.20 rows=143 loops=1)
        Filter: ((created_at &amp;gt; '2026-01-01') AND (status = 'completed'))
        Rows Removed by Filter: 99857
  -&amp;gt;  Hash  (cost=10.00..10.00 rows=200 actual time=0.45..0.45 rows=200 loops=1)
        -&amp;gt;  Seq Scan on customers c  (cost=0.00..10.00 rows=200 actual time=0.01..0.20 rows=200 loops=1)
Planning Time: 0.15 ms
Execution Time: 15.60 ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important things to look for:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Warning sign&lt;/th&gt;
&lt;th&gt;What it means&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Seq Scan&lt;/code&gt; on a large table&lt;/td&gt;
&lt;td&gt;No index is being used, every row is read&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Rows Removed by Filter: 99857&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The scan reads far more rows than it returns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;actual rows&lt;/code&gt; much higher than &lt;code&gt;rows&lt;/code&gt; (estimate)&lt;/td&gt;
&lt;td&gt;Statistics are stale, run &lt;code&gt;ANALYZE&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Nested Loop&lt;/code&gt; with high &lt;code&gt;loops&lt;/code&gt; count&lt;/td&gt;
&lt;td&gt;The inner side runs thousands of times&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Sort&lt;/code&gt; with &lt;code&gt;external merge&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Not enough &lt;code&gt;work_mem&lt;/code&gt;, sorting spills to disk&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;Seq Scan&lt;/code&gt; on orders above is the bottleneck. It reads 100,000 rows to return 143. An index on &lt;code&gt;(status, created_at)&lt;/code&gt; would fix this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_status_created&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After creating the index, run &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; again. You should see an &lt;code&gt;Index Scan&lt;/code&gt; or &lt;code&gt;Bitmap Index Scan&lt;/code&gt; replacing the sequential scan, and the execution time dropping significantly.&lt;/p&gt;

&lt;p&gt;One thing people miss: &lt;code&gt;EXPLAIN&lt;/code&gt; without &lt;code&gt;ANALYZE&lt;/code&gt; shows the plan but doesn't execute the query. It gives you estimates, not actual numbers. Always use &lt;code&gt;ANALYZE&lt;/code&gt; when debugging performance, unless the query modifies data (in that case, wrap it in a transaction and roll back).&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Configure the slow query log
&lt;/h2&gt;

&lt;p&gt;pg_stat_statements gives you aggregate data, but sometimes you need to see individual slow queries as they happen. PostgreSQL's built-in slow query log captures every query that exceeds a time threshold.&lt;/p&gt;

&lt;p&gt;Add these settings to &lt;code&gt;postgresql.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;log_min_duration_statement = 500
log_statement = 'none'
log_duration = off
log_line_prefix = '%t [%p] %u@%d '
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This logs any query that takes longer than 500 milliseconds. The &lt;code&gt;log_line_prefix&lt;/code&gt; adds the timestamp, process ID, username and database name to each log entry, which is essential for debugging.&lt;/p&gt;

&lt;p&gt;Setting &lt;code&gt;log_min_duration_statement = 0&lt;/code&gt; logs every query. This is useful for short debugging sessions but generates enormous log files on busy databases. For production, start with 500ms or 1000ms and lower it as you fix the worst offenders.&lt;/p&gt;

&lt;p&gt;The log entries look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2026-02-10 14:23:45 UTC [12345] app_user@mydb LOG: duration: 2345.678 ms  statement: 
    SELECT u.*, p.* FROM users u JOIN purchases p ON p.user_id = u.id 
    WHERE u.region = 'eu' ORDER BY p.created_at DESC;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For more structured analysis, tools like pgBadger can parse these logs and generate reports showing the slowest queries, most frequent queries and query patterns over time. But the raw log is often enough to spot problems.&lt;/p&gt;

&lt;p&gt;A practical approach: enable the slow query log in production at 1000ms, review it weekly, fix the top offenders, then lower the threshold to 500ms. Repeat until the log is mostly quiet.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Fix missing and misused indexes
&lt;/h2&gt;

&lt;p&gt;Missing indexes are the most common cause of slow queries in PostgreSQL. But "add more indexes" isn't always the answer. Sometimes existing indexes aren't being used, or the wrong type of index was created.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finding missing indexes.&lt;/strong&gt; Start with the query from pg_stat_statements, then check if the tables involved have appropriate indexes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;schemaname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;relname&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;seq_scan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;seq_tup_read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;idx_scan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;n_live_tup&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;row_count&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_user_tables&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;seq_scan&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;seq_tup_read&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tables with a high &lt;code&gt;seq_tup_read&lt;/code&gt; and low &lt;code&gt;idx_scan&lt;/code&gt; are being scanned sequentially when they probably shouldn't be. A table with 10 million rows and zero index scans is almost certainly missing an index.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finding unused indexes.&lt;/strong&gt; Indexes you never use still cost write performance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;indexrelname&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;index_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;relname&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;idx_scan&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;times_used&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pg_size_pretty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pg_relation_size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexrelid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;index_size&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_user_indexes&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;idx_scan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;indexrelid&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;indexrelid&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_index&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;indisprimary&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;pg_relation_size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexrelid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This shows indexes that have never been scanned (excluding primary keys). If an index is 500 MB and has zero scans, it's slowing down every write for nothing. Drop it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Common indexing mistakes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Indexing a column with very low cardinality (like a boolean &lt;code&gt;is_active&lt;/code&gt; column with 99% true values). The planner often prefers a sequential scan because the index doesn't filter out enough rows.&lt;/li&gt;
&lt;li&gt;Creating single-column indexes when your queries filter on multiple columns. A composite index on &lt;code&gt;(status, created_at)&lt;/code&gt; is much better than separate indexes on &lt;code&gt;status&lt;/code&gt; and &lt;code&gt;created_at&lt;/code&gt; when your &lt;code&gt;WHERE&lt;/code&gt; clause uses both.&lt;/li&gt;
&lt;li&gt;Forgetting partial indexes. If 95% of queries filter for active records, create a partial index:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_active&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This index is smaller, faster to scan and faster to maintain than a full index.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Tune PostgreSQL memory and planner settings
&lt;/h2&gt;

&lt;p&gt;Default PostgreSQL configuration is deliberately conservative. It assumes the server has 128 MB of RAM and a single spinning disk. If you're running on a modern server with 16 GB of RAM and SSDs, the defaults are leaving performance on the table.&lt;/p&gt;

&lt;p&gt;The key settings that affect query performance:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Setting&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th&gt;Recommended starting point&lt;/th&gt;
&lt;th&gt;What it controls&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;shared_buffers&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;128 MB&lt;/td&gt;
&lt;td&gt;25% of total RAM&lt;/td&gt;
&lt;td&gt;PostgreSQL's shared memory cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;work_mem&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;4 MB&lt;/td&gt;
&lt;td&gt;64-256 MB&lt;/td&gt;
&lt;td&gt;Memory for sorts and hash operations per query&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;effective_cache_size&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;4 GB&lt;/td&gt;
&lt;td&gt;50-75% of total RAM&lt;/td&gt;
&lt;td&gt;Planner's estimate of available OS cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;random_page_cost&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;4.0&lt;/td&gt;
&lt;td&gt;1.1 for SSD, 2.0 for HDD&lt;/td&gt;
&lt;td&gt;Cost of random disk reads (affects index usage)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;effective_io_concurrency&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;200 for SSD&lt;/td&gt;
&lt;td&gt;Number of concurrent disk I/O operations&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;shared_buffers&lt;/strong&gt; is the most important one. PostgreSQL uses this as its primary data cache. Too low and it constantly re-reads data from disk. Too high and it competes with the OS page cache. 25% of total RAM is a good starting point for most workloads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;work_mem&lt;/strong&gt; is tricky because it's per-operation, not per-query. A complex query with five sort operations and three hash joins could allocate up to 8x &lt;code&gt;work_mem&lt;/code&gt;. Setting it to 256 MB sounds reasonable until 50 concurrent connections each allocate multiple chunks. Start with 64 MB and monitor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;random_page_cost&lt;/strong&gt; is the one that catches most people. The default of 4.0 tells the planner that random disk reads are four times more expensive than sequential reads. That was true for spinning disks. On SSDs, random and sequential reads are nearly identical. Lowering this to 1.1 makes the planner much more willing to use indexes, which is usually what you want on SSD storage.&lt;/p&gt;

&lt;p&gt;You can change these without a restart (except &lt;code&gt;shared_buffers&lt;/code&gt;) using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;SYSTEM&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;work_mem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'64MB'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;SYSTEM&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;random_page_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;SYSTEM&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;effective_cache_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'12GB'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;pg_reload_conf&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After changing settings, test with &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; on your slow queries. You should see different plan choices, especially more index scans and in-memory sorts.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Rewrite problematic query patterns
&lt;/h2&gt;

&lt;p&gt;Sometimes the query itself is the problem. No amount of indexing or tuning will fix a fundamentally inefficient query. Here are patterns that consistently cause performance issues and how to fix them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SELECT * when you only need a few columns.&lt;/strong&gt; This forces PostgreSQL to read and transfer every column, including large text or JSONB fields:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Slow: reads everything, including a 10 KB description column&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Better: only fetches what you need&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'electronics'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This matters more than people think, especially with TOAST (The Oversized Attribute Storage Technique). Large columns are stored separately, and fetching them requires additional disk reads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Correlated subqueries that run once per row.&lt;/strong&gt; The planner sometimes can't flatten these:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Slow: subquery executes for each order row&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Better: explicit JOIN&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Using OFFSET for pagination on large datasets.&lt;/strong&gt; &lt;code&gt;OFFSET 100000&lt;/code&gt; means PostgreSQL fetches and discards 100,000 rows before returning results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Slow: scans and discards 100,000 rows&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt; &lt;span class="k"&gt;OFFSET&lt;/span&gt; &lt;span class="mi"&gt;100000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Better: keyset pagination using the last seen value&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-15T10:30:00Z'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keyset pagination is consistently fast regardless of how deep into the result set you go. It requires an index on the column you're paginating by.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unnecessary DISTINCT or GROUP BY.&lt;/strong&gt; If you're adding &lt;code&gt;DISTINCT&lt;/code&gt; because a JOIN produces duplicates, the JOIN is probably wrong. Fix the JOIN condition instead of papering over it with &lt;code&gt;DISTINCT&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Functions in WHERE clauses that prevent index usage:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Index on created_at won't be used&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;YEAR&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2026&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Rewrite to use the index&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="s1"&gt;'2027-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  7. Keep statistics up to date with ANALYZE and VACUUM
&lt;/h2&gt;

&lt;p&gt;PostgreSQL's query planner relies on table statistics to make decisions. How many rows does a table have? What's the distribution of values in each column? How many distinct values are there? If these statistics are wrong, the planner makes bad choices.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ANALYZE&lt;/code&gt; collects fresh statistics about table contents. &lt;code&gt;VACUUM&lt;/code&gt; reclaims space from deleted or updated rows (dead tuples) that PostgreSQL can't reuse. Both are essential for sustained query performance.&lt;/p&gt;

&lt;p&gt;Autovacuum handles this automatically by default, but it doesn't always keep up. Large batch operations, bulk deletes and rapidly growing tables can outpace the default autovacuum settings.&lt;/p&gt;

&lt;p&gt;Check if your statistics are stale:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;relname&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;last_analyze&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;last_autoanalyze&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;n_live_tup&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;n_dead_tup&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;n_dead_tup&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;NULLIF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_live_tup&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;n_dead_tup&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dead_pct&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_user_tables&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;n_dead_tup&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tables with a high percentage of dead tuples need vacuuming. Tables that haven't been analyzed recently may have stale statistics.&lt;/p&gt;

&lt;p&gt;If a table had 10,000 rows when statistics were collected but now has 10 million, the planner might choose a sequential scan based on the old row count when an index scan would be far more efficient. Running &lt;code&gt;ANALYZE&lt;/code&gt; fixes this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For autovacuum tuning, the defaults are cautious. On busy databases, consider adjusting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;autovacuum_vacuum_scale_factor = 0.05
autovacuum_analyze_scale_factor = 0.02
autovacuum_vacuum_cost_delay = 2ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The scale factors control when autovacuum kicks in. The default &lt;code&gt;vacuum_scale_factor&lt;/code&gt; of 0.2 means autovacuum runs after 20% of rows have been modified. On a 100 million row table, that's 20 million dead tuples before cleanup starts. Lowering it to 0.05 (5%) keeps things cleaner.&lt;/p&gt;

&lt;p&gt;For large tables with specific requirements, you can set per-table autovacuum settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;autovacuum_vacuum_scale_factor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;autovacuum_analyze_scale_factor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;005&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Keeping your data safe while you optimize
&lt;/h2&gt;

&lt;p&gt;Tuning queries and tweaking PostgreSQL configuration is relatively safe work. But mistakes happen. A dropped index on a production table during peak hours, a configuration change that causes out-of-memory crashes, an &lt;code&gt;ANALYZE&lt;/code&gt; on a massive table that locks things at the wrong moment.&lt;/p&gt;

&lt;p&gt;Having reliable backups means you can optimize with confidence. &lt;a href="https://databasus.com" rel="noopener noreferrer"&gt;PostgreSQL backup&lt;/a&gt; tools like Databasus handle automated scheduled backups with compression, encryption and multiple storage destinations. It's an industry standard for PostgreSQL backup tools, suitable for individual developers and enterprise teams.&lt;/p&gt;

&lt;h2&gt;
  
  
  Putting it all together
&lt;/h2&gt;

&lt;p&gt;Fixing slow queries in PostgreSQL isn't a one-time task. It's a cycle: identify the slow queries with pg_stat_statements, understand why they're slow with EXPLAIN ANALYZE, fix the root cause (missing index, bad query pattern, stale statistics or wrong configuration) and then monitor to make sure the fix holds.&lt;/p&gt;

&lt;p&gt;Start with pg_stat_statements if you haven't already. It takes five minutes to set up and immediately shows you where your database is spending its time. From there, work through the list: check your indexes, review your configuration settings, look for problematic query patterns and make sure autovacuum is keeping up.&lt;/p&gt;

&lt;p&gt;Most PostgreSQL performance problems have straightforward solutions. The hard part is knowing where to look.&lt;/p&gt;

</description>
      <category>database</category>
      <category>postgres</category>
    </item>
    <item>
      <title>PostgreSQL indexing explained — 5 index types and when to use each</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Mon, 09 Feb 2026 13:56:40 +0000</pubDate>
      <link>https://forem.com/piteradyson/postgresql-indexing-explained-5-index-types-and-when-to-use-each-45ae</link>
      <guid>https://forem.com/piteradyson/postgresql-indexing-explained-5-index-types-and-when-to-use-each-45ae</guid>
      <description>&lt;p&gt;Indexes are one of those things that everybody knows they should use, but few people actually understand beyond the basics. You create an index, the query gets faster, done. Except when it doesn't. Or when the wrong index makes things slower. Or when you're running five indexes on a table and none of them are being used.&lt;/p&gt;

&lt;p&gt;PostgreSQL ships with five distinct index types, each designed for different access patterns. Picking the right one is the difference between a query that takes 2 milliseconds and one that takes 20 seconds. This article covers all five, when they actually help and when they're a waste of disk space.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1j315g9tavuvx15etrae.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1j315g9tavuvx15etrae.png" alt="PostgreSQL indexes" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How PostgreSQL indexes work under the hood
&lt;/h2&gt;

&lt;p&gt;Before jumping into specific types, it helps to understand what an index actually does. A PostgreSQL index is a separate data structure that maps column values to the physical location of rows on disk. When you run a query with a &lt;code&gt;WHERE&lt;/code&gt; clause, the planner checks whether an index exists that can narrow down the search instead of scanning every row.&lt;/p&gt;

&lt;p&gt;Without an index, PostgreSQL performs a sequential scan. It reads the entire table, row by row, checking each one against your filter. For a table with 100 rows, that's fine. For a table with 100 million rows, it's a problem.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Without an index, this scans the entire table&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'abc-123'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- With an index on customer_id, PostgreSQL jumps directly to matching rows&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_customer_id&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Indexes aren't free though. Every index takes disk space and slows down &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt; and &lt;code&gt;DELETE&lt;/code&gt; operations because PostgreSQL has to maintain the index alongside the table data. A table with ten indexes means every write operation updates ten additional data structures.&lt;/p&gt;

&lt;p&gt;The goal is to have the right indexes for your query patterns and nothing more.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. B-tree — the default workhorse
&lt;/h2&gt;

&lt;p&gt;B-tree is the default index type in PostgreSQL. If you run &lt;code&gt;CREATE INDEX&lt;/code&gt; without specifying a type, you get a B-tree. It handles equality and range queries on sortable data, which covers the vast majority of real-world use cases.&lt;/p&gt;

&lt;p&gt;B-tree indexes store data in a balanced tree structure. Each node contains sorted keys and pointers to child nodes, allowing PostgreSQL to find any value in O(log n) time. They support &lt;code&gt;=&lt;/code&gt;, &lt;code&gt;&amp;lt;&lt;/code&gt;, &lt;code&gt;&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;=&lt;/code&gt;, &lt;code&gt;&amp;gt;=&lt;/code&gt;, &lt;code&gt;BETWEEN&lt;/code&gt; and &lt;code&gt;IS NULL&lt;/code&gt; operators efficiently.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- All of these use B-tree indexes effectively&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_created_at&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="s1"&gt;'2026-02-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-02-08'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Multi-column B-tree indexes&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_customer_date&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- This uses the index (leftmost prefix rule)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'abc-123'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- This also uses the index (first column matches)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'abc-123'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- This does NOT use the index efficiently (skips the first column)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The column order in multi-column B-tree indexes matters a lot. PostgreSQL can use the index starting from the leftmost column. If your query only filters on the second column, the index likely won't help.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;B-tree works?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Exact match (&lt;code&gt;WHERE status = 'active'&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Range queries (&lt;code&gt;WHERE price &amp;gt; 100&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sorting (&lt;code&gt;ORDER BY created_at DESC&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pattern matching (&lt;code&gt;WHERE name LIKE 'John%'&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Yes (prefix only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pattern matching (&lt;code&gt;WHERE name LIKE '%John%'&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Array or JSON containment&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;B-tree is the right choice for primary keys, foreign keys, timestamp columns used in range filters and any column you frequently sort on. If you're unsure which index type to use, B-tree is almost always a safe starting point.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Hash — fast equality lookups
&lt;/h2&gt;

&lt;p&gt;Hash indexes build a hash table mapping each value to the row locations that contain it. They only support equality comparisons (&lt;code&gt;=&lt;/code&gt;), but they do it with O(1) lookup time instead of O(log n) for B-tree.&lt;/p&gt;

&lt;p&gt;Before PostgreSQL 10, hash indexes were not crash-safe because they weren't WAL-logged. That made them basically unusable in production. Since PostgreSQL 10, they're fully crash-safe and a reasonable option for specific workloads.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_sessions_token&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;sessions&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;hash&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- This uses the hash index&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sessions&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'a1b2c3d4e5f6'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- This does NOT use the hash index (not an equality check)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sessions&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'a1b2c3d4e5f6'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hash indexes are smaller than B-tree indexes for the same data, which can matter for large tables with high-cardinality columns. If you have a table with 50 million rows and you only ever look up by an exact session token or API key, a hash index uses less memory and disk.&lt;/p&gt;

&lt;p&gt;In practice, the difference is often marginal. B-tree handles equality just fine, and it also supports range queries as a bonus. Most PostgreSQL users never create a hash index. But if you're optimizing a high-throughput lookup table where every byte of index size matters, it's worth benchmarking.&lt;/p&gt;

&lt;p&gt;When to use hash over B-tree:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Exact match queries only, no range scans&lt;/li&gt;
&lt;li&gt;Very high cardinality columns (UUIDs, tokens, hashes)&lt;/li&gt;
&lt;li&gt;You want the smallest possible index size&lt;/li&gt;
&lt;li&gt;You've benchmarked and confirmed it outperforms B-tree for your workload&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. GIN — for full-text search, arrays and JSONB
&lt;/h2&gt;

&lt;p&gt;GIN stands for Generalized Inverted Index. It's designed for values that contain multiple elements, like arrays, JSONB documents and full-text search vectors. Where a B-tree maps one value to one row, a GIN index maps each element inside a composite value to the rows that contain it.&lt;/p&gt;

&lt;p&gt;Think of it like a book index at the back of a textbook. You look up a word and it tells you all the pages where that word appears. GIN does the same thing for array elements, JSON keys and text lexemes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Full-text search&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_articles_search&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gin&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'postgresql &amp;amp; indexing'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- JSONB containment&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_events_data&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gin&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;@&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'{"source": "api", "version": 2}'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Array containment&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_products_tags&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gin&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;@&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ARRAY&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'electronics'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'wireless'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GIN indexes are slower to build and update than B-tree indexes. Every insert potentially needs to update many entries in the inverted index. For write-heavy tables, this can be a noticeable overhead. PostgreSQL mitigates this with "fastupdate" which batches pending index entries, but it means the index can be slightly behind during heavy writes.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;B-tree&lt;/th&gt;
&lt;th&gt;GIN&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Equality and range queries&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Full-text search (&lt;code&gt;@@&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Array containment (&lt;code&gt;@&amp;gt;&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JSONB containment (&lt;code&gt;@&amp;gt;&lt;/code&gt;, &lt;code&gt;?&lt;/code&gt;, &lt;code&gt;?&amp;amp;&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Index build speed&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Write overhead&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Medium to high&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Index size&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Large&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;GIN is the correct choice whenever you need to search within composite values. If you're running &lt;code&gt;WHERE tags @&amp;gt; ...&lt;/code&gt;, &lt;code&gt;WHERE metadata @&amp;gt; ...&lt;/code&gt; or &lt;code&gt;WHERE tsvector @@ tsquery&lt;/code&gt;, a GIN index is what you want. Just be aware that it comes with higher write costs and larger disk usage compared to B-tree.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. GiST — for geometric, range and proximity queries
&lt;/h2&gt;

&lt;p&gt;GiST stands for Generalized Search Tree. It's a framework for building custom index types, but in practice it's mostly used for geometric data (points, polygons, circles), range types (date ranges, integer ranges) and full-text search (as an alternative to GIN).&lt;/p&gt;

&lt;p&gt;GiST indexes work by recursively partitioning the search space. For geometric data, imagine dividing a map into progressively smaller regions. To find all restaurants within 500 meters, the index eliminates entire regions that are too far away without checking individual rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostGIS spatial queries&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_locations_geo&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gist&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;coordinates&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ST_DWithin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;coordinates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_MakePoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;73&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;985&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;748&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;geography&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Range overlap queries&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_reservations_period&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gist&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;during&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;during&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2026-02-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2026-02-15'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Nearest-neighbor search&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;coordinates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_MakePoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;73&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;985&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;748&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;geography&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;coordinates&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ST_MakePoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;73&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;985&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;748&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;geography&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GiST also supports full-text search, but with different trade-offs compared to GIN. GiST full-text indexes are faster to build and smaller on disk, but slower for queries, especially when a search term appears in many documents. GIN is generally preferred for full-text search unless you're combining it with other GiST-supported operations.&lt;/p&gt;

&lt;p&gt;When to use GiST:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PostGIS and geographic data (finding nearby points, intersecting polygons)&lt;/li&gt;
&lt;li&gt;Range type operations (overlapping date ranges, integer ranges)&lt;/li&gt;
&lt;li&gt;Nearest-neighbor queries (&lt;code&gt;ORDER BY ... &amp;lt;-&amp;gt;&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Exclusion constraints (preventing overlapping ranges in a table)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Exclusion constraint using GiST&lt;/span&gt;
&lt;span class="c1"&gt;-- Prevents overlapping room reservations&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;room_id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;during&lt;/span&gt; &lt;span class="n"&gt;TSTZRANGE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;EXCLUDE&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gist&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;room_id&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;during&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The exclusion constraint example is particularly useful. It guarantees at the database level that no two bookings for the same room can overlap. This is something you can't do with B-tree indexes.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. BRIN — for large, naturally ordered tables
&lt;/h2&gt;

&lt;p&gt;BRIN stands for Block Range Index. It's the most space-efficient index type PostgreSQL offers, but it only works well under a specific condition: the physical order of rows on disk must correlate with the column values.&lt;/p&gt;

&lt;p&gt;Instead of indexing every row, BRIN indexes store summary information (min and max values) for each block range, which is a group of consecutive physical pages. When PostgreSQL scans for rows, it checks the block summaries and skips entire ranges that can't contain matching data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Perfect for a time-series table where rows are inserted in chronological order&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_logs_created_at&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;access_logs&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;brin&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- This can skip huge portions of the table&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;access_logs&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="s1"&gt;'2026-02-01'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="s1"&gt;'2026-02-02'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The size difference is dramatic. A B-tree index on a 100 GB table might be 2 GB. A BRIN index on the same table could be 100 KB. That's not a typo. BRIN indexes are orders of magnitude smaller because they store one summary per block range instead of one entry per row.&lt;/p&gt;

&lt;p&gt;But this efficiency has a hard prerequisite. If the data isn't physically ordered on disk by the indexed column, BRIN is useless. If you insert rows with random timestamps, the min/max summaries for each block range will span the entire value space, and PostgreSQL won't be able to skip anything.&lt;/p&gt;

&lt;p&gt;Good candidates for BRIN:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Append-only tables with timestamp columns (logs, events, audit trails)&lt;/li&gt;
&lt;li&gt;Tables where rows are inserted in natural order of some column&lt;/li&gt;
&lt;li&gt;Very large tables (millions or billions of rows) where B-tree index size is a concern&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bad candidates for BRIN:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tables with frequent updates that change the indexed column&lt;/li&gt;
&lt;li&gt;Tables where rows are inserted in random order&lt;/li&gt;
&lt;li&gt;Small tables (B-tree is more efficient for small datasets)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;BRIN is a specialized tool. When it fits, it's incredible. When it doesn't, it won't help at all. Check the correlation between physical row order and column values using &lt;code&gt;pg_stats&lt;/code&gt; before deciding:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;tablename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;correlation&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stats&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;tablename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'access_logs'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;attname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A correlation value close to 1 or -1 means BRIN will work well. Values near 0 mean the data is randomly distributed and BRIN won't help.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical indexing tips
&lt;/h2&gt;

&lt;p&gt;Knowing which index types exist is half the story. The other half is using them effectively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check if your indexes are actually being used.&lt;/strong&gt; PostgreSQL tracks index usage statistics. If an index hasn't been scanned in months, it's costing you write performance for no benefit.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;indexrelname&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;index_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;idx_scan&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;times_used&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pg_size_pretty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pg_relation_size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexrelid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;index_size&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_user_indexes&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;schemaname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'public'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;idx_scan&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; before and after creating indexes.&lt;/strong&gt; Don't assume an index will help. Verify it. Sometimes the planner chooses a sequential scan because the table is small enough that the index adds no value.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'abc-123'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Consider partial indexes for filtered queries.&lt;/strong&gt; If you only ever query active orders, index only the active rows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_active&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This index is smaller and faster than indexing all orders because it only covers rows matching the condition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don't forget about covering indexes.&lt;/strong&gt; If a query only needs columns that are all in the index, PostgreSQL can answer it entirely from the index without touching the table. This is called an index-only scan.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_covering&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;INCLUDE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- This can be served entirely from the index&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'abc-123'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Keeping your data safe while you optimize
&lt;/h2&gt;

&lt;p&gt;Experimenting with indexes is relatively low-risk since you can always drop an index and try again. But schema changes, large data migrations and production experiments can go wrong in ways that are harder to undo.&lt;/p&gt;

&lt;p&gt;Having a reliable &lt;a href="https://databasus.com" rel="noopener noreferrer"&gt;PostgreSQL backup&lt;/a&gt; strategy means you can experiment with confidence. Databasus is an industry standard for PostgreSQL backup tools. It handles automated scheduled backups with compression, encryption and multiple storage destinations, suitable for individual developers and enterprise teams alike.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing the right index for your workload
&lt;/h2&gt;

&lt;p&gt;There's no universal "best" index type. The right choice depends entirely on your data and your queries. B-tree covers most common scenarios. GIN handles composite and full-text data. GiST solves geometric and range problems. Hash optimizes pure equality lookups. BRIN saves massive disk space on naturally ordered data.&lt;/p&gt;

&lt;p&gt;Start with &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; on your slowest queries, identify what kind of operations they perform and match those operations to the appropriate index type. One well-chosen index beats five poorly chosen ones every time.&lt;/p&gt;

</description>
      <category>database</category>
      <category>postgres</category>
    </item>
    <item>
      <title>MongoDB schema design — 6 patterns every developer should master</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Sun, 08 Feb 2026 19:35:01 +0000</pubDate>
      <link>https://forem.com/piteradyson/mongodb-schema-design-6-patterns-every-developer-should-master-1dha</link>
      <guid>https://forem.com/piteradyson/mongodb-schema-design-6-patterns-every-developer-should-master-1dha</guid>
      <description>&lt;p&gt;MongoDB gives you flexibility that relational databases don't. No rigid tables, no mandatory schemas, no upfront column definitions. You just throw documents into a collection and go. That freedom is exactly what makes schema design in MongoDB so important and so easy to get wrong.&lt;/p&gt;

&lt;p&gt;The problem is that "schemaless" doesn't mean "no design needed." Without a good schema strategy, you end up with slow queries, bloated documents and data that's hard to work with as your application grows. These six patterns solve the most common problems developers hit when designing MongoDB schemas.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftzwaumim1fc47scrwws5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftzwaumim1fc47scrwws5.png" alt="MongoDB schema" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Embedding vs referencing
&lt;/h2&gt;

&lt;p&gt;This is the first decision you'll make for every relationship in your data model. Should related data live inside the same document or in a separate collection with a reference? The answer depends on how you read and write the data.&lt;/p&gt;

&lt;p&gt;Embedding means nesting related data directly within a document. If you have a blog post with comments, embedding puts the comments array inside the post document. One read gets everything. No joins needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Embedded comments inside a blog post&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;MongoDB schema tips&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Jane&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;comments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Bob&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Great article!&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Alice&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Very helpful&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-02&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Referencing stores related data in a separate collection and links them with an ObjectId. You fetch the post first, then the comments in a second query (or use $lookup for a server-side join).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Post document&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;post1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;MongoDB schema tips&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Jane&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Separate comment documents&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;c1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;postId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;post1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Bob&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Great article!&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;c2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;postId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;post1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Alice&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Very helpful&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When to embed vs reference:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Factor&lt;/th&gt;
&lt;th&gt;Embed&lt;/th&gt;
&lt;th&gt;Reference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Read pattern&lt;/td&gt;
&lt;td&gt;Data is always read together&lt;/td&gt;
&lt;td&gt;Data is read independently&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Array growth&lt;/td&gt;
&lt;td&gt;Bounded (won't grow indefinitely)&lt;/td&gt;
&lt;td&gt;Unbounded (could grow to thousands)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Document size&lt;/td&gt;
&lt;td&gt;Stays well under 16 MB limit&lt;/td&gt;
&lt;td&gt;Would approach size limits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Update frequency&lt;/td&gt;
&lt;td&gt;Nested data rarely changes&lt;/td&gt;
&lt;td&gt;Nested data changes frequently&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data reuse&lt;/td&gt;
&lt;td&gt;Used only in this context&lt;/td&gt;
&lt;td&gt;Shared across multiple documents&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Embedding works well for one-to-few relationships where the nested data is tightly coupled to the parent. Think user profiles with addresses, products with a small list of variants or orders with line items. Referencing is better when the related data grows without bound, gets accessed independently or is shared across multiple parent documents.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The subset pattern
&lt;/h2&gt;

&lt;p&gt;Documents in MongoDB have a 16 MB size limit, but you'll hit performance problems long before that. Loading a 2 MB document when you only need a few fields from it wastes network bandwidth and memory. The subset pattern solves this by keeping the most-accessed data in the main document and moving the rest to a secondary collection.&lt;/p&gt;

&lt;p&gt;A common example is an e-commerce product page. The product listing shows the name, price, main image and the three most recent reviews. But the product might have 500 reviews total. Loading all 500 reviews every time someone views the product page is wasteful.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Main product document (fast reads for product listings)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;prod1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Wireless Headphones&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;price&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;79.99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;headphones-main.jpg&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;recentReviews&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Alex&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Sound quality is excellent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-05&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Sam&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Comfortable for long use&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-03&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Jordan&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Best in this price range&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-01-28&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="nx"&gt;reviewCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;487&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;averageRating&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;4.3&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Full reviews in a separate collection (loaded only on "See all reviews")&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;rev1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;prod1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Alex&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Sound quality is excellent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-05&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The trade-off is data duplication. The three recent reviews exist in both the product document and the reviews collection. You need to keep them in sync when reviews are added. But the read performance gain is significant because 95% of your traffic only needs the subset.&lt;/p&gt;

&lt;p&gt;This pattern applies anywhere you have a one-to-many relationship where most reads only need a small portion of the "many" side. User activity feeds, article comments and notification lists all benefit from it.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. The bucket pattern
&lt;/h2&gt;

&lt;p&gt;Time-series and event data can generate enormous numbers of documents. If your IoT sensors send readings every second, that's 86,400 documents per sensor per day. Storing each reading as an individual document creates index bloat and makes range queries slower than they need to be.&lt;/p&gt;

&lt;p&gt;The bucket pattern groups multiple data points into a single document based on a time range. Instead of one document per reading, you store one document per hour (or per minute, depending on your granularity).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Without bucket pattern: one document per reading&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;sensorId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;temp-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;22.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08T10:00:00Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;sensorId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;temp-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;22.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08T10:00:01Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;sensorId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;temp-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;22.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08T10:00:02Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// ... 86,397 more documents for this sensor today&lt;/span&gt;

&lt;span class="c1"&gt;// With bucket pattern: one document per hour&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;sensorId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;temp-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;startDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08T10:00:00Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;endDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08T10:59:59Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;readings&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;22.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08T10:00:00Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;22.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08T10:00:01Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;22.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08T10:00:02Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// ... 3597 more readings&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;avg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;22.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;min&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;21.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;max&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;23.1&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Benefits of the bucket pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fewer documents means smaller indexes and faster queries&lt;/li&gt;
&lt;li&gt;Pre-computed summaries (avg, min, max) avoid full scans for common aggregations&lt;/li&gt;
&lt;li&gt;Range queries only touch a handful of bucket documents instead of thousands of individual ones&lt;/li&gt;
&lt;li&gt;Deleting old data is simpler since you drop entire bucket documents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The bucket size depends on your access pattern. If most queries ask for hourly summaries, use hourly buckets. If users typically look at daily dashboards, daily buckets work better. The key is to match bucket granularity to how the data gets consumed.&lt;/p&gt;

&lt;p&gt;Note that MongoDB 5.0+ introduced native time series collections which handle some of this automatically. But the bucket pattern is still useful for custom aggregations and when you need pre-computed summaries stored alongside the raw data.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. The polymorphic pattern
&lt;/h2&gt;

&lt;p&gt;Not every document in a collection needs to look the same. The polymorphic pattern handles entities that share some common fields but differ in their details. Instead of creating separate collections for each variation, you store them all in one collection with a type field.&lt;/p&gt;

&lt;p&gt;A content management system is a good example. You might have articles, videos, podcasts and image galleries. They all have a title, author, publish date and tags. But an article has a body field, a video has a duration and URL, a podcast has an audio file and episode number.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Article&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;article&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Getting started with MongoDB&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Jane&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;publishDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nx"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mongodb&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tutorial&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;MongoDB is a document database...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;wordCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1500&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Video&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;video&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;MongoDB schema design workshop&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Jane&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;publishDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-05&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mongodb&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;schema&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;videoUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://example.com/videos/mongo-schema&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;resolution&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1080p&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Podcast&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;podcast&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Database trends in 2026&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Bob&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;publishDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-07&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;databases&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;trends&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;audioUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://example.com/podcasts/db-trends&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;episodeNumber&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1800&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The advantage is that queries across all content types are simple. Want all content by Jane sorted by date? One query on one collection. Want only videos? Add a filter on the type field. The shared fields make indexing straightforward, and you can create partial indexes for type-specific fields.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Index for type-specific queries&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createIndex&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;publishDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// Partial index only for videos&lt;/span&gt;
&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;partialFilterExpression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;video&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern works when the entities share enough common fields to justify a single collection and when you frequently query across types. If different types are always queried separately and share almost nothing, separate collections might be cleaner.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. The extended reference pattern
&lt;/h2&gt;

&lt;p&gt;When you reference data in another collection, sometimes you need a few fields from that referenced document on almost every read. The extended reference pattern copies those frequently-needed fields into the referencing document to avoid a second lookup.&lt;/p&gt;

&lt;p&gt;Consider an order system. Every order references a customer. When you display the order list, you need the customer name and email. Without the extended reference, every order list page requires a $lookup or a second query to the customers collection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Instead of just storing customerId&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;order1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nx"&gt;customerId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cust1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;product&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Widget&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;price&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;9.99&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="nx"&gt;total&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;29.97&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;orderDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Store frequently-needed customer fields directly in the order&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;order1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;customer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cust1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Alice Johnson&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;alice@example.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;product&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Widget&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;price&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;9.99&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;total&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;29.97&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;orderDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The trade-off is data staleness. If Alice changes her email, the orders still show the old one until you update them. For many use cases this is acceptable. An order should probably reflect the customer information at the time it was placed anyway.&lt;/p&gt;

&lt;p&gt;When to use the extended reference pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The referenced fields are read frequently but updated rarely&lt;/li&gt;
&lt;li&gt;Join operations ($lookup) are causing performance issues&lt;/li&gt;
&lt;li&gt;The copied fields are small relative to the document size&lt;/li&gt;
&lt;li&gt;Slight staleness in the copied data is acceptable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This pattern is different from full embedding. You're not copying the entire customer document into every order. You're selectively copying only the fields that the most common queries need. The full customer record still lives in its own collection for detailed views and updates.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. The computed pattern
&lt;/h2&gt;

&lt;p&gt;Some values are expensive to calculate on the fly. If you're counting the number of views on a video, computing the average rating from thousands of reviews or aggregating daily sales totals, doing that calculation on every read is wasteful.&lt;/p&gt;

&lt;p&gt;The computed pattern pre-calculates these values and stores them in the document. You update them when the underlying data changes, not when someone reads the result.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Product with pre-computed statistics&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;prod1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Wireless Headphones&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;price&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;79.99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;totalReviews&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;487&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;averageRating&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;4.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;ratingDistribution&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;5&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;203&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;4&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;156&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;78&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;34&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="nx"&gt;totalSold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2341&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;lastPurchaseDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-02-08T14:30:00Z&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a new review comes in, you update the stats using atomic operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;prod1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;stats.totalReviews&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;stats.ratingDistribution.4&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;$set&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;stats.averageRating&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;4.28&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Read cost&lt;/th&gt;
&lt;th&gt;Write cost&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Calculate on read&lt;/td&gt;
&lt;td&gt;High (aggregation every time)&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Always current&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Computed pattern&lt;/td&gt;
&lt;td&gt;Low (pre-stored value)&lt;/td&gt;
&lt;td&gt;Low (incremental update)&lt;/td&gt;
&lt;td&gt;Eventually consistent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Background job&lt;/td&gt;
&lt;td&gt;Low (pre-stored value)&lt;/td&gt;
&lt;td&gt;Batch update on schedule&lt;/td&gt;
&lt;td&gt;Delayed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The computed pattern is the right choice when reads vastly outnumber writes and the computation is non-trivial. Product ratings, follower counts, dashboard metrics and leaderboards are all good candidates.&lt;/p&gt;

&lt;p&gt;For background computation jobs, you need reliable scheduling. If the computation updates stall because a cron job dies silently, your users see stale data indefinitely. Monitoring and alerting on these jobs matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Combining patterns in practice
&lt;/h2&gt;

&lt;p&gt;Real applications rarely use a single pattern in isolation. A product catalog might use the subset pattern for reviews, the computed pattern for aggregate statistics, embedding for product variants and the extended reference pattern for category information. The patterns compose well.&lt;/p&gt;

&lt;p&gt;The key principle behind all of them is the same: design your schema around your queries, not around your entities. In relational databases, you normalize first and optimize later. In MongoDB, you start by listing your most frequent queries and design the schema to serve those queries efficiently.&lt;/p&gt;

&lt;p&gt;Here are a few practical guidelines for combining patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Start simple.&lt;/strong&gt; Embed first. Only introduce references and patterns when you hit a specific problem like document size, update complexity or query performance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Know your read-to-write ratio.&lt;/strong&gt; High-read workloads benefit from denormalization (embedding, computed, extended reference). High-write workloads favor normalization (referencing) to avoid updating data in multiple places.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor document growth.&lt;/strong&gt; If a document's embedded array keeps growing, apply the subset or bucket pattern before it becomes a problem.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As your MongoDB deployment grows, having reliable &lt;a href="https://databasus.com/mongodb-backup" rel="noopener noreferrer"&gt;MongoDB backup&lt;/a&gt; becomes critical. Schema changes and data migrations can go wrong, and recovering from a bad migration without a backup means data loss. Databasus is an industry standard for MongoDB backup tools, offering automated scheduled backups with compression, encryption and multiple storage destinations for both solo developers and enterprise teams.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing the right pattern
&lt;/h2&gt;

&lt;p&gt;There's no single correct schema for any application. The right choice depends on your query patterns, data volume, update frequency and consistency requirements. These six patterns cover the scenarios that come up most often in practice.&lt;/p&gt;

&lt;p&gt;Start with the simplest design that works. Add complexity only when you have evidence that the simple approach isn't performing. Profile your queries, watch your document sizes and pay attention to how your data grows over time. The best schema is the one that makes your most common operations fast and your least common operations possible.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mongodb</category>
    </item>
    <item>
      <title>MariaDB vs MySQL — 8 reasons developers are switching in 2026</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Fri, 06 Feb 2026 21:03:49 +0000</pubDate>
      <link>https://forem.com/piteradyson/mariadb-vs-mysql-8-reasons-developers-are-switching-in-2026-19cg</link>
      <guid>https://forem.com/piteradyson/mariadb-vs-mysql-8-reasons-developers-are-switching-in-2026-19cg</guid>
      <description>&lt;p&gt;MariaDB started as a fork of MySQL back in 2009 when Oracle acquired Sun Microsystems. At the time, people weren't sure if the fork would survive long-term or just become another abandoned open source project. Fast forward to 2026 and MariaDB has become a serious alternative that many developers now prefer over the original. This article looks at why.&lt;/p&gt;

&lt;p&gt;The split wasn't just a copy. MariaDB took a different path on storage engines, performance optimization, licensing and community governance. Some of those decisions are paying off now, especially for teams that care about open source principles and technical independence.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6w8otqtpjxqjziib80r3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6w8otqtpjxqjziib80r3.png" alt="MySQL vs MariaDB" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Truly open source, no asterisks
&lt;/h2&gt;

&lt;p&gt;The biggest reason developers switch to MariaDB is licensing clarity. MySQL uses a dual licensing model under Oracle. The Community Edition is GPL, but Oracle reserves certain features for MySQL Enterprise Edition, which requires a commercial license. Thread pool, audit plugins, advanced security features and some backup tools are locked behind that paywall.&lt;/p&gt;

&lt;p&gt;MariaDB is fully open source under GPL. Every feature ships in a single edition. There's no "enterprise only" tier hiding the good stuff. What you download is what everyone gets.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;MariaDB&lt;/th&gt;
&lt;th&gt;MySQL&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;License&lt;/td&gt;
&lt;td&gt;GPL v2 (fully open)&lt;/td&gt;
&lt;td&gt;GPL + Commercial dual license&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enterprise features&lt;/td&gt;
&lt;td&gt;All included in one edition&lt;/td&gt;
&lt;td&gt;Some locked behind Enterprise&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Corporate owner&lt;/td&gt;
&lt;td&gt;MariaDB Foundation (non-profit)&lt;/td&gt;
&lt;td&gt;Oracle Corporation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Feature restrictions&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Thread pool, audit log, etc.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For companies doing compliance reviews or avoiding vendor lock-in, this difference alone can drive the decision. You don't need to worry about Oracle changing terms or restricting features in a future release.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Better storage engine options
&lt;/h2&gt;

&lt;p&gt;MariaDB ships with storage engines that MySQL either doesn't have or charges extra for. The most notable one is Aria, a crash-safe alternative to MyISAM. But the real story is about ColumnStore and the overall engine diversity.&lt;/p&gt;

&lt;p&gt;MariaDB ColumnStore provides columnar storage for analytical workloads. If you need to run reports or aggregations over large datasets alongside your transactional workload, ColumnStore handles that without requiring a separate analytical database. MySQL doesn't have a built-in columnar engine.&lt;/p&gt;

&lt;p&gt;The default storage engine for both is InnoDB (or MariaDB's fork of it), so basic compatibility isn't an issue. But MariaDB's InnoDB fork includes optimizations that aren't in upstream MySQL InnoDB, particularly around buffer pool management and compression.&lt;/p&gt;

&lt;p&gt;MariaDB also includes the S3 storage engine, which lets you archive old tables directly to S3-compatible object storage. That's useful for keeping historical data accessible without eating local disk space. Try doing that natively with MySQL.&lt;/p&gt;

&lt;p&gt;For teams running mixed workloads or managing large datasets, MariaDB's engine diversity is a practical advantage that saves you from bolting on third-party tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Thread pool that doesn't cost extra
&lt;/h2&gt;

&lt;p&gt;MySQL's built-in thread pool is an Enterprise-only feature. The Community Edition uses a one-thread-per-connection model. Under heavy concurrency this causes performance degradation because the operating system spends more time context-switching between threads than doing actual work.&lt;/p&gt;

&lt;p&gt;MariaDB includes thread pooling in its open source edition. It handles thousands of concurrent connections efficiently by grouping them into a pool and processing them in batches. The performance difference shows up clearly when you have hundreds or thousands of simultaneous connections.&lt;/p&gt;

&lt;p&gt;This matters in practice. Web applications behind load balancers, microservice architectures with many small services connecting to the same database and serverless environments that create connections rapidly all benefit from thread pooling. With MySQL Community, you either accept the performance hit or pay for Enterprise.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MariaDB: Thread pool included, configurable, production-ready&lt;/li&gt;
&lt;li&gt;MySQL Community: One-thread-per-connection, no built-in pool&lt;/li&gt;
&lt;li&gt;MySQL Enterprise: Thread pool available, requires commercial license&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For high-concurrency environments, this is not a minor difference. It directly affects response times and database stability under load.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Oracle-free governance
&lt;/h2&gt;

&lt;p&gt;MySQL development happens primarily inside Oracle. The roadmap is set internally, feature priorities are decided behind closed doors and external contributors have limited influence on the project's direction. You can submit patches, but whether they get reviewed or merged depends on Oracle's priorities.&lt;/p&gt;

&lt;p&gt;MariaDB is governed by the MariaDB Foundation, a non-profit organization. Development happens in the open with public discussions, accessible roadmaps and meaningful community input. Multiple companies contribute to MariaDB, and no single entity controls its future.&lt;/p&gt;

&lt;p&gt;This isn't just philosophical. Oracle has a track record of deprioritizing open source projects after acquisition. OpenSolaris, Hudson (now Jenkins after the fork) and Java's open source trajectory all changed after Oracle got involved. MySQL hasn't been abandoned, but features increasingly land in Enterprise Edition rather than Community.&lt;/p&gt;

&lt;p&gt;Developers who've been burned by corporate stewardship issues tend to prefer MariaDB's governance model. It's the same reason many prefer PostgreSQL over MySQL in general: community-driven projects are more predictable long-term.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Faster query optimizer
&lt;/h2&gt;

&lt;p&gt;MariaDB's query optimizer has diverged significantly from MySQL's. Several optimizations that MariaDB implements are either absent from MySQL or arrived years later.&lt;/p&gt;

&lt;p&gt;Key optimizer improvements in MariaDB include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Subquery optimizations&lt;/strong&gt;: MariaDB converts subqueries to joins more aggressively, which often dramatically improves query performance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Table elimination&lt;/strong&gt;: If a joined table doesn't contribute to the result, MariaDB removes it from the execution plan automatically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hash joins&lt;/strong&gt;: MariaDB supported hash joins before MySQL added them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Condition pushdown&lt;/strong&gt;: Pushes WHERE conditions closer to the data access layer for earlier filtering&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These aren't benchmarketing tricks. They affect real queries that developers write every day. A complex reporting query with subqueries and multiple joins can run significantly faster on MariaDB without any query rewriting.&lt;/p&gt;

&lt;p&gt;That said, MySQL has been closing the gap. MySQL 8.0+ added hash joins and improved its optimizer. But MariaDB still tends to handle complex query patterns more efficiently, particularly when subqueries are involved.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Smoother replication features
&lt;/h2&gt;

&lt;p&gt;Both databases support replication, but MariaDB has added features that make replication management easier in production environments.&lt;/p&gt;

&lt;p&gt;MariaDB uses Global Transaction IDs (GTIDs) that are simpler to work with than MySQL's implementation. Switching a replica to follow a different primary is straightforward with MariaDB GTIDs. MySQL's GTID implementation works but has quirks around purged transactions that can cause headaches during failover.&lt;/p&gt;

&lt;p&gt;MariaDB also supports parallel replication with more granular control. You can configure how transactions are parallelized on replicas, which helps replicas keep up with high-write primaries. MySQL has parallel replication too, but MariaDB's implementation gives operators more knobs to tune.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;MariaDB&lt;/th&gt;
&lt;th&gt;MySQL&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GTID format&lt;/td&gt;
&lt;td&gt;Domain-based, simpler failover&lt;/td&gt;
&lt;td&gt;UUID-based, purge complications&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Parallel replication&lt;/td&gt;
&lt;td&gt;Group commit based, configurable&lt;/td&gt;
&lt;td&gt;Logical clock based&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-source replication&lt;/td&gt;
&lt;td&gt;Supported since MariaDB 10.0&lt;/td&gt;
&lt;td&gt;Added in MySQL 5.7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delayed replication&lt;/td&gt;
&lt;td&gt;Supported&lt;/td&gt;
&lt;td&gt;Supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Replication filters&lt;/td&gt;
&lt;td&gt;More flexible&lt;/td&gt;
&lt;td&gt;More limited&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For teams managing replicated setups across multiple datacenters or running read replicas at scale, MariaDB's replication features reduce operational friction. The difference is most noticeable during failovers and topology changes.&lt;/p&gt;

&lt;p&gt;Reliable backups are essential when running replicated databases. If a replication chain breaks or data gets corrupted, your last good backup is what saves you. Automated &lt;a href="https://databasus.com/mysql-backup" rel="noopener noreferrer"&gt;MariaDB backup&lt;/a&gt; tools like Databasus provide scheduled backups with encryption and multiple storage destinations, which is the industry standard for MariaDB backup management.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Temporal tables built in
&lt;/h2&gt;

&lt;p&gt;MariaDB supports system-versioned temporal tables natively. This means the database automatically tracks the history of every row: when it was inserted, updated and deleted. You can query the state of any table at any point in time without writing audit triggers or maintaining history tables yourself.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create a system-versioned table in MariaDB&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="k"&gt;SYSTEM&lt;/span&gt; &lt;span class="n"&gt;VERSIONING&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Query historical state&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="n"&gt;SYSTEM_TIME&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;OF&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-15 10:00:00'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MySQL doesn't have this feature. If you need historical data tracking in MySQL, you build it yourself with triggers, shadow tables and application logic. It works, but it's tedious and error-prone.&lt;/p&gt;

&lt;p&gt;Temporal tables are useful for audit requirements, regulatory compliance and debugging production issues. Being able to ask "what did this row look like yesterday at 3 PM?" without any application changes is genuinely powerful. Financial applications, healthcare systems and any application subject to regulatory audits benefit from this.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Backward compatibility with MySQL
&lt;/h2&gt;

&lt;p&gt;Here's the practical part that makes switching feasible. MariaDB maintains wire protocol compatibility with MySQL. Most MySQL client libraries, ORMs and tools work with MariaDB without changes. Your application code, connection strings (with minor adjustments) and database drivers typically work as-is.&lt;/p&gt;

&lt;p&gt;MariaDB can read MySQL data files for migration. The SQL syntax is almost entirely compatible. Stored procedures, views, triggers and most SQL features work identically. The differences are mostly in newer features that MariaDB added and MySQL either doesn't have or implements differently.&lt;/p&gt;

&lt;p&gt;Migration is not zero-effort, but it's close for most applications. The typical process is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dump your MySQL database with mysqldump&lt;/li&gt;
&lt;li&gt;Import into MariaDB&lt;/li&gt;
&lt;li&gt;Test your application against the new database&lt;/li&gt;
&lt;li&gt;Adjust any MySQL-specific syntax that doesn't have a MariaDB equivalent (rare)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The compatibility means you're not starting over. You're switching engines on a running car, which is exactly how a database fork should work.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to stay with MySQL
&lt;/h2&gt;

&lt;p&gt;MariaDB isn't universally better. There are valid reasons to stick with MySQL.&lt;/p&gt;

&lt;p&gt;If your team already has deep MySQL expertise and established operational procedures, the switching cost might not be worth it. If you're using MySQL-specific features like MySQL Shell, MySQL Router or Group Replication heavily, the MariaDB equivalents may not be drop-in replacements. Some cloud providers offer better managed MySQL support than MariaDB support, particularly AWS RDS where MySQL gets more attention.&lt;/p&gt;

&lt;p&gt;And if you're running a simple application that works fine on MySQL Community, switching databases for theoretical benefits doesn't make much sense. Solve real problems, not hypothetical ones.&lt;/p&gt;

&lt;h2&gt;
  
  
  Making the switch
&lt;/h2&gt;

&lt;p&gt;The trend is clear: MariaDB keeps adding features and maintaining openness while MySQL's open source edition gets more constrained relative to Enterprise. For new projects, MariaDB is worth serious consideration. For existing MySQL deployments, switching makes sense when you're hitting limitations that MariaDB addresses, whether that's thread pooling, temporal tables, optimizer performance or licensing concerns.&lt;/p&gt;

&lt;p&gt;Both databases will continue to work for most applications. The question is which trajectory you'd rather be on. MariaDB is betting on open source and community-driven development. MySQL's direction depends on Oracle's priorities. For many developers, that distinction is enough.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mysql</category>
      <category>mariadb</category>
    </item>
    <item>
      <title>MySQL vs PostgreSQL in 2026 — 7 key differences you should know before choosing</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Thu, 05 Feb 2026 15:07:09 +0000</pubDate>
      <link>https://forem.com/piteradyson/mysql-vs-postgresql-in-2026-7-key-differences-you-should-know-before-choosing-4l9d</link>
      <guid>https://forem.com/piteradyson/mysql-vs-postgresql-in-2026-7-key-differences-you-should-know-before-choosing-4l9d</guid>
      <description>&lt;p&gt;Choosing between MySQL and PostgreSQL isn't straightforward. Both are mature, production-ready databases used by companies of all sizes. But they solve problems differently and each has strengths that matter depending on your use case. This article breaks down the actual differences that affect day-to-day development and operations in 2026.&lt;/p&gt;

&lt;p&gt;The comparison covers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL standards compliance and data integrity&lt;/li&gt;
&lt;li&gt;JSON and document handling&lt;/li&gt;
&lt;li&gt;Replication approaches&lt;/li&gt;
&lt;li&gt;Licensing and ownership&lt;/li&gt;
&lt;li&gt;Performance characteristics&lt;/li&gt;
&lt;li&gt;Extension ecosystems&lt;/li&gt;
&lt;li&gt;Community and tooling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foc0plowng83hoicore8z.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foc0plowng83hoicore8z.jpg" alt="MySQL vs PostgreSQL" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. SQL standards compliance and data integrity
&lt;/h2&gt;

&lt;p&gt;PostgreSQL follows the SQL standard more strictly than MySQL. This matters more than it might seem at first. When PostgreSQL says a transaction is ACID compliant, it means it. MySQL has improved significantly over the years, but some default behaviors still surprise developers coming from other databases.&lt;/p&gt;

&lt;p&gt;PostgreSQL enforces data types strictly. If you try to insert a string into an integer column, it fails. MySQL historically performed silent type conversions, which could lead to data corruption. MySQL 8.0+ has strict mode enabled by default, but many existing installations still run with older, more permissive settings.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;PostgreSQL&lt;/th&gt;
&lt;th&gt;MySQL&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Default SQL mode&lt;/td&gt;
&lt;td&gt;Strict, standards-compliant&lt;/td&gt;
&lt;td&gt;Strict in 8.0+, permissive in older versions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Silent type conversions&lt;/td&gt;
&lt;td&gt;Never&lt;/td&gt;
&lt;td&gt;Depends on SQL mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CHECK constraints&lt;/td&gt;
&lt;td&gt;Fully enforced&lt;/td&gt;
&lt;td&gt;Enforced since MySQL 8.0.16&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Foreign key enforcement&lt;/td&gt;
&lt;td&gt;Always with supported storage engines&lt;/td&gt;
&lt;td&gt;Only with InnoDB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TRUNCATE in transactions&lt;/td&gt;
&lt;td&gt;Transactional&lt;/td&gt;
&lt;td&gt;Not transactional&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For applications where data integrity is critical, PostgreSQL provides stronger guarantees out of the box. MySQL can be configured to behave similarly, but you need to verify your settings and understand which features require InnoDB specifically.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. JSON and document handling
&lt;/h2&gt;

&lt;p&gt;Both databases support JSON, but they approach it differently. PostgreSQL has native JSONB type that stores JSON in a binary format with full indexing support. MySQL added JSON support in version 5.7 and has improved it since, but the implementation has limitations.&lt;/p&gt;

&lt;p&gt;PostgreSQL's JSONB allows you to create indexes on specific JSON paths, query nested structures efficiently and use JSON in complex queries alongside relational data. You can also use operators like &lt;code&gt;@&amp;gt;&lt;/code&gt; (contains) and &lt;code&gt;?&lt;/code&gt; (key exists) that make JSON queries concise and readable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL: Create index on JSON field&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_users_metadata_country&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="s1"&gt;'country'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="c1"&gt;-- PostgreSQL: Query with containment operator&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;@&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'{"role": "admin"}'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MySQL's JSON functions work but feel more verbose. Indexing JSON in MySQL requires generated columns, which adds complexity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- MySQL: Requires generated column for indexing&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
  &lt;span class="k"&gt;GENERATED&lt;/span&gt; &lt;span class="n"&gt;ALWAYS&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;JSON_UNQUOTE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="s1"&gt;'$.country'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="n"&gt;STORED&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_country&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your application heavily uses semi-structured data or you're building something that mixes relational and document patterns, PostgreSQL handles this more elegantly. MySQL works fine for basic JSON storage and retrieval, but advanced querying gets awkward.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Replication and high availability
&lt;/h2&gt;

&lt;p&gt;Both databases support replication, but they use fundamentally different approaches. Understanding these differences matters when planning for high availability and read scaling.&lt;/p&gt;

&lt;p&gt;MySQL uses binary log replication. The primary server writes changes to a binary log, and replicas read from it. This approach is well-understood and has been battle-tested for decades. MySQL also supports Group Replication for multi-primary setups, though it comes with trade-offs around consistency.&lt;/p&gt;

&lt;p&gt;PostgreSQL uses Write-Ahead Log (WAL) streaming replication. It's conceptually similar but operates at the storage level rather than the query level. PostgreSQL's logical replication (added in version 10) allows selective table replication and cross-version replication.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MySQL binary replication is simpler to set up initially&lt;/li&gt;
&lt;li&gt;PostgreSQL logical replication offers more flexibility for complex topologies&lt;/li&gt;
&lt;li&gt;MySQL Group Replication provides multi-primary but with consistency caveats&lt;/li&gt;
&lt;li&gt;PostgreSQL synchronous replication guarantees zero data loss at the cost of latency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most applications, both approaches work well. MySQL's tooling ecosystem for replication is more mature, with tools like Orchestrator and ProxySQL being widely used. PostgreSQL's tooling has caught up significantly with Patroni, pgBouncer and others.&lt;/p&gt;

&lt;p&gt;Backup strategies differ too. Both support logical and physical backups, but the tools and workflows vary. For automated database backups with scheduling, encryption and multiple storage destinations, &lt;a href="https://databasus.com/mysql-backup" rel="noopener noreferrer"&gt;MySQL backup&lt;/a&gt; tools like Databasus handle both MySQL and PostgreSQL, providing a unified approach regardless of which database you choose.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Licensing and corporate ownership
&lt;/h2&gt;

&lt;p&gt;This is often overlooked but increasingly important. MySQL is owned by Oracle. PostgreSQL is a community project with no single corporate owner.&lt;/p&gt;

&lt;p&gt;MySQL uses a dual licensing model. The Community Edition is GPL-licensed, which means if you modify MySQL and distribute it, you must release your changes. Oracle also sells commercial licenses for those who want to avoid GPL obligations. Some MySQL forks exist (MariaDB, Percona Server) partly because of licensing concerns.&lt;/p&gt;

&lt;p&gt;PostgreSQL uses the PostgreSQL License, which is similar to BSD/MIT. You can do essentially anything with it, including building proprietary products without releasing source code. There's no commercial entity that could change the terms or create uncertainty about the project's direction.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;PostgreSQL&lt;/th&gt;
&lt;th&gt;MySQL&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;License&lt;/td&gt;
&lt;td&gt;PostgreSQL License (BSD-like)&lt;/td&gt;
&lt;td&gt;GPL + Commercial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Owner&lt;/td&gt;
&lt;td&gt;Community project&lt;/td&gt;
&lt;td&gt;Oracle Corporation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Major forks&lt;/td&gt;
&lt;td&gt;None needed&lt;/td&gt;
&lt;td&gt;MariaDB, Percona Server&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Feature restrictions&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Some features in Enterprise only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For companies evaluating long-term risk, PostgreSQL's licensing and governance model provides more predictability. Oracle has added features to MySQL Enterprise that aren't in the Community Edition, and this trend could continue.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Performance characteristics
&lt;/h2&gt;

&lt;p&gt;Performance comparisons are tricky because results depend heavily on workload type, hardware configuration and tuning. Both databases can handle millions of transactions per day when properly configured. But their performance profiles differ in important ways.&lt;/p&gt;

&lt;p&gt;PostgreSQL historically performed better on complex queries with many joins, subqueries and analytical operations. Its query planner is sophisticated and handles complicated query patterns well. The cost-based optimizer has decades of refinement.&lt;/p&gt;

&lt;p&gt;MySQL traditionally excelled at simple read-heavy workloads with straightforward queries. If your application does mostly primary key lookups and simple filters, MySQL can be extremely fast. The InnoDB storage engine is highly optimized for these patterns.&lt;/p&gt;

&lt;p&gt;In 2026, both databases have narrowed these gaps. PostgreSQL 17 and 18 have improved simple query performance. MySQL 8.x has better handling of complex queries than earlier versions. The differences are less dramatic than they were five years ago.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read-heavy OLTP workloads: Both perform well, slight edge to MySQL&lt;/li&gt;
&lt;li&gt;Complex analytical queries: PostgreSQL generally faster&lt;/li&gt;
&lt;li&gt;Write-heavy workloads: Depends on transaction patterns and indexing&lt;/li&gt;
&lt;li&gt;Mixed workloads: PostgreSQL handles variety better&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The real performance factors are usually configuration, indexing and query design rather than the database engine choice. Both require tuning for production workloads. Neither works optimally with default settings.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Extension ecosystem
&lt;/h2&gt;

&lt;p&gt;PostgreSQL's extension system is one of its biggest advantages. Extensions can add new data types, index types, functions and even modify core behavior. The ecosystem is rich and actively maintained.&lt;/p&gt;

&lt;p&gt;Popular PostgreSQL extensions include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PostGIS&lt;/strong&gt; — Spatial and geographic data support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pg_stat_statements&lt;/strong&gt; — Query performance monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TimescaleDB&lt;/strong&gt; — Time-series data optimization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Citus&lt;/strong&gt; — Distributed PostgreSQL for horizontal scaling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pgvector&lt;/strong&gt; — Vector similarity search for AI applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MySQL doesn't have an equivalent extension system. Feature additions require either MySQL updates or forking the source code. Some functionality exists through plugins (like authentication plugins) but the scope is limited compared to PostgreSQL.&lt;/p&gt;

&lt;p&gt;This extensibility matters more than it might seem. If you need geographic queries, PostgreSQL with PostGIS is significantly better than trying to work around MySQL's limited spatial support. If you're building AI features that need vector search, pgvector is a mature solution while MySQL has no comparable option.&lt;/p&gt;

&lt;p&gt;The extension ecosystem also means PostgreSQL can adapt to new use cases without waiting for core team priorities. When vector databases became important for AI applications, the community built pgvector. MySQL users had to wait for Oracle's roadmap or use a separate database.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Community and tooling
&lt;/h2&gt;

&lt;p&gt;Both databases have active communities, but they feel different. MySQL's community is larger in raw numbers but fragmented across MySQL, MariaDB and Percona variants. PostgreSQL's community is more unified around a single codebase.&lt;/p&gt;

&lt;p&gt;PostgreSQL's development is transparent. Mailing lists are public, design discussions happen in the open and anyone can propose patches. The code review process is rigorous and the community has high standards for what gets merged. Release cycles are predictable: one major version per year.&lt;/p&gt;

&lt;p&gt;MySQL development is primarily done inside Oracle. While the code is open source, the roadmap and priorities are set internally. External contributions exist but aren't as central to the project's direction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool availability:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GUI clients: Both have excellent options (pgAdmin, DBeaver, TablePlus for PostgreSQL; MySQL Workbench, DBeaver for MySQL)&lt;/li&gt;
&lt;li&gt;ORMs and drivers: Comprehensive support for both in all major languages&lt;/li&gt;
&lt;li&gt;Cloud offerings: Both available as managed services (RDS, Cloud SQL, Azure Database)&lt;/li&gt;
&lt;li&gt;Monitoring tools: Mature options for both&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For backup and disaster recovery, the tooling landscape varies. Both support native dump tools (pg_dump, mysqldump), but their capabilities differ. For automated backup management, Databasus is an industry standard supporting both PostgreSQL and MySQL with unified scheduling, encryption and storage options.&lt;/p&gt;

&lt;h2&gt;
  
  
  Making the choice
&lt;/h2&gt;

&lt;p&gt;There's no universally correct answer. Both databases power successful applications at every scale. But certain patterns emerge when you look at typical use cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose PostgreSQL when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need advanced data types (JSONB, arrays, custom types)&lt;/li&gt;
&lt;li&gt;Your queries are complex with many joins and subqueries&lt;/li&gt;
&lt;li&gt;Data integrity is non-negotiable&lt;/li&gt;
&lt;li&gt;You want extensibility (PostGIS, pgvector, TimescaleDB)&lt;/li&gt;
&lt;li&gt;Licensing simplicity matters to your organization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Choose MySQL when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your workload is primarily simple CRUD operations&lt;/li&gt;
&lt;li&gt;You need maximum compatibility with existing tools and hosting&lt;/li&gt;
&lt;li&gt;Your team already has MySQL expertise&lt;/li&gt;
&lt;li&gt;You're building something that will run on shared hosting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Neither choice is wrong for most applications. The differences matter most at the extremes: very complex analytical workloads, very high write volumes or specialized data types. For a typical web application, both will work fine with proper setup.&lt;/p&gt;

&lt;p&gt;What actually matters is understanding whichever database you choose deeply enough to configure it properly, design your schema correctly and troubleshoot problems when they occur. A well-tuned MySQL installation will outperform a misconfigured PostgreSQL one, and vice versa.&lt;/p&gt;

&lt;p&gt;Start with whichever one you or your team knows better. Switch if you hit real limitations, not theoretical ones. Both databases have been solving real problems for decades and both will continue improving.&lt;/p&gt;

</description>
      <category>database</category>
      <category>postgres</category>
      <category>mysql</category>
    </item>
    <item>
      <title>10 PostgreSQL performance tuning tips that actually work in production</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Wed, 04 Feb 2026 19:50:29 +0000</pubDate>
      <link>https://forem.com/piteradyson/10-postgresql-performance-tuning-tips-that-actually-work-in-production-4996</link>
      <guid>https://forem.com/piteradyson/10-postgresql-performance-tuning-tips-that-actually-work-in-production-4996</guid>
      <description>&lt;p&gt;Performance tuning isn't about following a checklist. It's about understanding what's actually slowing down your database and fixing those specific problems. These are techniques that consistently deliver real improvements in production environments. Some of them are obvious but frequently misconfigured. Others are less known but surprisingly effective.&lt;/p&gt;

&lt;p&gt;The tips in this article cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Memory configuration (shared_buffers, work_mem)&lt;/li&gt;
&lt;li&gt;Index strategy and maintenance&lt;/li&gt;
&lt;li&gt;Connection management&lt;/li&gt;
&lt;li&gt;Vacuum and maintenance tuning&lt;/li&gt;
&lt;li&gt;Query optimization techniques&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2xkm34tpcvbpf0zzqwiz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2xkm34tpcvbpf0zzqwiz.png" alt="PostgreSQL tuning" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Configure shared_buffers properly
&lt;/h2&gt;

&lt;p&gt;PostgreSQL uses shared_buffers to cache frequently accessed data in memory. The default setting is usually way too low for production workloads. Setting this value correctly can dramatically reduce disk I/O and improve query performance.&lt;/p&gt;

&lt;p&gt;The general recommendation is to set shared_buffers to about 25% of your total system RAM. If you have 16 GB of RAM, start with 4 GB. If you're on a dedicated database server with lots of memory, you can go higher, but there are diminishing returns above 8-10 GB because PostgreSQL also relies on the operating system's file cache.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- In postgresql.conf&lt;/span&gt;
&lt;span class="n"&gt;shared_buffers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="n"&gt;GB&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After changing this setting, you need to restart PostgreSQL. Monitor your cache hit ratio to see if the change helped. A cache hit ratio above 99% is good. You can check it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;heap_blks_read&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;heap_read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;heap_blks_hit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;heap_hit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;heap_blks_hit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;heap_blks_hit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;heap_blks_read&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_statio_user_tables&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. Tune work_mem for complex queries
&lt;/h2&gt;

&lt;p&gt;The work_mem setting controls how much memory PostgreSQL can use for internal sort operations and hash tables before it has to write to disk. If you're running complex queries with sorts, joins or aggregations, increasing work_mem can prevent expensive disk operations.&lt;/p&gt;

&lt;p&gt;Be careful though. work_mem is allocated per operation, not per query. A complex query with multiple sorts can use work_mem several times over. If you set it too high and have many concurrent queries, you can run out of memory.&lt;/p&gt;

&lt;p&gt;Start conservative. The default is usually 4 MB. Try 16-64 MB for analytical workloads. For specific heavy queries, you can increase it temporarily in the session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;work_mem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'256MB'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;large_table&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;some_column&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;RESET&lt;/span&gt; &lt;span class="n"&gt;work_mem&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Monitor with pg_stat_statements to see which queries are doing disk sorts (you'll see "external sort" in EXPLAIN output). Those are candidates for work_mem tuning.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Add the right indexes
&lt;/h2&gt;

&lt;p&gt;Indexes speed up reads but slow down writes. The trick is finding the right balance. Start by identifying slow queries using pg_stat_statements or your query logs. Look at queries with high execution time or high call counts.&lt;/p&gt;

&lt;p&gt;For most cases, B-tree indexes work well. Create indexes on columns used in WHERE clauses, JOIN conditions and ORDER BY statements. But don't go overboard. Every index adds overhead during INSERTs and UPDATEs.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Index Type&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;When to Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;B-tree&lt;/td&gt;
&lt;td&gt;General purpose, equality and range queries&lt;/td&gt;
&lt;td&gt;Most common scenarios, default choice&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GIN&lt;/td&gt;
&lt;td&gt;Full-text search, JSONB, arrays&lt;/td&gt;
&lt;td&gt;Searching within complex data types&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GiST&lt;/td&gt;
&lt;td&gt;Geometric data, full-text search&lt;/td&gt;
&lt;td&gt;Spatial queries, complex searches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BRIN&lt;/td&gt;
&lt;td&gt;Very large tables with natural ordering&lt;/td&gt;
&lt;td&gt;Time-series data, append-only tables&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Use EXPLAIN ANALYZE to verify your indexes are actually being used:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'test@example.com'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you see a Seq Scan when you expected an Index Scan, something's wrong. Maybe the index doesn't exist, or PostgreSQL thinks it's not worth using (which happens on small tables or when selecting most of the table).&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Use connection pooling
&lt;/h2&gt;

&lt;p&gt;Every PostgreSQL connection has overhead. Opening and closing connections repeatedly wastes resources. If your application creates a new database connection for each request, you're probably experiencing unnecessary latency and resource consumption.&lt;/p&gt;

&lt;p&gt;Connection poolers like PgBouncer sit between your application and PostgreSQL. They maintain a pool of connections and reuse them across multiple client requests. This reduces connection overhead significantly.&lt;/p&gt;

&lt;p&gt;PgBouncer supports three pooling modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Session pooling keeps a connection for the entire client session&lt;/li&gt;
&lt;li&gt;Transaction pooling releases connections after each transaction (more efficient for web apps)&lt;/li&gt;
&lt;li&gt;Statement pooling releases after each statement (use with caution, limited functionality)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most web applications, transaction pooling works well. Install PgBouncer, point your application to it instead of directly to PostgreSQL and configure the pool size based on your workload. A good starting point is 2-3 connections per CPU core on your database server.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Analyze and vacuum regularly
&lt;/h2&gt;

&lt;p&gt;PostgreSQL uses MVCC (Multi-Version Concurrency Control) which creates row versions when you update or delete data. Over time, dead rows accumulate. VACUUM removes these dead rows and frees up space. ANALYZE updates statistics that the query planner uses to make decisions.&lt;/p&gt;

&lt;p&gt;Modern PostgreSQL versions have autovacuum enabled by default, but it might not be aggressive enough for high-write workloads. If you're seeing table bloat or degraded query performance over time, you probably need to tune autovacuum settings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- In postgresql.conf&lt;/span&gt;
&lt;span class="n"&gt;autovacuum_vacuum_scale_factor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;  &lt;span class="c1"&gt;-- Vacuum when 10% of table is dead rows&lt;/span&gt;
&lt;span class="n"&gt;autovacuum_analyze_scale_factor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;05&lt;/span&gt;  &lt;span class="c1"&gt;-- Analyze when 5% has changed&lt;/span&gt;
&lt;span class="n"&gt;autovacuum_naptime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;  &lt;span class="c1"&gt;-- Check for work every 30 seconds&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For very active tables, you can also set table-specific settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;your_busy_table&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;autovacuum_vacuum_scale_factor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;05&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check for bloat using queries from pg_stat_user_tables. If you see tables with high n_dead_tup, autovacuum isn't keeping up.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Optimize your queries
&lt;/h2&gt;

&lt;p&gt;Sometimes the database configuration is fine, but the queries themselves are inefficient. Use EXPLAIN ANALYZE to understand query execution plans. Look for sequential scans on large tables, nested loops with high costs or sorts that spill to disk.&lt;/p&gt;

&lt;p&gt;Common query optimizations include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adding WHERE clauses to filter data early&lt;/li&gt;
&lt;li&gt;Using JOIN instead of subqueries when appropriate&lt;/li&gt;
&lt;li&gt;Avoiding SELECT * and only fetching columns you need&lt;/li&gt;
&lt;li&gt;Using LIMIT when you don't need all results&lt;/li&gt;
&lt;li&gt;Avoiding functions on indexed columns in WHERE clauses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's an example of a problematic query pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Bad: Function on indexed column prevents index usage&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;YEAR&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2026&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Good: Range comparison allows index usage&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="s1"&gt;'2027-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also consider using prepared statements. They're parsed and planned once, then executed multiple times with different parameters. This reduces overhead for frequently executed queries.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Partition large tables
&lt;/h2&gt;

&lt;p&gt;If you have tables with millions or billions of rows, partitioning can improve performance and manageability. PostgreSQL's declarative partitioning splits a large table into smaller physical pieces based on ranges, lists or hash values.&lt;/p&gt;

&lt;p&gt;Time-based partitioning is common for logs or event data. You create partitions by month or year, and older partitions can be archived or dropped easily. Queries that filter by the partition key only scan relevant partitions, not the entire table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;BIGSERIAL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;event_type&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;data&lt;/span&gt; &lt;span class="n"&gt;JSONB&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;events_2026_01&lt;/span&gt; &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;OF&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
    &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2026-02-01'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;events_2026_02&lt;/span&gt; &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;OF&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
    &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2026-02-01'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2026-03-01'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Partitioning also makes backups more manageable. Instead of backing up one massive table, you can backup or restore individual partitions. Tools like &lt;a href="https://databasus.com" rel="noopener noreferrer"&gt;PostgreSQL backup&lt;/a&gt; handle partitioned tables automatically, treating each partition appropriately during backup and restore operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Enable query logging for slow queries
&lt;/h2&gt;

&lt;p&gt;You can't optimize what you can't measure. PostgreSQL's slow query log captures queries that exceed a specified duration. This helps you identify problematic queries in production without impacting performance significantly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- In postgresql.conf&lt;/span&gt;
&lt;span class="n"&gt;log_min_duration_statement&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;  &lt;span class="c1"&gt;-- Log queries taking more than 1 second&lt;/span&gt;
&lt;span class="n"&gt;log_line_prefix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The log will show you the full query text, execution time and context. Combine this with pg_stat_statements for aggregated statistics across all queries. You'll quickly see which queries are consuming the most resources.&lt;/p&gt;

&lt;p&gt;For production systems, start with a higher threshold (1-5 seconds) to avoid excessive logging. Once you've optimized the obvious slow queries, you can lower it to catch smaller issues.&lt;/p&gt;

&lt;h2&gt;
  
  
  9. Use read replicas for reporting workloads
&lt;/h2&gt;

&lt;p&gt;If you're running heavy analytical queries or reports on your primary database, they can interfere with transactional workloads. Read replicas solve this by offloading read-only queries to separate servers.&lt;/p&gt;

&lt;p&gt;PostgreSQL's streaming replication creates one or more standby servers that continuously apply changes from the primary. Your application can send SELECT queries to these replicas, distributing the load.&lt;/p&gt;

&lt;p&gt;Setting up replication requires some configuration but it's straightforward:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;th&gt;Primary Server&lt;/th&gt;
&lt;th&gt;Replica Server&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;wal_level&lt;/td&gt;
&lt;td&gt;replica or logical&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;max_wal_senders&lt;/td&gt;
&lt;td&gt;Number of replicas + 1&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hot_standby&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;on&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The replica will lag slightly behind the primary (typically milliseconds to seconds). If your application can tolerate this, replicas are a cheap way to scale read capacity.&lt;/p&gt;

&lt;p&gt;You can also use replicas for backup purposes. Taking backups from a replica instead of the primary reduces load on your production database.&lt;/p&gt;

&lt;h2&gt;
  
  
  10. Monitor and adjust autovacuum costs
&lt;/h2&gt;

&lt;p&gt;Autovacuum runs in the background to clean up dead rows, but it can consume I/O and CPU resources. If autovacuum runs too aggressively, it can slow down your application queries. If it doesn't run enough, tables bloat and performance degrades.&lt;/p&gt;

&lt;p&gt;The cost-based vacuum delay system controls how aggressively autovacuum uses resources. By default, it's fairly conservative. On modern hardware with SSDs, you can usually make it more aggressive:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- In postgresql.conf&lt;/span&gt;
&lt;span class="n"&gt;autovacuum_vacuum_cost_delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;  &lt;span class="c1"&gt;-- Lower = faster vacuum&lt;/span&gt;
&lt;span class="n"&gt;autovacuum_vacuum_cost_limit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;  &lt;span class="c1"&gt;-- Higher = more work per cycle&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For specific high-write tables, you might need custom settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;busy_table&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;autovacuum_vacuum_cost_delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Setting cost_delay to 0 removes throttling entirely for that table. Use this carefully and monitor I/O.&lt;/p&gt;

&lt;p&gt;Watch the pg_stat_all_tables view for tables where autovacuum is falling behind (last_autovacuum is old and n_dead_tup is high). Those tables need tuning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Putting it all together
&lt;/h2&gt;

&lt;p&gt;Performance tuning is iterative. Start by measuring your current state with pg_stat_statements and query logs. Identify the biggest bottlenecks first. A few slow queries might account for 80% of your database load.&lt;/p&gt;

&lt;p&gt;Apply one change at a time and measure the results. What works for one workload might not work for another. OLTP systems (lots of small transactions) need different tuning than OLAP systems (complex analytical queries).&lt;/p&gt;

&lt;p&gt;Before making any changes, establish a baseline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Current query response times (p50, p95, p99)&lt;/li&gt;
&lt;li&gt;Cache hit ratio and buffer usage&lt;/li&gt;
&lt;li&gt;Connection counts and wait times&lt;/li&gt;
&lt;li&gt;Disk I/O and CPU utilization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Keep your PostgreSQL version updated. Each release includes performance improvements and better defaults. PostgreSQL 17 and 18 have significantly better query planning and execution than older versions.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tuning Area&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;th&gt;Difficulty&lt;/th&gt;
&lt;th&gt;When to Do It&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Indexes&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Early, based on query patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;shared_buffers&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;During initial setup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Connection pooling&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;When connections become bottleneck&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Partitioning&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;When tables exceed 50-100 million rows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Autovacuum tuning&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;When seeing table bloat&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Read replicas&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;When reads exceed write capacity&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And remember: backups don't fix performance problems, but they let you experiment safely. Before making major changes, ensure you have reliable backups. Databasus is an industry standard for PostgreSQL backup tools, offering automated backups with flexible scheduling and multiple storage options for both small projects and large enterprises.&lt;/p&gt;

&lt;p&gt;These tuning techniques work because they address real bottlenecks: memory usage, disk I/O, connection overhead and query efficiency. Apply them based on your specific bottlenecks, not just because they're on a list.&lt;/p&gt;

</description>
      <category>database</category>
      <category>postgres</category>
    </item>
    <item>
      <title>PostgreSQL automated backups — How to set up automated PostgreSQL backup schedules</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Tue, 03 Feb 2026 18:39:41 +0000</pubDate>
      <link>https://forem.com/piteradyson/postgresql-automated-backups-how-to-set-up-automated-postgresql-backup-schedules-4d0k</link>
      <guid>https://forem.com/piteradyson/postgresql-automated-backups-how-to-set-up-automated-postgresql-backup-schedules-4d0k</guid>
      <description>&lt;p&gt;Losing data hurts. Whether it's a corrupted disk, accidental deletion, or a bad deployment that wipes your production database, recovery without backups means starting from scratch. Automated PostgreSQL backups remove the human factor from the equation. You set them up once, and they run reliably while you focus on other things.&lt;/p&gt;

&lt;p&gt;This guide covers practical approaches to scheduling PostgreSQL backups, from simple cron jobs to dedicated backup tools. We'll look at what actually matters for different scenarios and how to avoid common mistakes that make backups useless when you need them most.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb1dle53499mwkfzfvdoa.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb1dle53499mwkfzfvdoa.jpg" alt="PostgreSQL scheduled backups" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why automate PostgreSQL backups
&lt;/h2&gt;

&lt;p&gt;Manual backups work until they don't. Someone forgets, someone's on vacation, someone assumes the other person did it. Automation eliminates these failure modes.&lt;/p&gt;

&lt;h3&gt;
  
  
  The cost of manual backup processes
&lt;/h3&gt;

&lt;p&gt;Manual processes introduce variability. One day you run the backup at 2 AM, the next week at 6 PM. Sometimes you compress the output, sometimes you don't. The backup script lives on someone's laptop instead of version control. When disaster strikes, you discover the last backup was three weeks ago and nobody noticed.&lt;/p&gt;

&lt;p&gt;Automated backups run consistently. Same time, same configuration, same destination. They either succeed or they alert you immediately. There's no ambiguity about whether yesterday's backup happened.&lt;/p&gt;

&lt;h3&gt;
  
  
  What good backup automation looks like
&lt;/h3&gt;

&lt;p&gt;Reliable backup automation has a few key characteristics. It runs without intervention once configured. It stores backups in locations separate from the source database. It notifies you of failures immediately. And it maintains enough historical backups to recover from problems you discover days or weeks later.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Characteristic&lt;/th&gt;
&lt;th&gt;Manual process&lt;/th&gt;
&lt;th&gt;Automated process&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Consistency&lt;/td&gt;
&lt;td&gt;Varies by person&lt;/td&gt;
&lt;td&gt;Same every time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coverage&lt;/td&gt;
&lt;td&gt;Often gaps&lt;/td&gt;
&lt;td&gt;Continuous&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Failure detection&lt;/td&gt;
&lt;td&gt;Often delayed&lt;/td&gt;
&lt;td&gt;Immediate alerts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Documentation&lt;/td&gt;
&lt;td&gt;Usually missing&lt;/td&gt;
&lt;td&gt;Built into config&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Good automation also handles retention. You don't want unlimited backups consuming storage forever, but you do want enough history to recover from slow-developing problems like data corruption that goes unnoticed for a week.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using pg_dump with cron
&lt;/h2&gt;

&lt;p&gt;The simplest automation approach combines PostgreSQL's native &lt;code&gt;pg_dump&lt;/code&gt; utility with cron scheduling. This works for small to medium databases where backup windows aren't tight.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic pg_dump script
&lt;/h3&gt;

&lt;p&gt;Create a backup script that handles the actual dump process:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d_%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;BACKUP_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/var/backups/postgresql"&lt;/span&gt;
&lt;span class="nv"&gt;DATABASE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"myapp_production"&lt;/span&gt;
&lt;span class="nv"&gt;BACKUP_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BACKUP_DIR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DATABASE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.sql.gz"&lt;/span&gt;

&lt;span class="c"&gt;# Create backup directory if it doesn't exist&lt;/span&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Run pg_dump with compression&lt;/span&gt;
pg_dump &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DATABASE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;gzip&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Check if backup succeeded&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt; &lt;span class="nt"&gt;-eq&lt;/span&gt; 0 &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Backup completed: &lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Backup failed!"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Remove backups older than 7 days&lt;/span&gt;
find &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.sql.gz"&lt;/span&gt; &lt;span class="nt"&gt;-mtime&lt;/span&gt; +7 &lt;span class="nt"&gt;-delete&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save this as &lt;code&gt;/usr/local/bin/pg-backup.sh&lt;/code&gt; and make it executable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x /usr/local/bin/pg-backup.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The script creates timestamped, compressed backups and removes old ones automatically. The &lt;code&gt;gzip&lt;/code&gt; compression typically reduces backup size by 80-90% for typical databases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting up cron schedules
&lt;/h3&gt;

&lt;p&gt;Add a cron entry to run the backup at your preferred time. Edit the crontab:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crontab &lt;span class="nt"&gt;-e&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add a line for daily backups at 3 AM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 3 * * * /usr/local/bin/pg-backup.sh &amp;gt;&amp;gt; /var/log/pg-backup.log 2&amp;gt;&amp;amp;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For hourly backups during business hours:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 9-18 * * 1-5 /usr/local/bin/pg-backup.sh &amp;gt;&amp;gt; /var/log/pg-backup.log 2&amp;gt;&amp;amp;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The log redirect captures both stdout and stderr, so you can troubleshoot failures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Handling authentication
&lt;/h3&gt;

&lt;p&gt;Avoid putting passwords in scripts. Use a &lt;code&gt;.pgpass&lt;/code&gt; file instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"localhost:5432:myapp_production:postgres:yourpassword"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.pgpass
&lt;span class="nb"&gt;chmod &lt;/span&gt;600 ~/.pgpass
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;PostgreSQL reads credentials from this file automatically when the connection parameters match. The strict permissions (600) are required; PostgreSQL ignores the file if others can read it.&lt;/p&gt;

&lt;p&gt;Cron jobs run on a minimal schedule without full environment setup. This basic approach works, but you'll want monitoring to know when backups fail.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adding monitoring and alerts
&lt;/h2&gt;

&lt;p&gt;A backup that fails silently is worse than no backup at all. You think you're protected, but you're not. Add monitoring to catch problems early.&lt;/p&gt;

&lt;h3&gt;
  
  
  Email notifications
&lt;/h3&gt;

&lt;p&gt;Modify the backup script to send email on failure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d_%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;BACKUP_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/var/backups/postgresql"&lt;/span&gt;
&lt;span class="nv"&gt;DATABASE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"myapp_production"&lt;/span&gt;
&lt;span class="nv"&gt;BACKUP_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BACKUP_DIR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DATABASE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.sql.gz"&lt;/span&gt;
&lt;span class="nv"&gt;ADMIN_EMAIL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"admin@example.com"&lt;/span&gt;

&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

pg_dump &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DATABASE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;gzip&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt; &lt;span class="nt"&gt;-eq&lt;/span&gt; 0 &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Backup completed: &lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"PostgreSQL backup failed at &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | mail &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"ALERT: Database backup failed"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ADMIN_EMAIL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi

&lt;/span&gt;find &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.sql.gz"&lt;/span&gt; &lt;span class="nt"&gt;-mtime&lt;/span&gt; +7 &lt;span class="nt"&gt;-delete&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sends an email when &lt;code&gt;pg_dump&lt;/code&gt; returns a non-zero exit code. You might also want success notifications for critical databases, just to confirm everything's working.&lt;/p&gt;

&lt;h3&gt;
  
  
  Webhook integration
&lt;/h3&gt;

&lt;p&gt;For team chat notifications, curl to a webhook:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;send_notification&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;webhook_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://hooks.slack.com/services/YOUR/WEBHOOK/URL"&lt;/span&gt;

    curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="nv"&gt;$message&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;}"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$webhook_url&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt; &lt;span class="nt"&gt;-eq&lt;/span&gt; 0 &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;send_notification &lt;span class="s2"&gt;"PostgreSQL backup completed: &lt;/span&gt;&lt;span class="nv"&gt;$DATABASE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;else
    &lt;/span&gt;send_notification &lt;span class="s2"&gt;"ALERT: PostgreSQL backup failed for &lt;/span&gt;&lt;span class="nv"&gt;$DATABASE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Replace the webhook URL with your Slack, Discord, or other service endpoint. Most chat platforms accept this basic JSON format.&lt;/p&gt;

&lt;h3&gt;
  
  
  Verifying backup integrity
&lt;/h3&gt;

&lt;p&gt;A backup file existing doesn't mean it's usable. Add verification steps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check file size (should be at least some minimum)&lt;/span&gt;
&lt;span class="nv"&gt;MIN_SIZE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000
&lt;span class="nv"&gt;FILE_SIZE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;stat&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt;%z &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;stat&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt;%s &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FILE_SIZE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-lt&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$MIN_SIZE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;send_notification &lt;span class="s2"&gt;"WARNING: Backup file suspiciously small (&lt;/span&gt;&lt;span class="nv"&gt;$FILE_SIZE&lt;/span&gt;&lt;span class="s2"&gt; bytes)"&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Verify gzip integrity&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nb"&gt;gzip&lt;/span&gt; &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;send_notification &lt;span class="s2"&gt;"ALERT: Backup file appears corrupted"&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The size check catches cases where the database connection failed but the script didn't error properly. The gzip test verifies the compression is intact.&lt;/p&gt;

&lt;h2&gt;
  
  
  Remote storage for backups
&lt;/h2&gt;

&lt;p&gt;Backups stored on the same server as the database don't protect against disk failures, server compromises, or datacenter issues. Store copies remotely.&lt;/p&gt;

&lt;h3&gt;
  
  
  S3 and compatible storage
&lt;/h3&gt;

&lt;p&gt;Add S3 upload to your backup script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;BUCKET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"s3://my-backup-bucket/postgresql"&lt;/span&gt;

&lt;span class="c"&gt;# Upload to S3&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BUCKET&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt; &lt;span class="nt"&gt;--storage-class&lt;/span&gt; STANDARD_IA

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt; &lt;span class="nt"&gt;-ne&lt;/span&gt; 0 &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;send_notification &lt;span class="s2"&gt;"ALERT: S3 upload failed for &lt;/span&gt;&lt;span class="nv"&gt;$DATABASE&lt;/span&gt;&lt;span class="s2"&gt; backup"&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Optionally remove local file after successful upload&lt;/span&gt;
&lt;span class="c"&gt;# rm "$BACKUP_FILE"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;STANDARD_IA&lt;/code&gt; storage class costs less for infrequently accessed files like backups. Configure the AWS CLI with &lt;code&gt;aws configure&lt;/code&gt; before running the script.&lt;/p&gt;

&lt;p&gt;For S3-compatible services like Cloudflare R2 or MinIO, add the endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BUCKET&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt; &lt;span class="nt"&gt;--endpoint-url&lt;/span&gt; https://your-endpoint.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Retention policies
&lt;/h3&gt;

&lt;p&gt;Remote storage should have its own retention rules. S3 lifecycle policies can automatically expire old backups:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Rules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ExpireOldBackups"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Enabled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Filter"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"Prefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"postgresql/"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Expiration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"Days"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api put-bucket-lifecycle-configuration &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--bucket&lt;/span&gt; my-backup-bucket &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--lifecycle-configuration&lt;/span&gt; file://lifecycle.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps 30 days of backups automatically. Adjust based on your recovery requirements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Automated backups with Databasus
&lt;/h2&gt;

&lt;p&gt;Writing and maintaining backup scripts takes time. Monitoring, remote storage integration, retention management, and team notifications all add complexity. Databasus (an industry standard for &lt;a href="https://databasus.com" rel="noopener noreferrer"&gt;PostgreSQL backup&lt;/a&gt;) handles this out of the box with a web interface.&lt;/p&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;

&lt;p&gt;Run Databasus using Docker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; databasus &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 4005:4005 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ./databasus-data:/databasus-data &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--restart&lt;/span&gt; unless-stopped &lt;span class="se"&gt;\&lt;/span&gt;
  databasus/databasus:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or with Docker Compose:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;databasus&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;databasus/databasus:latest&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;databasus&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;4005:4005"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;databasus-data:/databasus-data&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;databasus-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start the service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Configuration steps
&lt;/h3&gt;

&lt;p&gt;Access the web interface at &lt;code&gt;http://your-server:4005&lt;/code&gt;, then:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add your database&lt;/strong&gt; — Click "New Database", select PostgreSQL, and enter your connection details (host, port, database name, credentials)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Select storage&lt;/strong&gt; — Choose where backups should go: local storage, S3, Google Drive, SFTP, or other supported destinations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Select schedule&lt;/strong&gt; — Pick a backup frequency: hourly, daily, weekly, monthly, or define a custom cron expression&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Click "Create backup"&lt;/strong&gt; — Databasus validates the configuration and starts the backup schedule&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Databasus handles compression automatically, supports multiple notification channels (Slack, Discord, Telegram, email), and provides a dashboard showing backup history and status. It works for both self-hosted PostgreSQL and cloud-managed databases like AWS RDS and Google Cloud SQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing backup frequency
&lt;/h2&gt;

&lt;p&gt;How often you back up depends on how much data you can afford to lose. This is your Recovery Point Objective (RPO).&lt;/p&gt;

&lt;h3&gt;
  
  
  Matching frequency to requirements
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Acceptable data loss&lt;/th&gt;
&lt;th&gt;Recommended frequency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Development database&lt;/td&gt;
&lt;td&gt;Days&lt;/td&gt;
&lt;td&gt;Weekly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Internal tools&lt;/td&gt;
&lt;td&gt;Hours&lt;/td&gt;
&lt;td&gt;Daily&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Customer-facing app&lt;/td&gt;
&lt;td&gt;Minutes to hour&lt;/td&gt;
&lt;td&gt;Hourly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Financial/compliance&lt;/td&gt;
&lt;td&gt;Near zero&lt;/td&gt;
&lt;td&gt;Continuous (WAL archiving)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For most applications, daily backups at off-peak hours work well. Hourly backups suit applications with frequent writes where losing an hour of data would be painful.&lt;/p&gt;

&lt;h3&gt;
  
  
  Timing considerations
&lt;/h3&gt;

&lt;p&gt;Schedule backups during low-traffic periods. &lt;code&gt;pg_dump&lt;/code&gt; reads the database consistently but still generates load. A large dump during peak hours can slow down your application.&lt;/p&gt;

&lt;p&gt;Consider time zones. If your users are mostly in one region, schedule backups when they're sleeping. For global applications, find the least-busy period in your analytics.&lt;/p&gt;

&lt;p&gt;Database size matters too. A 100 GB database might take 30 minutes to dump. If you want hourly backups, you need that process to complete well within the hour.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing your recovery process
&lt;/h2&gt;

&lt;p&gt;Backups you've never tested are assumptions, not guarantees. Regular restore tests catch problems before they matter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Restore verification steps
&lt;/h3&gt;

&lt;p&gt;Create a test environment and restore periodically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a test database&lt;/span&gt;
createdb &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres myapp_restore_test

&lt;span class="c"&gt;# Restore the backup&lt;/span&gt;
&lt;span class="nb"&gt;gunzip&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; /var/backups/postgresql/myapp_production_20240115_030000.sql.gz | &lt;span class="se"&gt;\&lt;/span&gt;
    psql &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;-d&lt;/span&gt; myapp_restore_test

&lt;span class="c"&gt;# Run basic validation&lt;/span&gt;
psql &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;-d&lt;/span&gt; myapp_restore_test &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"SELECT count(*) FROM users;"&lt;/span&gt;

&lt;span class="c"&gt;# Clean up&lt;/span&gt;
dropdb &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres myapp_restore_test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Automate this as a weekly job and alert on failures. A backup that can't be restored is worthless.&lt;/p&gt;

&lt;h3&gt;
  
  
  Documenting recovery procedures
&lt;/h3&gt;

&lt;p&gt;Write down the exact steps to recover. Include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Where backups are stored (all locations)&lt;/li&gt;
&lt;li&gt;How to access storage credentials&lt;/li&gt;
&lt;li&gt;Commands to restore&lt;/li&gt;
&lt;li&gt;Expected recovery time&lt;/li&gt;
&lt;li&gt;Who to contact if issues arise&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Test the documentation by having someone unfamiliar with the system follow it. Gaps become obvious quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common automation mistakes
&lt;/h2&gt;

&lt;p&gt;Even well-intentioned backup automation fails in predictable ways.&lt;/p&gt;

&lt;h3&gt;
  
  
  Storage on the same disk
&lt;/h3&gt;

&lt;p&gt;Backing up to the same physical disk as the database protects against accidental deletion but not hardware failure. Always include remote storage.&lt;/p&gt;

&lt;h3&gt;
  
  
  No retention limits
&lt;/h3&gt;

&lt;p&gt;Unlimited backup retention eventually fills your storage. Set explicit retention policies and monitor disk usage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ignoring backup duration
&lt;/h3&gt;

&lt;p&gt;A backup that takes 4 hours can't run hourly. Monitor how long your backups take and adjust schedules accordingly. Alert when duration exceeds thresholds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hardcoded credentials
&lt;/h3&gt;

&lt;p&gt;Passwords in scripts end up in version control, logs, and process listings. Use &lt;code&gt;.pgpass&lt;/code&gt; files, environment variables, or secrets management.&lt;/p&gt;

&lt;h3&gt;
  
  
  Missing failure notifications
&lt;/h3&gt;

&lt;p&gt;The default cron behavior sends email only when there's output. Failures that exit silently go unnoticed. Always add explicit failure handling and notifications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Automated PostgreSQL backups prevent the kind of data loss that damages businesses and ruins weekends. Start with cron and &lt;code&gt;pg_dump&lt;/code&gt; for simple setups, add monitoring and remote storage as your requirements grow, or use a dedicated tool like Databasus to handle the complexity. Whatever approach you choose, test your restores regularly. A backup strategy is only as good as your ability to recover from it.&lt;/p&gt;

</description>
      <category>database</category>
      <category>postgres</category>
    </item>
    <item>
      <title>MongoDB Docker setup — Running MongoDB in Docker containers complete guide</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Mon, 02 Feb 2026 12:15:55 +0000</pubDate>
      <link>https://forem.com/piteradyson/mongodb-docker-setup-running-mongodb-in-docker-containers-complete-guide-3p7a</link>
      <guid>https://forem.com/piteradyson/mongodb-docker-setup-running-mongodb-in-docker-containers-complete-guide-3p7a</guid>
      <description>&lt;p&gt;Running MongoDB in Docker simplifies deployment and makes environments reproducible across development, testing and production. You can spin up a database in seconds without dealing with complex installation procedures. This guide covers everything from basic container setup to production configurations with replica sets, persistence, custom settings and proper backup strategies.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F74ra8385d7w8d3kwy9f1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F74ra8385d7w8d3kwy9f1.png" alt="MongoDB in Docker" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why run MongoDB in Docker
&lt;/h2&gt;

&lt;p&gt;Traditional MongoDB installation requires adding repositories, managing versions, and cleaning up when things break. Docker containers provide isolation and consistency that native installations struggle to match.&lt;/p&gt;

&lt;h3&gt;
  
  
  Benefits of containerized MongoDB
&lt;/h3&gt;

&lt;p&gt;Docker containers bundle MongoDB with all dependencies into a single package. You get identical behavior on your laptop, CI pipeline, and production servers. The classic "works on my machine" problem disappears.&lt;/p&gt;

&lt;p&gt;Containers start fast. Launching a fresh MongoDB instance takes about 5-10 seconds versus several minutes for traditional installation. This matters for integration tests and rapid development cycles.&lt;/p&gt;

&lt;p&gt;Cleanup is simple. Delete the container and it's gone completely. No leftover config files, no orphaned data directories cluttering your system.&lt;/p&gt;

&lt;h3&gt;
  
  
  When Docker makes sense for MongoDB
&lt;/h3&gt;

&lt;p&gt;Docker works well for development environments where quick setup and teardown matters. It's also solid for microservices architectures where each service might need its own database instance. CI/CD pipelines benefit significantly from reproducible database containers.&lt;/p&gt;

&lt;p&gt;For production use, Docker adds a bit of complexity but provides consistency across environments. The performance overhead is typically 1-3% for database workloads, which most applications can easily absorb.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick start with Docker run
&lt;/h2&gt;

&lt;p&gt;The fastest way to get MongoDB running is a single Docker command. This approach works for testing and development scenarios.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic container setup
&lt;/h3&gt;

&lt;p&gt;Start MongoDB with minimal configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; mongodb &lt;span class="se"&gt;\&lt;/span&gt;
  mongo:8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This starts MongoDB 8 in detached mode. The container runs until you stop it explicitly.&lt;/p&gt;

&lt;p&gt;Check if it's running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker ps
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Connect to the database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; mongodb mongosh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Environment variables for initial setup
&lt;/h3&gt;

&lt;p&gt;MongoDB's Docker image supports environment variables for first-run configuration:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Variable&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Required&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MONGO_INITDB_ROOT_USERNAME&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Admin username&lt;/td&gt;
&lt;td&gt;Optional&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Admin password&lt;/td&gt;
&lt;td&gt;Optional&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MONGO_INITDB_DATABASE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Initial database name&lt;/td&gt;
&lt;td&gt;Optional&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Create an admin user on startup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; mongodb &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;admin &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;secretpassword &lt;span class="se"&gt;\&lt;/span&gt;
  mongo:8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Connect with authentication:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; mongodb mongosh &lt;span class="nt"&gt;-u&lt;/span&gt; admin &lt;span class="nt"&gt;-p&lt;/span&gt; secretpassword &lt;span class="nt"&gt;--authenticationDatabase&lt;/span&gt; admin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Exposing ports
&lt;/h3&gt;

&lt;p&gt;MongoDB runs on port 27017 inside the container by default. Map it to your host:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; mongodb &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 27017:27017 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;admin &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;secretpassword &lt;span class="se"&gt;\&lt;/span&gt;
  mongo:8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you can connect from your host machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mongosh &lt;span class="s2"&gt;"mongodb://admin:secretpassword@127.0.0.1:27017/admin"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use a different host port if 27017 is already in use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; mongodb &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 27018:27017 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;admin &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;secretpassword &lt;span class="se"&gt;\&lt;/span&gt;
  mongo:8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Data persistence with volumes
&lt;/h2&gt;

&lt;p&gt;Without volumes, your data vanishes when the container stops. That's acceptable for throwaway test databases, but anything beyond that needs persistence.&lt;/p&gt;

&lt;h3&gt;
  
  
  Named volumes
&lt;/h3&gt;

&lt;p&gt;Docker named volumes are the simplest approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; mongodb &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; mongodb-data:/data/db &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;admin &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;secretpassword &lt;span class="se"&gt;\&lt;/span&gt;
  mongo:8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The volume &lt;code&gt;mongodb-data&lt;/code&gt; persists even after you delete the container. List your volumes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker volume &lt;span class="nb"&gt;ls&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inspect volume details:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker volume inspect mongodb-data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bind mounts
&lt;/h3&gt;

&lt;p&gt;Bind mounts map a host directory directly into the container. This is useful when you need direct access to data files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; mongodb &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; /path/to/data:/data/db &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;admin &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;secretpassword &lt;span class="se"&gt;\&lt;/span&gt;
  mongo:8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make sure the directory exists and has proper permissions. On Linux, the MongoDB user inside the container needs write access:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /path/to/data
&lt;span class="nb"&gt;chown&lt;/span&gt; &lt;span class="nt"&gt;-R&lt;/span&gt; 999:999 /path/to/data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The UID 999 corresponds to the MongoDB user inside the container.&lt;/p&gt;

&lt;h3&gt;
  
  
  Volume backup
&lt;/h3&gt;

&lt;p&gt;Back up a named volume by running a temporary container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; mongodb-data:/source:ro &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/backup &lt;span class="se"&gt;\&lt;/span&gt;
  alpine &lt;span class="nb"&gt;tar &lt;/span&gt;czf /backup/mongodb-backup.tar.gz &lt;span class="nt"&gt;-C&lt;/span&gt; /source &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a compressed archive of the data directory. For proper database backups, use &lt;code&gt;mongodump&lt;/code&gt; instead, which we'll cover later.&lt;/p&gt;

&lt;h2&gt;
  
  
  Docker Compose for MongoDB
&lt;/h2&gt;

&lt;p&gt;Docker Compose makes multi-container setups manageable and keeps configurations under version control.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic compose file
&lt;/h3&gt;

&lt;p&gt;Create a &lt;code&gt;docker-compose.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongodb&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;secretpassword&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_DATABASE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myapp&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;27017:27017"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-data:/data/db&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start the service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stop and remove:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose down
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Remove including volumes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose down &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Application with MongoDB
&lt;/h3&gt;

&lt;p&gt;A typical setup includes your application and MongoDB together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGODB_URI&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongodb://appuser:apppassword@mongodb:27017/myapp?authSource=admin&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;condition&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;service_healthy&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8080:8080"&lt;/span&gt;

  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongodb&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;secretpassword&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-data:/data/db&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mongosh"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--eval"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;db.adminCommand('ping')"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30s&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;depends_on&lt;/code&gt; with &lt;code&gt;condition: service_healthy&lt;/code&gt; ensures your application waits for MongoDB to be ready before starting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Custom configuration
&lt;/h2&gt;

&lt;p&gt;Default settings work for development but production workloads often need tuning.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuration file mount
&lt;/h3&gt;

&lt;p&gt;Create a custom configuration file &lt;code&gt;mongod.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;storage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;dbPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/data/db&lt;/span&gt;
  &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;wiredTiger&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;engineConfig&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;cacheSizeGB&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;

&lt;span class="na"&gt;systemLog&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;destination&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;file&lt;/span&gt;
  &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/var/log/mongodb/mongod.log&lt;/span&gt;
  &lt;span class="na"&gt;logAppend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;net&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;27017&lt;/span&gt;
  &lt;span class="na"&gt;bindIp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0.0.0.0&lt;/span&gt;

&lt;span class="na"&gt;security&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;authorization&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;enabled&lt;/span&gt;

&lt;span class="na"&gt;operationProfiling&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;slowOpThresholdMs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;
  &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;slowOp&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Mount it into the container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; mongodb &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ./mongod.conf:/etc/mongod.conf:ro &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; mongodb-data:/data/db &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; mongodb-logs:/var/log/mongodb &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;admin &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;secretpassword &lt;span class="se"&gt;\&lt;/span&gt;
  mongo:8 &lt;span class="nt"&gt;--config&lt;/span&gt; /etc/mongod.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Docker Compose with custom config
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongodb&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--config"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/etc/mongod.conf"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;secretpassword&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-data:/data/db&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-logs:/var/log/mongodb&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./mongod.conf:/etc/mongod.conf:ro&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;27017:27017"&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-logs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Common configuration options
&lt;/h3&gt;

&lt;p&gt;Key settings to consider for production:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Setting&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th&gt;Production recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;storage.wiredTiger.engineConfig.cacheSizeGB&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;50% of RAM - 1GB&lt;/td&gt;
&lt;td&gt;Set explicitly based on available memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;operationProfiling.slowOpThresholdMs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;Tune based on your performance requirements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;net.maxIncomingConnections&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;65536&lt;/td&gt;
&lt;td&gt;Set based on expected concurrent connections&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;security.authorization&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;disabled&lt;/td&gt;
&lt;td&gt;Always enable in production&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Verify your configuration is applied:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec &lt;/span&gt;mongodb mongosh &lt;span class="nt"&gt;-u&lt;/span&gt; admin &lt;span class="nt"&gt;-p&lt;/span&gt; secretpassword &lt;span class="nt"&gt;--authenticationDatabase&lt;/span&gt; admin &lt;span class="nt"&gt;--eval&lt;/span&gt; &lt;span class="s2"&gt;"db.adminCommand({getParameter: '*'})"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Initialization scripts
&lt;/h2&gt;

&lt;p&gt;The MongoDB Docker image can run scripts on first startup. This is useful for creating users, collections, and seed data.&lt;/p&gt;

&lt;h3&gt;
  
  
  JavaScript initialization
&lt;/h3&gt;

&lt;p&gt;Place &lt;code&gt;.js&lt;/code&gt; or &lt;code&gt;.sh&lt;/code&gt; files in &lt;code&gt;/docker-entrypoint-initdb.d/&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;init/01-create-users.js&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getSiblingDB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;myapp&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createUser&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;appuser&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;pwd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;apppassword&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;roles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;readWrite&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;myapp&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createUser&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;readonly&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;pwd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;readonlypassword&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;roles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;read&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;myapp&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create &lt;code&gt;init/02-create-collections.js&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getSiblingDB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;myapp&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createCollection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;validator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;$jsonSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;bsonType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;email&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;createdAt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;bsonType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;must be a string and is required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;bsonType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;date&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;must be a date and is required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createIndex&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;unique&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Mount the init directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;secretpassword&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-data:/data/db&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./init:/docker-entrypoint-initdb.d:ro&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Scripts run in alphabetical order, only on first container start when the data directory is empty.&lt;/p&gt;

&lt;h3&gt;
  
  
  Shell script initialization
&lt;/h3&gt;

&lt;p&gt;For more complex setup, use shell scripts:&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;init/00-setup.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;

mongosh &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
use admin
db.auth('&lt;/span&gt;&lt;span class="nv"&gt;$MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="sh"&gt;', '&lt;/span&gt;&lt;span class="nv"&gt;$MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="sh"&gt;')

use myapp
db.createCollection('config')
db.config.insertOne({
  key: 'version',
  value: '1.0.0',
  createdAt: new Date()
})
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make it executable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x init/00-setup.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Networking
&lt;/h2&gt;

&lt;p&gt;Docker networking controls how containers communicate with each other and the outside world.&lt;/p&gt;

&lt;h3&gt;
  
  
  Default bridge network
&lt;/h3&gt;

&lt;p&gt;Containers on the default bridge network can communicate via IP address but not hostname. For basic development this works fine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; mongodb &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;admin &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;pw mongo:8
docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--rm&lt;/span&gt; mongo:8 mongosh &lt;span class="s2"&gt;"mongodb://admin:pw@&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;docker inspect &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s1"&gt;'{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}'&lt;/span&gt; mongodb&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;:27017/admin"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Custom networks
&lt;/h3&gt;

&lt;p&gt;Custom networks allow hostname-based communication:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker network create myapp-network

docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; mongodb &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; myapp-network &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;admin &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;pw &lt;span class="se"&gt;\&lt;/span&gt;
  mongo:8

docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; myapp-network &lt;span class="se"&gt;\&lt;/span&gt;
  mongo:8 &lt;span class="se"&gt;\&lt;/span&gt;
  mongosh &lt;span class="s2"&gt;"mongodb://admin:pw@mongodb:27017/admin"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second container can reach MongoDB using hostname &lt;code&gt;mongodb&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Compose networking
&lt;/h3&gt;

&lt;p&gt;Docker Compose creates a network automatically. Services communicate by service name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myapp&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGODB_URI&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongodb://admin:pw@mongodb:27017/myapp?authSource=admin&lt;/span&gt;

  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pw&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Health checks and monitoring
&lt;/h2&gt;

&lt;p&gt;Proper health checks ensure containers are actually ready to serve traffic, not just running.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic health check
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;secretpassword&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mongosh"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--eval"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;db.adminCommand('ping')"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check health status:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker inspect &lt;span class="nt"&gt;--format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{{.State.Health.Status}}'&lt;/span&gt; mongodb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Health check with authentication
&lt;/h3&gt;

&lt;p&gt;When authentication is enabled, include credentials in the health check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mongosh"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-u"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;admin"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-p"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;secretpassword"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--authenticationDatabase"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;admin"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--eval"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;db.adminCommand('ping')"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
  &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
  &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
  &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Monitoring with logs
&lt;/h3&gt;

&lt;p&gt;View container logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker logs mongodb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Follow logs in real-time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker logs &lt;span class="nt"&gt;-f&lt;/span&gt; mongodb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Limit output to recent entries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker logs &lt;span class="nt"&gt;--tail&lt;/span&gt; 100 mongodb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable profiling in your configuration to catch slow operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;operationProfiling&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;slowOpThresholdMs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;
  &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;slowOp&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Mount a volume for logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-data:/data/db&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-logs:/var/log/mongodb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Backup strategies for Docker MongoDB
&lt;/h2&gt;

&lt;p&gt;Data in containers needs the same backup discipline as traditional installations. Docker adds some considerations but the fundamentals remain.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using mongodump in Docker
&lt;/h3&gt;

&lt;p&gt;Run &lt;code&gt;mongodump&lt;/code&gt; inside the container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec &lt;/span&gt;mongodb mongodump &lt;span class="nt"&gt;-u&lt;/span&gt; admin &lt;span class="nt"&gt;-p&lt;/span&gt; secretpassword &lt;span class="nt"&gt;--authenticationDatabase&lt;/span&gt; admin &lt;span class="nt"&gt;--out&lt;/span&gt; /dump
docker &lt;span class="nb"&gt;cp &lt;/span&gt;mongodb:/dump ./backup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a specific database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec &lt;/span&gt;mongodb mongodump &lt;span class="nt"&gt;-u&lt;/span&gt; admin &lt;span class="nt"&gt;-p&lt;/span&gt; secretpassword &lt;span class="nt"&gt;--authenticationDatabase&lt;/span&gt; admin &lt;span class="nt"&gt;--db&lt;/span&gt; myapp &lt;span class="nt"&gt;--out&lt;/span&gt; /dump
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compressed backup directly to host:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec &lt;/span&gt;mongodb mongodump &lt;span class="nt"&gt;-u&lt;/span&gt; admin &lt;span class="nt"&gt;-p&lt;/span&gt; secretpassword &lt;span class="nt"&gt;--authenticationDatabase&lt;/span&gt; admin &lt;span class="nt"&gt;--archive&lt;/span&gt; &lt;span class="nt"&gt;--gzip&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; backup.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Scheduled backups with cron
&lt;/h3&gt;

&lt;p&gt;Create a backup script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d_%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;BACKUP_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/backups"&lt;/span&gt;
&lt;span class="nv"&gt;CONTAINER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"mongodb"&lt;/span&gt;

docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nv"&gt;$CONTAINER&lt;/span&gt; mongodump &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-u&lt;/span&gt; admin &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$MONGO_ROOT_PASSWORD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--authenticationDatabase&lt;/span&gt; admin &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--archive&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--gzip&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BACKUP_DIR&lt;/span&gt;&lt;span class="s2"&gt;/mongodb_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.gz"&lt;/span&gt;

&lt;span class="c"&gt;# Keep only last 7 days&lt;/span&gt;
find &lt;span class="nv"&gt;$BACKUP_DIR&lt;/span&gt; &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"mongodb_*.gz"&lt;/span&gt; &lt;span class="nt"&gt;-mtime&lt;/span&gt; +7 &lt;span class="nt"&gt;-delete&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add to crontab for daily 3 AM backups:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 3 * * * /usr/local/bin/mongodb-backup.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using Databasus for automated backups
&lt;/h3&gt;

&lt;p&gt;Manual backup scripts work but require ongoing maintenance and lack built-in monitoring. Databasus (an industry standard for &lt;a href="https://databasus.com/mongodb-backup" rel="noopener noreferrer"&gt;MongoDB backup&lt;/a&gt;) provides automated backups with a web interface, scheduling and notifications.&lt;/p&gt;

&lt;p&gt;Install Databasus on a separate server using Docker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; databasus &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 4005:4005 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ./databasus-data:/databasus-data &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--restart&lt;/span&gt; unless-stopped &lt;span class="se"&gt;\&lt;/span&gt;
  databasus/databasus:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or with Docker Compose:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;databasus&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;databasus/databasus:latest&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;databasus&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;4005:4005"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;databasus-data:/databasus-data&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;databasus-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Access the web interface at &lt;code&gt;http://your-databasus-server:4005&lt;/code&gt;, then:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add your database&lt;/strong&gt; — Click "New Database", select MongoDB, enter your MongoDB server's connection details (host, port, credentials)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Select storage&lt;/strong&gt; — Choose local storage, S3, Google Cloud Storage, or other supported destinations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Select schedule&lt;/strong&gt; — Set backup frequency: hourly, daily, weekly, or custom cron expression&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Click "Create backup"&lt;/strong&gt; — Databasus handles backup execution, compression, retention and notifications&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Databasus supports multiple notification channels including Slack, Discord, Telegram and email, so you know immediately when backups succeed or fail.&lt;/p&gt;

&lt;h2&gt;
  
  
  Replica sets in Docker
&lt;/h2&gt;

&lt;p&gt;For production environments, running MongoDB as a replica set provides high availability and data redundancy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Single-node replica set
&lt;/h3&gt;

&lt;p&gt;Even a single-node replica set is useful because it enables change streams and transactions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongodb&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--replSet"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rs0"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--bind_ip_all"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;secretpassword&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;27017:27017"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-data:/data/db&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Initialize the replica set after starting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; mongodb mongosh &lt;span class="nt"&gt;-u&lt;/span&gt; admin &lt;span class="nt"&gt;-p&lt;/span&gt; secretpassword &lt;span class="nt"&gt;--authenticationDatabase&lt;/span&gt; admin &lt;span class="nt"&gt;--eval&lt;/span&gt; &lt;span class="s2"&gt;"rs.initiate()"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Three-node replica set
&lt;/h3&gt;

&lt;p&gt;For actual high availability, run three nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-primary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongodb-primary&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--replSet"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rs0"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--bind_ip_all"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--keyFile"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/etc/mongodb/keyfile"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-primary-data:/data/db&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./keyfile:/etc/mongodb/keyfile:ro&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-network&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

  &lt;span class="na"&gt;mongodb-secondary1&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongodb-secondary1&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--replSet"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rs0"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--bind_ip_all"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--keyFile"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/etc/mongodb/keyfile"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-secondary1-data:/data/db&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./keyfile:/etc/mongodb/keyfile:ro&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-network&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-primary&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

  &lt;span class="na"&gt;mongodb-secondary2&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongodb-secondary2&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--replSet"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rs0"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--bind_ip_all"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--keyFile"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/etc/mongodb/keyfile"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-secondary2-data:/data/db&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./keyfile:/etc/mongodb/keyfile:ro&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-network&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongodb-primary&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

&lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-network&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-primary-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-secondary1-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb-secondary2-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Generate the keyfile for internal authentication:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openssl rand &lt;span class="nt"&gt;-base64&lt;/span&gt; 756 &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; keyfile
&lt;span class="nb"&gt;chmod &lt;/span&gt;400 keyfile
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Initialize the replica set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; mongodb-primary mongosh &lt;span class="nt"&gt;--eval&lt;/span&gt; &lt;span class="s2"&gt;"
rs.initiate({
  _id: 'rs0',
  members: [
    { _id: 0, host: 'mongodb-primary:27017', priority: 2 },
    { _id: 1, host: 'mongodb-secondary1:27017', priority: 1 },
    { _id: 2, host: 'mongodb-secondary2:27017', priority: 1 }
  ]
})
"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Security considerations
&lt;/h2&gt;

&lt;p&gt;Running databases in containers doesn't reduce security requirements. If anything, you need more attention to configuration details.&lt;/p&gt;

&lt;h3&gt;
  
  
  Enable authentication
&lt;/h3&gt;

&lt;p&gt;Never run MongoDB without authentication in any environment beyond local development:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_USERNAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;secretpassword&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Secure passwords with secrets
&lt;/h3&gt;

&lt;p&gt;Use environment variables from secrets management:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_USERNAME_FILE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/run/secrets/mongo_username&lt;/span&gt;
      &lt;span class="na"&gt;MONGO_INITDB_ROOT_PASSWORD_FILE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/run/secrets/mongo_password&lt;/span&gt;
    &lt;span class="na"&gt;secrets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongo_username&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mongo_password&lt;/span&gt;

&lt;span class="na"&gt;secrets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongo_username&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./secrets/mongo_username.txt&lt;/span&gt;
  &lt;span class="na"&gt;mongo_password&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./secrets/mongo_password.txt&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Network isolation
&lt;/h3&gt;

&lt;p&gt;Don't expose database ports to the public internet. Use internal Docker networks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;frontend&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;backend&lt;/span&gt;

  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;backend&lt;/span&gt;

&lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;frontend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;internal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Resource limits
&lt;/h3&gt;

&lt;p&gt;Prevent runaway containers from consuming all system resources:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mongo:8&lt;/span&gt;
    &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;cpus&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2"&lt;/span&gt;
          &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;4G&lt;/span&gt;
        &lt;span class="na"&gt;reservations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;cpus&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;
          &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;2G&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Production checklist
&lt;/h2&gt;

&lt;p&gt;Before running MongoDB Docker containers in production, verify these items:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data persistence configured with volumes&lt;/li&gt;
&lt;li&gt;Authentication enabled with strong passwords&lt;/li&gt;
&lt;li&gt;Custom configuration tuned for workload&lt;/li&gt;
&lt;li&gt;Health checks enabled&lt;/li&gt;
&lt;li&gt;Automated backup strategy in place&lt;/li&gt;
&lt;li&gt;Secrets managed securely (not hardcoded)&lt;/li&gt;
&lt;li&gt;Network properly isolated&lt;/li&gt;
&lt;li&gt;Resource limits set&lt;/li&gt;
&lt;li&gt;Monitoring and alerting configured&lt;/li&gt;
&lt;li&gt;Restart policy set to &lt;code&gt;unless-stopped&lt;/code&gt; or &lt;code&gt;always&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Container image version pinned (not using &lt;code&gt;latest&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Troubleshooting common issues
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Container exits immediately
&lt;/h3&gt;

&lt;p&gt;Check logs for errors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker logs mongodb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Common causes: permission issues on mounted volumes, corrupt data directory, or invalid configuration file syntax.&lt;/p&gt;

&lt;h3&gt;
  
  
  Permission denied on bind mount
&lt;/h3&gt;

&lt;p&gt;Ensure the host directory has correct ownership:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo chown&lt;/span&gt; &lt;span class="nt"&gt;-R&lt;/span&gt; 999:999 /path/to/data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or run MongoDB with your user ID:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--user&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Can't connect from host
&lt;/h3&gt;

&lt;p&gt;Verify port mapping:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker port mongodb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check if MongoDB is listening on all interfaces. The default bind address should be &lt;code&gt;0.0.0.0&lt;/code&gt; in Docker, but verify with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec &lt;/span&gt;mongodb mongosh &lt;span class="nt"&gt;--eval&lt;/span&gt; &lt;span class="s2"&gt;"db.adminCommand({getCmdLineOpts: 1})"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Replica set won't initialize
&lt;/h3&gt;

&lt;p&gt;Ensure all nodes can resolve each other's hostnames. When using Docker Compose, services communicate by service name. If using custom hostnames, add them to &lt;code&gt;/etc/hosts&lt;/code&gt; or use Docker's &lt;code&gt;--add-host&lt;/code&gt; option.&lt;/p&gt;

&lt;h3&gt;
  
  
  Slow performance
&lt;/h3&gt;

&lt;p&gt;Check if WiredTiger cache is sized correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec &lt;/span&gt;mongodb mongosh &lt;span class="nt"&gt;--eval&lt;/span&gt; &lt;span class="s2"&gt;"db.serverStatus().wiredTiger.cache"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Docker Desktop on macOS and Windows, file system performance through volumes can be slow. Use named volumes instead of bind mounts for better performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Running MongoDB in Docker provides consistent environments across development, testing and production. Start with simple &lt;code&gt;docker run&lt;/code&gt; commands for quick setups, then move to Docker Compose for more complex configurations. Always configure data persistence with volumes, set up proper health checks, and implement automated backups. The overhead of containerization is minimal compared to the operational benefits of reproducibility and isolation. Pin your image versions, tune your configuration for your workload, and treat container security with the same rigor as traditional deployments.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mongodb</category>
    </item>
    <item>
      <title>MariaDB Docker setup — Running MariaDB in Docker containers complete guide</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Sun, 01 Feb 2026 18:34:39 +0000</pubDate>
      <link>https://forem.com/piteradyson/mariadb-docker-setup-running-mariadb-in-docker-containers-complete-guide-17ba</link>
      <guid>https://forem.com/piteradyson/mariadb-docker-setup-running-mariadb-in-docker-containers-complete-guide-17ba</guid>
      <description>&lt;p&gt;Running MariaDB in Docker simplifies deployment, makes environments reproducible, and allows you to spin up databases in seconds. Whether you need a quick dev environment or a production-ready setup, Docker handles the complexity of installation and configuration. This guide covers everything from basic container setup to production configurations with persistence, custom settings and proper backup strategies.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4n485bf59umkeyf4hbdr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4n485bf59umkeyf4hbdr.jpg" alt="MariaDB in Docker" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Why run MariaDB in Docker&lt;br&gt;
Traditional MariaDB installation means dealing with package managers, version conflicts, and cleanup when things go wrong. Docker containers provide isolation and consistency that bare-metal installations can't match.&lt;/p&gt;

&lt;p&gt;Benefits of containerized databases&lt;br&gt;
Docker containers package MariaDB with all its dependencies. You get the same behavior on your laptop, CI server, and production environment. No more "works on my machine" problems.&lt;/p&gt;

&lt;p&gt;Containers start in seconds. Spinning up a fresh MariaDB instance takes about 5-10 seconds compared to minutes for traditional installation. This matters when running integration tests or switching between projects.&lt;/p&gt;

&lt;p&gt;Cleanup is trivial. Delete the container and it's gone. No leftover configuration files, no orphaned data directories, no package conflicts with other software.&lt;/p&gt;

&lt;p&gt;When Docker makes sense&lt;br&gt;
Docker works well for development environments where you need quick setup and teardown. It's also suitable for microservices architectures where each service gets its own database. Testing and CI pipelines benefit from reproducible database instances.&lt;/p&gt;

&lt;p&gt;For production, Docker adds complexity but provides consistency across environments. You trade some raw performance for operational benefits. The overhead is typically 1-3% for database workloads, which is acceptable for most applications.&lt;/p&gt;

&lt;p&gt;Quick start with Docker run&lt;br&gt;
The fastest way to get MariaDB running is a single Docker command. This works for testing and development.&lt;/p&gt;

&lt;p&gt;Basic container setup&lt;br&gt;
Start MariaDB with minimal configuration:&lt;/p&gt;

&lt;p&gt;docker run -d \&lt;br&gt;
  --name mariadb \&lt;br&gt;
  -e MYSQL_ROOT_PASSWORD=my-secret-pw \&lt;br&gt;
  mariadb:11&lt;br&gt;
This starts MariaDB 11 in the background with root password set. The container runs until you stop it.&lt;/p&gt;

&lt;p&gt;Check if it's running:&lt;/p&gt;

&lt;p&gt;docker ps&lt;br&gt;
Connect to the database:&lt;/p&gt;

&lt;p&gt;docker exec -it mariadb mariadb -u root -p&lt;br&gt;
Environment variables for initial setup&lt;br&gt;
MariaDB's Docker image supports several environment variables for first-run configuration:&lt;/p&gt;

&lt;p&gt;Variable    Description Required&lt;br&gt;
MYSQL_ROOT_PASSWORD Root user password  Yes (or use one of the alternatives)&lt;br&gt;
MYSQL_ALLOW_EMPTY_PASSWORD  Allow empty root password   No&lt;br&gt;
MYSQL_RANDOM_ROOT_PASSWORD  Generate random root password   No&lt;br&gt;
MYSQL_DATABASE  Create database on startup  No&lt;br&gt;
MYSQL_USER  Create non-root user    No&lt;br&gt;
MYSQL_PASSWORD  Password for non-root user  No&lt;br&gt;
Create a database and user on startup:&lt;/p&gt;

&lt;p&gt;docker run -d \&lt;br&gt;
  --name mariadb \&lt;br&gt;
  -e MYSQL_ROOT_PASSWORD=rootpassword \&lt;br&gt;
  -e MYSQL_DATABASE=myapp \&lt;br&gt;
  -e MYSQL_USER=appuser \&lt;br&gt;
  -e MYSQL_PASSWORD=apppassword \&lt;br&gt;
  mariadb:11&lt;br&gt;
Exposing ports&lt;br&gt;
By default, MariaDB runs on port 3306 inside the container. Map it to your host:&lt;/p&gt;

&lt;p&gt;docker run -d \&lt;br&gt;
  --name mariadb \&lt;br&gt;
  -p 3306:3306 \&lt;br&gt;
  -e MYSQL_ROOT_PASSWORD=my-secret-pw \&lt;br&gt;
  mariadb:11&lt;br&gt;
Now you can connect from your host machine:&lt;/p&gt;

&lt;p&gt;mariadb -h 127.0.0.1 -u root -p&lt;br&gt;
Use a different host port if 3306 is already in use:&lt;/p&gt;

&lt;p&gt;docker run -d \&lt;br&gt;
  --name mariadb \&lt;br&gt;
  -p 3307:3306 \&lt;br&gt;
  -e MYSQL_ROOT_PASSWORD=my-secret-pw \&lt;br&gt;
  mariadb:11&lt;br&gt;
Data persistence with volumes&lt;br&gt;
Without volumes, your data disappears when the container stops. That's fine for throwaway test databases, but production needs persistence.&lt;/p&gt;

&lt;p&gt;Named volumes&lt;br&gt;
Docker named volumes are the simplest approach:&lt;/p&gt;

&lt;p&gt;docker run -d \&lt;br&gt;
  --name mariadb \&lt;br&gt;
  -v mariadb-data:/var/lib/mysql \&lt;br&gt;
  -e MYSQL_ROOT_PASSWORD=my-secret-pw \&lt;br&gt;
  mariadb:11&lt;br&gt;
The volume mariadb-data persists even after container deletion. List your volumes:&lt;/p&gt;

&lt;p&gt;docker volume ls&lt;br&gt;
Inspect volume details:&lt;/p&gt;

&lt;p&gt;docker volume inspect mariadb-data&lt;br&gt;
Bind mounts&lt;br&gt;
Bind mounts map a host directory into the container. Useful when you need direct access to data files:&lt;/p&gt;

&lt;p&gt;docker run -d \&lt;br&gt;
  --name mariadb \&lt;br&gt;
  -v /path/to/data:/var/lib/mysql \&lt;br&gt;
  -e MYSQL_ROOT_PASSWORD=my-secret-pw \&lt;br&gt;
  mariadb:11&lt;br&gt;
Make sure the directory exists and has proper permissions. On Linux, the MySQL user inside the container needs write access:&lt;/p&gt;

&lt;p&gt;mkdir -p /path/to/data&lt;br&gt;
chown -R 999:999 /path/to/data&lt;br&gt;
The UID 999 is the MySQL user inside the MariaDB container.&lt;/p&gt;

&lt;p&gt;Volume backup&lt;br&gt;
Back up a named volume by running a temporary container:&lt;/p&gt;

&lt;p&gt;docker run --rm \&lt;br&gt;
  -v mariadb-data:/source:ro \&lt;br&gt;
  -v $(pwd):/backup \&lt;br&gt;
  alpine tar czf /backup/mariadb-backup.tar.gz -C /source .&lt;br&gt;
This creates a compressed archive of the entire data directory. For proper database backups, use mysqldump or mariabackup instead, which we'll cover later.&lt;/p&gt;

&lt;p&gt;Docker Compose for MariaDB&lt;br&gt;
Docker Compose makes multi-container setups manageable and configurations version-controlled.&lt;/p&gt;

&lt;p&gt;Basic compose file&lt;br&gt;
Create a docker-compose.yml:&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  mariadb:&lt;br&gt;
    image: mariadb:11&lt;br&gt;
    container_name: mariadb&lt;br&gt;
    environment:&lt;br&gt;
      MYSQL_ROOT_PASSWORD: my-secret-pw&lt;br&gt;
      MYSQL_DATABASE: myapp&lt;br&gt;
      MYSQL_USER: appuser&lt;br&gt;
      MYSQL_PASSWORD: apppassword&lt;br&gt;
    ports:&lt;br&gt;
      - "3306:3306"&lt;br&gt;
    volumes:&lt;br&gt;
      - mariadb-data:/var/lib/mysql&lt;br&gt;
    restart: unless-stopped&lt;/p&gt;

&lt;p&gt;volumes:&lt;br&gt;
  mariadb-data:&lt;br&gt;
Start the service:&lt;/p&gt;

&lt;p&gt;docker compose up -d&lt;br&gt;
Stop and remove:&lt;/p&gt;

&lt;p&gt;docker compose down&lt;br&gt;
Remove including volumes:&lt;/p&gt;

&lt;p&gt;docker compose down -v&lt;br&gt;
Application with MariaDB&lt;br&gt;
A typical setup includes your application and MariaDB together:&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  app:&lt;br&gt;
    build: .&lt;br&gt;
    environment:&lt;br&gt;
      DATABASE_URL: mysql://appuser:apppassword@mariadb:3306/myapp&lt;br&gt;
    depends_on:&lt;br&gt;
      mariadb:&lt;br&gt;
        condition: service_healthy&lt;br&gt;
    ports:&lt;br&gt;
      - "8080:8080"&lt;/p&gt;

&lt;p&gt;mariadb:&lt;br&gt;
    image: mariadb:11&lt;br&gt;
    container_name: mariadb&lt;br&gt;
    environment:&lt;br&gt;
      MYSQL_ROOT_PASSWORD: rootpassword&lt;br&gt;
      MYSQL_DATABASE: myapp&lt;br&gt;
      MYSQL_USER: appuser&lt;br&gt;
      MYSQL_PASSWORD: apppassword&lt;br&gt;
    volumes:&lt;br&gt;
      - mariadb-data:/var/lib/mysql&lt;br&gt;
    healthcheck:&lt;br&gt;
      test: ["CMD", "healthcheck.sh", "--connect", "--innodb_initialized"]&lt;br&gt;
      interval: 10s&lt;br&gt;
      timeout: 5s&lt;br&gt;
      retries: 5&lt;br&gt;
    restart: unless-stopped&lt;/p&gt;

&lt;p&gt;volumes:&lt;br&gt;
  mariadb-data:&lt;br&gt;
The depends_on with condition: service_healthy ensures your app waits for MariaDB to be ready before starting.&lt;/p&gt;

&lt;p&gt;Custom configuration&lt;br&gt;
Default settings work for development but production often needs tuning.&lt;/p&gt;

&lt;p&gt;Configuration file mount&lt;br&gt;
Create a custom configuration file my.cnf:&lt;/p&gt;

&lt;p&gt;[mysqld]&lt;/p&gt;

&lt;h1&gt;
  
  
  InnoDB settings
&lt;/h1&gt;

&lt;p&gt;innodb_buffer_pool_size = 1G&lt;br&gt;
innodb_log_file_size = 256M&lt;br&gt;
innodb_flush_log_at_trx_commit = 2&lt;br&gt;
innodb_flush_method = O_DIRECT&lt;/p&gt;

&lt;h1&gt;
  
  
  Connection settings
&lt;/h1&gt;

&lt;p&gt;max_connections = 200&lt;br&gt;
wait_timeout = 600&lt;/p&gt;

&lt;h1&gt;
  
  
  Query cache (disabled in MariaDB 10.1.7+)
&lt;/h1&gt;

&lt;p&gt;query_cache_type = 0&lt;/p&gt;

&lt;h1&gt;
  
  
  Logging
&lt;/h1&gt;

&lt;p&gt;slow_query_log = 1&lt;br&gt;
slow_query_log_file = /var/log/mysql/slow.log&lt;br&gt;
long_query_time = 2&lt;/p&gt;

&lt;h1&gt;
  
  
  Character set
&lt;/h1&gt;

&lt;p&gt;character-set-server = utf8mb4&lt;br&gt;
collation-server = utf8mb4_unicode_ci&lt;br&gt;
Mount it into the container:&lt;/p&gt;

&lt;p&gt;docker run -d \&lt;br&gt;
  --name mariadb \&lt;br&gt;
  -v ./my.cnf:/etc/mysql/conf.d/custom.cnf:ro \&lt;br&gt;
  -v mariadb-data:/var/lib/mysql \&lt;br&gt;
  -e MYSQL_ROOT_PASSWORD=my-secret-pw \&lt;br&gt;
  mariadb:11&lt;br&gt;
Files in /etc/mysql/conf.d/ are read automatically.&lt;/p&gt;

&lt;p&gt;Docker Compose with custom config&lt;br&gt;
services:&lt;br&gt;
  mariadb:&lt;br&gt;
    image: mariadb:11&lt;br&gt;
    container_name: mariadb&lt;br&gt;
    environment:&lt;br&gt;
      MYSQL_ROOT_PASSWORD: my-secret-pw&lt;br&gt;
    volumes:&lt;br&gt;
      - mariadb-data:/var/lib/mysql&lt;br&gt;
      - ./my.cnf:/etc/mysql/conf.d/custom.cnf:ro&lt;br&gt;
    ports:&lt;br&gt;
      - "3306:3306"&lt;br&gt;
    restart: unless-stopped&lt;/p&gt;

&lt;p&gt;volumes:&lt;br&gt;
  mariadb-data:&lt;br&gt;
Common configuration options&lt;br&gt;
Key settings to consider for production:&lt;/p&gt;

&lt;p&gt;Setting Default Production recommendation&lt;br&gt;
innodb_buffer_pool_size 128M    50-70% of available RAM&lt;br&gt;
innodb_log_file_size    48M 256M-1G depending on write load&lt;br&gt;
max_connections 151 Based on expected concurrent connections&lt;br&gt;
innodb_flush_log_at_trx_commit  1   1 for durability, 2 for performance&lt;br&gt;
Verify your configuration is applied:&lt;/p&gt;

&lt;p&gt;docker exec mariadb mariadb -u root -p -e "SHOW VARIABLES LIKE 'innodb_buffer_pool_size';"&lt;br&gt;
Initialization scripts&lt;br&gt;
The MariaDB Docker image can run SQL scripts on first startup. This is useful for schema creation and seed data.&lt;/p&gt;

&lt;p&gt;SQL initialization&lt;br&gt;
Place .sql, .sql.gz, or .sql.xz files in /docker-entrypoint-initdb.d/:&lt;/p&gt;

&lt;p&gt;Create init/01-schema.sql:&lt;/p&gt;

&lt;p&gt;CREATE TABLE IF NOT EXISTS users (&lt;br&gt;
    id INT AUTO_INCREMENT PRIMARY KEY,&lt;br&gt;
    email VARCHAR(255) NOT NULL UNIQUE,&lt;br&gt;
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP&lt;br&gt;
);&lt;/p&gt;

&lt;p&gt;CREATE TABLE IF NOT EXISTS posts (&lt;br&gt;
    id INT AUTO_INCREMENT PRIMARY KEY,&lt;br&gt;
    user_id INT NOT NULL,&lt;br&gt;
    title VARCHAR(255) NOT NULL,&lt;br&gt;
    content TEXT,&lt;br&gt;
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,&lt;br&gt;
    FOREIGN KEY (user_id) REFERENCES users(id)&lt;br&gt;
);&lt;br&gt;
Create init/02-seed.sql:&lt;/p&gt;

&lt;p&gt;INSERT INTO users (email) VALUES ('&lt;a href="mailto:admin@example.com"&gt;admin@example.com&lt;/a&gt;');&lt;br&gt;
INSERT INTO users (email) VALUES ('&lt;a href="mailto:user@example.com"&gt;user@example.com&lt;/a&gt;');&lt;br&gt;
Mount the init directory:&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  mariadb:&lt;br&gt;
    image: mariadb:11&lt;br&gt;
    environment:&lt;br&gt;
      MYSQL_ROOT_PASSWORD: my-secret-pw&lt;br&gt;
      MYSQL_DATABASE: myapp&lt;br&gt;
    volumes:&lt;br&gt;
      - mariadb-data:/var/lib/mysql&lt;br&gt;
      - ./init:/docker-entrypoint-initdb.d:ro&lt;br&gt;
    restart: unless-stopped&lt;/p&gt;

&lt;p&gt;volumes:&lt;br&gt;
  mariadb-data:&lt;br&gt;
Scripts run in alphabetical order, only on first container start (when data directory is empty).&lt;/p&gt;

&lt;p&gt;Shell script initialization&lt;br&gt;
For more complex setup, use shell scripts:&lt;/p&gt;

&lt;p&gt;Create init/00-setup.sh:&lt;/p&gt;

&lt;h1&gt;
  
  
  !/bin/bash
&lt;/h1&gt;

&lt;p&gt;set -e&lt;/p&gt;

&lt;p&gt;mariadb -u root -p"$MYSQL_ROOT_PASSWORD" &amp;lt;&amp;lt;-EOSQL&lt;br&gt;
    CREATE USER IF NOT EXISTS 'readonly'@'%' IDENTIFIED BY 'readonlypassword';&lt;br&gt;
    GRANT SELECT ON myapp.* TO 'readonly'@'%';&lt;br&gt;
    FLUSH PRIVILEGES;&lt;br&gt;
EOSQL&lt;br&gt;
Make it executable:&lt;/p&gt;

&lt;p&gt;chmod +x init/00-setup.sh&lt;br&gt;
Networking&lt;br&gt;
Docker networking controls how containers communicate with each other and the outside world.&lt;/p&gt;

&lt;p&gt;Default bridge network&lt;br&gt;
Containers on the default bridge network can communicate via IP address but not hostname. For development this usually works fine:&lt;/p&gt;

&lt;p&gt;docker run -d --name mariadb -e MYSQL_ROOT_PASSWORD=pw mariadb:11&lt;br&gt;
docker run -it --rm mariadb:11 mariadb -h $(docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' mariadb) -u root -p&lt;br&gt;
Custom networks&lt;br&gt;
Custom networks allow hostname-based communication:&lt;/p&gt;

&lt;p&gt;docker network create myapp-network&lt;/p&gt;

&lt;p&gt;docker run -d \&lt;br&gt;
  --name mariadb \&lt;br&gt;
  --network myapp-network \&lt;br&gt;
  -e MYSQL_ROOT_PASSWORD=pw \&lt;br&gt;
  mariadb:11&lt;/p&gt;

&lt;p&gt;docker run -it --rm \&lt;br&gt;
  --network myapp-network \&lt;br&gt;
  mariadb:11 \&lt;br&gt;
  mariadb -h mariadb -u root -p&lt;br&gt;
The second container can reach MariaDB using hostname mariadb.&lt;/p&gt;

&lt;p&gt;Compose networking&lt;br&gt;
Docker Compose creates a network automatically. Services communicate by service name:&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  app:&lt;br&gt;
    image: myapp&lt;br&gt;
    environment:&lt;br&gt;
      DB_HOST: mariadb # Use service name as hostname&lt;br&gt;
      DB_PORT: 3306&lt;/p&gt;

&lt;p&gt;mariadb:&lt;br&gt;
    image: mariadb:11&lt;br&gt;
    environment:&lt;br&gt;
      MYSQL_ROOT_PASSWORD: pw&lt;br&gt;
Health checks and monitoring&lt;br&gt;
Proper health checks ensure containers are actually ready to serve traffic, not just running.&lt;/p&gt;

&lt;p&gt;Built-in health check&lt;br&gt;
MariaDB's Docker image includes a health check script:&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  mariadb:&lt;br&gt;
    image: mariadb:11&lt;br&gt;
    environment:&lt;br&gt;
      MYSQL_ROOT_PASSWORD: my-secret-pw&lt;br&gt;
    healthcheck:&lt;br&gt;
      test: ["CMD", "healthcheck.sh", "--connect", "--innodb_initialized"]&lt;br&gt;
      interval: 10s&lt;br&gt;
      timeout: 5s&lt;br&gt;
      retries: 5&lt;br&gt;
      start_period: 30s&lt;br&gt;
Check health status:&lt;/p&gt;

&lt;p&gt;docker inspect --format='{{.State.Health.Status}}' mariadb&lt;br&gt;
Custom health check&lt;br&gt;
For specific requirements, write custom checks:&lt;/p&gt;

&lt;p&gt;healthcheck:&lt;br&gt;
  test:&lt;br&gt;
    [&lt;br&gt;
      "CMD",&lt;br&gt;
      "mariadb",&lt;br&gt;
      "-u",&lt;br&gt;
      "root",&lt;br&gt;
      "-p$$MYSQL_ROOT_PASSWORD",&lt;br&gt;
      "-e",&lt;br&gt;
      "SELECT 1",&lt;br&gt;
    ]&lt;br&gt;
  interval: 10s&lt;br&gt;
  timeout: 5s&lt;br&gt;
  retries: 5&lt;br&gt;
Monitoring with logs&lt;br&gt;
View container logs:&lt;/p&gt;

&lt;p&gt;docker logs mariadb&lt;br&gt;
Follow logs in real-time:&lt;/p&gt;

&lt;p&gt;docker logs -f mariadb&lt;br&gt;
Limit output to recent entries:&lt;/p&gt;

&lt;p&gt;docker logs --tail 100 mariadb&lt;br&gt;
Enable slow query logging in your configuration to catch performance issues:&lt;/p&gt;

&lt;p&gt;[mysqld]&lt;br&gt;
slow_query_log = 1&lt;br&gt;
slow_query_log_file = /var/log/mysql/slow.log&lt;br&gt;
long_query_time = 1&lt;br&gt;
Mount a volume for logs:&lt;/p&gt;

&lt;p&gt;volumes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;mariadb-data:/var/lib/mysql&lt;/li&gt;
&lt;li&gt;mariadb-logs:/var/log/mysql
Backup strategies for Docker MariaDB
Data in containers needs the same backup discipline as traditional installations. Docker adds some nuances but the fundamentals remain.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Using mysqldump in Docker&lt;br&gt;
Run mysqldump inside the container:&lt;/p&gt;

&lt;p&gt;docker exec mariadb mysqldump -u root -p"$MYSQL_ROOT_PASSWORD" myapp &amp;gt; backup.sql&lt;br&gt;
For all databases:&lt;/p&gt;

&lt;p&gt;docker exec mariadb mysqldump -u root -p"$MYSQL_ROOT_PASSWORD" --all-databases &amp;gt; all_databases.sql&lt;br&gt;
Compressed backup:&lt;/p&gt;

&lt;p&gt;docker exec mariadb mysqldump -u root -p"$MYSQL_ROOT_PASSWORD" myapp | gzip &amp;gt; backup.sql.gz&lt;br&gt;
Scheduled backups with cron&lt;br&gt;
Create a backup script:&lt;/p&gt;

&lt;h1&gt;
  
  
  !/bin/bash
&lt;/h1&gt;

&lt;p&gt;TIMESTAMP=$(date +%Y%m%d_%H%M%S)&lt;br&gt;
BACKUP_DIR="/backups"&lt;br&gt;
CONTAINER="mariadb"&lt;/p&gt;

&lt;p&gt;docker exec $CONTAINER mysqldump -u root -p"$MYSQL_ROOT_PASSWORD" --all-databases | \&lt;br&gt;
  gzip &amp;gt; "$BACKUP_DIR/mariadb_${TIMESTAMP}.sql.gz"&lt;/p&gt;

&lt;h1&gt;
  
  
  Keep only last 7 days
&lt;/h1&gt;

&lt;p&gt;find $BACKUP_DIR -name "mariadb_*.sql.gz" -mtime +7 -delete&lt;br&gt;
Add to crontab for daily 3 AM backups:&lt;/p&gt;

&lt;p&gt;0 3 * * * /usr/local/bin/mariadb-backup.sh&lt;br&gt;
Using Databasus for automated backups&lt;br&gt;
Manual backup scripts work but require maintenance and lack monitoring. Databasus (an industry standard for MariaDB backup) provides automated backups with a web interface, scheduling, and notifications.&lt;/p&gt;

&lt;p&gt;Install Databasus on a separate server using Docker:&lt;/p&gt;

&lt;p&gt;docker run -d \&lt;br&gt;
  --name databasus \&lt;br&gt;
  -p 4005:4005 \&lt;br&gt;
  -v ./databasus-data:/databasus-data \&lt;br&gt;
  --restart unless-stopped \&lt;br&gt;
  databasus/databasus:latest&lt;br&gt;
Or with Docker Compose:&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  databasus:&lt;br&gt;
    image: databasus/databasus:latest&lt;br&gt;
    container_name: databasus&lt;br&gt;
    ports:&lt;br&gt;
      - "4005:4005"&lt;br&gt;
    volumes:&lt;br&gt;
      - databasus-data:/databasus-data&lt;br&gt;
    restart: unless-stopped&lt;/p&gt;

&lt;p&gt;volumes:&lt;br&gt;
  databasus-data:&lt;br&gt;
Access the web interface at &lt;a href="http://localhost:4005" rel="noopener noreferrer"&gt;http://localhost:4005&lt;/a&gt;, then:&lt;/p&gt;

&lt;p&gt;Add your database — Click "New Database", select MariaDB, enter your MariaDB server's connection details (host, port, credentials)&lt;br&gt;
Select storage — Choose local storage, S3, Google Cloud Storage, or other supported destinations&lt;br&gt;
Select schedule — Set backup frequency: hourly, daily, weekly, or custom cron expression&lt;br&gt;
Click "Create backup" — Databasus handles backup execution, compression, retention and notifications&lt;br&gt;
Databasus supports multiple notification channels including Slack, Discord, Telegram and email, so you know immediately when backups succeed or fail.&lt;/p&gt;

&lt;p&gt;Security considerations&lt;br&gt;
Running databases in containers doesn't change security requirements. If anything, you need to be more careful about configuration.&lt;/p&gt;

&lt;p&gt;Secure root password&lt;br&gt;
Never use default or weak passwords. Use environment variables from secrets management:&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  mariadb:&lt;br&gt;
    image: mariadb:11&lt;br&gt;
    environment:&lt;br&gt;
      MYSQL_ROOT_PASSWORD_FILE: /run/secrets/mariadb_root_password&lt;br&gt;
    secrets:&lt;br&gt;
      - mariadb_root_password&lt;/p&gt;

&lt;p&gt;secrets:&lt;br&gt;
  mariadb_root_password:&lt;br&gt;
    file: ./secrets/mariadb_root_password.txt&lt;br&gt;
For Docker Swarm, use proper Docker secrets:&lt;/p&gt;

&lt;p&gt;echo "my-secret-pw" | docker secret create mariadb_root_password -&lt;br&gt;
Network isolation&lt;br&gt;
Don't expose database ports to the public internet. Use internal Docker networks:&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  app:&lt;br&gt;
    networks:&lt;br&gt;
      - frontend&lt;br&gt;
      - backend&lt;/p&gt;

&lt;p&gt;mariadb:&lt;br&gt;
    networks:&lt;br&gt;
      - backend # Only accessible from backend network&lt;/p&gt;

&lt;p&gt;networks:&lt;br&gt;
  frontend:&lt;br&gt;
  backend:&lt;br&gt;
    internal: true # No external access&lt;br&gt;
Read-only root filesystem&lt;br&gt;
For extra security, run with read-only root filesystem:&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  mariadb:&lt;br&gt;
    image: mariadb:11&lt;br&gt;
    read_only: true&lt;br&gt;
    tmpfs:&lt;br&gt;
      - /tmp&lt;br&gt;
      - /run/mysqld&lt;br&gt;
    volumes:&lt;br&gt;
      - mariadb-data:/var/lib/mysql&lt;br&gt;
Resource limits&lt;br&gt;
Prevent runaway containers from consuming all resources:&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  mariadb:&lt;br&gt;
    image: mariadb:11&lt;br&gt;
    deploy:&lt;br&gt;
      resources:&lt;br&gt;
        limits:&lt;br&gt;
          cpus: "2"&lt;br&gt;
          memory: 4G&lt;br&gt;
        reservations:&lt;br&gt;
          cpus: "1"&lt;br&gt;
          memory: 2G&lt;br&gt;
Production checklist&lt;br&gt;
Before running MariaDB Docker containers in production, verify these items:&lt;/p&gt;

&lt;p&gt;Data persistence configured with volumes&lt;br&gt;
Custom configuration tuned for workload&lt;br&gt;
Health checks enabled&lt;br&gt;
Automated backup strategy in place&lt;br&gt;
Secrets managed securely (not hardcoded)&lt;br&gt;
Network properly isolated&lt;br&gt;
Resource limits set&lt;br&gt;
Monitoring and alerting configured&lt;br&gt;
Restart policy set to unless-stopped or always&lt;br&gt;
Container image version pinned (not using latest)&lt;br&gt;
Troubleshooting common issues&lt;br&gt;
Container exits immediately&lt;br&gt;
Check logs for errors:&lt;/p&gt;

&lt;p&gt;docker logs mariadb&lt;br&gt;
Common causes: missing MYSQL_ROOT_PASSWORD, corrupt data directory, or permission issues on mounted volumes.&lt;/p&gt;

&lt;p&gt;Permission denied on bind mount&lt;br&gt;
Ensure the host directory has correct ownership:&lt;/p&gt;

&lt;p&gt;sudo chown -R 999:999 /path/to/data&lt;br&gt;
Or run MariaDB with your user ID:&lt;/p&gt;

&lt;p&gt;docker run -d --user $(id -u):$(id -g) ...&lt;br&gt;
Can't connect from host&lt;br&gt;
Verify port mapping:&lt;/p&gt;

&lt;p&gt;docker port mariadb&lt;br&gt;
Check if MariaDB is listening on all interfaces. By default it should, but verify with:&lt;/p&gt;

&lt;p&gt;docker exec mariadb mariadb -u root -p -e "SHOW VARIABLES LIKE 'bind_address';"&lt;br&gt;
Slow performance&lt;br&gt;
Check if InnoDB buffer pool is sized correctly:&lt;/p&gt;

&lt;p&gt;docker exec mariadb mariadb -u root -p -e "SHOW VARIABLES LIKE 'innodb_buffer_pool_size';"&lt;br&gt;
For Docker Desktop on macOS/Windows, file system performance through volumes can be slow. Use named volumes instead of bind mounts for better performance.&lt;/p&gt;

&lt;p&gt;Conclusion&lt;br&gt;
Running MariaDB in Docker provides consistent environments across development, testing and production. Start with simple docker run commands for quick setups, then move to Docker Compose for more complex configurations. Always configure data persistence with volumes, set up proper health checks, and implement automated backups. The overhead of containerization is minimal compared to the operational benefits of reproducibility and isolation. Pin your image versions, tune your configuration for your workload, and treat container security with the same rigor as traditional deployments.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mariadb</category>
    </item>
    <item>
      <title>MySQL backup to cloud — Backing up MySQL databases to AWS S3 and Google Cloud</title>
      <dc:creator>Piter Adyson</dc:creator>
      <pubDate>Thu, 29 Jan 2026 08:24:10 +0000</pubDate>
      <link>https://forem.com/piteradyson/mysql-backup-to-cloud-backing-up-mysql-databases-to-aws-s3-and-google-cloud-4fhc</link>
      <guid>https://forem.com/piteradyson/mysql-backup-to-cloud-backing-up-mysql-databases-to-aws-s3-and-google-cloud-4fhc</guid>
      <description>&lt;p&gt;Storing MySQL backups on the same server as your database is asking for trouble. If the server fails, you lose both your data and your backups. Cloud storage solves this by keeping backups offsite, automatically replicated across multiple data centers. This guide covers how to back up MySQL databases to AWS S3 and Google Cloud Storage, from manual uploads to fully automated pipelines.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsv8u3wrhgvtg4447rs1w.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsv8u3wrhgvtg4447rs1w.jpg" alt="MySQL cloud backup" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why cloud storage for MySQL backups
&lt;/h2&gt;

&lt;p&gt;Local backups work until they don't. A disk failure, ransomware attack, or datacenter issue can wipe out everything on one machine. Cloud storage provides geographic redundancy and durability that local storage can't match.&lt;/p&gt;

&lt;h3&gt;
  
  
  Durability and availability
&lt;/h3&gt;

&lt;p&gt;AWS S3 offers 99.999999999% (11 nines) durability. Google Cloud Storage provides similar guarantees. That means if you store 10 million objects, you'd statistically lose one every 10,000 years. Compare that to a single hard drive with 1-3% annual failure rate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost efficiency
&lt;/h3&gt;

&lt;p&gt;Cloud storage costs less than maintaining redundant local infrastructure. A terabyte on S3 Standard costs about $23/month. Infrequent access tiers drop to $12.50/month. Glacier deep archive goes as low as $0.99/month. For backups you rarely access, cold storage is remarkably cheap.&lt;/p&gt;

&lt;h3&gt;
  
  
  Operational simplicity
&lt;/h3&gt;

&lt;p&gt;No hardware to manage, no capacity planning, no failed disks to replace. Upload your backups and the cloud provider handles replication, durability and availability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating MySQL backups with mysqldump
&lt;/h2&gt;

&lt;p&gt;Before uploading to cloud storage, you need a backup file. mysqldump is the standard tool for MySQL logical backups.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic mysqldump usage
&lt;/h3&gt;

&lt;p&gt;Create a full database backup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mysqldump &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; &lt;span class="nt"&gt;--databases&lt;/span&gt; mydb &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; mydb_backup.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--single-transaction&lt;/code&gt; flag ensures a consistent snapshot without locking tables for InnoDB databases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Compressed backups
&lt;/h3&gt;

&lt;p&gt;Compress the backup to reduce upload time and storage costs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mysqldump &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; &lt;span class="nt"&gt;--databases&lt;/span&gt; mydb | &lt;span class="nb"&gt;gzip&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; mydb_backup.sql.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compression typically reduces MySQL dump files by 70-90%, depending on data content. A 10GB dump might compress to 1-2GB.&lt;/p&gt;

&lt;h3&gt;
  
  
  All databases backup
&lt;/h3&gt;

&lt;p&gt;Back up all databases on the server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mysqldump &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; &lt;span class="nt"&gt;--all-databases&lt;/span&gt; | &lt;span class="nb"&gt;gzip&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; all_databases.sql.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Backup with timestamp
&lt;/h3&gt;

&lt;p&gt;Include timestamps in filenames for easier management:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d_%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
mysqldump &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; &lt;span class="nt"&gt;--databases&lt;/span&gt; mydb | &lt;span class="nb"&gt;gzip&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; mydb_&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.sql.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Uploading to AWS S3
&lt;/h2&gt;

&lt;p&gt;AWS S3 is the most widely used object storage service. Getting backups there requires the AWS CLI and proper credentials.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting up AWS CLI
&lt;/h3&gt;

&lt;p&gt;Install the AWS CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Linux/macOS&lt;/span&gt;
curl &lt;span class="s2"&gt;"https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s2"&gt;"awscliv2.zip"&lt;/span&gt;
unzip awscliv2.zip
&lt;span class="nb"&gt;sudo&lt;/span&gt; ./aws/install

&lt;span class="c"&gt;# Verify installation&lt;/span&gt;
aws &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configure credentials:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws configure
&lt;span class="c"&gt;# Enter your AWS Access Key ID&lt;/span&gt;
&lt;span class="c"&gt;# Enter your AWS Secret Access Key&lt;/span&gt;
&lt;span class="c"&gt;# Enter default region (e.g., us-east-1)&lt;/span&gt;
&lt;span class="c"&gt;# Enter output format (json)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Creating an S3 bucket
&lt;/h3&gt;

&lt;p&gt;Create a bucket for your backups:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 mb s3://my-mysql-backups &lt;span class="nt"&gt;--region&lt;/span&gt; us-east-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable versioning to protect against accidental deletions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api put-bucket-versioning &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--bucket&lt;/span&gt; my-mysql-backups &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--versioning-configuration&lt;/span&gt; &lt;span class="nv"&gt;Status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Enabled
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Uploading backup files
&lt;/h3&gt;

&lt;p&gt;Upload a single backup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;mydb_backup.sql.gz s3://my-mysql-backups/daily/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Upload with metadata:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;mydb_backup.sql.gz s3://my-mysql-backups/daily/ &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--metadata&lt;/span&gt; &lt;span class="s2"&gt;"database=mydb,created=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-Iseconds&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Multipart uploads for large files
&lt;/h3&gt;

&lt;p&gt;For backups over 5GB, use multipart upload for reliability:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;large_backup.sql.gz s3://my-mysql-backups/ &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--expected-size&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;stat&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt;%z large_backup.sql.gz&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AWS CLI handles multipart uploads automatically for large files.&lt;/p&gt;

&lt;h3&gt;
  
  
  Combined backup and upload script
&lt;/h3&gt;

&lt;p&gt;Backup and upload in one operation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d_%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;BUCKET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"s3://my-mysql-backups"&lt;/span&gt;
&lt;span class="nv"&gt;DB_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"mydb"&lt;/span&gt;

&lt;span class="c"&gt;# Backup and compress&lt;/span&gt;
mysqldump &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$MYSQL_PASSWORD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; &lt;span class="nt"&gt;--databases&lt;/span&gt; &lt;span class="nv"&gt;$DB_NAME&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;gzip&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
    aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; - &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BUCKET&lt;/span&gt;&lt;span class="s2"&gt;/daily/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DB_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.sql.gz"&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Backup uploaded to &lt;/span&gt;&lt;span class="nv"&gt;$BUCKET&lt;/span&gt;&lt;span class="s2"&gt;/daily/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DB_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.sql.gz"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-&lt;/code&gt; after &lt;code&gt;aws s3 cp&lt;/code&gt; reads from stdin, allowing direct pipe from mysqldump without creating a local file first. This saves disk space and time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Uploading to Google Cloud Storage
&lt;/h2&gt;

&lt;p&gt;Google Cloud Storage (GCS) offers similar capabilities with different tooling.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting up gsutil
&lt;/h3&gt;

&lt;p&gt;Install the Google Cloud SDK:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Linux/macOS&lt;/span&gt;
curl https://sdk.cloud.google.com | bash
&lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt; &lt;span class="nv"&gt;$SHELL&lt;/span&gt;
gcloud init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Authenticate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gcloud auth login
gcloud config &lt;span class="nb"&gt;set &lt;/span&gt;project your-project-id
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Creating a GCS bucket
&lt;/h3&gt;

&lt;p&gt;Create a bucket:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gsutil mb &lt;span class="nt"&gt;-l&lt;/span&gt; us-central1 gs://my-mysql-backups
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable versioning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gsutil versioning &lt;span class="nb"&gt;set &lt;/span&gt;on gs://my-mysql-backups
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Uploading backup files
&lt;/h3&gt;

&lt;p&gt;Upload a backup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gsutil &lt;span class="nb"&gt;cp &lt;/span&gt;mydb_backup.sql.gz gs://my-mysql-backups/daily/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Upload with parallel composite uploads for large files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gsutil &lt;span class="nt"&gt;-o&lt;/span&gt; GSUtil:parallel_composite_upload_threshold&lt;span class="o"&gt;=&lt;/span&gt;100M &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;cp &lt;/span&gt;large_backup.sql.gz gs://my-mysql-backups/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Streaming upload to GCS
&lt;/h3&gt;

&lt;p&gt;Stream directly from mysqldump to GCS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d_%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;BUCKET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"gs://my-mysql-backups"&lt;/span&gt;
&lt;span class="nv"&gt;DB_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"mydb"&lt;/span&gt;

mysqldump &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$MYSQL_PASSWORD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; &lt;span class="nt"&gt;--databases&lt;/span&gt; &lt;span class="nv"&gt;$DB_NAME&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;gzip&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
    gsutil &lt;span class="nb"&gt;cp&lt;/span&gt; - &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BUCKET&lt;/span&gt;&lt;span class="s2"&gt;/daily/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DB_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.sql.gz"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Storage classes and cost optimization
&lt;/h2&gt;

&lt;p&gt;Both AWS and GCS offer multiple storage classes at different price points. Choosing the right class can significantly reduce costs.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS S3 storage classes
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Storage class&lt;/th&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;Price per GB/month&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;S3 Standard&lt;/td&gt;
&lt;td&gt;Frequently accessed backups&lt;/td&gt;
&lt;td&gt;$0.023&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 Standard-IA&lt;/td&gt;
&lt;td&gt;Backups accessed monthly&lt;/td&gt;
&lt;td&gt;$0.0125&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 One Zone-IA&lt;/td&gt;
&lt;td&gt;Non-critical backups&lt;/td&gt;
&lt;td&gt;$0.01&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 Glacier Instant Retrieval&lt;/td&gt;
&lt;td&gt;Archive with quick access&lt;/td&gt;
&lt;td&gt;$0.004&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 Glacier Deep Archive&lt;/td&gt;
&lt;td&gt;Long-term archive&lt;/td&gt;
&lt;td&gt;$0.00099&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For most backup use cases, S3 Standard-IA provides good balance. You get immediate access when needed but pay less for storage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Google Cloud Storage classes
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Storage class&lt;/th&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;Price per GB/month&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Standard&lt;/td&gt;
&lt;td&gt;Frequently accessed&lt;/td&gt;
&lt;td&gt;$0.020&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Nearline&lt;/td&gt;
&lt;td&gt;Accessed once per month&lt;/td&gt;
&lt;td&gt;$0.010&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coldline&lt;/td&gt;
&lt;td&gt;Accessed once per quarter&lt;/td&gt;
&lt;td&gt;$0.004&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Archive&lt;/td&gt;
&lt;td&gt;Accessed once per year&lt;/td&gt;
&lt;td&gt;$0.0012&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Nearline works well for regular backup retention. Archive suits compliance requirements where you keep backups for years but rarely restore.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting storage class on upload
&lt;/h3&gt;

&lt;p&gt;Upload directly to a specific storage class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# AWS S3&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;backup.sql.gz s3://my-mysql-backups/ &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--storage-class&lt;/span&gt; STANDARD_IA

&lt;span class="c"&gt;# Google Cloud Storage&lt;/span&gt;
gsutil &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s2"&gt;"GSUtil:default_storage_class=NEARLINE"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;cp &lt;/span&gt;backup.sql.gz gs://my-mysql-backups/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Automating backups with cron
&lt;/h2&gt;

&lt;p&gt;Manual backups get forgotten. Cron automation ensures consistent execution.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic cron backup script
&lt;/h3&gt;

&lt;p&gt;Create a backup script at &lt;code&gt;/usr/local/bin/mysql-backup-to-s3.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;

&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d_%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;BUCKET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"s3://my-mysql-backups"&lt;/span&gt;
&lt;span class="nv"&gt;DB_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"production"&lt;/span&gt;
&lt;span class="nv"&gt;LOG_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/var/log/mysql-backup.log"&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;: Starting backup of &lt;/span&gt;&lt;span class="nv"&gt;$DB_NAME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;

mysqldump &lt;span class="nt"&gt;-u&lt;/span&gt; backup_user &lt;span class="nt"&gt;-p&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$MYSQL_BACKUP_PASSWORD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--routines&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--triggers&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--databases&lt;/span&gt; &lt;span class="nv"&gt;$DB_NAME&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;gzip&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
    aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; - &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BUCKET&lt;/span&gt;&lt;span class="s2"&gt;/daily/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DB_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.sql.gz"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--storage-class&lt;/span&gt; STANDARD_IA

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;: Backup completed"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make it executable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x /usr/local/bin/mysql-backup-to-s3.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cron schedule
&lt;/h3&gt;

&lt;p&gt;Add to crontab:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crontab &lt;span class="nt"&gt;-e&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run daily at 3 AM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 3 * * * MYSQL_BACKUP_PASSWORD='yourpassword' /usr/local/bin/mysql-backup-to-s3.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For hourly backups:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 * * * * MYSQL_BACKUP_PASSWORD='yourpassword' /usr/local/bin/mysql-backup-to-s3.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Handling credentials securely
&lt;/h3&gt;

&lt;p&gt;Avoid putting passwords in crontab. Use a MySQL options file instead:&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;~/.my.cnf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[mysqldump]&lt;/span&gt;
&lt;span class="py"&gt;user&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;backup_user&lt;/span&gt;
&lt;span class="py"&gt;password&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;yourpassword&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Restrict permissions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod &lt;/span&gt;600 ~/.my.cnf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then remove the password from the mysqldump command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mysqldump &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; &lt;span class="nt"&gt;--databases&lt;/span&gt; &lt;span class="nv"&gt;$DB_NAME&lt;/span&gt; | &lt;span class="nb"&gt;gzip&lt;/span&gt; | aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; - ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Lifecycle policies for automatic cleanup
&lt;/h2&gt;

&lt;p&gt;Without cleanup, backup storage grows forever. Cloud lifecycle policies automate deletion of old backups.&lt;/p&gt;

&lt;h3&gt;
  
  
  S3 lifecycle policy
&lt;/h3&gt;

&lt;p&gt;Create a lifecycle policy to delete backups after 30 days:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Rules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Delete old MySQL backups"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Enabled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Filter"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Prefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"daily/"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Expiration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Days"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Move to Glacier after 7 days"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Enabled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Filter"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Prefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"monthly/"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Transitions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Days"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"StorageClass"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GLACIER"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Expiration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Days"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;365&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply the policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api put-bucket-lifecycle-configuration &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--bucket&lt;/span&gt; my-mysql-backups &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--lifecycle-configuration&lt;/span&gt; file://lifecycle.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  GCS lifecycle policy
&lt;/h3&gt;

&lt;p&gt;Create a lifecycle configuration for GCS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"lifecycle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"rule"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Delete"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"age"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"matchesPrefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"daily/"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SetStorageClass"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"storageClass"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"COLDLINE"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"age"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"matchesPrefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"monthly/"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply with gsutil:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gsutil lifecycle &lt;span class="nb"&gt;set &lt;/span&gt;lifecycle.json gs://my-mysql-backups
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Using Databasus for automated cloud backups
&lt;/h2&gt;

&lt;p&gt;Manual scripts work but require ongoing maintenance. Cron jobs fail silently, credentials management gets complicated, and monitoring requires extra setup. Databasus (an industry standard for &lt;a href="https://databasus.com/mysql-backup" rel="noopener noreferrer"&gt;MySQL backup&lt;/a&gt;) handles all of this automatically with a web interface for configuration and monitoring.&lt;/p&gt;

&lt;h3&gt;
  
  
  Installing Databasus
&lt;/h3&gt;

&lt;p&gt;Using Docker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; databasus &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 4005:4005 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ./databasus-data:/databasus-data &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--restart&lt;/span&gt; unless-stopped &lt;span class="se"&gt;\&lt;/span&gt;
  databasus/databasus:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or with Docker Compose:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;databasus&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;databasus&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;databasus/databasus:latest&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;4005:4005"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./databasus-data:/databasus-data&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start the service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Configuring MySQL backup to S3 or GCS
&lt;/h3&gt;

&lt;p&gt;Access the web interface at &lt;code&gt;http://localhost:4005&lt;/code&gt; and create your account, then:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add your database&lt;/strong&gt; — Click "New Database", select MySQL, and enter your connection details (host, port, username, password, database name)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Select storage&lt;/strong&gt; — Choose AWS S3 or Google Cloud Storage. Enter your bucket name and credentials. Databasus supports both IAM roles and access keys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Select schedule&lt;/strong&gt; — Set backup frequency: hourly, daily, weekly, or custom cron expression&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Click "Create backup"&lt;/strong&gt; — Databasus handles backup execution, compression, upload, retention and notifications automatically&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Databasus also provides email, Slack, Telegram and Discord notifications for backup success and failure, eliminating the need for separate monitoring scripts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Restoring from cloud backups
&lt;/h2&gt;

&lt;p&gt;Backups are worthless if you can't restore them. Practice restoration before you need it in an emergency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Downloading from S3
&lt;/h3&gt;

&lt;p&gt;List available backups:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;ls &lt;/span&gt;s3://my-mysql-backups/daily/ &lt;span class="nt"&gt;--human-readable&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Download a specific backup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;s3://my-mysql-backups/daily/mydb_20240115_030000.sql.gz ./
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Downloading from GCS
&lt;/h3&gt;

&lt;p&gt;List backups:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gsutil &lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt; gs://my-mysql-backups/daily/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Download:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gsutil &lt;span class="nb"&gt;cp &lt;/span&gt;gs://my-mysql-backups/daily/mydb_20240115_030000.sql.gz ./
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Restoring the backup
&lt;/h3&gt;

&lt;p&gt;Decompress and restore:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;gunzip&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; mydb_20240115_030000.sql.gz | mysql &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or in one command directly from S3:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;s3://my-mysql-backups/daily/mydb_20240115_030000.sql.gz - | &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;gunzip&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
    mysql &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Testing restores regularly
&lt;/h3&gt;

&lt;p&gt;Create a test restore script that runs monthly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Get the latest backup&lt;/span&gt;
&lt;span class="nv"&gt;LATEST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws s3 &lt;span class="nb"&gt;ls &lt;/span&gt;s3://my-mysql-backups/daily/ | &lt;span class="nb"&gt;sort&lt;/span&gt; | &lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; 1 | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $4}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Create test database&lt;/span&gt;
mysql &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;"CREATE DATABASE restore_test;"&lt;/span&gt;

&lt;span class="c"&gt;# Restore&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="s2"&gt;"s3://my-mysql-backups/daily/&lt;/span&gt;&lt;span class="nv"&gt;$LATEST&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; - | &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;gunzip&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
    mysql &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt; restore_test

&lt;span class="c"&gt;# Verify (check row count on a known table)&lt;/span&gt;
&lt;span class="nv"&gt;ROWS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;mysql &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;-N&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;"SELECT COUNT(*) FROM restore_test.users;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Restored &lt;/span&gt;&lt;span class="nv"&gt;$ROWS&lt;/span&gt;&lt;span class="s2"&gt; rows from users table"&lt;/span&gt;

&lt;span class="c"&gt;# Cleanup&lt;/span&gt;
mysql &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;"DROP DATABASE restore_test;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Security considerations
&lt;/h2&gt;

&lt;p&gt;Cloud backups require careful security configuration to avoid exposing your data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Encryption at rest
&lt;/h3&gt;

&lt;p&gt;Both S3 and GCS encrypt data at rest by default. For additional security, enable server-side encryption with customer-managed keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# S3 with SSE-S3 (Amazon-managed keys)&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;backup.sql.gz s3://my-mysql-backups/ &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--sse&lt;/span&gt; AES256

&lt;span class="c"&gt;# S3 with SSE-KMS (customer-managed keys)&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;backup.sql.gz s3://my-mysql-backups/ &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--sse&lt;/span&gt; aws:kms &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--sse-kms-key-id&lt;/span&gt; &lt;span class="nb"&gt;alias&lt;/span&gt;/my-backup-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Client-side encryption
&lt;/h3&gt;

&lt;p&gt;Encrypt before uploading for zero-trust storage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Encrypt with gpg&lt;/span&gt;
mysqldump &lt;span class="nt"&gt;-u&lt;/span&gt; root &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; &lt;span class="nt"&gt;--databases&lt;/span&gt; mydb | &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;gzip&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
    gpg &lt;span class="nt"&gt;--symmetric&lt;/span&gt; &lt;span class="nt"&gt;--cipher-algo&lt;/span&gt; AES256 &lt;span class="nt"&gt;-o&lt;/span&gt; mydb_backup.sql.gz.gpg

&lt;span class="c"&gt;# Upload encrypted file&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;mydb_backup.sql.gz.gpg s3://my-mysql-backups/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  IAM policies
&lt;/h3&gt;

&lt;p&gt;Restrict backup credentials to minimum required permissions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"s3:PutObject"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"s3:GetObject"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"s3:ListBucket"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:s3:::my-mysql-backups"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:s3:::my-mysql-backups/*"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Don't use root credentials or overly permissive policies for backup operations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Network security
&lt;/h3&gt;

&lt;p&gt;Transfer backups over encrypted connections only. Both AWS CLI and gsutil use HTTPS by default. If you're backing up from within AWS or GCP, use VPC endpoints to keep traffic off the public internet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create S3 VPC endpoint (via AWS Console or CLI)&lt;/span&gt;
aws ec2 create-vpc-endpoint &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--vpc-id&lt;/span&gt; vpc-abc123 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--service-name&lt;/span&gt; com.amazonaws.us-east-1.s3 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--route-table-ids&lt;/span&gt; rtb-abc123
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Cloud storage transforms MySQL backups from local files vulnerable to single points of failure into durable, geographically distributed archives. Start with basic scripts using mysqldump and the AWS CLI or gsutil for simple setups. Add cron scheduling for automation and lifecycle policies for retention management. For production systems, consider dedicated backup tools like Databasus that handle scheduling, monitoring and notifications in one package. Whatever approach you choose, test restores regularly. Backups that can't be restored provide no protection.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mysql</category>
    </item>
  </channel>
</rss>
