<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: YukiOnodera</title>
    <description>The latest articles on Forem by YukiOnodera (@yukionodera).</description>
    <link>https://forem.com/yukionodera</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1233037%2F06e91f94-7604-4892-b9fe-ce75dd21243b.jpeg</url>
      <title>Forem: YukiOnodera</title>
      <link>https://forem.com/yukionodera</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/yukionodera"/>
    <language>en</language>
    <item>
      <title>Instrumenting Python Apps with Datadog APM: A Docker Setup Guide</title>
      <dc:creator>YukiOnodera</dc:creator>
      <pubDate>Fri, 01 May 2026 13:51:47 +0000</pubDate>
      <link>https://forem.com/yukionodera/instrumenting-python-apps-with-datadog-apm-a-docker-setup-guide-3oo2</link>
      <guid>https://forem.com/yukionodera/instrumenting-python-apps-with-datadog-apm-a-docker-setup-guide-3oo2</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;In this post, we'll walk through the components required to instrument a Python application with Datadog APM, and how to configure them in a Docker environment.&lt;/p&gt;

&lt;p&gt;A common shorthand is "just install the Datadog Agent and the tracing library and you're good to go" — and while that's essentially correct, understanding what each component is actually doing under the hood makes troubleshooting dramatically easier when things go wrong. If you've ever stared at an empty APM dashboard wondering why your traces aren't showing up, this guide should help you build a clearer mental model of how the pieces fit together.&lt;/p&gt;

&lt;h1&gt;
  
  
  The Building Blocks of Datadog APM
&lt;/h1&gt;

&lt;p&gt;To instrument an app with Datadog APM, you need to set up two things: the &lt;strong&gt;Agent side&lt;/strong&gt; and the &lt;strong&gt;library side&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agent Side: The Trace Agent
&lt;/h2&gt;

&lt;p&gt;When you enable APM in the Datadog Agent, an internal component called the &lt;strong&gt;Trace Agent&lt;/strong&gt; starts up.&lt;/p&gt;

&lt;p&gt;The Trace Agent is responsible for receiving trace data sent from your application and forwarding it to Datadog's backend. By default, it listens for traces on port &lt;strong&gt;8126/tcp&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In a Docker environment, you enable APM with the following environment variables:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Environment Variable&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DD_APM_ENABLED&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Enables APM (the Trace Agent)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DD_APM_NON_LOCAL_TRAFFIC&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Allows trace submissions from other containers&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;If you don't set &lt;code&gt;DD_APM_NON_LOCAL_TRAFFIC=true&lt;/code&gt;, traces from other containers on the same Docker network won't be accepted — watch out for this.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Library Side: ddtrace
&lt;/h2&gt;

&lt;p&gt;On the Python application side, you'll use a library called &lt;strong&gt;ddtrace&lt;/strong&gt;. It's Datadog's official Python APM client and provides automatic instrumentation for over 80 libraries, including Flask, Django, and SQLAlchemy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;ddtrace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  How Auto Instrumentation Works: Monkey Patching
&lt;/h1&gt;

&lt;p&gt;ddtrace's auto instrumentation works through a technique called &lt;strong&gt;Monkey Patching&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monkey Patching&lt;/strong&gt; is a method of dynamically replacing existing classes or functions at runtime. ddtrace uses this approach to inject trace instrumentation into supported libraries without requiring any changes to your application code.&lt;/p&gt;

&lt;p&gt;There are two ways to enable it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The ddtrace-run Command (Recommended)
&lt;/h2&gt;

&lt;p&gt;Just prepend &lt;code&gt;ddtrace-run&lt;/code&gt; to your application's startup command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ddtrace-run python app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In a Dockerfile, modify the CMD like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CMD ["ddtrace-run", "python", "app.py"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  import ddtrace.auto
&lt;/h2&gt;

&lt;p&gt;Alternatively, you can import it at the very top of your entry point:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ddtrace.auto&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;flask&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;
&lt;span class="c1"&gt;# ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ Using &lt;code&gt;ddtrace-run&lt;/code&gt; and &lt;code&gt;import ddtrace.auto&lt;/code&gt; at the same time will cause monkey patching to be applied twice, so use only one of them.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Configuration in a Docker Environment
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Agent Container
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;services:
  datadog-agent:
    image: gcr.io/datadoghq/agent:latest
    environment:
      DD_API_KEY: ${DD_API_KEY}
      DD_SITE: datadoghq.com
      DD_APM_ENABLED: "true"
      DD_APM_NON_LOCAL_TRAFFIC: "true"
    ports:
      - "8126:8126/tcp"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Application Container
&lt;/h2&gt;

&lt;p&gt;On the application container side, specify the connection target for the Agent container via environment variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;services:
  myapp:
    build: .
    environment:
      DD_AGENT_HOST: datadog-agent
      DD_TRACE_AGENT_PORT: "8126"
      DD_SERVICE: my-python-app
      DD_ENV: production
      DD_VERSION: 1.0.0
    depends_on:
      - datadog-agent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can use the Docker Compose service name directly as the value of &lt;code&gt;DD_AGENT_HOST&lt;/code&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Unified Service Tagging
&lt;/h1&gt;

&lt;p&gt;The three variables &lt;code&gt;DD_SERVICE&lt;/code&gt; / &lt;code&gt;DD_ENV&lt;/code&gt; / &lt;code&gt;DD_VERSION&lt;/code&gt; are part of a mechanism called &lt;strong&gt;Unified Service Tagging&lt;/strong&gt; — standard tags that link telemetry across all of Datadog.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Environment Variable&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DD_SERVICE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Service name&lt;/td&gt;
&lt;td&gt;&lt;code&gt;my-python-app&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DD_ENV&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Environment name&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;production&lt;/code&gt;, &lt;code&gt;staging&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DD_VERSION&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Version&lt;/td&gt;
&lt;td&gt;&lt;code&gt;1.0.0&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;By setting these three, you'll be able to navigate from APM trace views to related logs and metrics with a single click. I strongly recommend configuring them.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;Datadog APM instrumentation works through the cooperation of two pieces: the Trace Agent on the Agent side, and ddtrace (with its monkey patching) on the library side. In a Docker environment, &lt;code&gt;DD_APM_NON_LOCAL_TRAFFIC=true&lt;/code&gt; and &lt;code&gt;DD_AGENT_HOST&lt;/code&gt; are particularly common gotchas, so keep them in mind during setup.&lt;/p&gt;

&lt;h1&gt;
  
  
  References
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://docs.datadoghq.com/containers/docker/apm/" rel="noopener noreferrer"&gt;Tracing Docker Applications | Datadog&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.datadoghq.com/tracing/trace_collection/automatic_instrumentation/dd_libraries/python/" rel="noopener noreferrer"&gt;Tracing Python Applications | Datadog&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.datadoghq.com/getting_started/tagging/unified_service_tagging/" rel="noopener noreferrer"&gt;Unified Service Tagging | Datadog&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Subscribe for more Datadog &amp;amp; Observability deep-dives.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>datadog</category>
      <category>apm</category>
      <category>python</category>
      <category>docker</category>
    </item>
    <item>
      <title>Inside Datadog's Log Pipeline: How "Logging without Limits" Actually Works</title>
      <dc:creator>YukiOnodera</dc:creator>
      <pubDate>Mon, 27 Apr 2026 09:54:07 +0000</pubDate>
      <link>https://forem.com/yukionodera/inside-datadogs-log-pipeline-how-logging-without-limits-actually-works-50od</link>
      <guid>https://forem.com/yukionodera/inside-datadogs-log-pipeline-how-logging-without-limits-actually-works-50od</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;In this post, I want to walk through how Datadog processes logs internally — from raw ingestion all the way to indexed, queryable data.&lt;/p&gt;

&lt;p&gt;If you've spent time clicking around Datadog's log management UI, you've probably noticed something satisfying: raw, messy log lines gradually get enriched and structured as they flow through the pipeline. It's a really elegant design, and once you understand the order in which things happen, it becomes clear why Datadog can offer both cost control and deep observability at the same time. Let me break it down.&lt;/p&gt;

&lt;p&gt;:::message&lt;br&gt;
This article focuses on the main steps I studied this time around. In practice, there are more detailed processes — sensitive data scanning, Error Tracking, Live Tail, and so on. For the full picture, please refer to the &lt;a href="https://docs.datadoghq.com/logs/" rel="noopener noreferrer"&gt;official documentation | Datadog&lt;/a&gt;.&lt;br&gt;
:::&lt;/p&gt;
&lt;h1&gt;
  
  
  The Overall Log Processing Flow
&lt;/h1&gt;

&lt;p&gt;Datadog's log management is built around a design philosophy called &lt;strong&gt;Logging without Limits™&lt;/strong&gt;, which lets you independently control "ingestion," "storage," and "analysis."&lt;/p&gt;

&lt;p&gt;The high-level flow looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Ingest
  ↓
Pipelines (Parse &amp;amp; Enrich)
  ↓
Generate Metrics
  ↓
Exclusion Filters
  ↓
Index
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Walking Through Each Step
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Ingest
&lt;/h2&gt;

&lt;p&gt;First, logs are collected into Datadog from a wide variety of sources.&lt;/p&gt;

&lt;p&gt;Datadog offers &lt;strong&gt;over 500 log integrations&lt;/strong&gt;, covering AWS, GCP, Kubernetes, and all kinds of middleware. I was honestly surprised by just how many there are.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pipelines (Parse &amp;amp; Enrich)
&lt;/h2&gt;

&lt;p&gt;Once raw logs are ingested, they pass through &lt;strong&gt;pipelines&lt;/strong&gt; that structure and enrich them (adding extra information).&lt;/p&gt;

&lt;p&gt;Using processors like the &lt;strong&gt;Grok parser&lt;/strong&gt;, unstructured text logs get broken down into fields, and additional attributes can be attached.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Before&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;parsing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(raw&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;log)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;2024-04-27&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ERROR&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="err"&gt;app&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Connection&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;timeout:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;host=db&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;duration=&lt;/span&gt;&lt;span class="mi"&gt;5002&lt;/span&gt;&lt;span class="err"&gt;ms&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;After&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;parsing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(structured)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-04-27T12:00:00Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ERROR"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"app"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Connection timeout"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"host"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"db01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"duration_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5002&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Watching unformatted logs get cleaned up and enriched is honestly the most fun part of this whole process to observe.&lt;/p&gt;

&lt;h2&gt;
  
  
  Generate Metrics
&lt;/h2&gt;

&lt;p&gt;This is the most interesting part of the Logging without Limits design.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Log-based metrics are generated &lt;em&gt;before&lt;/em&gt; exclusion filters run.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In other words, even for logs that will later be discarded and never make it to the index, you can still retain statistical information as metrics.&lt;/p&gt;

&lt;p&gt;:::message&lt;br&gt;
The benefit of this design is that even if you're aggressively dropping logs to keep costs down, you still get reliable metrics on trends, error rates, and the like.&lt;br&gt;
:::&lt;/p&gt;

&lt;h2&gt;
  
  
  Exclusion Filters
&lt;/h2&gt;

&lt;p&gt;After metric generation, &lt;strong&gt;exclusion filters&lt;/strong&gt; decide which logs are &lt;em&gt;not&lt;/em&gt; saved to the index.&lt;/p&gt;

&lt;p&gt;Debug logs, high-volume boilerplate logs, and anything that isn't needed for ongoing search can be dropped here, helping keep indexing costs under control.&lt;/p&gt;

&lt;h2&gt;
  
  
  Index
&lt;/h2&gt;

&lt;p&gt;Logs that pass through the filters are finally stored in the &lt;strong&gt;Index&lt;/strong&gt;. Once a log is indexed, you can use Datadog's UI for facet search and analysis.&lt;/p&gt;

&lt;h1&gt;
  
  
  Why This Ordering Matters
&lt;/h1&gt;

&lt;p&gt;The key insight in this processing order is the design principle: &lt;strong&gt;"extract metrics before throwing logs away."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Log storage costs balloon quickly, so indexing every single log is rarely realistic. But if you just drop logs, you lose visibility into trends within the discarded data.&lt;/p&gt;

&lt;p&gt;Logging without Limits solves this by placing metric generation &lt;em&gt;before&lt;/em&gt; exclusion filters. You can lower storage costs while still maximizing observability.&lt;/p&gt;

&lt;h1&gt;
  
  
  Wrap-up
&lt;/h1&gt;

&lt;p&gt;Datadog's log pipeline has clearly separated stages: ingest, parse, generate metrics, exclude, and index. The design choice to run metric generation before exclusion filters strikes me as especially important — it's what allows you to balance cost and observability rather than trade one off against the other.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Subscribe for more Datadog &amp;amp; Observability deep-dives.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>datadog</category>
      <category>observability</category>
      <category>logging</category>
    </item>
    <item>
      <title>ECR Costs Had Increased Over 10 Times Without Me Noticing</title>
      <dc:creator>YukiOnodera</dc:creator>
      <pubDate>Fri, 16 Aug 2024 12:10:15 +0000</pubDate>
      <link>https://forem.com/yukionodera/ecr-costs-had-increased-over-10-times-without-me-noticing-15l1</link>
      <guid>https://forem.com/yukionodera/ecr-costs-had-increased-over-10-times-without-me-noticing-15l1</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;The other day, while I was looking through Cost Explorer, I discovered that the cost of ECR had ballooned to &lt;strong&gt;more than 10 times&lt;/strong&gt; what it was a few months ago.&lt;/p&gt;

&lt;h2&gt;
  
  
  Investigation Begins
&lt;/h2&gt;

&lt;p&gt;Realizing this was a serious issue, I &lt;strong&gt;immediately began investigating the cause&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Confirming ECR Pricing
&lt;/h3&gt;

&lt;p&gt;I started by &lt;strong&gt;reviewing the pricing structure for ECR&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/ecr/pricing/" rel="noopener noreferrer"&gt;https://aws.amazon.com/ecr/pricing/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The cost of ECR is determined by &lt;strong&gt;storage charges based on the amount of image storage and data transfer out&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reviewing the Invoice
&lt;/h2&gt;

&lt;p&gt;Next, I &lt;strong&gt;checked last month’s invoice&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The goal was to &lt;strong&gt;identify whether the increased costs were due to storage or data transfer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;At this point, I noticed that in the ECR section of the invoice, only the storage costs were visible. Data transfer costs are listed under a separate data transfer section, so make sure to check that. I initially overlooked this, which left me puzzled about the discrepancy.&lt;/p&gt;

&lt;p&gt;In my case, &lt;strong&gt;data transfer costs had skyrocketed&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Checking AWS Accounts
&lt;/h3&gt;

&lt;p&gt;Since I was using AWS Organizations, I &lt;strong&gt;checked which member account was seeing the increased ECR costs&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Fortunately, only one account had significantly higher costs, so I was able to quickly pinpoint the source.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reviewing ECR Repositories
&lt;/h3&gt;

&lt;p&gt;However, just identifying the AWS account wasn’t enough to solve the problem.&lt;/p&gt;

&lt;p&gt;I decided to open the list of ECR repositories and take a look.&lt;/p&gt;

&lt;p&gt;I noticed that &lt;strong&gt;several repositories had been created around the time costs started increasing&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Cause of the Cost Increase
&lt;/h3&gt;

&lt;p&gt;After discussing it with team members and digging deeper, we discovered that this was due to a repository that had been migrated from Docker Hub for use in CI.&lt;/p&gt;

&lt;p&gt;CI is executed on GitHub Actions with each commit, and since multiple images of considerable size were being pulled, &lt;strong&gt;data transfer costs had surged&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;I'm glad I found the cause. In fact, I consider myself &lt;strong&gt;lucky&lt;/strong&gt; to have noticed it through Cost Explorer.&lt;/p&gt;

&lt;p&gt;Cloud costs can sometimes spike unexpectedly, so it’s important to stay vigilant.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://docs.docker.com/docker-hub/download-rate-limit/" rel="noopener noreferrer"&gt;https://docs.docker.com/docker-hub/download-rate-limit/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ecr</category>
      <category>container</category>
    </item>
  </channel>
</rss>
