<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Patrick Londa</title>
    <description>The latest articles on Forem by Patrick Londa (@patrick_londa_1477353d65e).</description>
    <link>https://forem.com/patrick_londa_1477353d65e</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3933240%2Fd3977bd7-c0bf-43fb-8f47-e2b395a8d5b9.jpeg</url>
      <title>Forem: Patrick Londa</title>
      <link>https://forem.com/patrick_londa_1477353d65e</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/patrick_londa_1477353d65e"/>
    <language>en</language>
    <item>
      <title>Breaking Logging's Flywheel of Compromises</title>
      <dc:creator>Patrick Londa</dc:creator>
      <pubDate>Tue, 19 May 2026 18:26:09 +0000</pubDate>
      <link>https://forem.com/bronto_io/breaking-loggings-flywheel-of-compromises-5gmb</link>
      <guid>https://forem.com/bronto_io/breaking-loggings-flywheel-of-compromises-5gmb</guid>
      <description>&lt;p&gt;&lt;em&gt;Authored by Mike Neville-O'Neill&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Let's face it — logging is broken. Not just a little broken, but fundamentally misaligned with the needs of modern engineering teams. At a recent AWS Summit talk in London, Benoit Gaudin (our Head of Infrastructure) and I shared Bronto's vision for fixing this mess once and for all.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem We're All Living In
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcayts215d8de7o18qt84.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcayts215d8de7o18qt84.png" alt="The 3C flywheel of compromises" width="800" height="303"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you're running any significant infrastructure today, you're probably stuck in what we call the &lt;strong&gt;"3C flywheel of compromises"&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost&lt;/strong&gt; — Logging at scale has become ridiculously expensive&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coverage&lt;/strong&gt; — So you cut corners, dropping those infra logs and long-tail workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complexity&lt;/strong&gt; — And end up with a Frankenstein's monster of 5–8 different systems duct-taped together&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't just inefficient — it's actively harmful. Engineers end up building parallel solutions just to get basic visibility because the main tool is too limited, too slow, or too expensive.&lt;/p&gt;




&lt;h2&gt;
  
  
  Logs Matter More Than Ever
&lt;/h2&gt;

&lt;p&gt;Logs aren't just a compliance checkbox anymore. They're your &lt;strong&gt;operational ground truth in the AI era&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;They feed your LLMs. They power your agents. They're your audit trail, your RAG source, your behavioral training set. And one log message from an LLM-based system might contain 50–100 nested events in a single payload.&lt;/p&gt;

&lt;p&gt;Try scaling that with a solution built before the separation of compute and storage was even a thing.&lt;/p&gt;




&lt;h2&gt;
  
  
  How We're Breaking the Cycle
&lt;/h2&gt;

&lt;p&gt;Bronto was built to tackle this head-on with three non-negotiable capabilities:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Subsecond search on all logs&lt;/strong&gt; — whether they're two seconds or two years old&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Petabyte-scale retention&lt;/strong&gt; — no infrastructure for you to manage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Completely different pricing&lt;/strong&gt; — think cents per GB, not dollars&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The platform is built natively on AWS (S3, Lambda, DynamoDB), but engineered so you don't have to deal with pipelines, pre-processing, or glue code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Bronto's Architectural Advantage
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kjhzflbss411kbl3w3z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kjhzflbss411kbl3w3z.png" alt="Bronto architecture diagram" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The ingestion layer accepts data from standard sources — OpenTelemetry Collector, FluentD, FluentBit — through HTTP endpoints, with AWS EC2 load balancers doing the heavy lifting. Data is buffered through Kafka (AWS MSK), but then things diverge from the standard playbook.&lt;/p&gt;

&lt;p&gt;Instead of traditional approaches, data is processed from Kafka and written to S3 in a proprietary format that borrows techniques from data analytics: &lt;strong&gt;data partitioning, Bloom filtering, push predicates, compression, and columnar-based formats&lt;/strong&gt;. Metadata lives in DynamoDB for speed.&lt;/p&gt;

&lt;p&gt;The real magic happens at search time. When you query through the UI or API, Lambda functions launch in parallel and process data directly from S3. No overprovisioning for big queries — horizontal scaling on demand, paying only while functions run.&lt;/p&gt;

&lt;p&gt;This architecture is what enables both the performance (subsecond on terabytes, seconds on petabytes) and the pricing model. No expensive clusters running 24/7 — just cloud resources used exactly when and where they're needed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real Teams, Real Results
&lt;/h2&gt;

&lt;h3&gt;
  
  
  API-First Content Platform
&lt;/h3&gt;

&lt;p&gt;A team running a massive content delivery platform, serving APIs behind a global CDN for websites, mobile apps, and e-commerce systems. Every request hits their API with a unique key — they need to trace errors, group by status codes, and export logs to their own customers.&lt;/p&gt;

&lt;p&gt;
  Before Bronto
  &lt;ul&gt;
&lt;li&gt;40TB monthly ingestion cap&lt;/li&gt;
&lt;li&gt;30+ minute query times (when they worked at all)&lt;/li&gt;
&lt;li&gt;Dashboards that routinely failed&lt;/li&gt;
&lt;li&gt;Constant budget pressure
&lt;/li&gt;
&lt;/ul&gt;




&lt;/p&gt;
&lt;p&gt;
  After Bronto
  &lt;ul&gt;
&lt;li&gt;Boosted ingestion to 60TB monthly&lt;/li&gt;
&lt;li&gt;Cut their logging bill in half&lt;/li&gt;
&lt;li&gt;Complex multi-day queries now return in subseconds&lt;/li&gt;
&lt;li&gt;Built reliable log exports for their own customers
&lt;/li&gt;
&lt;/ul&gt;




&lt;/p&gt;
&lt;p&gt;Their exact words? &lt;em&gt;"Bronto changed our lives."&lt;/em&gt; A logging tool. Actually improving engineers' lives.&lt;/p&gt;
&lt;h3&gt;
  
  
  Global SaaS Project Management Platform
&lt;/h3&gt;

&lt;p&gt;A company running a suite of SaaS tools across distributed cloud services and product lines.&lt;/p&gt;

&lt;p&gt;
  Before Bronto
  &lt;ul&gt;
&lt;li&gt;Graylog for live logs&lt;/li&gt;
&lt;li&gt;S3 for long-term storage&lt;/li&gt;
&lt;li&gt;HAProxy logs dumped into S3 with gnarly Athena queries&lt;/li&gt;
&lt;li&gt;A mix of Athena, Superset, and QuickSight for analytics&lt;/li&gt;
&lt;li&gt;Just 1–2 days of retention across most systems
&lt;/li&gt;
&lt;/ul&gt;




&lt;/p&gt;
&lt;p&gt;
  After Bronto
  &lt;ul&gt;
&lt;li&gt;Everything centralized — HAProxy, Kubernetes, application logs, audit trails&lt;/li&gt;
&lt;li&gt;Extended to 90-day hot retention&lt;/li&gt;
&lt;li&gt;Real dashboards tracking error spikes, traffic anomalies, and app version drift&lt;/li&gt;
&lt;li&gt;Engineers focused on product, not maintaining logging infrastructure
&lt;/li&gt;
&lt;/ul&gt;




&lt;/p&gt;
&lt;p&gt;They went from managing logs to actually using them.&lt;/p&gt;


&lt;h2&gt;
  
  
  Logs as Your Secret Weapon
&lt;/h2&gt;

&lt;p&gt;Your log data is massively undervalued — not because it lacks signal, but because current tooling hides that signal behind cost barriers, friction, and compromises.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Logs used to be a liability. With the right approach, they can be your secret weapon.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We're building Bronto to be for logging what Dyson was for vacuum cleaners, what iPhone was for smartphones, and what Tesla was for electric cars — a complete reimagining of what's possible when you refuse to accept the status quo.&lt;/p&gt;

&lt;p&gt;After all, when was the last time your logging tool made your life better instead of worse?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.bronto.io/book-a-demo" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;See Bronto in Action&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>logging</category>
      <category>devops</category>
      <category>observability</category>
      <category>architecture</category>
    </item>
    <item>
      <title>The CDN Logging Crisis</title>
      <dc:creator>Patrick Londa</dc:creator>
      <pubDate>Tue, 19 May 2026 17:39:56 +0000</pubDate>
      <link>https://forem.com/bronto_io/the-cdn-logging-crisis-3d1g</link>
      <guid>https://forem.com/bronto_io/the-cdn-logging-crisis-3d1g</guid>
      <description>&lt;p&gt;&lt;em&gt;Authored by Benoit Gaudin&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Every second, your CDN is generating thousands of logs that tell a critical story about your application's performance, security, and user experience. For large enterprises, this can mean terabytes of log data every day — data that contains invaluable insights about your business.&lt;/p&gt;

&lt;p&gt;But here's the uncomfortable truth: most organizations capture only a small fraction of their CDN logs, and retain that limited data for just days or weeks. This isn't because engineering teams don't understand the value. It's because the economics of traditional logging solutions make comprehensive CDN logging prohibitively expensive.&lt;/p&gt;

&lt;p&gt;The result? Critical blind spots that can be extremely costly during outages, security breaches, or major events.&lt;/p&gt;

&lt;p&gt;Welcome to the &lt;strong&gt;flywheel of compromises&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cost&lt;/strong&gt; — Traditional logging vendors charge egregious per-GB rates that make comprehensive CDN logging unaffordable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coverage&lt;/strong&gt; — Companies respond by severely limiting what logs they collect and how long they retain them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complexity&lt;/strong&gt; — To compensate for coverage gaps, teams cobble together 5–8 different logging solutions, creating a management nightmare&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The Current State of CDN Logging
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F6835cf3bc14ffe346a1da51c_image%2520%282%29.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F6835cf3bc14ffe346a1da51c_image%2520%282%29.png" alt="CDN logging landscape" width="800" height="282"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The observability sector today resembles markets before transformative innovation — vacuum cleaners before Dyson, mobile phones before iPhone, electric cars before Tesla. Existing solutions were designed for a completely different era: before the separation of compute and storage, before the explosion of log data volumes, and certainly before the demands of the AI era.&lt;/p&gt;

&lt;p&gt;Consider how most logging vendors operate today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Datadog&lt;/strong&gt; charges around $2–5 per GB for log ingestion with 15-day retention. A company generating 10TB of CDN logs daily could pay upwards of $600,000 per month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Splunk&lt;/strong&gt; forces customers into complex licensing schemes that effectively limit how much data they can realistically log&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New Relic&lt;/strong&gt; and other vendors offer marginally better pricing but still force unacceptable trade-offs between cost and coverage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What's most frustrating is that these pricing models persist despite dramatic changes in the underlying technology. The separation of compute and storage has revolutionized data economics across virtually every other category of software, yet logging vendors continue to operate on business models created 15 years ago.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Hypothetical (But Entirely Plausible) Scenario
&lt;/h2&gt;

&lt;p&gt;To illustrate the real-world impact of incomplete CDN logging, consider this:&lt;/p&gt;

&lt;p&gt;A week before a major live streaming event, a provider's engineering team makes a routine CDN configuration change. Under normal traffic loads, the misconfiguration goes unnoticed — cache hit ratios remain stable and performance appears normal.&lt;/p&gt;

&lt;p&gt;After a week, any trace of the configuration change disappears from their logs due to their 7-day retention policy. Capacity planning teams review infrastructure and assume current backend capacity can handle the anticipated load — after all, it worked fine during the last similar event. Unfortunately, the now-invisible change makes that assumption dangerously wrong.&lt;/p&gt;

&lt;p&gt;During the live event, CDN cache efficiency plummets under heavy load. Backend servers get hit much harder than expected. Users experience buffering and connection problems, but the operations team struggles to diagnose the root cause.&lt;/p&gt;

&lt;p&gt;By the time they identify the issue — tracing it back to the forgotten configuration change — the damage is done. Over a million viewers have abandoned the stream, social media is flooded with complaints, and the company's stock takes a hit.&lt;/p&gt;

&lt;p&gt;With complete CDN logging and longer retention, they could have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identified when the degradation trend first appeared, correlating it to the configuration change&lt;/li&gt;
&lt;li&gt;Maintained visibility throughout the planning period&lt;/li&gt;
&lt;li&gt;Quickly correlated the performance issues with the earlier change during the incident&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Limited logging coverage transformed a minor configuration error into a major business incident. The cost of their logging "savings"? Potentially millions in lost ad revenue and subscription cancellations.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Three Horsemen of the Logging Apocalypse
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Cost Explosion
&lt;/h3&gt;

&lt;p&gt;Traditional logging vendors price their products based on data volume, charging premium rates for both ingestion and storage. This pricing model was created when storage was genuinely expensive. In 2025, with cloud storage costs continuing to plummet, this model serves primarily to protect vendor margins.&lt;/p&gt;

&lt;p&gt;For CDN logs — which are high-volume by nature — this creates an impossible equation. When faced with estimates of $500,000+ monthly for complete CDN logging, even the most data-driven organizations are forced to compromise.&lt;/p&gt;

&lt;h3&gt;
  
  
  Coverage Sacrifice
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmf8zwflo6l28735jf7c4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmf8zwflo6l28735jf7c4.png" alt="Coverage gaps diagram" width="800" height="408"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The inevitable result of cost pressure is reduced coverage. Organizations typically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ingest only a sample of the data&lt;/li&gt;
&lt;li&gt;Limit retention to days instead of months&lt;/li&gt;
&lt;li&gt;Exclude high-volume CDNs or regions entirely&lt;/li&gt;
&lt;li&gt;Drop detailed fields that would aid troubleshooting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These compromises create dangerous blind spots. Intermittent issues, security threats that develop over time, and regional performance problems remain invisible. When an incident occurs, teams often discover they're missing exactly the data they need.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complexity Creep
&lt;/h3&gt;

&lt;p&gt;To compensate for coverage limitations, organizations implement a patchwork of supplementary solutions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Self-hosted ELK stacks for longer-term storage (with all the maintenance overhead)&lt;/li&gt;
&lt;li&gt;Cloud provider-specific logging solutions (AWS CloudWatch, GCP Logging)&lt;/li&gt;
&lt;li&gt;Custom scripts to archive logs to object storage with rehydration workflows&lt;/li&gt;
&lt;li&gt;Open-source tools for log analysis and visualization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result is a Frankenstein's monster of logging infrastructure that no one fully understands, requires constant maintenance, and still fails to provide comprehensive visibility.&lt;/p&gt;




&lt;h2&gt;
  
  
  CDN Logging for the AI Era
&lt;/h2&gt;

&lt;p&gt;These challenges are escalating as we enter the AI era:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Exploding volumes&lt;/strong&gt; — Microservices, containers, and edge computing are all contributing to the data deluge&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI-powered analysis&lt;/strong&gt; — ML systems require comprehensive, long-term data to identify patterns and anomalies effectively&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic applications&lt;/strong&gt; — Autonomous applications require complete historical data to make intelligent decisions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Legacy logging business models simply cannot accommodate these realities. They weren't designed for terabytes of daily log ingestion, years of retention, or a world where AI agents might need to analyze months of historical CDN patterns.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Different Approach
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx3935r3z66gcmnxtqh6a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx3935r3z66gcmnxtqh6a.png" alt="Bronto CDN logging dashboard" width="800" height="265"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Solving the CDN logging crisis requires rebuilding the logging stack from the ground up — not incremental improvements on broken foundations. Three core principles drive the right approach:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Economics Aligned with Modern Infrastructure
&lt;/h3&gt;

&lt;p&gt;Leveraging the separation of compute and storage to deliver CDN logging at a fraction of traditional costs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;90% cost reduction&lt;/strong&gt; compared to Datadog and similar vendors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;12-month retention&lt;/strong&gt; by default&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No charges for search&lt;/strong&gt; or compute resources&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Lightning-Fast Search Across Petabytes
&lt;/h3&gt;

&lt;p&gt;"Tracey's Law": the faster you make log search, the more valuable logging becomes to an organization.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sub-second search across terabytes of CDN logs&lt;/li&gt;
&lt;li&gt;Seconds-long queries across petabytes&lt;/li&gt;
&lt;li&gt;No rehydration from cold storage, ever&lt;/li&gt;
&lt;li&gt;Fast dashboards even across months of data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When queries return in seconds instead of minutes (or timing out entirely), teams use logging data proactively rather than as a last resort.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. A Single Unified Logging Layer
&lt;/h3&gt;

&lt;p&gt;Eliminating the patchwork by providing one comprehensive logging layer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All CDN providers in one place&lt;/li&gt;
&lt;li&gt;Drop-in replacement for existing solutions&lt;/li&gt;
&lt;li&gt;Two-line configuration change for implementation&lt;/li&gt;
&lt;li&gt;Automatic parsing and PII removal&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Breaking Free from the Flywheel
&lt;/h2&gt;

&lt;p&gt;The CDN logging crisis isn't just a technical problem — it's a business problem with real implications for reliability, security, and user experience. For too long, organizations have accepted a dysfunctional status quo because there seemed to be no alternative.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Every single word about the logging crisis resonates. We were spending over $400,000 monthly on CDN logging with Datadog, and still only capturing about 20% of our logs. With Bronto, we now have 100% coverage, 12-month retention, and our bill is under $40,000."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This isn't an incremental improvement — it's a fundamental reinvention of how logging works. Just as Apple reinvented the smartphone, Dyson reinvented the vacuum cleaner, and Tesla reinvented the electric car, the logging industry is overdue for the same transformation.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Bronto is reinventing logging from the ground up for the AI era. The team brings 150+ years of collective logging domain expertise, with previous experience building and scaling logging platforms at IBM, Rapid7, and Logentries.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.bronto.io/book-a-demo" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;See What 100% CDN Log Coverage Looks Like&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>logging</category>
      <category>devops</category>
      <category>observability</category>
      <category>cdn</category>
    </item>
    <item>
      <title>Logging Your AI Events (from Ollama) in Bronto</title>
      <dc:creator>Patrick Londa</dc:creator>
      <pubDate>Tue, 19 May 2026 16:16:59 +0000</pubDate>
      <link>https://forem.com/bronto_io/logging-your-ai-events-from-ollama-in-bronto-477j</link>
      <guid>https://forem.com/bronto_io/logging-your-ai-events-from-ollama-in-bronto-477j</guid>
      <description>&lt;p&gt;&lt;em&gt;Authored by David Tracey&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Many software companies are investigating the use of Large Language Models (LLMs) in their products. At Bronto we've announced &lt;a href="https://www.bronto.io/blog/introducing-bronto-labs" rel="noopener noreferrer"&gt;our Bronto Labs initiative&lt;/a&gt;, with AI features including auto-parsing, AI dashboard creation, and Bronto Scope for error investigation.&lt;/p&gt;

&lt;p&gt;This post explores a different angle: &lt;strong&gt;using logs in the development of AI applications&lt;/strong&gt;. We'll focus on &lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; — an open source tool for running LLMs locally — and show how to pipe its logs into Bronto for search and analysis.&lt;/p&gt;

&lt;p&gt;LLMs are complex, non-deterministic systems. Beyond traditional logging use cases (performance monitoring, API usage), their unpredictable nature increases the need for logging — particularly to record and track responses to prompts. Individual log events can be large when they include a full prompt or response. Meta found this problem significant enough at their scale to build a dedicated &lt;a href="https://engineering.fb.com/2024/03/18/data-infrastructure/logarithm-logging-engine-ai-training-workflows-services-meta/" rel="noopener noreferrer"&gt;Meta AI Logging Engine&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The fundamental requirements for logging AI applications are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ability to handle &lt;strong&gt;large log events&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Ability to handle &lt;strong&gt;high volumes at low cost&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Ability to &lt;strong&gt;search across high volumes&lt;/strong&gt; quickly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are exactly the requirements Bronto was designed to meet.&lt;/p&gt;




&lt;h2&gt;
  
  
  Setting Up Ollama
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Recommended specs:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;16GB RAM (8GB works for smaller models)&lt;/li&gt;
&lt;li&gt;12GB disk space for Ollama and basic models&lt;/li&gt;
&lt;li&gt;Modern CPU with at least 4 cores (8 preferred)&lt;/li&gt;
&lt;li&gt;Optional: GPU for improved performance&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Install and Run the Server
&lt;/h3&gt;

&lt;p&gt;Install from &lt;a href="https://ollama.com/download" rel="noopener noreferrer"&gt;ollama.com/download&lt;/a&gt; for your OS, then start the server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll see output including the default port it's listening on (&lt;code&gt;11434&lt;/code&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  Download and Run a Model
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pull a model from the registry&lt;/span&gt;
ollama pull gemma:2b

&lt;span class="c"&gt;# List downloaded models&lt;/span&gt;
ollama list

&lt;span class="c"&gt;# Run a model interactively&lt;/span&gt;
ollama run gemma:2b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;run&lt;/code&gt; command gives you a &lt;code&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/code&gt; prompt where you can enter prompts or &lt;code&gt;/help&lt;/code&gt; for commands.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sending Ollama Logs to Bronto
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Configure Ollama Logging to File
&lt;/h3&gt;

&lt;p&gt;Stop the server and restart it writing logs to a file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama serve &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /your_log_path/.ollama/logs/server.log 2&amp;gt;&amp;amp;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For more detailed debug logs, add to your shell profile (&lt;code&gt;.zprofile&lt;/code&gt; etc.):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OLLAMA_LOG_LEVEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;DEBUG
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OLLAMA_DEBUG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To redirect model client logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# stderr only (keeps console interactive)&lt;/span&gt;
ollama run gemma:2b 2&amp;gt;&amp;gt;/your_log_path/.ollama/logs/gemma.log

&lt;span class="c"&gt;# both stdout and stderr (API use only — disables console input)&lt;/span&gt;
ollama run gemma:2b &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /your_log_path/.ollama/logs/gemma.log 2&amp;gt;&amp;amp;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify logs are flowing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /your_log_path/.ollama/logs/server.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Install OpenTelemetry Collector
&lt;/h3&gt;

&lt;p&gt;Download for your platform from &lt;a href="https://opentelemetry.io/docs/collector/installation/" rel="noopener noreferrer"&gt;opentelemetry.io&lt;/a&gt;. Example for Mac ARM64:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;--proto&lt;/span&gt; &lt;span class="s1"&gt;'=https'&lt;/span&gt; &lt;span class="nt"&gt;--tlsv1&lt;/span&gt;.2 &lt;span class="nt"&gt;-fOL&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.114.0/otelcol-contrib_0.114.0_darwin_arm64.tar.gz

&lt;span class="nb"&gt;chmod&lt;/span&gt; +x otelcol-contrib
&lt;span class="nb"&gt;mv &lt;/span&gt;otelcol-contrib /usr/local/bin/otelcol

&lt;span class="c"&gt;# Verify&lt;/span&gt;
otelcol &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Configure OpenTelemetry to Forward to Bronto
&lt;/h3&gt;

&lt;p&gt;Create &lt;code&gt;/etc/otelcol/config.yaml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;receivers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;filelog/Ollama_Server&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/your_log_path/.ollama/logs/server.log&lt;/span&gt;
    &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;service.name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LaptopServer&lt;/span&gt;
      &lt;span class="na"&gt;service.namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Ollama&lt;/span&gt;

  &lt;span class="na"&gt;filelog/Ollama_Gemma&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/your_log_path/.ollama/logs/gemma.log&lt;/span&gt;
    &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;service.name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LaptopGemma&lt;/span&gt;
      &lt;span class="na"&gt;service.namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Ollama&lt;/span&gt;

&lt;span class="na"&gt;processors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;batch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

&lt;span class="na"&gt;exporters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;otlphttp/brontobytes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;logs_endpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://ingestion.us.bronto.io/v1/logs"&lt;/span&gt;
    &lt;span class="na"&gt;compression&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;none&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;x-bronto-api-key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;replace_this_with_your_bronto_apikey&lt;/span&gt;

&lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pipelines&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;logs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;receivers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;filelog/Ollama_Server&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;filelog/Ollama_Gemma&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;processors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;batch&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;exporters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;otlphttp/brontobytes&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="c1"&gt;# Useful for debugging:&lt;/span&gt;
  &lt;span class="c1"&gt;# telemetry:&lt;/span&gt;
  &lt;span class="c1"&gt;#   logs:&lt;/span&gt;
  &lt;span class="c1"&gt;#     level: "debug"&lt;/span&gt;
  &lt;span class="c1"&gt;#     output_paths: [/your_log_path/otelcol/debug.log]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Validate and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;otelcol validate &lt;span class="nt"&gt;--config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/otelcol/config.yaml
otelcol &lt;span class="nt"&gt;--config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/otelcol/config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  A Simple Ollama API Program
&lt;/h2&gt;

&lt;p&gt;The Python script below (&lt;code&gt;ollama-log-demo.py&lt;/code&gt;) uses the &lt;a href="https://github.com/ollama/ollama/blob/main/docs/api.md" rel="noopener noreferrer"&gt;Ollama API&lt;/a&gt; to send prompts against a log file and print the response. Example usage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Summarize 100 lines of CDN logs&lt;/span&gt;
python3 ollama-log-demo.py 100lines-CDN-log.csv &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model&lt;/span&gt; &lt;span class="s2"&gt;"gemma:2b"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--prompt&lt;/span&gt; &lt;span class="s2"&gt;"You have been given 100 lines from a CDN log in CSV format. Summarise the logs provided."&lt;/span&gt;

&lt;span class="c"&gt;# Find errors and suggest fixes&lt;/span&gt;
python3 ollama-log-demo.py 100lines-search-log.csv &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model&lt;/span&gt; &lt;span class="s2"&gt;"gemma:2b"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--prompt&lt;/span&gt; &lt;span class="s2"&gt;"Find errors in this log and suggest how to fix them"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The final line of each Ollama response includes useful performance metadata:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;total_duration&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Total time spent generating the response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;load_duration&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Time spent loading the model (nanoseconds)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;prompt_eval_count&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Number of tokens in the prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;prompt_eval_duration&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Time spent evaluating the prompt (nanoseconds)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;eval_count&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Number of tokens in the response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;eval_duration&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Time spent generating the response (nanoseconds)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;context&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Conversation encoding — pass in next request to maintain memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;response&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Empty if streamed; full response if not streamed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Model notes from testing:&lt;/strong&gt; &lt;code&gt;gemma:2b&lt;/code&gt; is good for summarizing but tends to give high-level summaries even when asked for specifics. &lt;code&gt;mistral&lt;/code&gt; takes longer but produces more detailed, data-specific responses. Defining the right prompt for your use case is key.&lt;/p&gt;




&lt;h2&gt;
  
  
  Searching Ollama Logs in Bronto
&lt;/h2&gt;

&lt;p&gt;Ollama server logs include a mix of structured and unstructured entries:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Standard log levels:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INFO [main] HTTP server listening | hostname="127.0.0.1" port="11434"
level=INFO source=sched.go:714 msg="new model will fit in available VRAM"
level=DEBUG source=memory.go:103 msg=evaluating library=metal gpu_count=1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Model and resource logs:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;llm_load_print_meta: max token length = 93
llama_model_loader: - kv 0: general.architecture str = gemma
level=INFO source=server.go:105 msg="system memory" total="8.0 GiB" free="1.2 GiB"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even a small test with short prompts generates surprisingly large log volumes — 244 events totaling ~2MB in our test. Bronto handles these unstructured and semi-structured formats natively, and you can add a custom parser to make them more convenient to search and view.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example searches in Bronto:&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Fig.1 — Searching for log events containing "tokens"&lt;/em&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flaels41zdygfj9igr6ll.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flaels41zdygfj9igr6ll.png" alt="Searching for log events containing " width="800" height="418"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fig.2 — Searching for log events containing "prompt"&lt;/em&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmefwxd7s4aufb1lmk0j6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmefwxd7s4aufb1lmk0j6.png" alt="Searching for log events containing " width="799" height="364"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Fig.3 — Grouping by prompt evaluation time per task_id&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhyvj6xmw7xnmmh4jpu84.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhyvj6xmw7xnmmh4jpu84.png" alt="Grouping by prompt evaluation time per task_id" width="800" height="433"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This post introduced Ollama as an example of an LLM system and explained why AI applications create unique logging challenges — large events, high volumes, non-deterministic outputs, and distributed agents. We walked through setting up Ollama locally, configuring OpenTelemetry to forward logs to Bronto, and writing a simple Python API program to experiment with prompts against log data.&lt;/p&gt;

&lt;p&gt;Future posts will develop the theme further with other AI systems including AWS Bedrock.&lt;/p&gt;


&lt;h2&gt;
  
  
  Appendix: &lt;code&gt;ollama-log-demo.py&lt;/code&gt;
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;print_ollama_stats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_response&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;load_duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;load_duration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;load_duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--- load_duration = &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;load_duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;total_duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total_duration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;total_duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--- total_duration = &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;eval_duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eval_duration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;eval_duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--- eval_duration = &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eval_duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;prompt_eval_duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt_eval_duration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;prompt_eval_duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--- prompt_eval_duration = &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt_eval_duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;prompt_eval_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt_eval_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;prompt_eval_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--- prompt_eval_count = &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt_eval_count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;eval_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eval_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;eval_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--- eval_count = &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eval_count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;examine_log_with_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input_model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;log_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;req_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;input_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;input_prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;log_data&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Update localhost URL to match your Ollama API endpoint
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:11434/api/generate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req_params&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--- Processing Successful Ollama Response ---&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;line_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;iter_lines&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;json_line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="n"&gt;line_count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                        &lt;span class="n"&gt;json_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error decoding JSON on line &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;line_count&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;UnicodeDecodeError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error decoding line to UTF-8 on line &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;line_count&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line_count&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No JSON lines found or response was empty.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--------------------------------------------------&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print_ollama_stats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--------------------------------------------------&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Error - Response Status code: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ollama API Demo for Logs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;file&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Path to the log file to be examined&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;--model&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Model to use in analysis&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;--prompt&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Prompt to send to model&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;examine_log_with_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://www.bronto.io/bronto-labs" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Explore Bronto's AI Features&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>ai</category>
      <category>logging</category>
      <category>ollama</category>
      <category>observability</category>
    </item>
    <item>
      <title>The Log Management Cost Trap: Part III — Search</title>
      <dc:creator>Patrick Londa</dc:creator>
      <pubDate>Tue, 19 May 2026 13:26:43 +0000</pubDate>
      <link>https://forem.com/bronto_io/the-log-management-cost-trap-part-iii-search-2lgo</link>
      <guid>https://forem.com/bronto_io/the-log-management-cost-trap-part-iii-search-2lgo</guid>
      <description>&lt;p&gt;&lt;em&gt;Authored by Benoit Gaudin&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In &lt;a href="https://www.bronto.io/blog/cost-trap-ingestion" rel="noopener noreferrer"&gt;Part I&lt;/a&gt; (Ingestion) and &lt;a href="https://www.bronto.io/blog/cost-trap-storage" rel="noopener noreferrer"&gt;Part II&lt;/a&gt; (Storage) of this series, I explored the challenges of designing, running, and managing a centralised log management solution. In Part III, I'll focus on &lt;strong&gt;search&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Competing Requirements of Log Search
&lt;/h2&gt;

&lt;p&gt;Log data search has two distinct use cases with fundamentally different requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-time troubleshooting&lt;/strong&gt; — when a system outage occurs, engineers need visibility into what caused the issue immediately. Log data must be searchable almost as soon as it's generated. This imposes a hard constraint: batch windows must be short. And short batch windows tend to produce small files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Large-scale historical analysis&lt;/strong&gt; — analyzing web or CDN access logs to identify patterns in API usage, track slowly degrading performance trends, or audit activity over weeks or months. Here, data freshness is irrelevant. What matters is the ability to efficiently scan large datasets.&lt;/p&gt;

&lt;p&gt;These two use cases create a direct tension. Making data available quickly often means processing small batches and creating many small files — which severely degrades performance when running queries across long time ranges. This is the classic &lt;a href="https://blog.min.io/challenge-big-data-small-files/" rel="noopener noreferrer"&gt;small file problem&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F6866a45d13683d5184b5a200_The%2520Small%2520File%2520Problem%2520%281%29.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F6866a45d13683d5184b5a200_The%2520Small%2520File%2520Problem%2520%281%29.png" alt="The small file problem illustrated" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A good log management solution must balance both: newly ingested data searchable immediately, stored in a format that also supports efficient querying over time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Performant and Cost-Effective Search
&lt;/h2&gt;

&lt;p&gt;As covered in Part II, the right data format and storage strategy are the foundation. Key techniques include indexing, Bloom filtering, and data partitioning.&lt;/p&gt;

&lt;h3&gt;
  
  
  Needle-in-a-haystack queries
&lt;/h3&gt;

&lt;p&gt;Indexing and Bloom filtering shine when searching for data that appears infrequently across a large time range — for example, finding a specific &lt;code&gt;trace_id&lt;/code&gt; across several terabytes of log data. As explained in &lt;a href="https://www.bronto.io/blog/why-is-bronto-so-fast" rel="noopener noreferrer"&gt;Why is Bronto so fast at searching logs&lt;/a&gt;, well-designed indexing and Bloom filtering can dramatically reduce the volume of data scanned, narrowing the dataset to a much smaller subset more likely to contain the target value.&lt;/p&gt;

&lt;h3&gt;
  
  
  Full-scan analytical queries
&lt;/h3&gt;

&lt;p&gt;Some queries can't be narrowed. If you want the maximum response time per endpoint over the past few months, every log entry must be examined — there's no rare value to isolate, no filter to push down, no partition to skip.&lt;/p&gt;

&lt;p&gt;Pre-aggregated summaries could help &lt;em&gt;if&lt;/em&gt; you know in advance exactly how users will slice their data. But general-purpose log management systems can't predict every analytical angle users will need. Full dataset scans are unavoidable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F6866b23e38a451227489d03c_Brute%2520Force%2520Search%2520%282%29.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F6866b23e38a451227489d03c_Brute%2520Force%2520Search%2520%282%29.png" alt="Brute force search at scale" width="799" height="558"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For these cases, the only viable solution is &lt;strong&gt;brute-force compute&lt;/strong&gt;: massive parallelism and high-performance processing to deliver results even when every record must be touched.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bronto's approach: AWS Lambda for bursty workloads
&lt;/h3&gt;

&lt;p&gt;To support demanding full-scan queries while keeping costs in check, Bronto uses AWS Lambda functions. Lambda enables high concurrency — large volumes of data stored in S3 can be processed in parallel, on demand, with no infrastructure to provision or manage in advance.&lt;/p&gt;

&lt;p&gt;The cost model is key: you only pay for compute time used. Even when running many functions concurrently, short execution times keep overall cost low. This makes it ideal for &lt;strong&gt;bursty, unpredictable workloads&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That said, Lambda isn't always the right tool. When query volume consistently exceeds a certain threshold, sustained compute options like AWS EC2 become more cost-effective. The right architecture uses both: Lambda for bursts, EC2 for the baseline.&lt;/p&gt;




&lt;h2&gt;
  
  
  High Cardinality
&lt;/h2&gt;

&lt;p&gt;Log data frequently contains high-cardinality fields — client IP addresses, trace IDs, user IDs. Queries over these fields (e.g. counting unique IP addresses across a large dataset) can lead to slow performance, high memory consumption, and a &lt;a href="https://www.datacamp.com/tutorial/cardinality" rel="noopener noreferrer"&gt;poor user experience&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A naive solution is to cap the number of unique values the system handles — but that means users simply can't get value from their data beyond the cap.&lt;/p&gt;

&lt;p&gt;A better approach: compute &lt;strong&gt;exact results up to a certain cardinality threshold&lt;/strong&gt;, then switch to &lt;strong&gt;approximations&lt;/strong&gt; when cardinality genuinely becomes too large to handle exactly. Several probabilistic data structures make this practical:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://engineering.fb.com/2018/12/13/data-infrastructure/hyperloglog/" rel="noopener noreferrer"&gt;HyperLogLog&lt;/a&gt; — approximate distinct counts&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://redis.io/blog/count-min-sketch-the-art-and-science-of-estimating-stuff/" rel="noopener noreferrer"&gt;Count-Min Sketch&lt;/a&gt; — approximate frequency counts&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Cuckoo_filter" rel="noopener noreferrer"&gt;Cuckoo Filter&lt;/a&gt; — approximate set membership&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.vldb.org/conf/2004/RS17P3.PDF" rel="noopener noreferrer"&gt;Top-K&lt;/a&gt; — approximate top values by frequency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach keeps resource consumption bounded while still giving users meaningful, actionable insights from high-cardinality data.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This wraps up the three-part &lt;em&gt;Log Management Cost Trap&lt;/em&gt; series. Across ingestion, storage, and search, the same theme emerges: design decisions in one layer constrain and shape what's possible in the others. Trade-offs are unavoidable, and navigating toward an optimal solution requires deep experience across all three.&lt;/p&gt;

&lt;p&gt;Bronto brings 150+ years of combined experience in log management at scale — and implements that experience into a platform designed to be cost-efficient, high-performance, and &lt;a href="https://www.bronto.io/manifesto" rel="noopener noreferrer"&gt;ready for logging in the AI era&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.bronto.io/book-a-demo" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;See Bronto in Action&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>logging</category>
      <category>devops</category>
      <category>infrastructure</category>
      <category>observability</category>
    </item>
    <item>
      <title>The Log Management Cost Trap: Part II — Storage</title>
      <dc:creator>Patrick Londa</dc:creator>
      <pubDate>Mon, 18 May 2026 21:07:26 +0000</pubDate>
      <link>https://forem.com/bronto_io/the-log-management-cost-trap-part-ii-storage-45la</link>
      <guid>https://forem.com/bronto_io/the-log-management-cost-trap-part-ii-storage-45la</guid>
      <description>&lt;p&gt;&lt;em&gt;Authored by Benoit Gaudin&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In &lt;a href="https://www.bronto.io/blog/cost-trap-ingestion" rel="noopener noreferrer"&gt;Part I&lt;/a&gt; of this series, I explored the challenges of designing, running, and managing a centralised log management solution, with a focus on data ingestion. In Part II, I focus on &lt;strong&gt;data storage&lt;/strong&gt;. Part III covers search.&lt;/p&gt;

&lt;p&gt;I'll discuss different storage types and how their characteristics can fulfil the requirements of log management solutions, how data is organised within these systems, and the role of file formats in enabling efficient ingestion, storage, and retrieval.&lt;/p&gt;




&lt;h2&gt;
  
  
  Storage Types
&lt;/h2&gt;

&lt;p&gt;When evaluating storage options, the type of storage medium is the first decision to make. File systems and blob storage each come with distinct characteristics.&lt;/p&gt;

&lt;h3&gt;
  
  
  Disks and File Systems
&lt;/h3&gt;

&lt;p&gt;File systems operate at a lower level of abstraction and often require explicit management of storage capacity, throughput, and IOPS. Managed services like AWS EFS and FSx simplify some of this — EFS, for example, supports automatic scaling of storage and throughput capacity.&lt;/p&gt;

&lt;p&gt;One major advantage of file systems is the ability to &lt;strong&gt;append data to existing files&lt;/strong&gt;. This is especially relevant in log management, where data is immutable and continuously streamed.&lt;/p&gt;

&lt;p&gt;At Bronto, we leverage file systems for data aggregation — specifically their ability to append to files. Aggregation runs over a few hours before data is transferred to blob storage, so the storage footprint stays modest and cost-effective. This aggregation phase prevents small files from landing on blob storage, which is &lt;a href="https://www.researchgate.net/publication/362278375_The_Small_Files_Issue_in_Big_Data_Platforms_Problem_and_Solutions" rel="noopener noreferrer"&gt;known to cause performance issues&lt;/a&gt; at query time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F682c90048f12094fb1951144_Frame%2520626%2520%285%29.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F682c90048f12094fb1951144_Frame%2520626%2520%285%29.png" alt="File system aggregation architecture" width="800" height="251"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Blob Storage
&lt;/h3&gt;

&lt;p&gt;Blob storage is the popular choice for data analytics workloads due to scalability and cost-effectiveness. Unlike file systems, blob storage doesn't support appending — files must be rewritten entirely when modified.&lt;/p&gt;

&lt;p&gt;The pricing model differs significantly: costs include both storage and per-transaction API operations (writes, reads). Overall, blob storage is more cost-efficient than remote disks for large, infrequently-modified datasets.&lt;/p&gt;

&lt;p&gt;Blob storage also supports extremely high throughput. AWS S3, for instance, enables massive parallel processing — making it ideal for data-intensive workloads like &lt;a href="https://aws.amazon.com/emr/" rel="noopener noreferrer"&gt;AWS EMR&lt;/a&gt; and &lt;a href="https://aws.amazon.com/athena/" rel="noopener noreferrer"&gt;AWS Athena&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The tradeoff: blob storage isn't well-suited for frequent appends or aggregations. Solutions like &lt;a href="https://youtu.be/mNneCaZewTg?t=1318" rel="noopener noreferrer"&gt;Datadog Husky&lt;/a&gt; and &lt;a href="https://clickhouse.com/docs/en/merges" rel="noopener noreferrer"&gt;ClickHouse&lt;/a&gt; use &lt;strong&gt;compaction&lt;/strong&gt; to address this — writing many small objects over time, then consolidating them into larger ones.&lt;/p&gt;

&lt;p&gt;Bronto combines both: blob storage for long-term, large immutable files; file storage for short-term data aggregation. This balance optimises both performance and cost at scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  File Formats and Data Organisation
&lt;/h2&gt;

&lt;p&gt;File format alone doesn't determine query performance — how data is physically organised in storage matters just as much. Here are the key techniques.&lt;/p&gt;

&lt;h3&gt;
  
  
  Compression
&lt;/h3&gt;

&lt;p&gt;Compression is essential at scale. The primary benefit is reduced storage footprint, translating directly into lower costs. At large volumes, the savings are substantial.&lt;/p&gt;

&lt;p&gt;That said, maximum compression isn't always ideal. Higher compression ratios demand more CPU, memory, and time — increasing compute cost. The right point on the curve depends on your access patterns.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F682c9858c12b27fb2c262daf_Compression%2520Trade%2520Off%2520%281%29.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F682c9858c12b27fb2c262daf_Compression%2520Trade%2520Off%2520%281%29.png" alt="Compression trade-off diagram" width="800" height="529"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Row-based vs. Column-based Formats
&lt;/h3&gt;

&lt;p&gt;In row-oriented storage, all fields for each record are stored together sequentially. In column-oriented storage, all values for each field are stored together.&lt;/p&gt;

&lt;p&gt;Row-oriented formats suit unstructured data with write-intensive workloads. But with the rise of structured logging and agents that annotate data with attributes, &lt;strong&gt;columnar formats have become increasingly relevant for log data&lt;/strong&gt; — enabling much more efficient scans when you only need specific fields.&lt;/p&gt;

&lt;h3&gt;
  
  
  Partitioning
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F279rn6so8e2zafdd86ol.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F279rn6so8e2zafdd86ol.png" alt="Partitioning diagram" width="800" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Partitioning divides large datasets into smaller segments so queries can skip irrelevant data entirely. The key is choosing a logical criterion for segmentation.&lt;/p&gt;

&lt;p&gt;For log data, time-based partitioning is the natural choice — queries almost always specify a time range, so only the relevant time partition needs to be scanned. This dramatically reduces both the volume of data read and the cost of doing so, especially when data is retained over months or years.&lt;/p&gt;

&lt;h3&gt;
  
  
  Indexing
&lt;/h3&gt;

&lt;p&gt;Indexes work like a book index: rather than reading the entire dataset to find a value, you consult the index to jump directly to where it lives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inverted indexes&lt;/strong&gt; are especially effective for searching uncommon values across large datasets. The tradeoff is size — inverted indexes can grow &lt;a href="https://www.elastic.co/blog/elasticsearch-storage-the-true-story" rel="noopener noreferrer"&gt;as large as the original dataset&lt;/a&gt; in some cases, significantly increasing storage cost.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Filbwa7ogckdg1iz5tlbf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Filbwa7ogckdg1iz5tlbf.png" alt="Indexing diagram" width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Predicate Pushdown
&lt;/h3&gt;

&lt;p&gt;Predicate pushdown evaluates filter conditions using file metadata or summary statistics — without downloading or inspecting full file contents. File formats like &lt;a href="https://parquet.apache.org/" rel="noopener noreferrer"&gt;Parquet&lt;/a&gt; support this by storing column statistics (min/max values) in each data block.&lt;/p&gt;

&lt;p&gt;If the statistics for a file guarantee that a filter condition can't match any record in it, the entire file can be skipped. At scale, across datasets distributed across many files, this can dramatically reduce both data transfer and compute cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bloom Filters
&lt;/h3&gt;

&lt;p&gt;A &lt;strong&gt;Bloom filter&lt;/strong&gt; is a probabilistic data structure that answers one question: is a value &lt;em&gt;definitely not present&lt;/em&gt;, or &lt;em&gt;possibly present&lt;/em&gt;, in a dataset?&lt;/p&gt;

&lt;p&gt;When a file's Bloom filter returns "definitely not," the system skips that file entirely — no scan needed. Compared to inverted indexes, Bloom filters are smaller and more lightweight. They don't pinpoint exact data locations, but they're highly effective at eliminating irrelevant files before any data is transferred.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dictionary Encoding
&lt;/h3&gt;

&lt;p&gt;Dictionary encoding optimises storage and search for key-value pairs where values have &lt;strong&gt;low cardinality&lt;/strong&gt; — country names, log levels, environment tags, and so on. Instead of storing the full value in every row, a compact reference (dictionary entry) is stored, and the actual values live in a separate dictionary.&lt;/p&gt;

&lt;p&gt;This reduces storage size and enables a query optimisation: if filtering by a key whose values don't appear in a file's dictionary at all, that file's entire column can be skipped.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Developing a storage strategy for a large-scale log management system demands deep expertise and a clear understanding of data ingestion and access patterns. The choices made at the storage layer directly shape what's possible — and what it costs — at the ingestion and search layers.&lt;/p&gt;

&lt;p&gt;Bronto combines file storage for aggregation and blob storage for long-term retention, and borrows techniques from databases and analytics engines — partitioning, Bloom filtering, predicate pushdown, and dictionary encoding — to achieve high search performance at low cost.&lt;/p&gt;

&lt;p&gt;In Part III, I'll focus on the approaches and economics of search, and detail how Bronto uses AWS Lambda to provide a fast, cost-effective way to process large volumes of data stored in S3.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.bronto.io/book-a-demo" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;See How Bronto Handles This&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>logging</category>
      <category>devops</category>
      <category>infrastructure</category>
      <category>observability</category>
    </item>
    <item>
      <title>The Log Management Cost Trap: Ingestion</title>
      <dc:creator>Patrick Londa</dc:creator>
      <pubDate>Mon, 18 May 2026 13:14:29 +0000</pubDate>
      <link>https://forem.com/bronto_io/the-log-management-cost-trap-ingestion-44mn</link>
      <guid>https://forem.com/bronto_io/the-log-management-cost-trap-ingestion-44mn</guid>
      <description>&lt;p&gt;&lt;em&gt;Authored by Benoit Gaudin&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For systems with low log data volumes, self-hosting open-source solutions or using SaaS free plans are often excellent starting points. But as data volume inevitably grows, the complexity and costs associated with these solutions often become unviable.&lt;/p&gt;

&lt;p&gt;This post is for you if your logging costs have risen to a point where you're hesitant to send more data, or are excluding certain sources because of what they'd cost to ingest. At that point you're typically faced with two options: invest resources to reduce costs within your existing solution (reducing retention, archiving data, etc.), or build your own logging system for better cost control.&lt;/p&gt;

&lt;p&gt;For centralised log management systems, the sheer volume of data and its unstructured nature are typically the biggest factors driving cost and complexity. I break these challenges down into three key areas:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Ingesting large volumes of data&lt;/li&gt;
&lt;li&gt;Storing large volumes of data&lt;/li&gt;
&lt;li&gt;Querying large volumes of data&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These challenges are closely related — design decisions in one area directly impact the others. This post focuses on &lt;strong&gt;ingestion&lt;/strong&gt;. Storage and search will be tackled in follow-up posts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Ingestion
&lt;/h2&gt;

&lt;p&gt;Ingestion is the part of the system that receives data and processes it to make it searchable. Because of the volumes involved, log management solutions share many similarities with data analytics engines like Hadoop or Spark — but with one critical difference: &lt;strong&gt;data must be searchable in real time, or with minimal delay.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This freshness requirement exists because log management supports urgent troubleshooting use cases. In a production incident, engineers need access to logs from the last few minutes immediately — they can't wait for data to be batched. At the same time, other use cases (like browser version analysis across months of traffic) don't require fresh data at all.&lt;/p&gt;

&lt;p&gt;Because log management must support both real-time troubleshooting &lt;em&gt;and&lt;/em&gt; analytical queries over large historical datasets, it can't rely solely on off-the-shelf analytics platforms. The ingestion pipeline has to be designed with both speed and scale in mind.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F6820ade361e4643aade657b2_6814df9a709faf9547030fb2_Group%2525201171277535%252520%282%29.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.prod.website-files.com%2F67d2e1c8bd118640c72006cb%2F6820ade361e4643aade657b2_6814df9a709faf9547030fb2_Group%2525201171277535%252520%282%29.png" alt="Ingestion architecture diagram" width="800" height="264"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Reliability
&lt;/h2&gt;

&lt;p&gt;Upon receiving data, the system must acknowledge its reception and ensure it's securely handled. Mechanisms like &lt;strong&gt;data buffering&lt;/strong&gt; must be in place to gracefully handle temporary issues.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://kafka.apache.org/" rel="noopener noreferrer"&gt;Apache Kafka&lt;/a&gt; is an effective and commonly used solution for data buffering at scale, integrated into many log management solutions including ELK, Datadog, and Honeycomb. A Kafka layer in the ingestion pipeline allows the system to absorb temporary processing impediments without data loss.&lt;/p&gt;

&lt;p&gt;That said, efficient Kafka cluster management requires real expertise. Even with managed cloud offerings like AWS MSK, the overhead can be substantial and costly at large data volumes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Indexing and Partitioning
&lt;/h2&gt;

&lt;p&gt;When ingesting log data, how you organise it in the backend directly determines how it can be searched later. Two main approaches exist:&lt;/p&gt;

&lt;h3&gt;
  
  
  Index-based
&lt;/h3&gt;

&lt;p&gt;Systems like &lt;a href="https://github.com/elastic/elasticsearch" rel="noopener noreferrer"&gt;Elasticsearch&lt;/a&gt; and &lt;a href="https://github.com/opensearch-project/OpenSearch" rel="noopener noreferrer"&gt;OpenSearch&lt;/a&gt; build indexes that point to exact locations of relevant data. This offers good search performance but typically requires extracting key-value pairs from logs (e.g. via Logstash in the ELK stack) — and the index itself can grow to a significant size.&lt;/p&gt;

&lt;h3&gt;
  
  
  Partition-based
&lt;/h3&gt;

&lt;p&gt;No index is involved. Instead, data is organised so that large portions can be skipped entirely at query time. Most log management solutions partition by time range, since log data is timestamped and queries almost always specify a time window.&lt;/p&gt;

&lt;p&gt;Some solutions go further and partition on additional attributes beyond time — &lt;a href="https://grafana.com/docs/loki/latest/get-started/labels/" rel="noopener noreferrer"&gt;Grafana Loki&lt;/a&gt; and &lt;a href="https://docs.aws.amazon.com/athena/latest/ug/partitions.html" rel="noopener noreferrer"&gt;AWS Athena&lt;/a&gt; are good examples. Athena stores data on S3 and uses separate prefixes per partition to avoid full-dataset scans.&lt;/p&gt;

&lt;h3&gt;
  
  
  The hybrid approach
&lt;/h3&gt;

&lt;p&gt;Relying on indexing alone is expensive — building indexes is a heavy task. Partitioning alone may not narrow the dataset efficiently enough. &lt;a href="https://www.youtube.com/watch?v=mNneCaZewTg&amp;amp;t=2213s" rel="noopener noreferrer"&gt;Datadog Husky&lt;/a&gt; uses a hybrid approach, and we believe at Bronto this is the right pattern: it provides multiple levers for tuning performance and cost independently.&lt;/p&gt;




&lt;h2&gt;
  
  
  Append-only and Compaction
&lt;/h2&gt;

&lt;p&gt;Two competing requirements shape how data gets written:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fresh data&lt;/strong&gt; must be available to search quickly — ideally within seconds — meaning it must be written in small increments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Large historical datasets&lt;/strong&gt; must be searchable efficiently, which favours large files and batch-oriented access patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Writing lots of small files creates the classic &lt;strong&gt;"small files problem"&lt;/strong&gt; in analytics workloads: many parallel compute units each making small network requests, which kills throughput. Two techniques address this:&lt;/p&gt;

&lt;h3&gt;
  
  
  Compaction
&lt;/h3&gt;

&lt;p&gt;Used by &lt;a href="https://youtu.be/mNneCaZewTg?t=1318" rel="noopener noreferrer"&gt;Datadog Husky&lt;/a&gt; and &lt;a href="https://clickhouse.com/docs/en/merges" rel="noopener noreferrer"&gt;ClickHouse&lt;/a&gt;, among others. Data is first stored in small units, then consolidated into larger ones over time. Since small objects only apply to recent data, this remains suitable for historical queries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Append-only
&lt;/h3&gt;

&lt;p&gt;Data is incrementally added to a growing unit. Easy on a file system, but problematic with object stores like AWS S3 — where appending isn't possible and the entire object must be rewritten on every update. This impacts both performance and ingestion cost.&lt;/p&gt;

&lt;p&gt;Despite that limitation, object stores are cost-efficient for long-term storage and well-suited to high-parallelism search access.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bronto's approach
&lt;/h3&gt;

&lt;p&gt;We implemented a &lt;strong&gt;two-tier storage solution&lt;/strong&gt;: data is first appended to local files, making it immediately available to the search engine; once a file reaches a suitable size, it's uploaded to an object store. This avoids compaction entirely while still keeping fresh data searchable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fduhektmwxsz9n0ii0o63.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fduhektmwxsz9n0ii0o63.png" alt="Two-tier storage architecture" width="800" height="350"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Log management solutions are designed to handle vast amounts of unstructured data — a task that introduces significant cost and complexity. They must serve conflicting use cases: real-time troubleshooting that demands fresh data immediately, and analytical queries that demand efficient access to large historical datasets.&lt;/p&gt;

&lt;p&gt;At scale, choosing how to ingest data requires careful attention to the trade-offs between reliability, performance, cost, and system complexity. The expertise required to design, implement, and maintain this pipeline is substantial — and that's before accounting for storage and search.&lt;/p&gt;

&lt;p&gt;Subsequent posts will cover those remaining challenges. In the meantime, if your logging costs are already a problem worth solving, it's worth understanding what's driving them at each layer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.bronto.io/book-a-demo" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;See How Bronto Handles This&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>logging</category>
      <category>devops</category>
      <category>infrastructure</category>
      <category>observability</category>
    </item>
    <item>
      <title>Build Your Own Telemetry UI Using Lovable &amp; Bronto</title>
      <dc:creator>Patrick Londa</dc:creator>
      <pubDate>Fri, 15 May 2026 16:27:48 +0000</pubDate>
      <link>https://forem.com/bronto_io/bring-your-own-telemetry-ui-using-lovable-bronto-5hh8</link>
      <guid>https://forem.com/bronto_io/bring-your-own-telemetry-ui-using-lovable-bronto-5hh8</guid>
      <description>&lt;p&gt;&lt;em&gt;Authored by Feargal Karney &amp;amp; Mati Remi&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The Bronto REST API &lt;strong&gt;now&lt;/strong&gt; exposes everything our own UI is built on. That means you can build a custom interface tailored exactly to your team's workflow, rather than having to use a general-purpose interface someone else designed.&lt;/p&gt;

&lt;p&gt;To make it easy to get started, we've published a baseline project on Lovable you can remix into your own workspace. For those who haven't used it, Lovable is an AI-powered frontend builder — think Figma meets Claude Code, but for shipping real React apps. It fuelled its way to $100M ARR in just 8 months.&lt;/p&gt;

&lt;p&gt;Here's how to create your very own BrontoVibe project.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started in 3 Steps
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Create a free account or log into &lt;a href="https://lovable.dev/" rel="noopener noreferrer"&gt;Lovable&lt;/a&gt;, then open the template &lt;a href="https://lovable.dev/projects/af3702ca-a831-411d-9620-789e05cf8e20" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Get an API Key from your Bronto account under &lt;strong&gt;Settings → API Keys → Add API Key&lt;/strong&gt; (API Full Access Role).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Remix the project and get prompting!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgnp4qwsibw2rruoz5612.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgnp4qwsibw2rruoz5612.gif" alt="Remixing the BrontoVibe project in Lovable" width="800" height="438"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's it. No backend to deploy, no auth to configure, no infrastructure to manage.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Baseline Project Covers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Search &amp;amp; Explore
&lt;/h3&gt;

&lt;p&gt;Query via SQL or LogQL, view raw events, or plot them on a timeseries. A solid starting point you can extend however you need.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fapuexi81xnhje0alfq3y.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fapuexi81xnhje0alfq3y.gif" alt="Search and explore in BrontoVibe" width="760" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Tracing
&lt;/h3&gt;

&lt;p&gt;Find errors and performance issues, drill into spans across services. One click from any trace takes you to the correlated raw log data — useful when you're mid-incident and need context fast.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F046gswf1wixnitxslld4.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F046gswf1wixnitxslld4.gif" alt="Tracing in BrontoVibe" width="760" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Dashboards
&lt;/h3&gt;

&lt;p&gt;View your existing dashboards or ask Lovable to generate new widgets. This is where the AI-builder angle gets interesting.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F55y2aka5ffnye7ic4x0o.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F55y2aka5ffnye7ic4x0o.gif" alt="Dashboards in BrontoVibe" width="760" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Usage
&lt;/h3&gt;

&lt;p&gt;Ingestion and search usage broken down, so you always know where your volume is going.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn00y0ij5w3inijkmp5ci.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn00y0ij5w3inijkmp5ci.gif" alt="Usage breakdown in BrontoVibe" width="760" height="392"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Ingestion Methods
&lt;/h2&gt;

&lt;p&gt;You can send data via:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Quick Ingest&lt;/strong&gt; — raw log paste for fast testing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agents&lt;/strong&gt; — Fluent Bit, Vector, Datadog Agent, OpenTelemetry Collector&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrations&lt;/strong&gt; — Akamai, PagerDuty, and more&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See the full list in the &lt;a href="https://docs.bronto.io/agent-setup/agent-intro" rel="noopener noreferrer"&gt;Bronto integrations docs&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;We look forward to seeing what you build!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://lovable.dev/projects/af3702ca-a831-411d-9620-789e05cf8e20" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Remix the BrontoVibe Template&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>observability</category>
      <category>devops</category>
      <category>webdev</category>
      <category>ai</category>
    </item>
    <item>
      <title>Logging &amp; Observability Best Practices from Bronto</title>
      <dc:creator>Patrick Londa</dc:creator>
      <pubDate>Fri, 15 May 2026 13:39:40 +0000</pubDate>
      <link>https://forem.com/bronto_io/logging-observability-best-practices-from-bronto-59hc</link>
      <guid>https://forem.com/bronto_io/logging-observability-best-practices-from-bronto-59hc</guid>
      <description>&lt;p&gt;&lt;em&gt;Authored by Conall Heffernan&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Centralized logging is a good start to improving your log management — it allows collection, storage, and analysis from multiple sources in a single repository, making it easier to manage and access logs for dev, support, product, and SRE teams, as well as more easily meeting security and compliance requirements.&lt;/p&gt;

&lt;p&gt;Having centralized your logs, the practices below will take you further. High-quality logs are the foundation of effective observability. Consistent, structured, and well-tagged log data allows teams to quickly identify performance issues, troubleshoot errors, and optimize cost and performance.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If AI is defined as the intersection of where intelligence meets data … data quality is key in an AI world.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In a world where AIs are starting to automate more and more, having clean, high-quality logs opens up the door to further automation and efficiencies — enabling additional benefits and new AI use cases.&lt;/p&gt;

&lt;p&gt;This guide covers recommended best practices for &lt;strong&gt;log structure and context enrichment&lt;/strong&gt;, &lt;strong&gt;correlation&lt;/strong&gt;, &lt;strong&gt;agent configuration&lt;/strong&gt;, &lt;strong&gt;team ownership&lt;/strong&gt;, and &lt;strong&gt;log strategy&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Log Structure and Context
&lt;/h2&gt;

&lt;p&gt;Tags, log metadata, and message attributes are all key–value pairs (KVPs), but they serve different purposes and live at different levels of your event stream:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tags&lt;/strong&gt; – Properties that apply to an entire stream of events (a dataset)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log metadata&lt;/strong&gt; – Properties added to individual log records, typically by the logging agent or its plugins&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message attributes&lt;/strong&gt; – Properties embedded directly in the log message itself&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Tags: Properties of the Dataset
&lt;/h3&gt;

&lt;p&gt;Tags apply to all entries in a stream of events and are not visible as part of the log event itself. They are ideal for separating environments at query time (e.g. avoid mixing staging and prod).&lt;/p&gt;

&lt;p&gt;Examples of good tags:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;production&lt;/span&gt;
&lt;span class="py"&gt;account_id&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;12345678&lt;/span&gt;
&lt;span class="py"&gt;region&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;us-east-1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set tags via agent configuration so they are applied automatically to all data processed by that agent. Configuration management tools such as Terraform or CloudFormation can set these tags consistently across your infrastructure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Log Metadata: Properties of the Source
&lt;/h3&gt;

&lt;p&gt;Log metadata are key–value pairs associated with a specific log, typically added by the agent (often via plugins), not by the application itself. It usually describes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The host or node — e.g. &lt;code&gt;host_name=web-01&lt;/code&gt;, &lt;code&gt;os=linux&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The pod or container — e.g. &lt;code&gt;pod_name=api-6c8d3f5c2f-wz2vt&lt;/code&gt;, &lt;code&gt;namespace=payments&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The service name and version — e.g. &lt;code&gt;service=checkout-api&lt;/code&gt;, &lt;code&gt;version=2.3.1&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A key point: a single agent can process data from multiple hosts, pods, services, or versions, and the metadata will reflect those differences on a per-record basis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Message Attributes: Properties Inside the Log Message
&lt;/h3&gt;

&lt;p&gt;Message attributes are key–value pairs present inside the log message body itself, authored by application developers and specific to a single log entry. They're ideal for capturing fine-grained, per-request context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"info"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"request processed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"duration_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Common examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;duration_ms&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;123&lt;/span&gt;
&lt;span class="py"&gt;request_id&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;abc-123&lt;/span&gt;
&lt;span class="py"&gt;retry_count&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two supported formats out of the box:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The entire message follows &lt;strong&gt;JSON format&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;key=value&lt;/code&gt;&lt;/strong&gt; format within the log message (values may be quoted; &lt;code&gt;:&lt;/code&gt; can be used instead of &lt;code&gt;=&lt;/code&gt;)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Indexing is automatic in modern logging platforms — manually managing and configuring indexes is a time-consuming and cumbersome task you shouldn't need to do.&lt;/p&gt;

&lt;h3&gt;
  
  
  Exception and Stack Trace Handling
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;agent-side multiline support&lt;/strong&gt; (e.g., &lt;a href="https://docs.fluentbit.io/manual/data-pipeline/filters/multiline-stacktrace" rel="noopener noreferrer"&gt;FluentBit multiline filter&lt;/a&gt;) to capture stack traces as &lt;strong&gt;single log events&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Report exception name and stack trace as structured attributes:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;exception.type
exception.stacktrace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes it easy to query and alert on recurring or unexpected exceptions.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Correlation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Trace and Correlation IDs
&lt;/h3&gt;

&lt;p&gt;Add fields like &lt;code&gt;trace_id&lt;/code&gt;, &lt;code&gt;span_id&lt;/code&gt;, and &lt;code&gt;request_id&lt;/code&gt; to your logs so you can tie them back to a single user request or workflow across multiple services. In a distributed system, a single call can pass through frontends, APIs, queues, and background workers — without a shared ID, the logs from each hop look like isolated events.&lt;/p&gt;

&lt;p&gt;With a common ID, you can filter on that value and reconstruct the full timeline of "what happened where and when," instead of guessing based on timestamps and hosts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to add them — it's usually a combination of code and tooling:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A tracing library or standard (such as &lt;a href="https://opentelemetry.io/docs/concepts/signals/traces/" rel="noopener noreferrer"&gt;OpenTelemetry&lt;/a&gt;) generates and propagates trace and span context across service boundaries. Most logging frameworks can be configured to automatically include those IDs on every log entry.&lt;/li&gt;
&lt;li&gt;At the same time, use an application-level &lt;code&gt;request_id&lt;/code&gt; or correlation ID (often taken from or added to an HTTP header at the edge) and pass it through your services.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A robust setup does both: use tracing context (&lt;code&gt;trace_id&lt;/code&gt;, &lt;code&gt;span_id&lt;/code&gt;) and ensure they are consistently present in logs so any logging or observability system can correlate events end-to-end.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Agent Configuration &amp;amp; Processing
&lt;/h2&gt;

&lt;p&gt;The OpenTelemetry Collector and similar agents like Fluentbit, Logstash, and Vector can enrich, sanitize, and optimize log data before it ever reaches storage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Recommended Configurations
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Redact PII&lt;/strong&gt; before logs leave your infrastructure. Mask or drop fields like emails, full names, IPs, IDs, and tokens at the agent or collector level — so even if logs are leaked or shared, sensitive data isn't exposed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Configure multiline stacktrace handling&lt;/strong&gt; so full exceptions are captured as a single log event instead of being split into many noisy lines. This typically means using a multiline rule that continues a record while lines match patterns like &lt;code&gt;^\s+at&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Normalize log levels&lt;/strong&gt; before shipping. If you don't, breakdowns by log level in dashboards will look fragmented — instead of a clean &lt;code&gt;INFO / WARN / ERROR&lt;/code&gt;, you'll see multiple tiny buckets like &lt;code&gt;info&lt;/code&gt;, &lt;code&gt;Info&lt;/code&gt;, &lt;code&gt;INFO&lt;/code&gt;, &lt;code&gt;error&lt;/code&gt;, and &lt;code&gt;ERR&lt;/code&gt; that all mean the same thing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use batch and memory limiter processors&lt;/strong&gt; (for example with OTel):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Processor&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;batch&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Groups spans/logs/metrics into batches, improves throughput, reduces overhead&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;memory_limiter&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Puts a hard cap on memory usage, drops data or throttles when usage exceeds thresholds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Strike a balance:&lt;/strong&gt; let agents fix inconsistencies from 3rd-party logs, but rely on developers to structure first-party logs correctly.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  4. Team Practices &amp;amp; Ownership
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why It Matters
&lt;/h3&gt;

&lt;p&gt;Logging is not just a technical setup — it's a shared responsibility across teams. Establishing clear ownership early ensures that logs are consistent, searchable, and actionable throughout your organization's lifecycle. It also makes it clear who is accountable for volume control (for example, leaving DEBUG on in production).&lt;/p&gt;

&lt;h3&gt;
  
  
  Best Practices
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Assign team ownership from day one.&lt;/strong&gt; Each dataset or service should have a defined owning team responsible for log quality, metadata, and alerting setup. This avoids confusion later when troubleshooting or optimizing costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tag logs by team.&lt;/strong&gt; Include a &lt;code&gt;team&lt;/code&gt; or &lt;code&gt;owner&lt;/code&gt; tag in metadata or agent configuration. This enables your logging platform to group logs, usage metrics, and cost by responsible team automatically — particularly useful when understanding volume spikes. Set up usage alerts so a given team is notified if their volumes suddenly go off the charts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz5ggxp4jgwvv0krnbav6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz5ggxp4jgwvv0krnbav6.png" alt=" " width="800" height="444"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Encourage collaboration through shared queries.&lt;/strong&gt; Make it a habit for teams to share saved queries, dashboards, and monitors. Common examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Error spikes by environment"&lt;/li&gt;
&lt;li&gt;"Token usage per service"&lt;/li&gt;
&lt;li&gt;"Slowest response patterns over 24h"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Shared queries reduce duplication and foster best-practice discovery internally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use team-based datasets.&lt;/strong&gt; Group data logically — by service ownership rather than by underlying infrastructure — so each team can monitor the performance, health, and behavior of their own services without noise from unrelated systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Make accountability visible.&lt;/strong&gt; Use tags and naming conventions that make ownership clear:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;team&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;payments&lt;/span&gt;
&lt;span class="py"&gt;service&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;checkout-api&lt;/span&gt;
&lt;span class="py"&gt;env&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;prod&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Pro Tip:&lt;/strong&gt; Building a strong observability culture that promotes best practices early creates long-term efficiency. Teams that own their data from the start rarely need a cleanup project later.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  5. Log Types and Strategy
&lt;/h2&gt;

&lt;p&gt;Define what types of logs your organization will collect and how they'll be categorized:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Application&lt;/td&gt;
&lt;td&gt;Custom app logs&lt;/td&gt;
&lt;td&gt;Owned by dev teams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Third-party services&lt;/td&gt;
&lt;td&gt;Kafka, NGINX, Redis&lt;/td&gt;
&lt;td&gt;Semi-structured; normalize via agents or auto-parser&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Infrastructure&lt;/td&gt;
&lt;td&gt;syslog, journald&lt;/td&gt;
&lt;td&gt;Often managed by SREs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloud&lt;/td&gt;
&lt;td&gt;AWS, GCP, Azure&lt;/td&gt;
&lt;td&gt;Forwarding integration needed; can be high volume (CloudTrail, Load Balancer logs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security&lt;/td&gt;
&lt;td&gt;CloudTrail, auditd&lt;/td&gt;
&lt;td&gt;Coordinate with SecOps/SIEM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CI/CD&lt;/td&gt;
&lt;td&gt;Pipeline events&lt;/td&gt;
&lt;td&gt;Great for trend correlation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Pro Tip:&lt;/strong&gt; Review overlap between application and infrastructure logs to avoid duplication and unnecessary ingestion usage. If your app logs &lt;code&gt;request_id&lt;/code&gt;, &lt;code&gt;user_id&lt;/code&gt;, &lt;code&gt;status&lt;/code&gt;, and &lt;code&gt;latency&lt;/code&gt;, and NGINX/syslog already records &lt;code&gt;status&lt;/code&gt; and &lt;code&gt;latency&lt;/code&gt;, keep those fields in one layer and use &lt;code&gt;request_id&lt;/code&gt; to correlate — instead of ingesting the same details twice.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;Good logging is a discipline, not a one-time setup. The combination of structured data, consistent metadata, proper correlation IDs, well-configured agents, and clear team ownership is what separates logs that collect dust from logs that actively drive engineering decisions.&lt;/p&gt;

&lt;p&gt;Start with structure, assign ownership early, and build the habit of sharing queries and dashboards across teams. Your future self — debugging a production incident at 2am — will thank you.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://app.eu.bronto.io/signup" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Give Bronto a Try&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>logging</category>
      <category>observability</category>
      <category>sre</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
