<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Kemal Akkoyun</title>
    <description>The latest articles on Forem by Kemal Akkoyun (@kakkoyun).</description>
    <link>https://forem.com/kakkoyun</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F154921%2Fc7f37497-7b15-47ad-adeb-7a893eeee5c9.jpeg</url>
      <title>Forem: Kemal Akkoyun</title>
      <link>https://forem.com/kakkoyun</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/kakkoyun"/>
    <language>en</language>
    <item>
      <title>Measuring Software Performance: Why Your Benchmarks Are Probably Lying</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Fri, 06 Mar 2026 00:00:00 +0000</pubDate>
      <link>https://forem.com/kakkoyun/measuring-software-performance-why-your-benchmarks-are-probably-lying-k9o</link>
      <guid>https://forem.com/kakkoyun/measuring-software-performance-why-your-benchmarks-are-probably-lying-k9o</guid>
      <description>&lt;h3&gt;
  
  
  A Loose Cable That Broke Physics
&lt;/h3&gt;

&lt;p&gt;In 2006, a team of physicists began building the &lt;a href="https://en.wikipedia.org/wiki/OPERA_experiment" rel="noopener noreferrer"&gt;OPERA experiment&lt;/a&gt; — a 730-kilometer underground tunnel from CERN in Switzerland to Gran Sasso in Italy, designed to measure the speed of neutrinos. Five years of construction. Roughly 100 million euros. The most rigorous experimental physics on the planet.&lt;/p&gt;

&lt;p&gt;In September 2011, the results came back. Neutrinos were traveling &lt;a href="https://profmattstrassler.com/articles-and-posts/particle-physics-basics/neutrinos/neutrinos-faster-than-light/opera-what-went-wrong/" rel="noopener noreferrer"&gt;faster than the speed of light&lt;/a&gt;. The team had just broken the laws of physics.&lt;/p&gt;

&lt;p&gt;Except they hadn’t. After months of rechecking the math, the sensors, and the calibration, they found the root cause: a single fiber-optic cable that wasn’t fully plugged in. A loose connector had introduced a 73-nanosecond timing error — enough to make neutrinos appear superluminal.&lt;/p&gt;

&lt;p&gt;Most of us aren’t building 730-kilometer tunnels. But we deal with “loose cables” every day when measuring software performance. A benchmark that shows a 5% speedup might be measuring thermal throttling, CPU frequency scaling, or a noisy neighbor on a shared cloud instance. The signal is real, but so is the noise — and telling them apart requires discipline.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyxmzu4i9p9heod3vefct.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyxmzu4i9p9heod3vefct.jpeg" alt="Software Performance Devroom audience at FOSDEM 2026" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This post expands on the talk &lt;a href="https://github.com/igoragoli" rel="noopener noreferrer"&gt;Augusto de Oliveira&lt;/a&gt; and I gave at the &lt;a href="https://dev.to/talks/how-to-reliably-measure-software-performance/"&gt;FOSDEM 2026 Software Performance Devroom&lt;/a&gt;. The &lt;a href="https://github.com/igoragoli/fosdem-2026-software-performance" rel="noopener noreferrer"&gt;slides and experiments&lt;/a&gt; are all open source.&lt;/p&gt;




&lt;h3&gt;
  
  
  Why Benchmarking Is Hard
&lt;/h3&gt;

&lt;p&gt;Measuring software performance is a specialized version of a more general problem: finding a signal in a world full of noise.&lt;/p&gt;

&lt;p&gt;Modern systems have layers of non-determinism that conspire against repeatable measurements. The CPU dynamically adjusts its clock frequency based on load and temperature. The OS scheduler moves threads between cores. Caches warm and cool. Background processes steal cycles. VMs share physical resources with other tenants. Memory layout changes between runs due to address space layout randomization (ASLR).&lt;/p&gt;

&lt;p&gt;Any one of these factors can shift your numbers by a few percent. Stack them up, and a benchmark that reports a 5% improvement might just be measuring random variation. You run it again and the improvement vanishes — or reverses.&lt;/p&gt;

&lt;p&gt;The gap between “I ran a quick benchmark on my laptop” and “this measurement is reliable enough to make decisions on” is enormous. Closing that gap requires controlling the environment, designing the benchmark properly, interpreting results with statistical rigor, and integrating the whole process into your development workflow.&lt;/p&gt;




&lt;h3&gt;
  
  
  Environment Control
&lt;/h3&gt;

&lt;p&gt;This is the foundation. No amount of statistical sophistication will compensate for a noisy measurement environment. The sources of noise come from every layer of the stack:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Sources of Noise&lt;/th&gt;
&lt;th&gt;Mitigations&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;External&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Network, temperature, vibration, virtualization&lt;/td&gt;
&lt;td&gt;Bare metal instances, dedicated hardware&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Application&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Memory layout, compilation/linking&lt;/td&gt;
&lt;td&gt;Fixed builds, disable ASLR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Kernel&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Scheduling, caching&lt;/td&gt;
&lt;td&gt;CPU affinity, process priority, cache management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CPU&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SMT contention, dynamic frequency scaling&lt;/td&gt;
&lt;td&gt;Disable SMT, disable DFS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Noisy Neighbors and Bare Metal
&lt;/h4&gt;

&lt;p&gt;If you’re running benchmarks on a shared cloud VM, you’re sharing physical CPU cores, memory bandwidth, and last-level cache with other tenants. Their workload affects your numbers. This is the classic noisy neighbor problem.&lt;/p&gt;

&lt;p&gt;The fix: use bare metal cloud instances (e.g., AWS &lt;code&gt;m5.metal&lt;/code&gt;). They cost more, but they give you exclusive access to the underlying hardware. Just as importantly, bare metal access lets you apply the kernel-level and CPU-level mitigations below — none of which are possible on shared VMs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.mongodb.com/company/blog/engineering/reducing-variability-performance-tests-ec2-setup-key-results" rel="noopener noreferrer"&gt;MongoDB’s engineering team documented this well&lt;/a&gt; — their work on reducing variability in EC2 performance tests is an excellent reference for anyone setting up cloud-based benchmarking infrastructure.&lt;/p&gt;

&lt;h4&gt;
  
  
  CPU Affinity and Process Priority
&lt;/h4&gt;

&lt;p&gt;The OS scheduler moves processes between CPU cores to balance load. Each migration can evict warm cache lines and introduce jitter. Pinning your benchmark to specific cores with &lt;code&gt;taskset&lt;/code&gt; eliminates this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pin benchmark to CPU 0&lt;/span&gt;
taskset &lt;span class="nt"&gt;-c&lt;/span&gt; 0 ./benchmark
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Similarly, raising process priority with &lt;code&gt;nice&lt;/code&gt; reduces scheduling interference from other processes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Higher priority (niceness -5, where -20 is highest)&lt;/span&gt;
&lt;span class="nb"&gt;nice&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="nt"&gt;-5&lt;/span&gt; ./benchmark
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Cache Management
&lt;/h4&gt;

&lt;p&gt;If your benchmark touches the filesystem, cold vs. warm page cache can dramatically change results. Either warm the cache deliberately before measurement, or drop it to start from a known state:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Drop all caches (requires root)&lt;/span&gt;
&lt;span class="nb"&gt;echo &lt;/span&gt;3 &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /proc/sys/vm/drop_caches &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sync&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Simultaneous Multithreading (SMT)
&lt;/h4&gt;

&lt;p&gt;SMT (marketed as Hyper-Threading on Intel CPUs) allows two hardware threads to share a single physical core. They share execution resources — ALUs, caches, branch predictors — while maintaining separate architectural state.&lt;/p&gt;

&lt;p&gt;For I/O-bound workloads, this is fine: one thread executes while the other waits for I/O. But for CPU-bound benchmarks, SMT introduces severe contention. Two threads fight over the same execution units, and the resulting interference shows up as variance in your measurements.&lt;/p&gt;

&lt;p&gt;We ran a simple experiment on an AWS &lt;code&gt;m5.metal&lt;/code&gt; instance with DFS disabled, measuring two CPU-bound tasks running on the same core (SMT enabled) vs. separate cores (SMT disabled):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;th&gt;Mean&lt;/th&gt;
&lt;th&gt;Coeff. of Variation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SMT enabled, task 1&lt;/td&gt;
&lt;td&gt;1537.64 +/- 367.29 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;23.887%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMT enabled, task 2&lt;/td&gt;
&lt;td&gt;1536.88 +/- 366.84 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;23.869%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMT disabled, task 1&lt;/td&gt;
&lt;td&gt;737.37 +/- 0.32 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.044%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMT disabled, task 2&lt;/td&gt;
&lt;td&gt;737.93 +/- 1.74 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.235%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That’s &lt;strong&gt;100x less variance&lt;/strong&gt; with SMT disabled. The tasks also run twice as fast because they’re no longer contending for shared execution resources.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Disable SMT&lt;/span&gt;
&lt;span class="nb"&gt;echo &lt;/span&gt;off &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /sys/devices/system/cpu/smt/control
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Dynamic Frequency Scaling (DFS)
&lt;/h4&gt;

&lt;p&gt;Modern CPUs adjust their clock frequency dynamically based on workload, thermals, and power budgets. Intel calls the upward scaling “Turbo Boost.” This is great for general-purpose computing but terrible for benchmarking — the frequency varies based on how many cores are active, the ambient temperature, and the power headroom.&lt;/p&gt;

&lt;p&gt;A single-threaded benchmark might run at 3.5 GHz. Start another workload on a neighboring core and the frequency drops to 3.1 GHz. Your benchmark just got 11% slower, and the code didn’t change.&lt;/p&gt;

&lt;p&gt;We measured this on the same &lt;code&gt;m5.metal&lt;/code&gt; instance with SMT disabled, varying the number of concurrent CPU-bound tasks:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;th&gt;Mean&lt;/th&gt;
&lt;th&gt;Coeff. of Variation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DFS on, 1 task&lt;/td&gt;
&lt;td&gt;533.97 +/- 2.046 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.383%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DFS on, 8 tasks&lt;/td&gt;
&lt;td&gt;578.67 +/- 0.287 ms&lt;/td&gt;
&lt;td&gt;0.050%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DFS off, 1 task&lt;/td&gt;
&lt;td&gt;738.18 +/- 0.306 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.041%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DFS off, 8 tasks&lt;/td&gt;
&lt;td&gt;739.18 +/- 0.351 ms&lt;/td&gt;
&lt;td&gt;0.047%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;With DFS enabled, the single-task case shows ~10x more variance than with DFS disabled. The absolute runtime is higher with DFS off (the CPU runs at its base frequency rather than boosting), but the measurements are rock-solid. When benchmarking, consistency matters more than raw speed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pin clock rate to base frequency&lt;/span&gt;
&lt;span class="nb"&gt;echo &lt;/span&gt;2500000 &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /sys/devices/system/cpu/cpu&lt;span class="k"&gt;*&lt;/span&gt;/cpufreq/scaling_max_freq

&lt;span class="c"&gt;# Set scaling governor to "performance"&lt;/span&gt;
&lt;span class="nb"&gt;echo &lt;/span&gt;performance &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /sys/devices/system/cpu/cpu&lt;span class="k"&gt;*&lt;/span&gt;/cpufreq/scaling_governor

&lt;span class="c"&gt;# Disable Turbo Boost (Intel CPUs)&lt;/span&gt;
&lt;span class="nb"&gt;echo &lt;/span&gt;1 &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /sys/devices/system/cpu/intel_pstate/no_turbo

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Denis Bakhvalov’s &lt;a href="https://github.com/dendibakh/perf-book" rel="noopener noreferrer"&gt;Performance Analysis and Tuning on Modern CPUs&lt;/a&gt; covers CPU-level tuning in depth and is the definitive reference on this topic.&lt;/p&gt;




&lt;h3&gt;
  
  
  Benchmark Design
&lt;/h3&gt;

&lt;p&gt;Environment control reduces noise. Good benchmark design ensures the signal you’re measuring is actually meaningful.&lt;/p&gt;

&lt;h4&gt;
  
  
  Representative Workloads
&lt;/h4&gt;

&lt;p&gt;A benchmark is only useful if it measures something that matters. What does your application actually do?&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Archetype&lt;/th&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Characteristics&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Idle&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Background workers, minimal load&lt;/td&gt;
&lt;td&gt;Low RPS, minimal CPU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Microservices, APIs&lt;/td&gt;
&lt;td&gt;High RPS, low CPU per request&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Throughput&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Queue workers, batch processing&lt;/td&gt;
&lt;td&gt;Moderate RPS, high CPU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Enterprise&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Business apps with DB/API calls&lt;/td&gt;
&lt;td&gt;Moderate RPS, mixed CPU/IO&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Your benchmark workload should match your production workload. A microbenchmark that measures a tight loop in isolation won’t tell you much about how your API server handles realistic traffic patterns.&lt;/p&gt;

&lt;p&gt;That said, microbenchmarks have their place. They’re invaluable for comparing algorithms, validating specific optimizations, and catching regressions in hot paths. The key is knowing which type fits your question:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Benchmark Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Comparing algorithms&lt;/td&gt;
&lt;td&gt;Micro&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Validating optimizations&lt;/td&gt;
&lt;td&gt;Micro&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regression detection&lt;/td&gt;
&lt;td&gt;Both&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Capacity planning&lt;/td&gt;
&lt;td&gt;Macro&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User experience&lt;/td&gt;
&lt;td&gt;Macro&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Best practice: use both in your pipeline.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Coordinated Omission Problem
&lt;/h4&gt;

&lt;p&gt;If your load generator waits for each response before sending the next request, it’s probably lying to you. When the system under test slows down, the generator slows down too — sending fewer requests per second, which artificially improves the measured latencies.&lt;/p&gt;

&lt;p&gt;Gil Tene’s talk &lt;a href="https://www.youtube.com/watch?v=lJ8ydIuPFeU" rel="noopener noreferrer"&gt;“How NOT to Measure Latency”&lt;/a&gt; is the definitive explanation of this problem. The short version: use load generators that maintain a constant request rate regardless of response time. Tools like &lt;a href="https://k6.io/" rel="noopener noreferrer"&gt;k6&lt;/a&gt; and &lt;a href="https://github.com/giltene/wrk2" rel="noopener noreferrer"&gt;wrk2&lt;/a&gt; handle this correctly.&lt;/p&gt;

&lt;h4&gt;
  
  
  Warm-Up and Steady State
&lt;/h4&gt;

&lt;p&gt;We learned this the hard way with a Java benchmark. The goal: measure instrumentation overhead on a Spring application. Initial setup: 20-second warmup, 15 seconds of measurements, collecting one sample per second.&lt;/p&gt;

&lt;p&gt;The coefficient of variation was &lt;strong&gt;11.80%&lt;/strong&gt; — far too noisy to detect real changes.&lt;/p&gt;

&lt;p&gt;The problem was warmup. The JVM compiles methods on the fly (JIT compilation). Each method needs to be called enough times to hit the compilation threshold, then you wait for the compiler to finish. Twenty seconds wasn’t nearly enough. By extending the warmup to 160 seconds and the measurement period to match, the picture changed completely.&lt;/p&gt;

&lt;p&gt;From the experiments:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tip 1&lt;/strong&gt; : Run benchmarks long enough to uncover perturbations like warmup effects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tip 2&lt;/strong&gt; : Collect enough samples to reduce intra-run variation. N &amp;gt;= 30 is a reasonable minimum.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tip 3&lt;/strong&gt; : Rerun benchmarks multiple times to reduce inter-run variation. M &amp;gt;= 5 runs helps account for &lt;a href="https://link.springer.com/chapter/10.1007/11758525_26" rel="noopener noreferrer"&gt;random initial state effects&lt;/a&gt; (cache layout, memory placement).&lt;/p&gt;

&lt;p&gt;Applying all three tips reduced the coefficient of variation from &lt;strong&gt;11.80% to 2.94%&lt;/strong&gt; — a 4x improvement from benchmark design alone, before any environment control.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tip 4&lt;/strong&gt; : Use deterministic inputs. Non-deterministic data leads to non-deterministic measurements.&lt;/p&gt;




&lt;h3&gt;
  
  
  Statistical Methods
&lt;/h3&gt;

&lt;p&gt;You’ve controlled the environment and designed a good benchmark. Now you have data. The question is: is the difference you’re seeing real, or noise?&lt;/p&gt;

&lt;h4&gt;
  
  
  Why Averages Lie
&lt;/h4&gt;

&lt;p&gt;Consider a throughput benchmark run before and after a code change. The “before” mean is 102.7 req/s. The “after” mean is 105.0 req/s. That’s a 2.3% improvement. Ship it?&lt;/p&gt;

&lt;p&gt;Not so fast. Each of those means summarizes a distribution of individual measurements. If those distributions overlap significantly, the difference between the means might not be statistically significant — it could easily arise from random variation alone.&lt;/p&gt;

&lt;h4&gt;
  
  
  Hypothesis Testing
&lt;/h4&gt;

&lt;p&gt;The intuition is straightforward: compare the size of the difference to the size of the noise.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://en.wikipedia.org/wiki/Welch%27s_t-test" rel="noopener noreferrer"&gt;Welch’s t-test&lt;/a&gt; formalizes this. It computes a test statistic &lt;em&gt;t&lt;/em&gt; that is essentially the ratio of the mean difference to the standard error. If &lt;em&gt;t&lt;/em&gt; exceeds a critical value (determined by your chosen false positive rate, alpha), you can conclude the difference is statistically significant.&lt;/p&gt;

&lt;p&gt;The key insight: &lt;strong&gt;a statistically significant result tells you the difference is unlikely to be zero, but not that the difference is large or practically meaningful.&lt;/strong&gt; Always pair hypothesis testing with effect size estimates. A 0.1% improvement might be statistically significant with enough samples — but not worth the code complexity.&lt;/p&gt;

&lt;h4&gt;
  
  
  Change Point Detection
&lt;/h4&gt;

&lt;p&gt;Hypothesis testing works well when you have a clear “before” and “after.” But what about continuous benchmarking, where you’re tracking performance across hundreds of commits?&lt;/p&gt;

&lt;p&gt;Change point detection algorithms scan a time series and identify where the underlying distribution shifts. The &lt;a href="https://aakinshin.net/posts/edpelt/" rel="noopener noreferrer"&gt;e-divisive method&lt;/a&gt; (ED-PELT) is particularly effective for benchmark data. It handles non-normal distributions, detects multiple change points, and works well with the kind of noisy data that benchmarks produce.&lt;/p&gt;

&lt;p&gt;Netflix’s engineering team wrote an excellent post on &lt;a href="https://netflixtechblog.com/fixing-performance-regressions-before-they-happen-eab2602b86fe" rel="noopener noreferrer"&gt;fixing performance regressions before they happen&lt;/a&gt;, which covers their use of change point detection in continuous benchmarking.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://blog.nyrkio.com/2025/06/12/slides-from-presentation-to-spec-devops-performance-wg/" rel="noopener noreferrer"&gt;Henrik Ingo&lt;/a&gt; (who spoke in the same Software Performance Devroom at FOSDEM) has published extensively on applying these methods in practice.&lt;/p&gt;

&lt;h4&gt;
  
  
  Visualization: Strip Plots Over Boxplots
&lt;/h4&gt;

&lt;p&gt;Boxplots hide too much. They show quartiles and a median, but they obscure the actual distribution shape — bimodality, outlier clusters, and gaps all disappear into a box.&lt;/p&gt;

&lt;p&gt;Strip plots (dot plots of every individual measurement) are better for benchmark data. They make outliers obvious, reveal distribution shape at a glance, and scale well for the sample sizes typical in benchmarking (30-200 points).&lt;/p&gt;

&lt;p&gt;Brendan Gregg’s work on &lt;a href="https://www.brendangregg.com/FrequencyTrails/outliers.html#Causes" rel="noopener noreferrer"&gt;frequency trails&lt;/a&gt; is excellent on this topic — showing how visualization choices affect your ability to detect real patterns in performance data.&lt;/p&gt;




&lt;h3&gt;
  
  
  Integrating Into Development Workflows
&lt;/h3&gt;

&lt;p&gt;Reliable measurement is only half the problem. The other half is making performance a first-class part of the development process.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Feedback Loop
&lt;/h4&gt;

&lt;p&gt;The ideal: a developer opens a pull request, benchmarks run automatically, and within minutes they see whether their changes have performance implications. If there’s a regression, they know about it before the code merges — not weeks later when a customer notices.&lt;/p&gt;

&lt;p&gt;This requires:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Automated benchmark execution&lt;/strong&gt; triggered by code changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Statistical analysis&lt;/strong&gt; to distinguish real regressions from noise&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clear reporting&lt;/strong&gt; that developers can act on — not a wall of numbers, but a concise “this got 3% slower, here’s the data”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local reproducibility&lt;/strong&gt; so developers can investigate and fix regressions on their own machines&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Performance Quality Gates
&lt;/h4&gt;

&lt;p&gt;Beyond PR-level feedback, performance quality gates can block releases that don’t meet defined SLOs. The philosophy is the same as any other quality gate — you wouldn’t ship without passing tests, so don’t ship without passing performance benchmarks.&lt;/p&gt;

&lt;h4&gt;
  
  
  When to Benchmark
&lt;/h4&gt;

&lt;p&gt;The answer depends on your resources and risk tolerance:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Strategy&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Coverage&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Every PR&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Complete&lt;/td&gt;
&lt;td&gt;Critical paths, performance-sensitive libraries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Periodic (nightly/weekly)&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Trend detection&lt;/td&gt;
&lt;td&gt;General regression catching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-demand&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Targeted&lt;/td&gt;
&lt;td&gt;Investigation, optimization validation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For most teams, a combination works best: lightweight benchmarks on every PR, comprehensive macrobenchmarks nightly, and on-demand deep dives when investigating specific issues.&lt;/p&gt;

&lt;h4&gt;
  
  
  Open Source Tools
&lt;/h4&gt;

&lt;p&gt;You don’t need to build a benchmarking platform from scratch. Several open source projects can get you started:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://bencher.dev/" rel="noopener noreferrer"&gt;&lt;strong&gt;bencher.dev&lt;/strong&gt;&lt;/a&gt; — Continuous benchmarking as a service. Tracks benchmark results over time, detects regressions, and integrates with CI/CD.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/sharkdp/hyperfine" rel="noopener noreferrer"&gt;&lt;strong&gt;hyperfine&lt;/strong&gt;&lt;/a&gt; — A CLI benchmarking tool for comparing command execution times. Handles warmup, statistical analysis, and parameterized runs.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/benchmark-action/github-action-benchmark" rel="noopener noreferrer"&gt;&lt;strong&gt;github-action-benchmark&lt;/strong&gt;&lt;/a&gt; — GitHub Action for running benchmarks and tracking results over time, with support for Go, Python, Rust, and other language-specific benchmark formats.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/dandavison/chronologer" rel="noopener noreferrer"&gt;&lt;strong&gt;chronologer&lt;/strong&gt;&lt;/a&gt; — Benchmark tracking focused on Go benchmarks with historical comparison.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://blog.nyrkio.com/2025/05/08/welcome-apache-otava-incubating-project/" rel="noopener noreferrer"&gt;&lt;strong&gt;Apache Otava&lt;/strong&gt;&lt;/a&gt; (formerly Nyrkio, incubating) — Performance change point detection service, built on the e-divisive algorithm.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/aclements/perflock" rel="noopener noreferrer"&gt;&lt;strong&gt;perflock&lt;/strong&gt;&lt;/a&gt; — A tool for locking CPU frequency and other system settings during benchmarks. Useful for local development.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The right tool depends on your language ecosystem, CI system, and how much you want to self-host vs. use a managed service.&lt;/p&gt;




&lt;h3&gt;
  
  
  Key Takeaways
&lt;/h3&gt;

&lt;p&gt;Four things to remember:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Control your benchmarking environment.&lt;/strong&gt; Bare metal instances, CPU isolation, disable SMT, disable dynamic frequency scaling. Environment noise is the single largest source of unreliable measurements.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Design your benchmarks to be representative and repeatable.&lt;/strong&gt; Match your production workload. Run long enough. Collect enough samples. Rerun multiple times.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Interpret results with statistical rigor.&lt;/strong&gt; Don’t trust averages. Use hypothesis testing or change point detection. Always ask: is this difference real, or noise?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Integrate benchmarks into your development workflow.&lt;/strong&gt; Run continuously. Catch regressions on PRs. Make performance feedback as fast as test feedback.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Performance Matters
&lt;/h3&gt;

&lt;p&gt;Performance is not always the first thing we think about when building software. We focus on features, correctness, security. And those are right to come first. But in the end, performance is what users experience.&lt;/p&gt;

&lt;p&gt;Low latency means your users aren’t waiting. High throughput means your system handles the load. Cost-efficient performance means you’re not burning money (and energy) on infrastructure that could be halved with the right optimization. A &lt;a href="https://www.brendangregg.com/blog/2020-07-15/systems-performance-2nd-edition.html" rel="noopener noreferrer"&gt;500ms delay costs Google 20% of their traffic&lt;/a&gt;. A 400ms improvement gave Yahoo 5-9% more traffic. The numbers are real.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Not all fast software is world-class, but all world-class software is fast.” – Tobi Lutke, CEO of Shopify&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So write benchmarks. Run them continuously. Catch regressions before your users do.&lt;/p&gt;

&lt;p&gt;And don’t shout in the datacenter.&lt;/p&gt;




&lt;h3&gt;
  
  
  Resources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/igoragoli/fosdem-2026-software-performance" rel="noopener noreferrer"&gt;Slides and experiments (GitHub)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=8211fNI_nc4" rel="noopener noreferrer"&gt;Talk recording (YouTube)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/talks/how-to-reliably-measure-software-performance/"&gt;Talk page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/posts/fosdem-2026/"&gt;FOSDEM 2026 recap&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/posts/otel-unplugged-eu-2026/"&gt;OTel Unplugged EU 2026: Field Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Bakhvalov, D. — &lt;a href="https://github.com/dendibakh/perf-book" rel="noopener noreferrer"&gt;Performance Analysis and Tuning on Modern CPUs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Gregg, B. — &lt;a href="https://www.brendangregg.com/blog/2020-07-15/systems-performance-2nd-edition.html" rel="noopener noreferrer"&gt;Systems Performance: Enterprise and the Cloud&lt;/a&gt;, 2nd ed.&lt;/li&gt;
&lt;li&gt;Tene, G. — &lt;a href="https://www.youtube.com/watch?v=lJ8ydIuPFeU" rel="noopener noreferrer"&gt;How NOT to Measure Latency&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Kalibera, T. et al. — &lt;a href="https://link.springer.com/chapter/10.1007/11758525_26" rel="noopener noreferrer"&gt;Benchmark Precision and Random Initial State&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Leiserson, C. et al. — &lt;a href="https://science.sciencemag.org/content/368/6495/eaam9744" rel="noopener noreferrer"&gt;There’s Plenty of Room at the Top&lt;/a&gt; (Science, 2020)&lt;/li&gt;
&lt;li&gt;Netflix Engineering — &lt;a href="https://netflixtechblog.com/fixing-performance-regressions-before-they-happen-eab2602b86fe" rel="noopener noreferrer"&gt;Fixing Performance Regressions Before They Happen&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Ingo, H. — &lt;a href="https://blog.nyrkio.com/2025/06/12/slides-from-presentation-to-spec-devops-performance-wg/" rel="noopener noreferrer"&gt;Change Point Detection for Performance&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Gregg, B. — &lt;a href="https://www.brendangregg.com/FrequencyTrails/outliers.html#Causes" rel="noopener noreferrer"&gt;Frequency Trails: Outliers&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.mongodb.com/company/blog/engineering/reducing-variability-performance-tests-ec2-setup-key-results" rel="noopener noreferrer"&gt;MongoDB: Reducing Variability in EC2 Performance Tests&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>computerscience</category>
      <category>performance</category>
      <category>softwareengineering</category>
      <category>testing</category>
    </item>
    <item>
      <title>Auto-Instrumenting Go: From eBPF to USDT Probes</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Fri, 27 Feb 2026 00:00:00 +0000</pubDate>
      <link>https://forem.com/kakkoyun/auto-instrumenting-go-from-ebpf-to-usdt-probes-3hgd</link>
      <guid>https://forem.com/kakkoyun/auto-instrumenting-go-from-ebpf-to-usdt-probes-3hgd</guid>
      <description>&lt;p&gt;This post expands on the &lt;a href="https://dev.to/talks/how-to-instrument-go-without-changing-code/"&gt;FOSDEM 2026 Go Devroom talk&lt;/a&gt; I co-presented with &lt;a href="https://hannahkm.github.io" rel="noopener noreferrer"&gt;Hannah S. Kim&lt;/a&gt;. The talk, demo code, and all benchmark scenarios are available in the &lt;a href="https://github.com/kakkoyun/fosdem-2026" rel="noopener noreferrer"&gt;fosdem-2026 repository&lt;/a&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;Go is one of the best languages for building production backend services. It compiles to native binaries, has excellent concurrency primitives, and produces predictable performance characteristics. But when it comes to auto-instrumentation — adding observability without modifying source code — Go is uniquely difficult.&lt;/p&gt;

&lt;p&gt;In the JVM world, bytecode manipulation gives you powerful hooks. Java agents can intercept method calls, inject tracing, and propagate context without the application developer knowing. Python and Node.js have similar dynamic capabilities. Go has none of this.&lt;/p&gt;

&lt;p&gt;The reasons are structural:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static compilation.&lt;/strong&gt; Go compiles to a single native binary. There is no intermediate bytecode to rewrite at load time, no classloader to intercept, no dynamic linking by default.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No &lt;code&gt;LD_PRELOAD&lt;/code&gt;.&lt;/strong&gt; Go’s default static linking means the &lt;code&gt;LD_PRELOAD&lt;/code&gt; trick that works for C/C++ applications (and that the &lt;a href="https://github.com/open-telemetry/opentelemetry-injector" rel="noopener noreferrer"&gt;OTel Injector&lt;/a&gt; uses for Java, .NET, and Node.js) doesn’t apply.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unique calling convention.&lt;/strong&gt; Go’s ABI passes arguments in registers with a convention different from the platform C ABI. This makes dynamic hooking with tools like Frida or ptrace significantly harder — you can’t just read standard frame pointers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Goroutine stack management.&lt;/strong&gt; Goroutines use segmented, growable stacks that the runtime can move at any time. Traditional stack-walking assumptions break.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The gap between “Go is great for production” and “Go is hard to auto-instrument” is real. This is the gap we set out to map.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Comparison Framework
&lt;/h3&gt;

&lt;p&gt;We built a &lt;a href="https://github.com/kakkoyun/fosdem-2026" rel="noopener noreferrer"&gt;demo repository&lt;/a&gt; with the same Go HTTP server implemented across seven scenarios, each using a different instrumentation approach. The application is deliberately simple — an HTTP server with configurable CPU load, memory allocation, and off-CPU time — so that instrumentation overhead is isolated and measurable.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Seven Scenarios
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;default&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Baseline. No instrumentation of any kind.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;&lt;code&gt;manual&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;OTel SDK&lt;/td&gt;
&lt;td&gt;Manual OpenTelemetry SDK integration — explicit tracer initialization, span creation via &lt;code&gt;otelhttp&lt;/code&gt;, and context propagation. The “standard” way.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;&lt;code&gt;obi&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;eBPF (OBI)&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation" rel="noopener noreferrer"&gt;OpenTelemetry eBPF Instrumentation&lt;/a&gt;. Network-level eBPF hooks. Runs as a sidecar, attaches to the running process. No code changes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ebpf&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;eBPF (Auto)&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://github.com/open-telemetry/opentelemetry-go-instrumentation" rel="noopener noreferrer"&gt;OpenTelemetry Go Auto-Instrumentation&lt;/a&gt;. Uprobe-based eBPF hooks targeting Go runtime functions. No code changes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;&lt;code&gt;orchestrion&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Compile-time&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://github.com/datadog/orchestrion" rel="noopener noreferrer"&gt;Datadog Orchestrion&lt;/a&gt; with OTel SDK. AST transformation via &lt;code&gt;-toolexec&lt;/code&gt; at compile time. Requires a rebuild but no source changes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;&lt;code&gt;libstabst&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;USDT (salp)&lt;/td&gt;
&lt;td&gt;USDT probes via &lt;a href="https://github.com/mmcshane/salp" rel="noopener noreferrer"&gt;salp&lt;/a&gt;/&lt;a href="https://github.com/sthima/libstapsdt" rel="noopener noreferrer"&gt;libstapsdt&lt;/a&gt;, consumed by a bpftrace sidecar that exports to OTLP. Proof of concept.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;&lt;code&gt;usdt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;USDT (native)&lt;/td&gt;
&lt;td&gt;Native USDT probes via a &lt;a href="https://github.com/kakkoyun/go/tree/poc_usdt" rel="noopener noreferrer"&gt;custom Go fork&lt;/a&gt; that adds probe points to &lt;code&gt;net/http&lt;/code&gt;, &lt;code&gt;database/sql&lt;/code&gt;, &lt;code&gt;crypto/tls&lt;/code&gt;, and &lt;code&gt;net&lt;/code&gt;. Proof of concept.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each scenario runs in Docker with an identical observability stack (OTel Collector, Jaeger, Prometheus) and is load-tested with identical parameters.&lt;/p&gt;

&lt;h4&gt;
  
  
  Evaluation Axes
&lt;/h4&gt;

&lt;p&gt;We compared the approaches across three dimensions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Performance overhead&lt;/strong&gt; — latency, CPU, memory (RSS), throughput&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Robustness&lt;/strong&gt; — stability across Go versions, container environments, failure modes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operational friction&lt;/strong&gt; — deployment complexity, privilege requirements, debugging&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Manual OTel SDK (Baseline for Comparison)
&lt;/h3&gt;

&lt;p&gt;The manual scenario is not auto-instrumentation — it is the standard way to instrument a Go service. You import the OTel SDK, initialize a tracer provider, wrap your HTTP handler with &lt;code&gt;otelhttp.NewHandler&lt;/code&gt;, and create spans explicitly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;setupHandlers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Handler&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;mux&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewServeMux&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;mux&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/health"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HealthHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mux&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/load"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LoadHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;otelhttp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mux&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;LoadHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tracer&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;otel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tracer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"manual"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;tracer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s"&gt;"manual.handler"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;End&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c"&gt;// ... business logic&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you full control — custom span attributes, context propagation, error recording. But it requires code changes in every service, and those changes accumulate. Multiply by a hundred microservices and you understand why auto-instrumentation matters.&lt;/p&gt;




&lt;h3&gt;
  
  
  Compile-Time: Orchestrion and OTel Compile-Time Instrumentation
&lt;/h3&gt;

&lt;p&gt;Orchestrion uses Go’s &lt;code&gt;-toolexec&lt;/code&gt; flag to intercept the compilation pipeline. During the AST transformation phase, it injects instrumentation code — adding OTel spans, wrapping handlers, propagating context — without the developer modifying source files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go build &lt;span class="nt"&gt;-toolexec&lt;/span&gt; &lt;span class="s1"&gt;'orchestrion toolexec'&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; myapp &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;span class="c"&gt;# Or equivalently:&lt;/span&gt;
orchestrion go build &lt;span class="nt"&gt;-o&lt;/span&gt; myapp &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The mechanism is aspect-oriented: you declare join points (e.g., “any function in package &lt;code&gt;main&lt;/code&gt; named &lt;code&gt;LoadHandler&lt;/code&gt;”) and advice (e.g., “prepend a span creation statement”). The transformation happens at the AST level before the compiler emits machine code.&lt;/p&gt;

&lt;p&gt;Orchestrion supports OpenTelemetry natively — it is not Datadog-specific. In January 2025, Datadog and Alibaba began merging their compile-time instrumentation efforts into a unified solution under the &lt;a href="https://github.com/open-telemetry/opentelemetry-go-compile-instrumentation" rel="noopener noreferrer"&gt;OpenTelemetry Compile-Time Instrumentation SIG&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trade-offs:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requires a rebuild. You cannot instrument already-deployed binaries.&lt;/li&gt;
&lt;li&gt;Deepest instrumentation of all approaches — it can instrument stdlib and dependencies.&lt;/li&gt;
&lt;li&gt;Zero runtime overhead from the instrumentation mechanism itself (the injected OTel code has the same cost as manual instrumentation).&lt;/li&gt;
&lt;li&gt;Stable across Go versions (the toolexec interface is stable).&lt;/li&gt;
&lt;li&gt;No kernel privileges required.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a deeper dive into the &lt;code&gt;-toolexec&lt;/code&gt; mechanism, see my earlier &lt;a href="https://dev.to/talks/unleashing-the-go-toolchain/"&gt;Unleashing the Go Toolchain&lt;/a&gt; talk from GopherCon UK 2025.&lt;/p&gt;




&lt;h3&gt;
  
  
  eBPF Approaches
&lt;/h3&gt;

&lt;h4&gt;
  
  
  OBI (OpenTelemetry eBPF Instrumentation)
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation" rel="noopener noreferrer"&gt;OBI&lt;/a&gt; takes a network-level approach. It uses eBPF programs to hook into kernel-level network operations, intercepting HTTP/S and gRPC traffic. It is multi-language — Go, Java, .NET, Python, Node.js, Ruby, Rust — because it operates at the protocol layer rather than the language runtime layer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;docker run --privileged \
  --pid=container:myapp \
  -e OTEL_EXPORTER_OTLP_ENDPOINT=http://collector:4318 \
  otel/ebpf-instrumentation:latest

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OBI runs as a sidecar container. It attaches to the target process’s PID namespace and loads eBPF programs that intercept network system calls. No source code modification, no recompilation, no restart.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trade-offs:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requires &lt;code&gt;CAP_SYS_ADMIN&lt;/code&gt; or privileged containers. Security teams push back on this.&lt;/li&gt;
&lt;li&gt;Limited to what eBPF can observe at the network level. Application-internal spans are not visible.&lt;/li&gt;
&lt;li&gt;Protocol coverage is growing: HTTP/S, gRPC, TLS visibility.&lt;/li&gt;
&lt;li&gt;Excellent for topology mapping and network observability beyond just tracing.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  OTel Go Auto-Instrumentation
&lt;/h4&gt;

&lt;p&gt;The &lt;a href="https://github.com/open-telemetry/opentelemetry-go-instrumentation" rel="noopener noreferrer"&gt;OpenTelemetry Go Auto-Instrumentation&lt;/a&gt; project uses uprobe-based eBPF hooks that target specific Go runtime functions. Unlike OBI’s network-level approach, this hooks directly into Go function prologues.&lt;/p&gt;

&lt;p&gt;This project is effectively in maintenance mode. Several of its contributors have moved to OBI. At &lt;a href="https://dev.to/posts/otel-unplugged-eu-2026/"&gt;OTel Unplugged EU 2026&lt;/a&gt;, the frank assessment was: the people moved to where the momentum is.&lt;/p&gt;




&lt;h3&gt;
  
  
  Runtime Injection: Frida and ptrace
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;injector&lt;/code&gt; scenario explores dynamic instrumentation via &lt;a href="https://frida.re/" rel="noopener noreferrer"&gt;Frida&lt;/a&gt;, a ptrace-based toolkit for runtime function hooking. The idea is conceptually simple: attach to a running process, find the function you want to hook, and replace its prologue with a trampoline that calls your instrumentation code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// The application code uses //go:noinline to keep functions hookable.&lt;/span&gt;
&lt;span class="c"&gt;//go:noinline&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;LoadHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// ... business logic&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, this is extremely hard for Go binaries. &lt;a href="https://blog.quarkslab.com/lets-go-into-the-rabbit-hole-part-1-the-challenges-of-dynamically-hooking-golang-program.html" rel="noopener noreferrer"&gt;Quarkslab’s excellent three-part series&lt;/a&gt; documents the challenges in detail: Go’s register-based calling convention, goroutine stack relocation, and compiler optimizations (inlining, dead code elimination) all conspire against reliable dynamic hooking.&lt;/p&gt;

&lt;p&gt;The demo’s injector scenario includes a helper tool that uses &lt;code&gt;unsafe.Offsetof&lt;/code&gt; to find &lt;code&gt;http.Request&lt;/code&gt; struct field offsets — information you need just to read the HTTP method and path from a hooked function’s arguments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trade-offs:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Works with existing binaries. No rebuild required.&lt;/li&gt;
&lt;li&gt;Requires &lt;code&gt;-gcflags="all=-N -l"&lt;/code&gt; to disable optimizations, which defeats the purpose for production.&lt;/li&gt;
&lt;li&gt;Fragile across Go versions — struct layouts and calling conventions change.&lt;/li&gt;
&lt;li&gt;Limited applicability for Go’s statically linked binaries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For Go, this approach is more useful as a debugging tool than a production instrumentation strategy.&lt;/p&gt;




&lt;h3&gt;
  
  
  USDT Probes: The Novel Part
&lt;/h3&gt;

&lt;p&gt;USDT (User Statically-Defined Tracing) probes are a mechanism from the DTrace/SystemTap ecosystem. They are marker points compiled into a binary that external tooling (bpftrace, perf, DTrace) can attach to at runtime. The key property: &lt;strong&gt;when no consumer is attached, the probe site is a NOP instruction with zero overhead.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We built two proof-of-concept implementations.&lt;/p&gt;

&lt;h4&gt;
  
  
  libstabst: USDT via salp and bpftrace
&lt;/h4&gt;

&lt;p&gt;The &lt;code&gt;libstabst&lt;/code&gt; scenario uses &lt;a href="https://github.com/mmcshane/salp" rel="noopener noreferrer"&gt;salp&lt;/a&gt;, a Go binding to &lt;a href="https://github.com/sthima/libstapsdt" rel="noopener noreferrer"&gt;libstapsdt&lt;/a&gt;, to create USDT probes at runtime. The application defines probe points for request start and end:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;probes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;salp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"fosdem"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;reqStart&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;probes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddProbe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"request_start"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Int64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;reqEnd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;probes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddProbe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"request_end"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Int64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Int64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;probes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// In the handler:&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;reqStart&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;reqStart&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Enabled&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;reqStart&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fire&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reqID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A bpftrace sidecar attaches to these probes and exports events as OTLP traces via a custom exporter bridge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Known limitations:&lt;/strong&gt; The salp library has compatibility issues with Go 1.25+, pinning this scenario to Go 1.23.x. It also needs &lt;code&gt;/proc/self/fd/&lt;/code&gt; access for mmap, which fails in many container environments. On bare metal Linux or in a Lima VM, it works.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy4anzzwbubc0jesmzb57.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy4anzzwbubc0jesmzb57.png" alt="Presenting the USDT + eBPF proof of concept at the Go Devroom"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Native USDT: Custom Go Fork
&lt;/h4&gt;

&lt;p&gt;The more ambitious PoC is a &lt;a href="https://github.com/kakkoyun/go/tree/poc_usdt" rel="noopener noreferrer"&gt;custom Go fork&lt;/a&gt; that adds USDT probe points directly to the Go standard library — &lt;code&gt;net/http&lt;/code&gt;, &lt;code&gt;database/sql&lt;/code&gt;, &lt;code&gt;crypto/tls&lt;/code&gt;, and &lt;code&gt;net&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"runtime/trace/usdt"&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;handleRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;usdt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Probe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"myapp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"request_start"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;usdt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Probe1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"myapp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"request_end"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StatusCode&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c"&gt;// ... handle request&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fork includes a &lt;code&gt;go tool usdt&lt;/code&gt; command for listing probes in a binary and generating bpftrace scripts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go tool usdt list ./myserver
&lt;span class="go"&gt;PROVIDER NAME ADDRESS ARGUMENTS
net_http server_request_start 0x63296c 8@%rsi -8@%r8

&lt;/span&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go tool usdt bpftrace ./myserver &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; trace.bt
&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;bpftrace trace.bt
&lt;span class="go"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This PoC proves that native USDT support in Go is technically feasible. The standard library instrumentation is automatically available in any binary built with the fork — no application code changes, no SDK imports.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Known limitations:&lt;/strong&gt; ARM64 argument parsing in bpftrace has issues with the probe argument notation emitted by the fork. The fork is strictly a proof of concept and not suitable for production.&lt;/p&gt;




&lt;h3&gt;
  
  
  Go Runtime PoCs: Flight Recording
&lt;/h3&gt;

&lt;p&gt;Beyond USDT, we explored a &lt;a href="https://github.com/kakkoyun/go/tree/poc_flight_recorder" rel="noopener noreferrer"&gt;flight recorder PoC&lt;/a&gt; based on &lt;a href="https://github.com/golang/go/issues/63185" rel="noopener noreferrer"&gt;golang/go#63185&lt;/a&gt;. The concept: always-on distributed tracing built into the Go runtime, with a bounded ring buffer and GODEBUG-based activation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"runtime/trace/flight"&lt;/span&gt;

&lt;span class="n"&gt;flight&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Enable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;flight&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HTTP&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;flight&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SQL&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;flight&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Net&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;flight&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// Export on error or crash&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The flight recorder PoC watches for trace files produced by the runtime, converts them to OTLP spans, and exports to a collector. If Go’s runtime trace facilities eventually gain W3C Trace Context propagation, this could become the lowest-friction instrumentation path for Go — no SDK, no eBPF, no compile-time tools. Just the runtime doing what runtimes should do.&lt;/p&gt;




&lt;h3&gt;
  
  
  Benchmark Results
&lt;/h3&gt;

&lt;p&gt;We ran each scenario under identical load conditions using a Docker-based observability stack with 5-minute sustained load tests.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;CPU&lt;/th&gt;
&lt;th&gt;Memory (RSS)&lt;/th&gt;
&lt;th&gt;Max Latency&lt;/th&gt;
&lt;th&gt;Max Throughput&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Baseline (no instrumentation)&lt;/td&gt;
&lt;td&gt;10.2%&lt;/td&gt;
&lt;td&gt;202 MiB&lt;/td&gt;
&lt;td&gt;4.50 ms&lt;/td&gt;
&lt;td&gt;3.1k req/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manual OTel SDK&lt;/td&gt;
&lt;td&gt;10.3% (+0.1%)&lt;/td&gt;
&lt;td&gt;210 MiB (+8 MiB)&lt;/td&gt;
&lt;td&gt;3.02 ms&lt;/td&gt;
&lt;td&gt;13.97k req/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;eBPF Auto-Instrumentation&lt;/td&gt;
&lt;td&gt;10.0% (-0.3%)&lt;/td&gt;
&lt;td&gt;204 MiB (+2 MiB)&lt;/td&gt;
&lt;td&gt;3.07 ms&lt;/td&gt;
&lt;td&gt;4.57k req/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compile-time (Orchestrion)&lt;/td&gt;
&lt;td&gt;9.8% (-0.4%)&lt;/td&gt;
&lt;td&gt;210 MiB (+8 MiB)&lt;/td&gt;
&lt;td&gt;2.59 ms&lt;/td&gt;
&lt;td&gt;27.8k req/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A few things stand out. The CPU and memory overhead across all approaches is negligible for this workload. The throughput differences are more interesting — Orchestrion’s compile-time approach achieved the highest throughput, likely because the OTel code injected at compile time benefits from the same optimizations as the rest of the application. The eBPF approach showed lower throughput, consistent with the overhead of crossing the kernel boundary for each intercepted call.&lt;/p&gt;

&lt;p&gt;The USDT scenarios (&lt;code&gt;libstabst&lt;/code&gt; and &lt;code&gt;usdt&lt;/code&gt;) are not included in the table because they are proof-of-concept implementations with different exporter architectures. The core property of USDT — zero overhead when probes are not attached — was confirmed, but end-to-end benchmarking against the other approaches requires further work.&lt;/p&gt;

&lt;p&gt;Full benchmark data and reproduction instructions are in the &lt;a href="https://github.com/kakkoyun/fosdem-2026" rel="noopener noreferrer"&gt;demo repository&lt;/a&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Ecosystem Picture
&lt;/h3&gt;

&lt;p&gt;The most grounded take from the &lt;a href="https://dev.to/posts/otel-unplugged-eu-2026/"&gt;OTel Unplugged EU 2026&lt;/a&gt; OBI/eBPF session:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Document the trade-offs between OBI, compile-time, injector, and SDK. Let people choose. Make them aware of each other and let them work together.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;These approaches are not competing. They serve different deployment scenarios and can coexist.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;Limitation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Compile-time&lt;/strong&gt; (Orchestrion)&lt;/td&gt;
&lt;td&gt;AST transformation via &lt;code&gt;-toolexec&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Deepest instrumentation, security-sensitive environments&lt;/td&gt;
&lt;td&gt;Requires rebuild&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;eBPF/OBI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Kernel-level network hooks&lt;/td&gt;
&lt;td&gt;Runtime flexibility, multi-language, no restart&lt;/td&gt;
&lt;td&gt;Needs kernel privileges&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;eBPF Auto&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Uprobe hooks on Go functions&lt;/td&gt;
&lt;td&gt;Go-specific deep tracing without code changes&lt;/td&gt;
&lt;td&gt;Maintenance mode, fragile across Go versions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Injector/SSI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;K8s operator + &lt;code&gt;LD_PRELOAD&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Lowest friction onboarding&lt;/td&gt;
&lt;td&gt;Does not work for Go’s static binaries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;USDT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Compiled probe points + bpftrace&lt;/td&gt;
&lt;td&gt;Zero overhead when not tracing, future potential&lt;/td&gt;
&lt;td&gt;Proof of concept, ecosystem immaturity&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The vision articulated at OTel Unplugged — &lt;code&gt;apt install opentelemetry&lt;/code&gt; and everything works — requires all these layers coordinating. OBI detecting the Injector and backing off. Compile-time instrumentation detecting existing SDK usage. USDT probes coexisting with eBPF hooks. We are not there yet, but the direction is clear.&lt;/p&gt;




&lt;h3&gt;
  
  
  Future Directions
&lt;/h3&gt;

&lt;p&gt;Several threads from the talk and surrounding conversations point forward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OTel Compile-Time SIG.&lt;/strong&gt; The merger between Datadog’s Orchestrion and Alibaba’s compile-time instrumentation under the OpenTelemetry umbrella is the most significant near-term development. A vendor-neutral, community-maintained compile-time instrumentation tool for Go would change the adoption curve.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;W3C context propagation in runtimes.&lt;/strong&gt; If language runtimes and compilers understand trace context natively, the instrumentation story simplifies fundamentally. This was a recurring theme at OTel Unplugged.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;eBPF Tokens.&lt;/strong&gt; &lt;a href="https://fosdem.org/2026/schedule/event/3LLHG9-bpf-tokens-safe-userspace-ebpf/" rel="noopener noreferrer"&gt;BPF Tokens&lt;/a&gt; could significantly reduce the privilege requirements for eBPF-based instrumentation. Instead of &lt;code&gt;CAP_SYS_ADMIN&lt;/code&gt;, a token-based trust model would lower the bar for security teams.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Native USDT in Go.&lt;/strong&gt; The PoC fork demonstrates feasibility. Whether the Go team would accept USDT probes into the standard library is an open question, but the pattern exists in other ecosystems — Postgres, MySQL, and the JVM all have static tracepoints behind flags.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Flight recording.&lt;/strong&gt; The &lt;code&gt;golang/go#63185&lt;/code&gt; proposal for always-on flight recording in the Go runtime could eventually provide the foundation for zero-touch distributed tracing without any external tooling.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Closing
&lt;/h3&gt;

&lt;p&gt;The instrumentation tax is real and unavoidable. The question is not whether to pay it, but how to manage it. For Go, the answer is increasingly “you have options” — and those options are getting better.&lt;/p&gt;

&lt;p&gt;The slides are available as &lt;a href="https://github.com/kakkoyun/fosdem-2026/blob/main/presentation.md" rel="noopener noreferrer"&gt;Markdown&lt;/a&gt; in the repository. The demo code, Docker setup, and benchmark scripts are all in the &lt;a href="https://github.com/kakkoyun/fosdem-2026" rel="noopener noreferrer"&gt;fosdem-2026 repository&lt;/a&gt;. The &lt;a href="https://www.youtube.com/watch?v=0TvrSebuDPk" rel="noopener noreferrer"&gt;recording is on YouTube&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you want to get involved: the &lt;a href="https://github.com/open-telemetry/opentelemetry-go-compile-instrumentation" rel="noopener noreferrer"&gt;OTel Compile-Time Instrumentation SIG&lt;/a&gt;, &lt;a href="https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation" rel="noopener noreferrer"&gt;OBI&lt;/a&gt;, and &lt;a href="https://github.com/open-telemetry/opentelemetry-go" rel="noopener noreferrer"&gt;OTel Go&lt;/a&gt; repositories all accept contributions. The &lt;code&gt;#otel-go&lt;/code&gt; and &lt;code&gt;#otel-ebpf-sig&lt;/code&gt; channels on &lt;a href="https://slack.cncf.io/" rel="noopener noreferrer"&gt;CNCF Slack&lt;/a&gt; are where the discussions happen.&lt;/p&gt;

&lt;p&gt;See also: &lt;a href="https://dev.to/posts/otel-unplugged-eu-2026/"&gt;OTel Unplugged EU 2026 field notes&lt;/a&gt; for the broader ecosystem context.&lt;/p&gt;

</description>
      <category>go</category>
      <category>linux</category>
      <category>monitoring</category>
      <category>performance</category>
    </item>
    <item>
      <title>OTel Unplugged EU 2026: Field Notes from the Instrumentation Frontier</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Fri, 20 Feb 2026 00:00:00 +0000</pubDate>
      <link>https://forem.com/kakkoyun/otel-unplugged-eu-2026-field-notes-from-the-instrumentation-frontier-3fn0</link>
      <guid>https://forem.com/kakkoyun/otel-unplugged-eu-2026-field-notes-from-the-instrumentation-frontier-3fn0</guid>
      <description>&lt;h3&gt;
  
  
  Brussels Again, But Make It Unplugged
&lt;/h3&gt;

&lt;p&gt;The day after FOSDEM, about a hundred of us gathered at &lt;strong&gt;Sparks Meeting&lt;/strong&gt; on Rue Ravenstein in Brussels for &lt;a href="https://opentelemetry.io/blog/2025/otel-unplugged-fosdem/" rel="noopener noreferrer"&gt;OTel Unplugged EU 2026&lt;/a&gt; — an unconference dedicated entirely to OpenTelemetry. Purple stage lights, a mid-century auditorium with wood paneling, and the familiar buzz of people who spend their days thinking about telemetry pipelines. If you know, you know.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh0c8chrwfahl6dmb3a7g.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh0c8chrwfahl6dmb3a7g.jpeg" alt="OTel Unplugged agenda projected on stage" width="800" height="1066"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The format is simple: no prepared talks, no slides. Morning session brainstorming, dot-voting on topics, then self-organizing into &lt;strong&gt;nine rooms across four breakout slots&lt;/strong&gt;. You vote with your feet. If a conversation isn’t working, you move. It’s chaotic, it’s honest, and it produces the kind of discussions that polished conference talks rarely achieve.&lt;/p&gt;

&lt;p&gt;I spent the day bouncing between sessions on &lt;strong&gt;Prometheus and OpenTelemetry convergence&lt;/strong&gt; , the &lt;strong&gt;Injector and Operator&lt;/strong&gt; , &lt;strong&gt;OBI/eBPF&lt;/strong&gt; , and &lt;strong&gt;auto-instrumentation for Go&lt;/strong&gt;. Four rooms, one thread connecting them all: &lt;em&gt;how do we make applications observable without asking developers to change their code?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here’s what I learned.&lt;/p&gt;




&lt;h3&gt;
  
  
  Prometheus Loves OpenTelemetry (It’s Complicated)
&lt;/h3&gt;

&lt;p&gt;Prometheus and OpenTelemetry had &lt;strong&gt;two sessions&lt;/strong&gt; — one in the morning with end users and contributors, and a follow-up in the afternoon specifically for maintainers. Both were packed. The relationship between these two projects is the kind you’d describe as “it’s complicated” on social media.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Resource Attributes Mess
&lt;/h4&gt;

&lt;p&gt;The biggest pain point? Getting OTLP data into Prometheus. &lt;strong&gt;Resource attributes&lt;/strong&gt; are the central headache. OTLP has a rich hierarchy — resource, scope, and metric attributes. Prometheus is flat. Bridging these two models means choosing between promoting all attributes, promoting some, or relying on &lt;code&gt;target_info&lt;/code&gt;. There are too many config options, no consistency across deployments, and the &lt;code&gt;info&lt;/code&gt; function (using &lt;code&gt;target_info&lt;/code&gt;) helps but adoption is uneven.&lt;/p&gt;

&lt;p&gt;One person described running an observability platform for &lt;strong&gt;over a thousand developers&lt;/strong&gt; , most using Prometheus &lt;code&gt;remote_write&lt;/code&gt;. Some teams want a single OTLP endpoint for logs and metrics, but that just shifts the same “what to promote, what to drop” problem. The frustration was palpable — someone put it bluntly: &lt;em&gt;“OTel is rewriting everything again.”&lt;/em&gt; Different conventions (&lt;code&gt;.&lt;/code&gt; vs &lt;code&gt;_&lt;/code&gt;), hard &lt;code&gt;target_info&lt;/code&gt; joins, and the sense that mature Prometheus semantics (&lt;code&gt;cluster&lt;/code&gt;, &lt;code&gt;namespace&lt;/code&gt;) are being duplicated under different names (&lt;code&gt;k8s.*&lt;/code&gt;).&lt;/p&gt;

&lt;h4&gt;
  
  
  Migration Resistance
&lt;/h4&gt;

&lt;p&gt;Teams recognize the value of OTel’s semantic conventions, but the migration path is painful. &lt;strong&gt;Naming inconsistencies&lt;/strong&gt; (&lt;code&gt;.&lt;/code&gt; vs &lt;code&gt;_&lt;/code&gt;), hard &lt;code&gt;target_info&lt;/code&gt; joins, and the cognitive overhead of moving from Prometheus’s world view to OTel’s. Several people mentioned that &lt;code&gt;PromQL IS AWESOME&lt;/code&gt; (their emphasis, not mine) and that transformation adds overhead that people who come from a Prometheus background don’t want to pay.&lt;/p&gt;

&lt;p&gt;On the SDK side, OTel measurements require a &lt;strong&gt;hashmap lookup&lt;/strong&gt; while Prometheus doesn’t. Too many concepts — meter, instrument, aggregation — versus Prometheus’s closer alignment to mechanical sympathy. The performance direction being pursued? &lt;strong&gt;Zero allocations, no lookups&lt;/strong&gt; — the bound instruments PoC is the concrete step toward closing that gap. Nobody in the room uses delta temporality.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“People care about observability, not query languages.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  The Afternoon: Maintainers Chart a Path
&lt;/h4&gt;

&lt;p&gt;The afternoon session brought Prometheus and OTel maintainers together. The mood was constructive. &lt;strong&gt;OTel SDK v2&lt;/strong&gt; was discussed as an opportunity for the kind of breaking changes that could simplify the metrics API — a simplified, more performant, but less flexible API. The &lt;strong&gt;Prometheus 3.0&lt;/strong&gt; experience was instructive: the maintainers planned for major breakage but ended up with almost none.&lt;/p&gt;

&lt;p&gt;Concrete progress: &lt;strong&gt;David Ashpole’s &lt;a href="https://github.com/open-telemetry/opentelemetry-go/pull/7790" rel="noopener noreferrer"&gt;bound instruments PoC in Go&lt;/a&gt;&lt;/strong&gt; — instruments pre-bound to specific attribute sets, eliminating the hashmap lookup. People in the room care about Go and C++ performance, and this could be a game changer.&lt;/p&gt;

&lt;p&gt;On the receiver/exporter convergence front: &lt;strong&gt;cAdvisor is considering archiving its Prometheus exporter&lt;/strong&gt; and moving all code into the OTel collector. OTel Kubernetes monitoring is broadly adopted, with near-parity to kube-state-metrics. The idea of Prometheus carrying an OTel Collector distribution was floated.&lt;/p&gt;

&lt;p&gt;The messaging problem came into sharp focus: as one Prometheus maintainer put it, &lt;em&gt;“Having joint statements helps towards the perception of working together.”&lt;/em&gt; The gap isn’t just technical — it’s about perception. End users see two projects that look like they’re competing, even when the maintainers are collaborating.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Action items&lt;/strong&gt; : use the &lt;code&gt;#otel-prometheus&lt;/code&gt; Slack channel, meet again in &lt;strong&gt;Amsterdam&lt;/strong&gt; , and produce &lt;strong&gt;joint messaging&lt;/strong&gt; — “this is built together and is compatible.” Who owns that messaging? That’s the open question.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flbsqjzl55mfdrxm668f4.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flbsqjzl55mfdrxm668f4.jpeg" alt="Emerging topics sorted on sticky notes during morning brainstorming" width="800" height="1066"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Injector: From LD_PRELOAD to &lt;code&gt;apt install opentelemetry&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Two sessions covered the &lt;strong&gt;Injector and Operator&lt;/strong&gt; ecosystem — one focused on the general architecture, the other specifically on &lt;strong&gt;OBI and Injector coordination for Go&lt;/strong&gt;. The framing that stuck with me came early: &lt;em&gt;“OTel instrumentation feels more like a collection of tools than a product.”&lt;/em&gt; That’s why the Injector exists — to close the gap between what OTel offers and what users expect to just work.&lt;/p&gt;

&lt;h4&gt;
  
  
  Injector vs Operator
&lt;/h4&gt;

&lt;p&gt;The &lt;strong&gt;Injector&lt;/strong&gt; is opinionated and out-of-the-box. It aims for &lt;strong&gt;80% coverage with zero configuration&lt;/strong&gt;. The &lt;strong&gt;Operator&lt;/strong&gt; is for power users who want fine-grained control. Both end users and OTel maintainers were in the room, and more people knew about the Operator than the Injector.&lt;/p&gt;

&lt;p&gt;The Injector works via &lt;code&gt;LD_PRELOAD&lt;/code&gt; — it hooks into process loading to activate SDK instrumentation for Java, .NET, Node.js, and soon Python. It’s being used &lt;strong&gt;in production at scale&lt;/strong&gt; on Kubernetes. It can detect libc vs musl. Blocking system start during injection? Not perceived as a problem by anyone in the room.&lt;/p&gt;

&lt;p&gt;The inevitable question came up: &lt;strong&gt;“What about Go?”&lt;/strong&gt; For Go’s statically linked binaries, there’s no &lt;code&gt;LD_PRELOAD&lt;/code&gt; equivalent. The answer is either eBPF or compile-time instrumentation. Go remains the special case that requires different thinking.&lt;/p&gt;

&lt;h4&gt;
  
  
  Beyond Kubernetes
&lt;/h4&gt;

&lt;p&gt;There’s clear demand for the Injector &lt;strong&gt;outside of Kubernetes&lt;/strong&gt; — EC2, bare metal, traditional VMs. Users not on K8s or Docker &lt;em&gt;“end up using custom Ansibles”&lt;/em&gt; — the packaging gap is real and concrete. System packages (Debian, RPM) are needed, but hosting for them doesn’t exist yet. &lt;strong&gt;Red Hat is looking into packaging OTel components.&lt;/strong&gt; Multiple projects are independently solving the same packaging problems — signatures, distribution, hosting — which led to a proposal for a &lt;strong&gt;new SIG on OS packaging&lt;/strong&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Vision: One Package to Rule Them All
&lt;/h4&gt;

&lt;p&gt;The afternoon session on OBI and Injector for Go articulated a bold vision:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Run &lt;code&gt;apt install opentelemetry&lt;/code&gt; and get everything running — SDKs, Injector, OBI, all coordinated.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This would require massive coordination between instrumentation providers — and it would include the &lt;strong&gt;OTel profiler&lt;/strong&gt; alongside the SDKs and Injector. The group discussed how to avoid &lt;strong&gt;double instrumentation&lt;/strong&gt; when both OBI and the Injector are present — OBI should detect the Injector and back off (similar to how it already detects other SDKs). A creative proposal emerged: &lt;strong&gt;OBI injecting the Injector&lt;/strong&gt; instead of the Operator, since eBPF can intercept process loading natively.&lt;/p&gt;

&lt;p&gt;The reality is that &lt;strong&gt;OTel declarative configuration&lt;/strong&gt; doesn’t cleanly fit either project’s model yet. The Injector has its own config format. OBI instruments many applications from a single daemon, which doesn’t map neatly to per-application YAML. This is a design problem that needs solving before the &lt;code&gt;apt install&lt;/code&gt; dream becomes real.&lt;/p&gt;

&lt;p&gt;And the question that kept coming back in both sessions — &lt;em&gt;“What about Go?”&lt;/em&gt; — led naturally into the next room.&lt;/p&gt;




&lt;h3&gt;
  
  
  eBPF and the Instrumentation Tax
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;OBI/eBPF session&lt;/strong&gt; drew a crowd interested in the promise and the trade-offs of non-invasive auto-instrumentation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftow3ltmx7svy9u4snf9e.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftow3ltmx7svy9u4snf9e.jpeg" alt="Session brainstorming — Go instrumentation topics cluster together" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/open-telemetry/opentelemetry-go-instrumentation" rel="noopener noreferrer"&gt;OBI&lt;/a&gt; (eBPF-based auto-instrumentation) uses &lt;strong&gt;uprobes&lt;/strong&gt; to hook into application functions at the kernel level. No source code modification, no recompilation, no SDK integration. The trade-off? You need &lt;strong&gt;privileges&lt;/strong&gt;. &lt;code&gt;CAP_SYS_ADMIN&lt;/code&gt; or root access is a hard sell for security teams, and the discussion around reducing privilege requirements was lively.&lt;/p&gt;

&lt;p&gt;The operational reality came through early: someone described a case where &lt;em&gt;“the instrumentation was bringing down the pod”&lt;/em&gt; — auto-injection and sidecars destabilizing the very workloads they’re supposed to observe. That anecdote set the tone for the rest of the session and led directly to the quote that stuck with me most.&lt;/p&gt;

&lt;p&gt;A bright spot: &lt;strong&gt;&lt;a href="https://fosdem.org/2026/schedule/event/3LLHG9-bpf-tokens-safe-userspace-ebpf/" rel="noopener noreferrer"&gt;eBPF Tokens&lt;/a&gt;&lt;/strong&gt;, a newer Linux facility for safer userspace eBPF, could significantly lower the trust bar. There was optimism in the room about this direction.&lt;/p&gt;

&lt;p&gt;OBI isn’t just about application tracing. It shines in &lt;strong&gt;network observability&lt;/strong&gt; — topology mapping, correlating network stack behavior with application layer events. Someone asked about lock observability — &lt;em&gt;“Maybe profiling”&lt;/em&gt; was the answer, hinting at the breadth of what people want from eBPF beyond just tracing. And there’s an underexplored opportunity around &lt;strong&gt;USDTs&lt;/strong&gt; (user-defined static tracepoints). Postgres and MySQL already have them behind flags. Rust makes them easy to add. But we need to convince &lt;strong&gt;popular libraries across more languages&lt;/strong&gt; to adopt them.&lt;/p&gt;

&lt;p&gt;A broader point was raised: &lt;strong&gt;W3C context propagation&lt;/strong&gt; should be pushed into language runtimes and compilers, not just libraries. If the runtime itself understands trace context, the instrumentation story becomes fundamentally simpler.&lt;/p&gt;

&lt;p&gt;The most grounded take came during the Go discussion:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Document the trade-offs between OBI, compile-time, injector, and SDK. Let people choose. Make them aware of each other and let them work together.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And the reality check:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Instrumentation tax is inevitable. Manage it, don’t pretend it’s free.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The consensus across multiple sessions: &lt;strong&gt;stop treating these approaches as competing camps&lt;/strong&gt;. They’re complementary layers for different deployment scenarios:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;Trade-off&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compile-time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;AST transformation via &lt;code&gt;-toolexec&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Deepest instrumentation, zero runtime overhead, requires rebuild&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;eBPF/OBI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Kernel-level uprobe hooking&lt;/td&gt;
&lt;td&gt;No app modification, needs kernel privileges&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Injector/SSI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;K8s operator triggering instrumentation&lt;/td&gt;
&lt;td&gt;Lowest friction onboarding, abstracts complexity&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;On the Kubernetes operations side, there was a concrete proposal: a &lt;strong&gt;CRD for otel-operator&lt;/strong&gt; to deploy OBI daemonsets — with config validation and selective node deployment via workload labels. Not theoretical; the group was sketching the API surface.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8vp7aib1wh3vn9lcdwx.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8vp7aib1wh3vn9lcdwx.jpeg" alt="Community and ecosystem topics — the other half of the brainstorming table" width="800" height="1066"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Patterns Across the Day
&lt;/h3&gt;

&lt;p&gt;Beyond the sessions I attended, three themes kept surfacing throughout the unconference.&lt;/p&gt;

&lt;h4&gt;
  
  
  Ship Faster vs Stable by Default
&lt;/h4&gt;

&lt;p&gt;Two rooms, opposite tensions. One group argued: &lt;em&gt;“We discourage people from trying. Processes feel rigid. We can only learn if we actually build something.”&lt;/em&gt; The Prometheus model — experiment first, let things mature, specify later — was held up as the better feedback loop. The other group was laser-focused on &lt;strong&gt;stability&lt;/strong&gt; : feature gates, opt-in experimental features, the pain of breaking changes in semantic conventions. The impatience was clear: &lt;em&gt;“Less bike-shedding, more doing.”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Both are right. The community is threading a needle between moving fast enough to stay relevant and being stable enough that enterprises trust the project. The gap between these two positions is where a lot of energy gets spent.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Maintainer Crisis
&lt;/h4&gt;

&lt;p&gt;This came up in at least three rooms. &lt;strong&gt;Not enough maintainers, too many PRs, codeowners disappearing.&lt;/strong&gt; The JavaScript SIG has an automated script to move inactive maintainers to emeritus after three months. Other SIGs handle it manually. Some SIGs have tried a buddy/mentor system for onboarding new contributors — it helps, but it doesn’t scale across all SIGs when the existing maintainers barely have time to review PRs. The phrase that stuck with me:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Maintainership is privilege AND responsibility.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And a new problem: as one maintainer put it directly, &lt;em&gt;“AI slop creates a lot of work for maintainers.”&lt;/em&gt; Low-quality AI-generated PRs need review just like everything else, but they rarely lead to productive outcomes — creating a treadmill of review work that burns out the very people the project can’t afford to lose.&lt;/p&gt;

&lt;h4&gt;
  
  
  opentelemetry-go-auto: Quietly Fading
&lt;/h4&gt;

&lt;p&gt;During the Go-focused session, someone asked about &lt;code&gt;opentelemetry-go-auto&lt;/code&gt; — the eBPF-based Go auto-instrumentation project (originally from Alibaba). The answer was frank: the project &lt;strong&gt;“seems in maintenance mode, some of their maintainers are already contributing to OBI.”&lt;/strong&gt; The group decided to keep it out of the discussions unless those maintainers want to participate. No drama, just the natural evolution of open-source projects. The people moved to where the momentum is.&lt;/p&gt;




&lt;h3&gt;
  
  
  What Comes Next
&lt;/h3&gt;

&lt;p&gt;The unconference produced concrete next steps across every thread:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prometheus + OTel&lt;/strong&gt; : Convergence work continues. Joint messaging, Amsterdam meetup, bound instruments moving forward.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Injector&lt;/strong&gt; : Merge functionality into the Operator starting with one language. System packages for non-K8s environments.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OBI&lt;/strong&gt; : Gradual protocol expansion, packaging SIG proposal, exploration of an eBPF-based OTel Collector.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Go auto-instrumentation&lt;/strong&gt; : Coordinate all three approaches, document trade-offs clearly for end users.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Unplugged
&lt;/h3&gt;

&lt;p&gt;The unconference format works because &lt;strong&gt;the hardest problems in observability right now are not technical&lt;/strong&gt; — they’re social. Governance, maintenance burden, convergence between projects that grew up independently, vendor-neutrality when vendors are the primary contributors. You can’t solve these with a slide deck. You need a room, a whiteboard, and honest conversation.&lt;/p&gt;

&lt;p&gt;As I always say — &lt;strong&gt;the hallway track is the real conference.&lt;/strong&gt; OTel Unplugged is an entire day of hallway track, and it’s exactly what the community needs.&lt;/p&gt;

&lt;p&gt;If you want to get involved: join the &lt;a href="https://slack.cncf.io/" rel="noopener noreferrer"&gt;CNCF Slack&lt;/a&gt; and find the &lt;code&gt;#otel-prometheus&lt;/code&gt;, &lt;code&gt;#otel-ebpf-sig&lt;/code&gt;, and &lt;code&gt;#otel-go&lt;/code&gt; channels. The SIG meetings are open and listed on the &lt;a href="https://github.com/open-telemetry/community" rel="noopener noreferrer"&gt;OTel community repo&lt;/a&gt;. Show up, contribute, and help shape the future of observability.&lt;/p&gt;

&lt;p&gt;Already looking forward to the next one.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>monitoring</category>
      <category>opensource</category>
      <category>tooling</category>
    </item>
    <item>
      <title>FOSDEM 2026: Even Bigger, Even Better</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Fri, 13 Feb 2026 00:00:00 +0000</pubDate>
      <link>https://forem.com/kakkoyun/fosdem-2026-even-bigger-even-better-b67</link>
      <guid>https://forem.com/kakkoyun/fosdem-2026-even-bigger-even-better-b67</guid>
      <description>&lt;h3&gt;
  
  
  Another Year, Another FOSDEM
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;FOSDEM&lt;/strong&gt; — the annual Brussels pilgrimage. If you’ve been, you know the drill: too many talks, too little time, questionable coffee, and the kind of conversations that only happen when you pack thousands of open-source developers into a university campus in the dead of winter.&lt;/p&gt;

&lt;p&gt;This year was different for me, though. Two talks in two devrooms, three sessions at OTel Unplugged — and this time, I brought the whole family. My wife and our toddler (who has graduated from “can barely walk” to “can absolutely destroy a hotel room in under four minutes”) came along, and we turned it into a proper trip — FOSDEM, then a few days exploring &lt;strong&gt;Ghent&lt;/strong&gt; and &lt;strong&gt;Antwerp&lt;/strong&gt; before heading home.&lt;/p&gt;

&lt;p&gt;The conference part was incredible. The journey home… well, we’ll get to that.&lt;/p&gt;

&lt;h3&gt;
  
  
  Saturday Morning: eBPF Devroom
&lt;/h3&gt;

&lt;p&gt;Last year the eBPF Devroom was impenetrable — nobody leaves, nobody gets in. This year I made it in early and spent the morning there.&lt;/p&gt;

&lt;p&gt;Three sessions stood out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;"&lt;a href="https://fosdem.org/2026/schedule/event/8GVBN7-ebpf-hooks-gotchas/" rel="noopener noreferrer"&gt;eBPF Hookpoint Gotchas: Why Your Program Fires (or Fails) in Unexpected Ways&lt;/a&gt;"&lt;/strong&gt; — Donia Chaiehloudj and Chris Tarazi walked through the subtle behaviors of kprobes, fentry, tracepoints, and uprobes that catch everyone off guard. The kind of talk where half the room is nodding along because they’ve hit these exact edge cases in production. If you write eBPF programs, this is required viewing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;"&lt;a href="https://fosdem.org/2026/schedule/event/H3LM7G-performance_and_reliability_pitfalls_of_ebpf/" rel="noopener noreferrer"&gt;Performance and Reliability Pitfalls of eBPF&lt;/a&gt;"&lt;/strong&gt; — Usama Saqib shared hard-won lessons from running eBPF at scale: kprobe performance varying across kernel versions, fentry stability issues, and the challenges of scaling uprobes. Directly relevant to anyone using eBPF-based auto-instrumentation — the kind of detail you don’t find in documentation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;"&lt;a href="https://fosdem.org/2026/schedule/event/VTXQSK-oomprof/" rel="noopener noreferrer"&gt;OOMProf: Profiling Go Heap Memory at OOM Time&lt;/a&gt;"&lt;/strong&gt; — Tommy Reilly presented OOMProf, a Go library that uses eBPF to hook into Linux OOM tracepoints and capture heap profiles right before the kernel kills your process. Exports to pprof or Parca. The intersection of Go, eBPF, and profiling — three things I care deeply about.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The eBPF Devroom continues to be one of the most technically dense tracks at FOSDEM. Every talk assumes you already know the basics and goes straight to the edge cases and production realities.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sunday: Two Talks, Two Devrooms
&lt;/h3&gt;

&lt;p&gt;Sunday was a double-header. Two talks in two devrooms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Augusto de Oliveira&lt;/strong&gt; and I co-presented &lt;strong&gt;“How to Reliably Measure Software Performance”&lt;/strong&gt; in the &lt;strong&gt;Software Performance Devroom&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frruf1v23mv3sr66ofauv.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frruf1v23mv3sr66ofauv.jpeg" alt="Kemal and Augusto presenting at the Software Performance Devroom" width="800" height="602"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The talk opened with one of my favorite stories in science: the OPERA experiment that appeared to show neutrinos traveling faster than the speed of light, only for the root cause to be a single fiber-optic cable that wasn’t fully plugged in. That’s benchmarking in a nutshell — a world where loose cables are everywhere and your numbers are lying to you until you prove otherwise.&lt;/p&gt;

&lt;p&gt;We covered the full stack of what it takes to measure reliably. &lt;strong&gt;Environment control&lt;/strong&gt; : bare metal instances, disabling SMT, CPU affinity, cache management. &lt;strong&gt;Benchmark design&lt;/strong&gt; : making measurements representative and repeatable. &lt;strong&gt;Statistical rigor&lt;/strong&gt; : because if you’re not thinking about variance, you’re not thinking. And then the part I’m most excited about — &lt;strong&gt;integrating benchmarks into development workflows&lt;/strong&gt;. Performance quality gates on PRs, auto-generated regression comments, continuous benchmarking infrastructure. We showed what we’ve built at Datadog and pointed to the open-source alternatives available today.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Performance matters. It’s not always the first thing we think about when building software. But in the end, performance is what users experience.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Performance Devroom had a strong lineup all day. The audience was deeply technical — people who care about p99 latencies and can argue for an hour about whether your benchmark harness is introducing measurement bias. My kind of crowd.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://dev.to/posts/fosdem-2026-measuring-software-performance/"&gt;technical blog post&lt;/a&gt; goes deeper, and the &lt;a href="https://dev.to/talks/how-to-reliably-measure-software-performance/"&gt;talk page&lt;/a&gt; has the slides and recording.&lt;/p&gt;

&lt;p&gt;Then I crossed campus to the &lt;strong&gt;Go Devroom&lt;/strong&gt;. My kind of room.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hannah S. Kim&lt;/strong&gt; and I presented &lt;strong&gt;“How to Instrument Go Without Changing a Single Line of Code”&lt;/strong&gt; — a talk comparing every strategy available today for zero-touch Go observability.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq3gisy488wpsrfc756pm.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq3gisy488wpsrfc756pm.jpeg" alt="Hannah and Kemal presenting at the Go Devroom" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We walked through eBPF-based auto-instrumentation with OBI, compile-time manipulation with tools like Orchestrion and the OTel Go compile-time instrumentation project, runtime injection via LD_PRELOAD, and the emerging world of USDTs for Go.&lt;/p&gt;

&lt;p&gt;The core of the talk was practical: benchmark results and small realistic services, compared along three axes — &lt;strong&gt;performance overhead&lt;/strong&gt; , &lt;strong&gt;robustness across Go versions&lt;/strong&gt; , and &lt;strong&gt;operational friction&lt;/strong&gt;. We showed the trade-offs honestly. eBPF gives you zero code changes but needs kernel privileges. Compile-time rewriting gives you the deepest instrumentation but requires a rebuild. The Injector abstracts complexity but is currently Kubernetes-only. There’s no silver bullet, just choices with different costs.&lt;/p&gt;

&lt;p&gt;We also looked forward at how upcoming work in the Go runtime — flight recording, improved diagnostics primitives, USDT probe generation — could unlock cleaner hooks for future instrumentation. The room was full. The questions were sharp. Hannah handled the eBPF deep-dives while I covered the compile-time and operational integration angles. It worked.&lt;/p&gt;

&lt;p&gt;If you want the full technical breakdown, I wrote a &lt;a href="https://dev.to/posts/fosdem-2026-auto-instrumenting-go/"&gt;companion blog post&lt;/a&gt; and the &lt;a href="https://dev.to/talks/how-to-instrument-go-without-changing-code/"&gt;talk page has the slides and recording&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Monday: OTel Unplugged
&lt;/h3&gt;

&lt;p&gt;The day after FOSDEM, about a hundred of us gathered at &lt;strong&gt;Sparks Meeting&lt;/strong&gt; on Rue Ravenstein for &lt;a href="https://opentelemetry.io/blog/2025/otel-unplugged-fosdem/" rel="noopener noreferrer"&gt;OTel Unplugged EU 2026&lt;/a&gt; — an unconference dedicated entirely to OpenTelemetry. No slides, no prepared talks, just session brainstorming, dot-voting, and then splitting into nine rooms across four breakout slots.&lt;/p&gt;

&lt;p&gt;I led or co-led &lt;strong&gt;three sessions&lt;/strong&gt; : one on &lt;strong&gt;Prometheus and OTel convergence&lt;/strong&gt; , one on &lt;strong&gt;OBI/eBPF-based auto-instrumentation&lt;/strong&gt; , and one on &lt;strong&gt;the Injector and OBI coordination for Go&lt;/strong&gt;. The thread connecting all three was the same question that keeps me up at night: &lt;em&gt;how do we make applications observable without asking developers to change their code?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I wrote a &lt;a href="https://dev.to/posts/otel-unplugged-eu-2026/"&gt;dedicated post covering the full day&lt;/a&gt;, so I won’t repeat it here. The short version: the community is converging. Prometheus and OTel maintainers are charting a path together, the Injector vision is expanding beyond Kubernetes, and the various auto-instrumentation approaches for Go are finally being treated as complementary layers rather than competing camps. Read the post for the details.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Hallway Track
&lt;/h3&gt;

&lt;p&gt;As always — &lt;strong&gt;the hallway track is the real conference.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Some of the best conversations happened between sessions, over coffee, during the frantic sprints between ULB buildings. Catching up with &lt;strong&gt;Prometheus maintainers&lt;/strong&gt; about v3 adoption and the road ahead. Talking auto-instrumentation strategy with OTel contributors who I’d only known through GitHub issues. Comparing notes on performance engineering practices with people running infrastructure at wildly different scales.&lt;/p&gt;

&lt;p&gt;The informal &lt;strong&gt;Prometheus maintainers gathering&lt;/strong&gt; was a highlight. Getting the people who build and maintain the project into the same room, away from structured agendas, just talking about what’s working and what isn’t — that’s where real alignment happens. No Zoom call will ever replicate that.&lt;/p&gt;

&lt;p&gt;I’m incredibly grateful for the people I managed to see this year. And as always, slightly heartbroken about the ones I missed. FOSDEM is four thousand developers in one place for a weekend, and no matter how fast you move, you can’t see everyone.&lt;/p&gt;

&lt;h3&gt;
  
  
  Travel and Logistics: The Sequel Nobody Asked For
&lt;/h3&gt;

&lt;p&gt;A FOSDEM without travel drama is, apparently, not something the universe allows for me.&lt;/p&gt;

&lt;p&gt;We flew &lt;strong&gt;Berlin to Brussels&lt;/strong&gt; on Friday and had a great time — FOSDEM all weekend, OTel Unplugged on Monday, then a few days of family time in &lt;strong&gt;Ghent&lt;/strong&gt; and &lt;strong&gt;Antwerp&lt;/strong&gt;. Belgian frites, Belgian waffles, Belgian everything. The toddler approved.&lt;/p&gt;

&lt;p&gt;Then came Thursday. Our flight home from Brussels to Berlin: first delayed, then cancelled. &lt;strong&gt;Berlin airport shut down.&lt;/strong&gt; The coldest winter in twenty years had frozen the city solid. We ended up at a hotel near Brussels airport with a very tired toddler and no plan B.&lt;/p&gt;

&lt;p&gt;Friday morning we flew to &lt;strong&gt;Frankfurt&lt;/strong&gt; instead. Surely a train from Frankfurt to Berlin would be straightforward? Of course not. The Frankfurt-to-Berlin flight: also cancelled. We rebooked on a train, but our checked luggage was… somewhere. The airline couldn’t tell us where. We waited two hours at the airport, watching the carousel go around empty, then gave up and headed to the train station.&lt;/p&gt;

&lt;p&gt;Four and a half hours of train ride later, we were finally home. &lt;strong&gt;Antwerp to Berlin: 29.5 hours, door to door.&lt;/strong&gt; With a toddler. In the coldest week Germany had seen in two decades.&lt;/p&gt;

&lt;p&gt;The luggage? It arrived ten days later. Intact, thankfully. But ten days.&lt;/p&gt;

&lt;p&gt;Last year’s transport chaos was cute by comparison.&lt;/p&gt;

&lt;h3&gt;
  
  
  Looking Forward
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;FOSDEM 2026&lt;/strong&gt; was my best one yet. Two talks across the weekend, three unconference sessions on Monday, and more hallway track conversations than I can count. The open-source observability community is in a remarkable place right now — Prometheus and OpenTelemetry converging, auto-instrumentation maturing across multiple approaches, and performance engineering finally getting the attention it deserves.&lt;/p&gt;

&lt;p&gt;Already thinking about next year. If you’re into open source and haven’t experienced FOSDEM, just go. You won’t regret it.&lt;/p&gt;

</description>
      <category>community</category>
      <category>devjournal</category>
      <category>monitoring</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Fix Go Module Downloads Behind a Corporate VPN</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Thu, 12 Feb 2026 00:00:00 +0000</pubDate>
      <link>https://forem.com/kakkoyun/fix-go-module-downloads-behind-a-corporate-vpn-7ce</link>
      <guid>https://forem.com/kakkoyun/fix-go-module-downloads-behind-a-corporate-vpn-7ce</guid>
      <description>&lt;p&gt;If you work at a company that runs its own Go module proxy and you connect through a VPN, you’ve probably seen this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Get "https://binaries.example.com/google.golang.org/grpc/@v/v1.77.0.mod":
  dial tcp 172.27.5.36:443: i/o timeout

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The module has nothing to do with your company. It’s a public dependency. Yet Go refuses to fetch it from the public proxy and just dies with a timeout. The frustrating part: you know &lt;code&gt;proxy.golang.org&lt;/code&gt; has the module, and your config lists it as a fallback. So why doesn’t it fall through?&lt;/p&gt;

&lt;h2&gt;
  
  
  The comma trap
&lt;/h2&gt;

&lt;p&gt;A typical corporate Go setup looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export GOPROXY=corp-proxy.internal,https://proxy.golang.org,direct

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The comma separator between proxies looks harmless, but it controls exactly when Go tries the next proxy in the chain. With commas, Go only falls through on &lt;strong&gt;HTTP 404 or 410&lt;/strong&gt; — meaning the proxy responded and said “I don’t have this module.” Any other error, including TCP timeouts, DNS failures, and 5xx server errors, is treated as a &lt;strong&gt;hard failure&lt;/strong&gt;. Go stops and reports the error.&lt;/p&gt;

&lt;p&gt;When your VPN is disconnected, the corporate proxy is unreachable. That’s a TCP timeout, not a 404. Go never tries &lt;code&gt;proxy.golang.org&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pipe fix
&lt;/h2&gt;

&lt;p&gt;Go 1.15 introduced the pipe separator (&lt;code&gt;|&lt;/code&gt;) as an alternative to commas. With a pipe, Go falls through on &lt;strong&gt;any error&lt;/strong&gt; , including network failures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export GOPROXY="corp-proxy.internal|https://proxy.golang.org,direct"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the mix of separators. The pipe between the corporate proxy and the public proxy means “if the corporate proxy is unreachable, try the public one.” The comma between the public proxy and &lt;code&gt;direct&lt;/code&gt; means “only go direct if the public proxy returns 404” — which is the safer default for the last hop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why not use pipes everywhere?
&lt;/h2&gt;

&lt;p&gt;The comma separator exists for a reason: &lt;strong&gt;privacy&lt;/strong&gt;. When Go tries to fetch a module from a proxy, it reveals the module path in the request URL. If your corporate proxy is down and you use pipes everywhere, Go would send your private module paths (&lt;code&gt;github.com/your-company/secret-service&lt;/code&gt;) to &lt;code&gt;proxy.golang.org&lt;/code&gt; before finally trying to fetch them directly.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;GOPRIVATE&lt;/code&gt; and &lt;code&gt;GONOPROXY&lt;/code&gt; environment variables mitigate this. Modules matching those patterns bypass the proxy chain entirely and are fetched directly from source. If you set &lt;code&gt;GOPRIVATE&lt;/code&gt; correctly, the pipe separator is safe for your use case:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export GOPRIVATE=github.com/your-company
export GONOPROXY=github.com/your-company
export GONOSUMDB=github.com/your-company,go.internal.example.com
export GOPROXY="corp-proxy.internal|https://proxy.golang.org,direct"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this setup, private modules never touch any proxy. Public modules try the corporate proxy first (fast, cached, available on VPN), fall back to the public proxy on failure, and go direct as a last resort.&lt;/p&gt;

&lt;h2&gt;
  
  
  The full picture
&lt;/h2&gt;

&lt;p&gt;Here’s how Go resolves a module with this configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;go get google.golang.org/grpc@v1.77.0

1. Does "google.golang.org/grpc" match GOPRIVATE? No.
2. Try corp-proxy.internal -&amp;gt; TCP timeout (VPN off)
3. Separator is "|" -&amp;gt; fall through on any error
4. Try proxy.golang.org -&amp;gt; 200 OK, module found
5. Done.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And for a private module:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;go get github.com/your-company/secret-service@latest

1. Does "github.com/your-company/secret-service" match GOPRIVATE? Yes.
2. Skip proxy chain entirely.
3. Fetch directly from github.com via git.
4. Done.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  One-line fix
&lt;/h2&gt;

&lt;p&gt;If you’re in this situation, the fix is a single character change in your shell config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- export GOPROXY=corp-proxy.internal,https://proxy.golang.org,direct
+ export GOPROXY="corp-proxy.internal|https://proxy.golang.org,direct"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reload your shell (&lt;code&gt;source ~/.zshrc&lt;/code&gt;) and Go will gracefully fall back to the public proxy whenever your corporate proxy is unreachable. No more waiting for timeouts to tell you what you already know.&lt;/p&gt;

</description>
      <category>go</category>
      <category>networking</category>
      <category>tooling</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Stop Putting API Keys in Your Shell Config</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Thu, 12 Feb 2026 00:00:00 +0000</pubDate>
      <link>https://forem.com/kakkoyun/stop-putting-api-keys-in-your-shell-config-119o</link>
      <guid>https://forem.com/kakkoyun/stop-putting-api-keys-in-your-shell-config-119o</guid>
      <description>&lt;p&gt;We all know better. Don’t hardcode secrets. Use a vault. Rotate your keys. We’ve been saying this for years.&lt;/p&gt;

&lt;p&gt;And then the &lt;strong&gt;agentic coding boom&lt;/strong&gt; happened.&lt;/p&gt;

&lt;p&gt;Suddenly every tool wants an API key. OpenAI, Anthropic, Gemini, Groq, Mistral, Replicate—the list grows weekly. And where do those keys end up? Right there in &lt;code&gt;.zshrc&lt;/code&gt;, in plain text, because you needed it working &lt;em&gt;right now&lt;/em&gt; and you were going to fix it later.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# The "I'll fix this later" hall of shame
export OPENAI_API_KEY=sk-proj-abc123...
export ANTHROPIC_API_KEY=sk-ant-xyz789...
export GEMINI_API_KEY=AIzaSy...

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I caught myself doing exactly this. Two API keys, sitting in my dotfiles, probably backed up to Time Machine, possibly in shell history, definitely in my terminal scrollback. Let’s fix this properly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Plain text API keys in shell configs are bad for reasons you already know:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Shell history&lt;/strong&gt; — &lt;code&gt;~/.zsh_history&lt;/code&gt; records commands, and sometimes you &lt;code&gt;echo $OPENAI_API_KEY&lt;/code&gt; to debug something&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backup snapshots&lt;/strong&gt; — Time Machine, cloud backups, dotfile repos all capture the file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shoulder surfing&lt;/strong&gt; — &lt;code&gt;cat ~/.zshrc&lt;/code&gt; during a screen share or a pairing session&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Terminal scrollback&lt;/strong&gt; — the key is sitting in your terminal buffer right now&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;And this isn’t just a theoretical risk. Attackers actively scan repos and backups for unprotected credentials — and when they find stolen API keys, they rack up thousands of dollars in charges. The platform bills the original owner.&lt;/p&gt;

&lt;p&gt;The “I’ll rotate it later” never comes. Meanwhile these keys have billing attached to them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fix: 1Password CLI
&lt;/h2&gt;

&lt;p&gt;If you use 1Password, you already have a secret manager with biometric unlock, audit logging, and team sharing. The &lt;code&gt;op&lt;/code&gt; CLI lets you pull secrets into your shell without ever writing them to disk.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Install the CLI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;brew install --cask 1password-cli

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable the CLI integration in 1Password desktop app: &lt;strong&gt;Settings &amp;gt; Developer &amp;gt; Connect with 1Password CLI&lt;/strong&gt;. This lets the CLI authenticate via the desktop app (Touch ID on Mac) instead of requiring a separate login.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Store Your Keys
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;op item create \
  --category="API Credential" \
  --title="OpenAI API Key" \
  --vault="Private" \
  "credential=sk-proj-your-key-here"

op item create \
  --category="API Credential" \
  --title="Gemini API Key" \
  --vault="Private" \
  "credential=AIzaSy-your-key-here"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Replace Hardcoded Values
&lt;/h3&gt;

&lt;p&gt;In your &lt;code&gt;.zshrc&lt;/code&gt; (or &lt;code&gt;.bashrc&lt;/code&gt;, &lt;code&gt;.profile&lt;/code&gt;, whatever you use):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- export OPENAI_API_KEY=sk-proj-abc123...
- export GEMINI_API_KEY=AIzaSy...
+ export OPENAI_API_KEY=$(op read "op://Private/OpenAI API Key/credential" --no-newline 2&amp;gt;/dev/null)
+ export GEMINI_API_KEY=$(op read "op://Private/Gemini API Key/credential" --no-newline 2&amp;gt;/dev/null)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s it. Three steps. The keys now live in 1Password, protected by your master password and biometric auth.&lt;/p&gt;

&lt;p&gt;One catch: this triggers a 1Password biometric prompt every time you open a terminal. If that bothers you (it bothered me), see Shell Startup Speed for the lazy-loading version that only prompts when you actually run a command.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Rotate the Old Keys
&lt;/h3&gt;

&lt;p&gt;This is the step people skip. &lt;strong&gt;Do it now.&lt;/strong&gt; The old keys have been in plaintext. Assume they’re compromised.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI: &lt;a href="https://platform.openai.com/api-keys" rel="noopener noreferrer"&gt;platform.openai.com/api-keys&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Google AI: &lt;a href="https://aistudio.google.com/apikey" rel="noopener noreferrer"&gt;aistudio.google.com/apikey&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Anthropic: &lt;a href="https://console.anthropic.com/settings/keys" rel="noopener noreferrer"&gt;console.anthropic.com/settings/keys&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generate new keys, update the 1Password items with &lt;code&gt;op item edit&lt;/code&gt;, and you’re done.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Details Worth Knowing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why &lt;code&gt;--no-newline&lt;/code&gt;?
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;op read&lt;/code&gt; appends a trailing newline by default. API keys with a stray newline cause cryptic authentication failures—the kind where the key “looks right” but every request returns 401. The &lt;code&gt;--no-newline&lt;/code&gt; flag strips it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why &lt;code&gt;2&amp;gt;/dev/null&lt;/code&gt;?
&lt;/h3&gt;

&lt;p&gt;If 1Password is locked or the CLI isn’t authenticated, &lt;code&gt;op read&lt;/code&gt; writes an error to stderr. The redirect silences that so you don’t get a wall of errors every time you open a terminal without 1Password unlocked. The variable simply becomes empty.&lt;/p&gt;

&lt;p&gt;The tradeoff: a misconfigured vault path also fails silently. Test it once after setup, and you’re fine.&lt;/p&gt;

&lt;h3&gt;
  
  
  What About Shell Startup Speed?
&lt;/h3&gt;

&lt;p&gt;The eager approach above runs &lt;code&gt;op read&lt;/code&gt; at shell init, which means every new terminal triggers a 1Password biometric prompt. If you open terminals frequently, this gets old fast.&lt;/p&gt;

&lt;p&gt;The fix is lazy loading with command-specific triggers. In zsh, the &lt;code&gt;preexec&lt;/code&gt; hook fires right before a command executes and receives the command string — perfect for deciding &lt;em&gt;which&lt;/em&gt; secrets to load &lt;em&gt;when&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Map: env var → 1Password secret reference
typeset -A _op_refs=(
  OPENAI_API_KEY "op://Private/OpenAI API Key/credential"
  GEMINI_API_KEY "op://Private/Gemini API Key/credential"
)

# Map: command → which keys it needs
typeset -A _op_cmd_keys=(
  codex "OPENAI_API_KEY"
  aider "OPENAI_API_KEY GEMINI_API_KEY"
  gemini "GEMINI_API_KEY"
)

_maybe_load_op_secrets() {
  local cmd="${1%% *}" # extract first word
  cmd="${cmd##*/}" # strip path prefix
  local keys="${_op_cmd_keys[$cmd]}"
  [[-z "$keys"]] &amp;amp;&amp;amp; return
  for key in ${=keys}; do
    [[-n "${(P)key}"]] &amp;amp;&amp;amp; continue # already loaded
    export "$key=$(op read "${_op_refs[$key]}" --no-newline 2&amp;gt;/dev/null)"
  done
}
preexec_functions+=(_maybe_load_op_secrets)

# Manual fallback: load everything
load-secrets() {
  for key ref in "${(@kv)_op_refs}"; do
    export "$key=$(op read "$ref" --no-newline 2&amp;gt;/dev/null)"
  done
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you three properties:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No startup cost&lt;/strong&gt; — terminal opens instantly, no biometric prompt&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Least privilege&lt;/strong&gt; — &lt;code&gt;codex&lt;/code&gt; only loads &lt;code&gt;OPENAI_API_KEY&lt;/code&gt;, not every secret you have&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load once&lt;/strong&gt; — each key is fetched at most once per session (the &lt;code&gt;${(P)key}&lt;/code&gt; guard skips keys that are already set)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adding a new tool is one line in &lt;code&gt;_op_cmd_keys&lt;/code&gt;. Adding a new key is one line in &lt;code&gt;_op_refs&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you have multiple 1Password accounts (personal + work), add &lt;code&gt;--account=my.1password.com&lt;/code&gt; to the &lt;code&gt;op read&lt;/code&gt; calls to avoid vault name collisions.&lt;/p&gt;

&lt;p&gt;For even more granularity:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;op run&lt;/code&gt;&lt;/strong&gt; — inject secrets into a specific command rather than the global environment:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Only injects the key for this one command
op run --env-file=.env.1password -- python train.py

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;op inject&lt;/code&gt;&lt;/strong&gt; — when you have a dozen keys, individual &lt;code&gt;op read&lt;/code&gt; calls add up. With &lt;code&gt;op inject&lt;/code&gt;, you define all your secrets in a single template and load them in one shot:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# ~/.env.op (template — safe to commit, contains no secrets)
export OPENAI_API_KEY={{ op://Private/OpenAI API Key/credential }}
export GEMINI_API_KEY={{ op://Private/Gemini API Key/credential }}
export ANTHROPIC_API_KEY={{ op://Private/Anthropic API Key/credential }}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# In .zshrc — one CLI call loads everything
eval "$(op inject --in-file ~/.env.op)"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is substantially faster than N individual &lt;code&gt;op read&lt;/code&gt; calls — the CLI resolves all references in a single authentication round-trip.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Scoped injection&lt;/strong&gt; — skip the global environment entirely and inject a key for exactly one command’s lifetime:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPENAI_API_KEY=$(op read "op://Private/OpenAI API Key/credential" --no-newline) python train.py

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key exists only in that command’s process environment. Nothing touches your shell, nothing lingers after the process exits. This is the most paranoid option, and it’s great for CI scripts or one-off runs.&lt;/p&gt;

&lt;h3&gt;
  
  
  What About macOS Keychain?
&lt;/h3&gt;

&lt;p&gt;macOS Keychain (&lt;code&gt;security find-generic-password&lt;/code&gt;) works too and has zero startup overhead since it’s always unlocked when you’re logged in. I use it for some tokens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export GITLAB_TOKEN=$(security find-generic-password -a ${USER} -s gitlab_token -w)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The advantage of 1Password over Keychain: cross-device sync, team sharing, audit logs, and a UI that doesn’t make you question your life choices. Use whichever fits your workflow. The point is to stop storing secrets in plain text.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Agentic Boom Made This Worse
&lt;/h2&gt;

&lt;p&gt;A year ago, most developers had maybe one or two API keys. Now? I know people with &lt;strong&gt;six or more&lt;/strong&gt; AI service keys in their shell config. Coding agents need them. MCP servers need them. Every new tool in the ecosystem asks you to “just export your API key” and the docs always show the hardcoded version because it’s simpler to explain.&lt;/p&gt;

&lt;p&gt;MCP servers are the newest vector here. Tools like Claude Code, Cursor, and Windsurf use configuration files (&lt;code&gt;claude_desktop_config.json&lt;/code&gt;, &lt;code&gt;mcp.json&lt;/code&gt;) that store API keys for tool servers. The LLM itself never sees the secret values — the MCP server process does — but only if you inject them properly. Hardcoding keys in MCP configs is the same mistake as hardcoding them in &lt;code&gt;.zshrc&lt;/code&gt;, just in a newer file. The &lt;code&gt;op&lt;/code&gt; CLI works here too: use &lt;code&gt;op run&lt;/code&gt; or environment variable references in your MCP server configs instead of raw keys.&lt;/p&gt;

&lt;p&gt;This is a tooling culture problem. The default getting-started experience for almost every AI API is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export MAGIC_AI_KEY=your-key-here # don't do this

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We should normalize showing the secure version in documentation. Until that happens, take five minutes and move your keys to a vault. Your future self (and your billing page) will thank you.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Before: plain text keys in .zshrc
export OPENAI_API_KEY=sk-proj-...

# After: lazy-loaded from 1Password, per-command per-key
typeset -A _op_refs=(
  OPENAI_API_KEY "op://Private/OpenAI API Key/credential"
  GEMINI_API_KEY "op://Private/Gemini API Key/credential"
)
typeset -A _op_cmd_keys=(
  codex "OPENAI_API_KEY"
  aider "OPENAI_API_KEY GEMINI_API_KEY"
)
_maybe_load_op_secrets() {
  local cmd="${1%% *}"; cmd="${cmd##*/}"
  local keys="${_op_cmd_keys[$cmd]}"
  [[-z "$keys"]] &amp;amp;&amp;amp; return
  for key in ${=keys}; do
    [[-n "${(P)key}"]] &amp;amp;&amp;amp; continue
    export "$key=$(op read "${_op_refs[$key]}" --no-newline 2&amp;gt;/dev/null)"
  done
}
preexec_functions+=(_maybe_load_op_secrets)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install &lt;code&gt;op&lt;/code&gt;, store your keys, replace the exports, rotate the old keys. Five minutes. Zero excuses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://1password.com/blog/securing-mcp-servers-with-1password-stop-credential-exposure-in-your-agent" rel="noopener noreferrer"&gt;Securing MCP Servers with 1Password&lt;/a&gt; — 1Password’s take on stopping credential exposure in agent configurations&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://williamcallahan.com/blog/secure-environment-variables-1password-doppler-llms-mcps-ai-tools" rel="noopener noreferrer"&gt;Secure Environment Variables for LLMs, MCPs, and AI Tools&lt;/a&gt; — William Callahan’s walkthrough of using 1Password CLI and Doppler for AI tool secrets&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://1password.com/blog/where-mcp-fits-and-where-it-doesnt" rel="noopener noreferrer"&gt;Where MCP Fits and Where It Doesn’t&lt;/a&gt; — 1Password on the security model of MCP and credential boundaries&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://developer.1password.com/docs/cli/secret-references/" rel="noopener noreferrer"&gt;1Password CLI: Secret References&lt;/a&gt; — official docs on the &lt;code&gt;op://&lt;/code&gt; URI scheme&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://developer.1password.com/docs/cli/reference/commands/inject/" rel="noopener noreferrer"&gt;1Password CLI: &lt;code&gt;op inject&lt;/code&gt;&lt;/a&gt; — batch-load secrets from template files&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://developer.1password.com/docs/cli/shell-plugins/" rel="noopener noreferrer"&gt;1Password Shell Plugins&lt;/a&gt; — native integrations for CLI tools like &lt;code&gt;gh&lt;/code&gt;, &lt;code&gt;aws&lt;/code&gt;, and &lt;code&gt;stripe&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>api</category>
      <category>cli</category>
      <category>security</category>
    </item>
    <item>
      <title>Vibe Coding with Cursor: My R&amp;D Week Adventure 🚀</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Wed, 12 Mar 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/kakkoyun/vibe-coding-with-cursor-my-rd-week-adventure-52g</link>
      <guid>https://forem.com/kakkoyun/vibe-coding-with-cursor-my-rd-week-adventure-52g</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;TL;DR: Spent a week building cool stuff with &lt;a href="https://cursor.com" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;, an AI-powered IDE. Found it surprisingly effective for both coding and managing my &lt;a href="https://www.buildingasecondbrain.com/" rel="noopener noreferrer"&gt;second brain&lt;/a&gt;. When your requirements are clear, it’s almost magical! ✨&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Setup: R&amp;amp;D Week Vibes
&lt;/h2&gt;

&lt;p&gt;You know that feeling when R&amp;amp;D week rolls around, and you’re caught between “I should learn something useful” and “I want to have fun”? Well, this time I decided to combine both by diving deep into &lt;a href="https://cursor.com" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;, an AI-powered code editor that’s been making waves in the developer community.&lt;/p&gt;

&lt;p&gt;The mission was simple: Use Cursor for &lt;strong&gt;everything&lt;/strong&gt; - from managing my notes to building small task-specific projects. And by everything, I mean &lt;em&gt;everything&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes Cursor Different?
&lt;/h2&gt;

&lt;p&gt;Unlike traditional IDEs that just help you write code, Cursor feels more like having a pair programmer who actually gets your context. It’s built on top of VSCode (so you get all the good stuff you’re used to) but adds a layer of AI-powered features that make development feel more… vibey? 😎&lt;/p&gt;

&lt;h3&gt;
  
  
  The Good Parts
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Context-Aware AI&lt;/strong&gt; : The AI understands your project structure and can help with everything from code completion to refactoring. For example, when working on a React component, it automatically suggested appropriate hooks and state management patterns based on my component’s purpose. When your requirements are clear, it’s almost magical how it can scaffold projects and implement patterns!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.cursor.com/context/rules-for-ai" rel="noopener noreferrer"&gt;Rules Feature&lt;/a&gt;&lt;/strong&gt;: This is where things get interesting. You can create custom rules and context for different types of work, both at the project and global level. Think project-specific coding standards, documentation patterns, and even architecture guidelines.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.cursor.com/beta/notepads" rel="noopener noreferrer"&gt;Notepads&lt;/a&gt;&lt;/strong&gt;: Quick thoughts? Code snippets? The notepad feature is like having a smart scratchpad that understands code and can share context between different parts of your development workflow.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Second Brain Management: A Pleasant Surprise
&lt;/h2&gt;

&lt;p&gt;One of my unexpected discoveries was how well Cursor handles note-taking and &lt;a href="https://www.buildingasecondbrain.com/" rel="noopener noreferrer"&gt;second brain&lt;/a&gt; management. Here’s what made it click for me:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Markdown support is top-notch
- AI understands context across files
- Easy to maintain structure with rules
- Quick navigation between related notes
- File attachments for enhanced documentation
- Dynamic references using @ mentions

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Rules Feature: A Game Changer
&lt;/h3&gt;

&lt;p&gt;The rules feature deserves its own spotlight. Cursor offers two powerful ways to customize AI behavior (note that the older &lt;code&gt;.cursorrules&lt;/code&gt; file is being deprecated in favor of this new system):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Project Rules&lt;/strong&gt; (&lt;code&gt;.cursor/rules&lt;/code&gt; directory):&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Global Rules&lt;/strong&gt; (Cursor Settings):&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Pro tip: Use project rules whenever possible - they’re more flexible, can be version controlled, and provide better granular control over different parts of your project.&lt;/p&gt;

&lt;p&gt;I’ve set up different contexts for various types of work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Technical blog posts (with specific writing guidelines)&lt;/li&gt;
&lt;li&gt;Project documentation (with architecture patterns)&lt;/li&gt;
&lt;li&gt;Personal notes (with custom templates)&lt;/li&gt;
&lt;li&gt;Code standards (with framework-specific rules)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each context comes with its own set of rules and AI behavior. It’s like having multiple specialized assistants at your disposal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Notepads: Beyond Simple Notes
&lt;/h3&gt;

&lt;p&gt;The Notepads feature (currently in beta) has been a revelation. Think of them as enhanced reference documents that go beyond regular &lt;code&gt;.cursorrules&lt;/code&gt;. I use them for:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dynamic Boilerplate Generation&lt;/strong&gt; :&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Architecture Documentation&lt;/strong&gt; :&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Development Guidelines&lt;/strong&gt; :&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The ability to share context between composers and chat interactions makes them incredibly powerful. Plus, you can attach files and use @ mentions to create a web of connected knowledge.&lt;/p&gt;

&lt;h2&gt;
  
  
  Small Projects, Big Impact
&lt;/h2&gt;

&lt;p&gt;During the week, I worked on several small, task-specific projects. The workflow typically went like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a new project with clear requirements&lt;/li&gt;
&lt;li&gt;Set up project-specific rules and templates&lt;/li&gt;
&lt;li&gt;Let the AI handle boilerplate and routine coding&lt;/li&gt;
&lt;li&gt;Focus on architecture and edge cases&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The AI handled a lot of the repetitive work, letting me focus on the creative aspects of each project. The clearer my requirements were, the more magical the results became. ✨&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AI-Powered Doesn’t Mean AI-Dependent&lt;/strong&gt; : Cursor enhances your workflow without taking over.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rules Are Your Friend&lt;/strong&gt; : Taking time to set up proper rules pays off immensely.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Context is King&lt;/strong&gt; : The more context you provide, the better the AI assistance becomes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Second Brain Benefits&lt;/strong&gt; : It’s not just for coding; it’s a genuine knowledge management tool.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Clear Requirements = Magic&lt;/strong&gt; : The more precise your task definition, the better the results.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What’s Next?
&lt;/h2&gt;

&lt;p&gt;I’m planning to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Expand my rule sets for different types of work&lt;/li&gt;
&lt;li&gt;Create more structured templates for common architectural patterns&lt;/li&gt;
&lt;li&gt;Explore advanced AI features like multi-file refactoring&lt;/li&gt;
&lt;li&gt;Share my rules and templates with the community&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;R&amp;amp;D weeks are about trying new things and finding better ways to work. This experiment with Cursor turned out to be more than just playing with a new tool - it’s changed how I think about IDE capabilities and knowledge management.&lt;/p&gt;

&lt;p&gt;The combination of familiar VSCode features with AI assistance, especially the rules system, makes it a powerful tool for both coding and knowledge work. It’s not perfect (what is?), but it’s definitely earned its place in my daily toolkit.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Remember: The best tools are the ones that enhance your natural workflow rather than forcing you to adapt to them. Cursor does this surprisingly well. 👍&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>cursor</category>
      <category>vibecoding</category>
      <category>secondbrain</category>
    </item>
    <item>
      <title>FOSDEM 2025: Blimey, What a Weekend!</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Tue, 04 Feb 2025 00:00:00 +0000</pubDate>
      <link>https://forem.com/kakkoyun/fosdem-2025-blimey-what-a-weekend-3191</link>
      <guid>https://forem.com/kakkoyun/fosdem-2025-blimey-what-a-weekend-3191</guid>
      <description>&lt;h3&gt;
  
  
  Another Year, Another FOSDEM
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;FOSDEM&lt;/strong&gt; —the annual pilgrimage to &lt;strong&gt;Brussels&lt;/strong&gt; for a weekend of open-source brilliance, hallway track magic, and the inevitable sleep deprivation. This year’s &lt;strong&gt;Free and Open Source Software Developers’ European Meeting&lt;/strong&gt; was, as always, a whirlwind of ideas, people, and tech so bleeding-edge it practically needed bandages.&lt;/p&gt;

&lt;p&gt;But for me? It was all about &lt;strong&gt;seeing friends&lt;/strong&gt;. Catching up, syncing, and squeezing in as many conversations as humanly possible. As we always say—the &lt;strong&gt;hallway track is the real conference&lt;/strong&gt;. I’m beyond grateful for the people I managed to see, and equally bummed about those I missed. But with a toddler waiting at home, even carving out this limited time was a logistical miracle.&lt;/p&gt;

&lt;h3&gt;
  
  
  Saturday: Go, Go, Go… and the eBPF Black Hole
&lt;/h3&gt;

&lt;p&gt;Saturday kicked off with a deep dive into the &lt;strong&gt;Go DevRoom&lt;/strong&gt; , before a (failed) mission to infiltrate the &lt;strong&gt;eBPF&lt;/strong&gt; talks.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Go Goodness&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The &lt;strong&gt;Go DevRoom&lt;/strong&gt; delivered as expected:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;"&lt;a href="https://fosdem.org/2025/schedule/event/fosdem-2025-5353-the-state-of-go/" rel="noopener noreferrer"&gt;The State of Go&lt;/a&gt;"&lt;/strong&gt; – Maartje Eyskens gave a solid rundown on where Go is headed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"&lt;a href="https://fosdem.org/2025/schedule/event/fosdem-2025-6049-swiss-maps-in-go/" rel="noopener noreferrer"&gt;Swiss Maps in Go&lt;/a&gt;"&lt;/strong&gt; – Bryan Boreham took us through these lightning-fast maps. &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fifoyhgwgvsd9vwn5kbl5.jpeg" alt="Swiss Maps in Go talk" width="800" height="600"&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"&lt;a href="https://fosdem.org/2025/schedule/event/fosdem-2025-5343-go-ing-easy-on-memory-writing-gc-friendly-code/" rel="noopener noreferrer"&gt;Go-ing Easy on Memory: Writing GC-Friendly Code&lt;/a&gt;"&lt;/strong&gt; – Sümer Cip’s talk was a timely reminder that, yes, your garbage collection problems are (probably) your fault.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;eBPF Fail&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The &lt;strong&gt;eBPF DevRoom&lt;/strong&gt;? Packed. Absolutely impenetrable. As someone put it on Twitter:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Nobody leaves #eBPF room at #FOSDEM, so nobody gets in. 🥲”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Next year, I’m bringing a tent and camping outside the door.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sunday: Monitoring, Metrics, and Maybe Too Many Frites
&lt;/h3&gt;

&lt;p&gt;Sunday was all about observability, performance, and squeezing every bit of insight from running systems.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Observability Overload&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The &lt;strong&gt;Monitoring and Observability DevRoom&lt;/strong&gt; had a strong lineup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Richard “RichiH” Hartmann &lt;a href="https://fosdem.org/2025/schedule/event/fosdem-2025-6715-monitoring-and-observability-devroom-opening/" rel="noopener noreferrer"&gt;set the stage&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"&lt;a href="https://fosdem.org/2025/schedule/event/fosdem-2025-5502-the-performance-impact-of-auto-instrumentation/" rel="noopener noreferrer"&gt;The Performance Impact of Auto-Instrumentation&lt;/a&gt;"&lt;/strong&gt; – James Belchamber gave a fantastic talk on the hidden costs of auto-instrumentation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"&lt;a href="https://fosdem.org/2025/schedule/event/fosdem-2025-6571-prometheus-version-3/" rel="noopener noreferrer"&gt;Prometheus Version 3&lt;/a&gt;"&lt;/strong&gt; – Jan Fajerski and Bryan Boreham gave us the lowdown on what’s next for Prometheus. &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0cdq5umh7ebk0yryrlxb.jpeg" alt="Prometheus 3 talk" width="800" height="600"&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Community Vibes
&lt;/h3&gt;

&lt;p&gt;Like I said— &lt;strong&gt;FOSDEM&lt;/strong&gt; is really about the people. The talks are great, but the real magic happens in the hallway track. Some of the best conversations weren’t planned; they just happened over coffee, between sessions, or during a frantic sprint between buildings.&lt;/p&gt;

&lt;p&gt;I’m incredibly happy for the folks I got to see, and at the same time, I wish I had more time to catch up with everyone I missed. But life is about balance, and with a little one waiting at home, I had to make every moment count.&lt;/p&gt;

&lt;p&gt;Oh, and the &lt;strong&gt;frites&lt;/strong&gt;? Still undefeated.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bonus: Trains, Chaos, and a Race Against Time
&lt;/h3&gt;

&lt;p&gt;Because no trip is complete without &lt;strong&gt;public transport drama&lt;/strong&gt; , my journey back home came with an extra dose of stress. Trains? Cancelled. Schedule? A mess. Plane? Hanging by a thread. &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsmnsu832edfgkp3zltf4.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsmnsu832edfgkp3zltf4.jpeg" alt="Train chaos while trying to catch my flight" width="800" height="1066"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Somehow, I made it. But FOSDEM weekend wouldn’t be complete without at least one unexpected adventure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Final Thoughts
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;FOSDEM 2025&lt;/strong&gt; delivered. Again. Already looking forward to next year. If you’re into open source and haven’t experienced &lt;strong&gt;FOSDEM&lt;/strong&gt; , sort it out.&lt;/p&gt;

</description>
      <category>fosdem</category>
      <category>conference</category>
      <category>opensource</category>
    </item>
    <item>
      <title>When Hustle Culture and Personal Values Collide: Lessons from My Startup Journey</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Wed, 16 Oct 2024 00:00:00 +0000</pubDate>
      <link>https://forem.com/kakkoyun/when-hustle-culture-and-personal-values-collide-lessons-from-my-startup-journey-59o4</link>
      <guid>https://forem.com/kakkoyun/when-hustle-culture-and-personal-values-collide-lessons-from-my-startup-journey-59o4</guid>
      <description>&lt;p&gt;Startups can be exciting arenas of innovation, filled with ambitious goals, rapid development cycles, and the allure of shaping the future. But when the pace becomes unsustainable, and personal values clash with company culture, the dream can quickly lose its luster. My recent experience at a machine learning inference startup taught me invaluable lessons about overwork, alignment, and the balance between idealism and pragmatism.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Decided to Leave
&lt;/h2&gt;

&lt;p&gt;The decision to leave wasn’t easy, but it became necessary when I realized that the environment was not compatible with my personal and professional priorities.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Overwork as a Default&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The company embraced a hustle culture where working over 10 hours a day and being on-call 24/7 was normalized. This wasn’t limited to crunch times—it was the baseline expectation. For someone with a newborn at home, this level of overwork was unsustainable and detrimental to my family life.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lack of Empathy and Transparency&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Despite knowing about my life situation, the company struggled to adjust its expectations. With no parents on the team, there was little understanding of what it meant to balance work and family. Additionally, expectations around work hours and deliverables weren’t clearly communicated during onboarding.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Misaligned Values&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
I joined with the goal of building resilient, scalable, high-performance systems while contributing to open-source projects—a passion of mine. However, the company prioritized rapid feature delivery and short-term metrics over reliability, sustainability, or open-source contributions. This fundamental misalignment created constant friction.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  My Mistakes
&lt;/h2&gt;

&lt;p&gt;While the cultural mismatch played a significant role, I also made mistakes that compounded the challenges:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Over-Optimizing Instead of Delivering&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
I leaned into finding ideal solutions rather than delivering quick, practical implementations. In a fast-paced startup, speed often outweighs perfection.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Focusing Too Much on Learning&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
My desire to deeply understand and control every detail of the platform slowed me down. While this mindset works well in some roles, it was counterproductive in a high-pressure, delivery-focused environment.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Prioritizing Reliability Over Features&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
I instinctively gravitated toward improving system reliability and long-term sustainability, even when it was clear the company valued rapid feature delivery instead. This misalignment of priorities made my efforts less impactful in their eyes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Spending Time on Open Source&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
I worked on improving and contributing to open-source tools, which I saw as valuable. However, the company didn’t share this enthusiasm, and my efforts were viewed as misaligned with their goals.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;p&gt;Reflecting on this experience, I’ve taken away several key lessons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cultural Fit is Critical&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
No matter how exciting the technology or mission, if the company’s culture doesn’t align with your values, frustrations will inevitably arise. Startups that glorify overwork are not sustainable for someone who values balance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Clarify Expectations Early&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Misaligned expectations around priorities and success metrics can derail even the most skilled engineers. Asking detailed questions during interviews and onboarding is essential to ensure alignment.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Balance Idealism with Pragmatism&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Striking a balance between delivering quick wins and building sustainable systems is key, especially in startups. Knowing when to prioritize speed over perfection is a crucial skill.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Stay True to Your Priorities&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
For me, being present for my family and maintaining a balanced life outweighs any professional ambition. Leaving the role wasn’t easy, but it was the right decision for my well-being and values.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;This experience was a humbling reminder of the importance of alignment—between personal values, company culture, and role expectations. While I’ve always thrived at the intersection of systems engineering and challenging problems, this chapter underscored the need for environments that respect the individual, not just the output.&lt;/p&gt;

&lt;p&gt;Startups can be transformative experiences for those who thrive on rapid growth and ambiguity. But for those who prioritize balance and long-term thinking, it’s critical to choose an organization that values these traits. For me, this experience reaffirmed the importance of staying true to my values, even when the professional stakes are high.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Profiling Python with eBPF: A New Frontier in Performance Analysis</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Mon, 12 Feb 2024 00:00:00 +0000</pubDate>
      <link>https://forem.com/kakkoyun/profiling-python-with-ebpf-a-new-frontier-in-performance-analysis-2dmj</link>
      <guid>https://forem.com/kakkoyun/profiling-python-with-ebpf-a-new-frontier-in-performance-analysis-2dmj</guid>
      <description>&lt;h1&gt;
  
  
  Profiling Python with eBPF: A New Frontier in Performance Analysis
&lt;/h1&gt;

&lt;p&gt;Profiling Python applications can be challenging, especially in scenarios involving high-performance requirements or complex workloads. Existing tools often require code instrumentation, making them impractical for certain use cases. Enter &lt;a href="https://ebpf.io/" rel="noopener noreferrer"&gt;eBPF&lt;/a&gt; (Extended Berkeley Packet Filter)—a revolutionary Linux technology—and the open-source project &lt;a href="https://parca.dev" rel="noopener noreferrer"&gt;Parca&lt;/a&gt;, which together are reshaping the landscape of Python profiling.&lt;/p&gt;

&lt;p&gt;In this post, I’ll explore how eBPF enables continuous profiling, discuss challenges like stack unwinding in Python, and demonstrate the power of modern profiling tools.&lt;/p&gt;

&lt;p&gt;You can also watch my &lt;a href="https://youtu.be/nNbU26CoMWA?si=t3Mh1z6XfNwa5r7M" rel="noopener noreferrer"&gt;full talk here&lt;/a&gt; or refer to the &lt;a href="https://kakkoyun.me/notes/presentations/FOSDEM24+-+Profiling+Python+with+eBPF+-+A+New+Frontier+in+Performance+Analysis" rel="noopener noreferrer"&gt;slides from the presentation&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Do We Need Profiling?
&lt;/h2&gt;

&lt;p&gt;Profiling helps optimize performance and troubleshoot issues, such as CPU spikes, memory leaks, or out-of-memory (OOM) events. For instance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Performance optimization:&lt;/strong&gt; Identifying bottlenecks in code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incident resolution:&lt;/strong&gt; Determining which function or component caused a memory spike or CPU overload.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Traditional Python profiling tools, like &lt;a href="https://docs.python.org/3/library/profile.html" rel="noopener noreferrer"&gt;&lt;code&gt;cProfile&lt;/code&gt;&lt;/a&gt; or &lt;a href="https://github.com/benfred/py-spy" rel="noopener noreferrer"&gt;&lt;code&gt;py-spy&lt;/code&gt;&lt;/a&gt;, require application instrumentation, which isn’t always feasible—especially in production environments where code access might be restricted. This is where eBPF shines, offering non-intrusive, external profiling.&lt;/p&gt;




&lt;h2&gt;
  
  
  Existing Profiling Solutions in Python
&lt;/h2&gt;

&lt;p&gt;The Python ecosystem offers several profiling tools, each with unique strengths:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.python.org/3/library/profile.html" rel="noopener noreferrer"&gt;&lt;code&gt;cProfile&lt;/code&gt;&lt;/a&gt;: A built-in module for deterministic profiling.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/joerick/pyinstrument" rel="noopener noreferrer"&gt;&lt;code&gt;pyinstrument&lt;/code&gt;&lt;/a&gt;: A call stack profiler for Python.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/benfred/py-spy" rel="noopener noreferrer"&gt;&lt;code&gt;py-spy&lt;/code&gt;&lt;/a&gt;: A sampling profiler for Python programs.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/sumerc/yappi" rel="noopener noreferrer"&gt;&lt;code&gt;yappi&lt;/code&gt;&lt;/a&gt;: Yet Another Python Profiler, supports multithreaded programs.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pyflame.readthedocs.io/en/latest/" rel="noopener noreferrer"&gt;&lt;code&gt;Pyflame&lt;/code&gt;&lt;/a&gt;: A ptracing profiler for Python.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/plasma-umass/scalene" rel="noopener noreferrer"&gt;&lt;code&gt;Scalene&lt;/code&gt;&lt;/a&gt;: A high-performance CPU and memory profiler.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While these tools are valuable, many require code instrumentation or introduce significant overhead, making them less suitable for continuous profiling in production environments.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is eBPF?
&lt;/h2&gt;

&lt;p&gt;Originally designed for network packet filtering, &lt;a href="https://ebpf.io/" rel="noopener noreferrer"&gt;eBPF&lt;/a&gt; has evolved into a versatile event-driven system. It enables safe execution of custom programs inside the Linux kernel, using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://en.wikipedia.org/wiki/Performance_monitoring_unit" rel="noopener noreferrer"&gt;Performance Monitoring Units (PMUs)&lt;/a&gt;:&lt;/strong&gt; Efficient hardware units that track CPU cycles and other metrics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://perf.wiki.kernel.org/index.php/Main_Page" rel="noopener noreferrer"&gt;Perf subsystem&lt;/a&gt;:&lt;/strong&gt; A Linux facility for hooking into kernel and user-space events, such as CPU activity, memory allocation, or I/O.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By leveraging eBPF with PMUs, profiling becomes faster and more efficient than traditional approaches.&lt;/p&gt;




&lt;h2&gt;
  
  
  Continuous Profiling with Parca
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://parca.dev" rel="noopener noreferrer"&gt;Parca&lt;/a&gt; is an open-source project enabling continuous profiling. Its eBPF agent hooks into &lt;a href="https://perf.wiki.kernel.org/index.php/Tutorial" rel="noopener noreferrer"&gt;perf events&lt;/a&gt;, collects stack traces, and aggregates data for visualization. The process involves:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Hooking into CPU events&lt;/strong&gt; to monitor active functions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stack unwinding&lt;/strong&gt; to trace function calls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data aggregation and visualization&lt;/strong&gt; in a web-based UI.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Unlike traditional profilers, Parca introduces minimal runtime overhead, making it ideal for production workloads.&lt;/p&gt;




&lt;h2&gt;
  
  
  Stack Unwinding: A Key Challenge
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Native Code
&lt;/h3&gt;

&lt;p&gt;Profiling native code is straightforward: we unwind the stack by reading memory addresses from the CPU and resolving them into human-readable symbols using debug information (e.g., &lt;a href="https://dwarfstd.org/" rel="noopener noreferrer"&gt;DWARF&lt;/a&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  Python Code
&lt;/h3&gt;

&lt;p&gt;For Python, stack unwinding is complex due to its interpreter-based execution. Python maintains execution state in custom data structures, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Interpreter state:&lt;/strong&gt; Tracks threads and their execution context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Thread state:&lt;/strong&gt; A linked list of threads running in the interpreter.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frame state:&lt;/strong&gt; Represents the current execution frame.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To unwind Python stacks, we must traverse these structures, extract relevant information, and map them to human-readable symbols.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Parca Profiles Python
&lt;/h2&gt;

&lt;p&gt;Here’s how Parca handles Python profiling:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reverse Engineering the Python Runtime:&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Unwinding Python Stacks:&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Mapping Symbols:&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Efficient Data Handling:&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Python 3.13: A Game-Changer for Profiling
&lt;/h2&gt;

&lt;p&gt;The upcoming Python 3.13 release introduces a debug offset structure that simplifies stack unwinding. It provides precomputed offsets for key runtime fields, eliminating much of the manual reverse engineering required for earlier versions. This improvement marks a significant leap forward for tools like Parca.&lt;/p&gt;




&lt;h2&gt;
  
  
  Visualizing Profiles with Parca
&lt;/h2&gt;

&lt;p&gt;Parca’s UI provides a comprehensive view of application performance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flame graphs&lt;/strong&gt; : Visualize stack traces over time, highlighting bottlenecks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filtering and Metadata&lt;/strong&gt; : Focus on specific languages (e.g., Python) or layers (e.g., C libraries).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Continuous Insights&lt;/strong&gt; : Compare profiles across deployments to monitor performance regressions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, a flame graph might reveal inefficient recursion in a Python function, enabling developers to pinpoint and optimize the problematic code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Supported Python Versions
&lt;/h2&gt;

&lt;p&gt;Parca supports profiling for Python versions from 2.7 to 3.11, with ongoing work for 3.12 and full support anticipated for 3.13. The project’s modular design allows quick adaptation to new Python runtime changes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Profiling Python applications with eBPF and Parca represents a new frontier in performance analysis. By leveraging eBPF and continuous profiling, we can gain invaluable insights into our applications, enabling effective performance optimization. I encourage you to explore Parca, provide feedback, and contribute to the project—it’s a collaborative effort that can benefit us all as we tackle the challenges of modern software development.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Started
&lt;/h3&gt;

&lt;p&gt;Watch my &lt;a href="https://youtu.be/nNbU26CoMWA?si=t3Mh1z6XfNwa5r7M" rel="noopener noreferrer"&gt;full talk&lt;/a&gt; or check out the &lt;a href="https://kakkoyun.me/notes/presentations/FOSDEM24+-+Profiling+Python+with+eBPF+-+A+New+Frontier+in+Performance+Analysis" rel="noopener noreferrer"&gt;presentation slides&lt;/a&gt;. Explore Parca on &lt;a href="https://github.com/parca-dev/parca" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; and join the community. Your feedback helps improve the tooling and shape the future of observability.&lt;/p&gt;

</description>
      <category>python</category>
      <category>ebpf</category>
      <category>profiling</category>
    </item>
    <item>
      <title>Fantastic Symbols and Where to Find Them - Part 2</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Thu, 27 Jan 2022 14:46:02 +0000</pubDate>
      <link>https://forem.com/kakkoyun/fantastic-symbols-and-where-to-find-them-part-2-1edk</link>
      <guid>https://forem.com/kakkoyun/fantastic-symbols-and-where-to-find-them-part-2-1edk</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published on polarsignals.com/blog on 27.01.2022&lt;/p&gt;

&lt;p&gt;This is a blog post series. If you haven’t read &lt;a href="https://www.polarsignals.com/blog/posts/2022/01/13/fantastic-symbols-and-where-to-find-them" rel="noopener noreferrer"&gt;Part 1&lt;/a&gt; we recommend you to do so first!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In &lt;a href="https://www.polarsignals.com/blog/posts/2022/01/13/fantastic-symbols-and-where-to-find-them" rel="noopener noreferrer"&gt;the first blog post&lt;/a&gt;, we learned about the fantastic symbols (&lt;a href="https://en.wikipedia.org/wiki/Debug_symbol" rel="noopener noreferrer"&gt;debug symbols&lt;/a&gt;), how the symbolization process works and lastly, how to find the symbolic names of addresses in a compiled binary.&lt;/p&gt;

&lt;p&gt;The actual location of the symbolic information depends on the programming language implementation the program is written in.&lt;br&gt;
We can categorize the programming language implementations into three groups: compiled languages (with or without a runtime), interpreted languages, and &lt;a href="https://en.wikipedia.org/wiki/Just-in-time_compilation" rel="noopener noreferrer"&gt;JIT-compiled&lt;/a&gt; languages.&lt;/p&gt;

&lt;p&gt;In this post, we will continue our journey to find fantastic symbols. And we will look into where to find them for the other types of programming language implementations.&lt;/p&gt;
&lt;h2&gt;
  
  
  JIT-compiled language implementations
&lt;/h2&gt;

&lt;p&gt;Examples of JIT-compiled languages include Java, .NET, Erlang, JavaScript (Node.js) and many others.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Just-in-time_compilation" rel="noopener noreferrer"&gt;Just-In-Time&lt;/a&gt; compiled languages compile the source code into &lt;a href="https://en.wikipedia.org/wiki/Bytecode" rel="noopener noreferrer"&gt;bytecode&lt;/a&gt;, which is then compiled into &lt;a href="https://en.wikipedia.org/wiki/Machine_code" rel="noopener noreferrer"&gt;machine code&lt;/a&gt; at runtime,&lt;br&gt;
often using direct feedback from runtime to guide compiler optimizations on the fly.&lt;/p&gt;

&lt;p&gt;Because functions are compiled on the fly, there is no pre-built, discoverable symbol table in any object files. Instead, the symbol table is created on the fly.&lt;br&gt;
The symbol mappings (location to symbol) are usually stored in the &lt;em&gt;memory&lt;/em&gt; of the &lt;a href="https://en.wikipedia.org/wiki/Runtime_(program_lifecycle_phase)" rel="noopener noreferrer"&gt;runtime&lt;/a&gt; or &lt;a href="https://en.wikipedia.org/wiki/Virtual_machine" rel="noopener noreferrer"&gt;virtual machine&lt;/a&gt;&lt;br&gt;
and used for rendering human-readable stack traces when it is needed &lt;em&gt;, e. g.&lt;/em&gt; when an exception occurs, the runtime will use the symbol mappings to render a human-readable stack trace.&lt;/p&gt;

&lt;p&gt;The good thing is that most of the runtimes provide supplemental symbol mappings for the just-in-time compiled code for Linux to use &lt;code&gt;perf&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;perf&lt;/code&gt; defines &lt;a href="https://github.com/torvalds/linux/blob/master/tools/perf/Documentation/jit-interface.txt" rel="noopener noreferrer"&gt;an interface&lt;/a&gt; to resolve symbols for dynamically generated code by a JIT compiler.&lt;br&gt;
These files usually can be found in &lt;code&gt;/tmp/perf-$PID.map&lt;/code&gt;, where &lt;code&gt;$PID&lt;/code&gt; is the process ID of the process of the runtime that is running on the system.&lt;/p&gt;

&lt;p&gt;The runtimes usually don't enable providing symbol mappings by default.&lt;br&gt;
You might need to change a configuration, run the virtual machine with a specific flag/environment variable or run an additional program to obtain these mappings.&lt;br&gt;
For example, JVM needs an agent to provide supplemental symbol mapping files, called &lt;a href="https://github.com/jvm-profiling-tools/perf-map-agent" rel="noopener noreferrer"&gt;perf-map-agent&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let's see an example &lt;code&gt;perf map&lt;/code&gt; file for NodeJS. The runtimes out there output this file with &lt;em&gt;more or less&lt;/em&gt; the same format, &lt;a href="https://github.com/parca-dev/parca-agent/issues/139" rel="noopener noreferrer"&gt;more or less!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To generate a similar file for &lt;a href="https://en.wikipedia.org/wiki/Node.js" rel="noopener noreferrer"&gt;Node.js&lt;/a&gt;, we need to run &lt;code&gt;node&lt;/code&gt; with &lt;code&gt;--perf-basic-prof&lt;/code&gt; option.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# With Node.js &amp;gt;=v0.11.15 the following command will create a map file for NodeJS:&lt;/span&gt;
node &lt;span class="nt"&gt;--perf-basic-prof&lt;/span&gt; your-app.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create a map file at &lt;code&gt;/tmp/perf-&amp;lt;pid&amp;gt;.map&lt;/code&gt; that looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;3ef414c0 398 RegExp:[{(]
3ef418a0 398 RegExp:[})]
59ed4102 26 LazyCompile:~REPLServer.self.writer repl.js:514
59ed44ea 146 LazyCompile:~inspect internal/util/inspect.js:152
59ed4e4a 148 LazyCompile:~formatValue internal/util/inspect.js:456
59ed558a 25f LazyCompile:~formatPrimitive internal/util/inspect.js:768
59ed5d62 35 LazyCompile:~formatNumber internal/util/inspect.js:761
59ed5fca 5d LazyCompile:~stylizeWithColor internal/util/inspect.js:267
4edd2e52 65 LazyCompile:~Domain.exit domain.js:284
4edd30ea 14b LazyCompile:~lastIndexOf native array.js:618
4edd3522 35 LazyCompile:~online internal/repl.js:157
4edd37f2 ec LazyCompile:~setTimeout timers.js:388
4edd3cca b0 LazyCompile:~Timeout internal/timers.js:55
4edd40ba 55 LazyCompile:~initAsyncResource internal/timers.js:45
4edd42da f LazyCompile:~exports.active timers.js:151
4edd457a cb LazyCompile:~insert timers.js:167
4edd4962 50 LazyCompile:~TimersList timers.js:195
4edd4cea 37 LazyCompile:~append internal/linkedlist.js:29
4edd4f12 35 LazyCompile:~remove internal/linkedlist.js:15
4edd5132 d LazyCompile:~isEmpty internal/linkedlist.js:44
4edd529a 21 LazyCompile:~ok assert.js:345
4edd555a 68 LazyCompile:~innerOk assert.js:317
4edd59a2 27 LazyCompile:~processTimers timers.js:220
4edd5d9a 197 LazyCompile:~listOnTimeout timers.js:226
4edd6352 15 LazyCompile:~peek internal/linkedlist.js:9
4edd66ca a1 LazyCompile:~tryOnTimeout timers.js:292
4edd6a02 86 LazyCompile:~ontimeout timers.js:429
4edd7132 d7 LazyCompile:~process.kill internal/process/per_thread.js:173
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Each line has &lt;code&gt;START&lt;/code&gt;, &lt;code&gt;SIZE&lt;/code&gt; and &lt;code&gt;symbolname&lt;/code&gt; fields, separated with spaces. &lt;code&gt;START&lt;/code&gt; and &lt;code&gt;SIZE&lt;/code&gt; are hex numbers without 0x.&lt;br&gt;
&lt;code&gt;symbolname&lt;/code&gt; is the rest of the line, so it could contain special characters.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;With the help of this mapping file, we have everything we need to symbolize the addresses in the stack trace. Of course, as always, this is just an oversimplification.&lt;/p&gt;

&lt;p&gt;For example, these mappings might change as the runtime decides to recompile the bytecode. So we need to keep an eye on these files and keep track of the changes to resolve the address correctly with their most recent mapping.&lt;/p&gt;

&lt;p&gt;Each runtime and virtual machine has its peculiarities that we need to adapt. But those are out of the scope of this post.&lt;/p&gt;

&lt;h2&gt;
  
  
  Interpreted language implementations
&lt;/h2&gt;

&lt;p&gt;Examples of interpreted languages include Python, Ruby, and again many others.&lt;br&gt;
There are also languages that commonly use interpretation as a stage before &lt;a href="https://en.wikipedia.org/wiki/Just-in-time_compilation" rel="noopener noreferrer"&gt;JIT compilation&lt;/a&gt;, e. g. Java.&lt;br&gt;
Symbolization for this stage of compilation is similar to interpreted languages.&lt;/p&gt;

&lt;p&gt;Interpreted language runtimes do not compile the program to machine code.&lt;br&gt;
Instead, interpreters and virtual machines parse and execute the source code using their &lt;a href="https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop" rel="noopener noreferrer"&gt;REPL&lt;/a&gt; routines.&lt;br&gt;
Or execute their own virtual processor. So they have their own way of executing functions and managing stacks.&lt;/p&gt;

&lt;p&gt;If you observe (profile or debug) these runtimes using something like &lt;code&gt;perf&lt;/code&gt;,&lt;br&gt;
you will see symbols for the runtime. However, you won't see the language-level context you might be expecting.&lt;/p&gt;

&lt;p&gt;Moreover, the interpreter itself is probably written in a more low-level language like C or C++.&lt;br&gt;
And when you inspect the object file of the runtime/interpreter, the symbol table that you would find would show the internals of the interpreter, not the symbols from the provided source code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Finding the symbols for our runtime
&lt;/h3&gt;

&lt;p&gt;The runtime symbols are useful because they allow you to see the internal routines of the interpreter. e. g. how much time your program spends on garbage collection.&lt;br&gt;
And it's mostly like the stack traces you would see in the debugger or profiler will have calls to the internals of the runtime.&lt;br&gt;
So these symbols are also helpful for debugging.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiicety55keefh9bqvnh0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiicety55keefh9bqvnh0.png" alt="Node Stack Trace" width="794" height="1042"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Most of the runtimes are compiled with &lt;code&gt;production&lt;/code&gt; mode, and they most likely lack the debug symbols in their release binaries.&lt;br&gt;
You might need to manually compile your runtime in &lt;code&gt;debug mode&lt;/code&gt; to actually have them in the resulting binary.&lt;br&gt;
Some runtimes, such as Node.js, already have them in their &lt;code&gt;production&lt;/code&gt; distributions.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Lastly, to completely resolve the stack traces of the runtime, we might need to obtain the debug information for the linked libraries.&lt;br&gt;
If you remember from &lt;a href="https://dev.to/blog/posts/2022/01/13/fantastic-symbols-and-where-to-find-them"&gt;the first blog post&lt;/a&gt;, debuginfo files can help us.&lt;br&gt;
Debuginfo files for software packages are available through package managers in Linux distributions.&lt;br&gt;
Usually for an available package called &lt;code&gt;mypackage&lt;/code&gt; there exists a &lt;code&gt;mypackage-dbgsym&lt;/code&gt;, &lt;code&gt;mypackage-dbg&lt;/code&gt; or &lt;code&gt;mypackage-debuginfo&lt;/code&gt; package.&lt;br&gt;
There are also &lt;a href="https://sourceware.org/elfutils/Debuginfod.html" rel="noopener noreferrer"&gt;public servers&lt;/a&gt; that serve debug information.&lt;br&gt;
So we need to find the debuginfo files for the runtime we are using and all the linked libraries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Finding the symbols for our target program
&lt;/h3&gt;

&lt;p&gt;The symbols that we look for in our own program likely are stored in a memory table that is specific to the runtime.&lt;br&gt;
For example, in Python, the symbol mappings can be accessed using &lt;a href="https://docs.python.org/3/library/symtable.html" rel="noopener noreferrer"&gt;&lt;code&gt;symtable&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As a result, you need to craft a specific routine for each interpreter runtime (in some cases, each version of that runtime) to obtain symbol information.&lt;br&gt;
Educated eyes might have already noticed, it's not an easy undertaking considering the sheer amount of interpreted languages out there.&lt;br&gt;
For example, a very well known Ruby profiler, &lt;a href="https://github.com/rbspy/rbspy/blob/master/ARCHITECTURE.md" rel="noopener noreferrer"&gt;rbspy&lt;/a&gt;, generates code for reading internal structs of the Ruby runtime for each version.&lt;/p&gt;

&lt;p&gt;If you were to write a general-purpose profiler, &lt;em&gt;like us&lt;/em&gt;, you would need to write a special subroutine in your profiler for each runtime that you want to support.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;em&gt;Again&lt;/em&gt;, don't worry, we got you covered
&lt;/h2&gt;

&lt;p&gt;The good news is we got you covered. If you are using &lt;a href="https://github.com/parca-dev/parca-agent" rel="noopener noreferrer"&gt;Parca Agent&lt;/a&gt;, we already do &lt;a href="https://www.parca.dev/docs/symbolization" rel="noopener noreferrer"&gt;the heavy lifting&lt;/a&gt; for you to symbolize captured stack traces.&lt;br&gt;
And we keep extending our support for the different languages and runtimes.&lt;br&gt;
For example, Parca has already support for parsing &lt;code&gt;perf&lt;/code&gt; JIT interface to resolve the symbols for collected stack traces.&lt;/p&gt;

&lt;p&gt;Check &lt;a href="https://www.parca.dev/" rel="noopener noreferrer"&gt;Parca&lt;/a&gt; out and let us know what you think, on &lt;a href="https://discord.gg/ZgUpYgpzXy" rel="noopener noreferrer"&gt;Discord&lt;/a&gt; channel.&lt;/p&gt;

&lt;h3&gt;
  
  
  Further reading
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/torvalds/linux/blob/master/tools/perf/Documentation/jit-interface.txt" rel="noopener noreferrer"&gt;perf JIT Interface&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.brendangregg.com/perf.html#JIT_Symbols" rel="noopener noreferrer"&gt;perf JIT Symbols&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://joyeecheung.github.io/blog/2018/12/31/tips-and-tricks-node-core/" rel="noopener noreferrer"&gt;Node.js profiling tips and tricks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html" rel="noopener noreferrer"&gt;Node.js Flamegraphs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>programming</category>
      <category>debugging</category>
      <category>profiling</category>
      <category>runtimes</category>
    </item>
    <item>
      <title>Fantastic Symbols and Where to Find Them - Part 1</title>
      <dc:creator>Kemal Akkoyun</dc:creator>
      <pubDate>Sat, 15 Jan 2022 08:18:33 +0000</pubDate>
      <link>https://forem.com/kakkoyun/fantastic-symbols-and-where-to-find-them-part-1-1epo</link>
      <guid>https://forem.com/kakkoyun/fantastic-symbols-and-where-to-find-them-part-1-1epo</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://www.polarsignals.com/blog/" rel="noopener noreferrer"&gt;polarsignals.com/blog&lt;/a&gt; on 13.01.2022&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Symbolization is a technique that allows you to translate machine memory addresses to human-readable symbol information (symbols).&lt;/p&gt;

&lt;p&gt;Why do we need to read what programs do anyways? We usually do not need to translate everything to a human-readable format when things run smoothly. But when things go south, we need to understand what is going on under the hood.&lt;br&gt;
Symbolization is needed by introspection tools like &lt;a href="https://en.wikipedia.org/wiki/Debugger" rel="noopener noreferrer"&gt;debuggers&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Profiling_(computer_programming)" rel="noopener noreferrer"&gt;profilers&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/Core_dump" rel="noopener noreferrer"&gt;core dumps&lt;/a&gt; or any other program that needs to trace the execution of another program.&lt;br&gt;
While a target program is executing on a machine, these types of programs capture the stack traces of the program that is being executed.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A &lt;a href="https://en.wikipedia.org/wiki/Stack_trace" rel="noopener noreferrer"&gt;stack trace&lt;/a&gt; (also called stack backtrace or stack traceback) is a report of the active stack frames at a certain point in time during the execution of a program.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.polarsignals.com%2Fblog%2Fposts%2F2022%2F01%2Fcall_stack_layout.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.polarsignals.com%2Fblog%2Fposts%2F2022%2F01%2Fcall_stack_layout.svg" alt="Call Stack Layout" width="684" height="558"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In raw stack traces, the addresses of the functions that are being called are recorded. The addresses are hexadecimal numbers representing the memory return addresses of the functions. Symbols are needed to translate memory addresses into function and variable names precisely as in the program’s source code to be read by us humans.&lt;br&gt;
Without symbols, all we see are hexadecimal numbers representing the memory addresses that we have captured.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzs4pw98s97izt7nv2j2b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzs4pw98s97izt7nv2j2b.png" alt="Unsymbolized Stack" width="772" height="70"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It sounds simple enough, right? Well, it's not. As with everything else about computers, it's a bit of sorcery. It has its challenges, such as associating them with correct symbols, transforming addresses, and most importantly, actually finding the symbols!&lt;br&gt;
The strategies to get symbol information varies depending on the platform and the programming language implementation that the program is written in.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;For the sake of simplicity, we will be focusing on Linux as the target platform and ignore Windows, macOS and many other platforms. Otherwise, I could end up writing a small size book in here :)&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Fantastic Symbols ...
&lt;/h2&gt;

&lt;p&gt;A symbol (or debug symbol, to be precise) is a special kind of &lt;a href="https://en.wikipedia.org/wiki/Symbol_(programming)" rel="noopener noreferrer"&gt;symbol&lt;/a&gt; that attaches additional information to the symbol table of a program.&lt;br&gt;
This symbol information allows a debugger or a profiler to gain access to information from the program's source code, such as the names of identifiers, including variables and functions.&lt;br&gt;
But where can we find these symbols?&lt;/p&gt;
&lt;h2&gt;
  
  
  ... and Where to Find Them
&lt;/h2&gt;

&lt;p&gt;The actual location of the symbolic information depends on the programming language implementation the program is written in.&lt;br&gt;
We can categorize the programming language implementations into three groups: compiled languages (with or without a runtime), interpreted languages, and &lt;a href="https://en.wikipedia.org/wiki/Just-in-time_compilation" rel="noopener noreferrer"&gt;JIT-compiled&lt;/a&gt; languages.&lt;/p&gt;

&lt;p&gt;If the program is a compiled one, these may be compiled together with the binary file, distributed in a separate file, or discarded during the compilation and/or linking.&lt;br&gt;
Or, if the program is interpreted, these may be stored in the program itself. Let's briefly look at where and how we can find these symbols depending on the programming language implementation.&lt;/p&gt;
&lt;h3&gt;
  
  
  Compiled language implementations
&lt;/h3&gt;

&lt;p&gt;Examples of compiled languages include C, C++, Go, Rust and many others.&lt;/p&gt;

&lt;p&gt;The compiled languages usually have a &lt;a href="https://en.wikipedia.org/wiki/Symbol_table" rel="noopener noreferrer"&gt;symbol table&lt;/a&gt; that contains all the symbols used in the program.&lt;br&gt;
The symbol table is usually compiled in the executable binary file. And the binary file is typically in the &lt;a href="https://en.wikipedia.org/wiki/Executable_and_Linkable_Format" rel="noopener noreferrer"&gt;ELF&lt;/a&gt; format (for Linux systems).&lt;br&gt;
Symbol tables are included in the ELF binary file, specifically for mapping the addresses to function names and object names.&lt;br&gt;
In rare cases, it is stored in a separate file, usually with the same name as the binary file, but with a different extension.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfiq9k0pawylyobtoyte.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfiq9k0pawylyobtoyte.png" alt="ELF" width="800" height="573"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The ELF format is not an easy one to describe in a couple of sentences. For the purpose of this article, we will focus on what we need to know about the ELF format.&lt;br&gt;
Each ELF file is made up of one ELF header, followed by file data. The ELF header is a fixed size and contains information about the data sections.&lt;br&gt;
The relevant part for us is the symbols can live in a special section called &lt;code&gt;.symtab&lt;/code&gt; and &lt;code&gt;.dynsym&lt;/code&gt;.&lt;br&gt;
&lt;code&gt;.dynsym&lt;/code&gt; is the “dynamic symbol table” and it is a smaller version of the &lt;code&gt;.symtab&lt;/code&gt; that only contains global symbols.&lt;/p&gt;

&lt;p&gt;Contents of &lt;code&gt;.dynsym&lt;/code&gt; and &lt;code&gt;.symtab&lt;/code&gt; section using &lt;code&gt;readelf -s /bin/go&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Symbol table '.dynsym' contains 38 entries:
   Num: Value Size Type Bind Vis Ndx Name
     0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
     1: 00000000006355e0 99 FUNC GLOBAL DEFAULT 1 crosscall2
     2: 00000000006355a0 55 FUNC GLOBAL DEFAULT 1 _cgo_panic
     3: 0000000000465560 25 FUNC GLOBAL DEFAULT 1 _cgo_topofstack
     4: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND [...]@GLIBC_2.2.5 (6)
     5: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND [...]@GLIBC_2.2.5 (4)
     6: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND [...]@GLIBC_2.2.5 (4)
     7: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND [...]@GLIBC_2.2.5 (4)
     8: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND [...]@GLIBC_2.2.5 (4)
     9: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND [...]@GLIBC_2.2.5 (4)
    10: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND [...]@GLIBC_2.2.5 (4)
...
Symbol table '.symtab' contains 13199 entries:
   Num: Value Size Type Bind Vis Ndx Name
     0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
     1: 0000000000000000 0 FILE LOCAL DEFAULT ABS go.go
     2: 0000000000401000 0 FUNC LOCAL DEFAULT 1 runtime.text
     3: 0000000000401000 214 FUNC LOCAL DEFAULT 1 net(.text)
     4: 00000000004010e0 214 FUNC LOCAL DEFAULT 1 runtime/cgo(.text)
     5: 00000000004011c0 601 FUNC LOCAL DEFAULT 1 runtime/cgo(.text)
     6: 0000000000401420 480 FUNC LOCAL DEFAULT 1 runtime/cgo(.text)
     7: 0000000000401420 47 FUNC LOCAL HIDDEN 1 threadentry
     8: 0000000000401600 70 FUNC LOCAL DEFAULT 1 runtime/cgo(.text)
     9: 0000000000401646 5 FUNC LOCAL DEFAULT 1 runtime/cgo(.tex[...]
    10: 0000000000401646 5 FUNC LOCAL HIDDEN 1 x_cgo_munmap.cold
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Go has a unique table (of course). It stores its symbols in a section called &lt;a href="https://pkg.go.dev/debug/gosym#LineTable" rel="noopener noreferrer"&gt;&lt;code&gt;.gopclntab&lt;/code&gt;&lt;/a&gt;. This is a table of functions, line numbers and addresses.&lt;br&gt;
Go does this because it needs to be able to render human-readable stack traces when a panic occurs in runtime;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Note that addresses in the symbol table do not move during execution so that they can be read any time during the execution of the program.&lt;br&gt;
They can easily be loaded into memory independent of the running program and an observer can easily read them.&lt;/p&gt;

&lt;p&gt;We assumed that the binary file is a statically linked executable until this point. However, this might not be the case. The binary file might be dynamically linked to other libraries.&lt;br&gt;
From now on, we will refer to these shared library files and executables (both in ELF format) as &lt;a href="https://en.wikipedia.org/wiki/Object_file" rel="noopener noreferrer"&gt;object files&lt;/a&gt;. Each object file can have its own symbol table.&lt;/p&gt;

&lt;p&gt;We need to note that when we take a snapshot of the stack (a.k.a stack trace), it could include addresses from linked shared libraries and Kernel functions.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Kernel-level software differs as it has its own dynamic symbol table in &lt;code&gt;/proc/kallsyms&lt;/code&gt;, which is a file that contains all the symbols that are used in the kernel. And it can grow as the kernel modules are loaded.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We can read the object files by using binary utilities such as &lt;a href="https://en.wikipedia.org/wiki/Objdump" rel="noopener noreferrer"&gt;objdump&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Readelf" rel="noopener noreferrer"&gt;readelf&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/Nm_(Unix)" rel="noopener noreferrer"&gt;nm&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To read the &lt;code&gt;.symtab&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nm &lt;span class="nv"&gt;$FILE&lt;/span&gt;
&lt;span class="c"&gt;# or&lt;/span&gt;
objdump &lt;span class="nt"&gt;--syms&lt;/span&gt; &lt;span class="nv"&gt;$FILE&lt;/span&gt;
&lt;span class="c"&gt;# or&lt;/span&gt;
readelf &lt;span class="nt"&gt;-a&lt;/span&gt; &lt;span class="nv"&gt;$FILE&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To read the &lt;code&gt;.dynsym&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nm &lt;span class="nt"&gt;-D&lt;/span&gt; &lt;span class="nv"&gt;$FILE&lt;/span&gt;
&lt;span class="c"&gt;# or&lt;/span&gt;
objdump &lt;span class="nt"&gt;--dynamic-syms&lt;/span&gt; &lt;span class="nv"&gt;$FILE&lt;/span&gt;
&lt;span class="c"&gt;# or&lt;/span&gt;
readelf &lt;span class="nt"&gt;-a&lt;/span&gt; &lt;span class="nv"&gt;$FILE&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the compiled languages, the symbol table is not the only source of symbols. There are also DWARFs!&lt;/p&gt;

&lt;h4&gt;
  
  
  Debuginfo
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;ELFs and DWARFs, welcome to fairyland.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Another way to obtain the symbols from an object file is to use the debug information or &lt;code&gt;debuginfo&lt;/code&gt; in short.&lt;br&gt;
Same as the symbol table, this information can be compiled in the binary file, formatted in the &lt;a href="https://en.wikipedia.org/wiki/DWARF" rel="noopener noreferrer"&gt;DWARF(Debugging With Attributed Record Formats)&lt;/a&gt; or in a separate file.&lt;/p&gt;

&lt;p&gt;DWARF is the debug information format most commonly used with ELF. It’s not necessarily tied to ELF, but the two were developed in tandem and work very well together.&lt;br&gt;
This information is split across different ELF sections (&lt;code&gt;.debug_*&lt;/code&gt; and &lt;code&gt;.zdebug_*&lt;/code&gt; for compressed ones), each with its own piece of information to relay.&lt;br&gt;
For our specific needs, we need to use the &lt;code&gt;.debug_info&lt;/code&gt; section to find corresponding functions and &lt;code&gt;.debug_line&lt;/code&gt; section to corresponding line numbers.&lt;/p&gt;

&lt;p&gt;Debuginfo files for software packages are available through package managers in Linux distributions.&lt;br&gt;
Usually for an available package called &lt;code&gt;mypackage&lt;/code&gt; there exists a &lt;code&gt;mypackage-dbgsym&lt;/code&gt;, &lt;code&gt;mypackage-dbg&lt;/code&gt; or &lt;code&gt;mypackage-debuginfo&lt;/code&gt; package.&lt;br&gt;
There are also &lt;a href="https://sourceware.org/elfutils/Debuginfod.html" rel="noopener noreferrer"&gt;public servers&lt;/a&gt; that serve debug information.&lt;/p&gt;
&lt;h4&gt;
  
  
  One Program to bring them all, and in the darkness bind them: &lt;code&gt;addr2line&lt;/code&gt;
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;Wait, what?! Isn't that from another fantasy book?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now that we have the symbol table or debug information, we can use &lt;code&gt;addr2line&lt;/code&gt; (&lt;em&gt;address to line&lt;/em&gt;) to get the source code location of a given address.&lt;br&gt;
&lt;a href="https://linux.die.net/man/1/addr2line" rel="noopener noreferrer"&gt;&lt;code&gt;addr2line&lt;/code&gt;&lt;/a&gt; converts addresses back to function and line numbers.&lt;/p&gt;

&lt;p&gt;Let's see it in action &lt;code&gt;addr2line -a 0x0000000000001154 -e &amp;lt;objectFile&amp;gt;&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;For addr2line &lt;code&gt;&amp;lt;objectFile&amp;gt;&lt;/code&gt; can be any object file compiled with debug information or symbols. It can be an executable, a shared library or output of a strip operation.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Voilà!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0x0000000000001154
main
/home/newt/Sandbox/hello-c/hello.c:14
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I used a simple C executable for this example. And we have got our symbol and attached source information for the corresponding address 🎉&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxlwwnxa3qyp4z69o2uxo.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxlwwnxa3qyp4z69o2uxo.jpg" alt="Success" width="500" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I only wish we had compiled programming language implementations out there, then our job here could have been finished. But we are not. We need to keep digging.&lt;br&gt;
But for that, you need to wait for another week. As we hinted at in the title of this post, there will be a part 2! All the best franchises are sequels, right?!&lt;br&gt;
In part 2, we will see how interpreted languages and &lt;a href="https://en.wikipedia.org/wiki/Just-in-time_compilation" rel="noopener noreferrer"&gt;Just-In-Time&lt;/a&gt; compiled languages handle symbols.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Please stay tuned!&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Don't worry we got you covered
&lt;/h2&gt;

&lt;p&gt;Even though we simplified things a bit here, if you want to write a program to utilize symbolization, you still have a lot of work to do.&lt;br&gt;
Many open-source tools out there already handle nitty-gritty details of symbolization, like &lt;a href="https://www.brendangregg.com/perf.html" rel="noopener noreferrer"&gt;&lt;code&gt;perf&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The good news is we got you covered. If you are using &lt;a href="https://github.com/parca-dev/parca-agent" rel="noopener noreferrer"&gt;Parca Agent&lt;/a&gt;, we already do &lt;a href="https://www.parca.dev/docs/symbolization" rel="noopener noreferrer"&gt;the heavy lifting&lt;/a&gt; for you to symbolize captured stack traces.&lt;br&gt;
And we keep extending our support for the different languages and runtimes.&lt;/p&gt;

&lt;p&gt;Check &lt;a href="https://www.parca.dev/" rel="noopener noreferrer"&gt;Parca&lt;/a&gt; out and let us know what you think, on &lt;a href="https://discord.gg/ZgUpYgpzXy" rel="noopener noreferrer"&gt;Discord&lt;/a&gt; channel.&lt;/p&gt;

&lt;h3&gt;
  
  
  Further reading
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Debug_symbol" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/Debug_symbol&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.brendangregg.com/bpf-performance-tools-book.html" rel="noopener noreferrer"&gt;https://www.brendangregg.com/bpf-performance-tools-book.html&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/DataDog/go-profiler-notes/blob/main/stack-traces.md" rel="noopener noreferrer"&gt;https://github.com/DataDog/go-profiler-notes/blob/main/stack-traces.md&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.brendangregg.com/perf.html" rel="noopener noreferrer"&gt;https://www.brendangregg.com/perf.html&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://jvns.ca/blog/2018/01/09/resolving-symbol-addresses/" rel="noopener noreferrer"&gt;https://jvns.ca/blog/2018/01/09/resolving-symbol-addresses/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Sources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/File:Call_stack_layout.svg" rel="noopener noreferrer"&gt;Call Stack Layout&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/corkami/pics/blob/28cb0226093ed57b348723bc473cea0162dad366/binary/elf101/elf101-64.svg" rel="noopener noreferrer"&gt;ELF Executable and Linkable Format diagram by Ange Albertini&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>programming</category>
      <category>profiling</category>
      <category>debugging</category>
      <category>observability</category>
    </item>
  </channel>
</rss>
