<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Prathamesh Deshmukh</title>
    <description>The latest articles on Forem by Prathamesh Deshmukh (@prathamudeshmukh).</description>
    <link>https://forem.com/prathamudeshmukh</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F102643%2F411894c7-d573-4b0a-a5cf-9024e0ae8322.jpeg</url>
      <title>Forem: Prathamesh Deshmukh</title>
      <link>https://forem.com/prathamudeshmukh</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/prathamudeshmukh"/>
    <language>en</language>
    <item>
      <title>How I Load Test a PDF Generation API with k6, Docker, and GitHub Actions</title>
      <dc:creator>Prathamesh Deshmukh</dc:creator>
      <pubDate>Sat, 28 Mar 2026 05:13:48 +0000</pubDate>
      <link>https://forem.com/prathamudeshmukh/how-i-load-test-a-pdf-generation-api-with-k6-docker-and-github-actions-4em4</link>
      <guid>https://forem.com/prathamudeshmukh/how-i-load-test-a-pdf-generation-api-with-k6-docker-and-github-actions-4em4</guid>
      <description>&lt;h2&gt;
  
  
  The Problem with "It Works on My Machine"
&lt;/h2&gt;

&lt;p&gt;PDF generation is one of those deceptively expensive operations. You fire off a request, Puppeteer spins up a headless Chromium, renders a full HTML page, and exports it to bytes. Works great in dev. Works great in staging with one user. Then someone puts it in production and a dozen concurrent requests land at once — and you discover your server is quietly crying.&lt;/p&gt;

&lt;p&gt;That was the situation with &lt;a href="https://templify.cloud" rel="noopener noreferrer"&gt;Templify&lt;/a&gt;, a PDF generation platform I built. The core API — &lt;code&gt;POST /convert/{templateId}&lt;/code&gt; — compiles a Handlebars template and delegates to a job-runner service (Express + Puppeteer) to render the PDF. Each request is CPU-bound and takes 1–4 seconds depending on template complexity.&lt;/p&gt;

&lt;p&gt;Before confidently telling users the API handles concurrent load, I needed proof. Enter &lt;strong&gt;k6&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why k6
&lt;/h2&gt;

&lt;p&gt;I've used Locust, JMeter, and Artillery. k6 wins on developer ergonomics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Test scripts are JavaScript&lt;/strong&gt; — no YAML configs, no XML, no DSL to learn&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Built-in thresholds&lt;/strong&gt; — define pass/fail criteria in the script itself&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker-first&lt;/strong&gt; — the official &lt;code&gt;grafana/k6&lt;/code&gt; image just works&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CLI output is readable&lt;/strong&gt; — colored, structured, and tells you exactly what you need&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The only gotcha: k6 uses a custom JS runtime (Goja, not Node.js), so you can't &lt;code&gt;import&lt;/code&gt; arbitrary npm packages. For API testing, that limitation never matters.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Infrastructure (Brief Context)
&lt;/h2&gt;

&lt;p&gt;The job-runner service runs in Docker on a &lt;strong&gt;Hetzner CX11&lt;/strong&gt; — 2 vCPU, 4GB RAM, ~$4.15/month. It hosts both production (port 3000) and staging (port 3001) environments as separate containers on the same box.&lt;/p&gt;

&lt;p&gt;The load test runs &lt;em&gt;on the Hetzner server itself&lt;/em&gt;, not from the GitHub Actions runner. This is intentional: it removes network latency variability from CI runners and tests the raw throughput of the server under local loopback — the truest measure of what the hardware can sustain.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Write the k6 Test Script
&lt;/h2&gt;

&lt;p&gt;The test script lives at &lt;code&gt;job-runner/k6/load-test.js&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Shape
&lt;/h3&gt;

&lt;p&gt;k6 calls the test configuration &lt;code&gt;options&lt;/code&gt;. The shape I chose is a classic &lt;strong&gt;ramp-up → steady state → ramp-down&lt;/strong&gt; pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;stages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;30s&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;// ramp up to 5 VUs&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1m&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;// hold for 1 minute&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;30s&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;// ramp down&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;thresholds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;http_req_duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;p(95)&amp;lt;3500&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;// 95th percentile under 3.5s&lt;/span&gt;
    &lt;span class="na"&gt;http_req_failed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;rate&amp;lt;0.1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;   &lt;span class="c1"&gt;// error rate under 10%&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why 5 virtual users?&lt;/strong&gt; This isn't an arbitrary number. Each VU issues one request, waits 1 second (&lt;code&gt;sleep(1)&lt;/code&gt;), then issues another. At 5 concurrent VUs with ~2–3s response times, the server is handling ~2–3 simultaneous Puppeteer renders at any given moment — right at the limit of what a 2-vCPU box can sustain without queuing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why p95 instead of average?&lt;/strong&gt; Averages lie. A test where 94% of requests complete in 200ms and 6% time out at 30 seconds shows a "fine" average. p95 tells you what the worst realistic experience looks like.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Payload
&lt;/h3&gt;

&lt;p&gt;The test uses a realistic Handlebars template payload — a multi-section marketing brochure for a fictional company called TechInnovate. This matters: testing with &lt;code&gt;{"name": "test"}&lt;/code&gt; would render in 300ms; testing with the actual production payload shape catches real performance characteristics.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;PAYLOAD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;templateData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;hero&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Transforming Business Through Technology&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;subtitle&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Innovative solutions that drive growth, efficiency, and competitive advantage&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;ctaButton&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#products&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Discover Our Solutions&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;about&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;About TechInnovate&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Founded in 2015, TechInnovate has been at the forefront...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;bulletPoints&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;10+ years of industry experience&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;200+ successful projects delivered&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;98% client satisfaction rate&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Global team of certified experts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
      &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;products&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="cm"&gt;/* 3 products with prices, descriptions */&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;features&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="cm"&gt;/* 4 feature cards */&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;testimonial&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;quote&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;We have seen a 40% increase in efficiency...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Jennifer Martinez&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;company&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CEO, Global Enterprises&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// ... contact, footer, social links&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Request Function
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;TEMPLATE_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;c07deb00-bb22-4e5f-b48e-1b1c17f7c969&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CLIENT_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;__ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CLIENT_ID&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CLIENT_SECRET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;__ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CLIENT_SECRET&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;baseUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://api.templify.cloud&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;client_secret&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CLIENT_SECRET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;client_id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CLIENT_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pdfResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/convert/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;TEMPLATE_ID&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;PAYLOAD&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;headers&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nf"&gt;check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pdfResponse&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PDF generation status is 200&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PDF generation response time &amp;lt; 5s&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;timings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Credentials come from environment variables via k6's &lt;code&gt;__ENV&lt;/code&gt; — never hardcoded. The &lt;code&gt;check()&lt;/code&gt; function records pass/fail metrics per assertion without stopping the test.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Containerize k6
&lt;/h2&gt;

&lt;p&gt;The Dockerfile is deliberately minimal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; grafana/k6:latest&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; *.js .&lt;/span&gt;

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["run", "load-test.js"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. The official &lt;code&gt;grafana/k6&lt;/code&gt; image ships k6 at a known version in a minimal Alpine-based image. No node_modules, no build step, no complexity. The &lt;code&gt;*.js&lt;/code&gt; glob future-proofs it — add more test files and they're automatically available.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: The Deploy Script — Running Tests Remotely
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;run-load-test.sh&lt;/code&gt; script orchestrates the full flow: sync the k6 files to the server, build the Docker image there, run it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;

&lt;span class="nv"&gt;TARGET_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HETZNER_HOST&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;HETZNER_USER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HETZNER_USER&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;CLIENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CLIENT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;CLIENT_SECRET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CLIENT_SECRET&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Running load test for PRODUCTION environment..."&lt;/span&gt;

&lt;span class="c"&gt;# Clean previous run&lt;/span&gt;
ssh &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;StrictHostKeyChecking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;no &lt;span class="nt"&gt;-i&lt;/span&gt; ~/.ssh/id_rsa &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;$HETZNER_USER&lt;/span&gt;@&lt;span class="nv"&gt;$TARGET_HOST&lt;/span&gt; &lt;span class="s2"&gt;"rm -rf ~/load-test/k6"&lt;/span&gt;

&lt;span class="c"&gt;# Sync k6 folder to server&lt;/span&gt;
scp &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;StrictHostKeyChecking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;no &lt;span class="nt"&gt;-i&lt;/span&gt; ~/.ssh/id_rsa &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-r&lt;/span&gt; k6 &lt;span class="nv"&gt;$HETZNER_USER&lt;/span&gt;@&lt;span class="nv"&gt;$TARGET_HOST&lt;/span&gt;:~/load-test/

&lt;span class="c"&gt;# Build and run on the server&lt;/span&gt;
ssh &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;StrictHostKeyChecking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;no &lt;span class="nt"&gt;-i&lt;/span&gt; ~/.ssh/id_rsa &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;$HETZNER_USER&lt;/span&gt;@&lt;span class="nv"&gt;$TARGET_HOST&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
    cd ~/load-test/k6
    docker stop k6-load-test 2&amp;gt;/dev/null || true
    docker rm k6-load-test 2&amp;gt;/dev/null || true
    docker build --no-cache -f Dockerfile.k6 -t k6-load-test .
    docker run --rm &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="sh"&gt;
      --name k6-load-test &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="sh"&gt;
      -e CLIENT_ID=&lt;/span&gt;&lt;span class="nv"&gt;$CLIENT_ID&lt;/span&gt;&lt;span class="sh"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="sh"&gt;
      -e CLIENT_SECRET=&lt;/span&gt;&lt;span class="nv"&gt;$CLIENT_SECRET&lt;/span&gt;&lt;span class="sh"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="sh"&gt;
      -e K6_WEB_DASHBOARD=true &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="sh"&gt;
      -e K6_WEB_DASHBOARD_PORT=-1 &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="sh"&gt;
      k6-load-test
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Load test completed."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why build on the server instead of pulling from a registry?
&lt;/h3&gt;

&lt;p&gt;Because the test script changes frequently during development. Pushing to a registry on every iteration adds friction. &lt;code&gt;scp&lt;/code&gt; + &lt;code&gt;docker build --no-cache&lt;/code&gt; is fast (&amp;lt; 30 seconds) and guarantees you're running exactly the code you just edited — no cache surprises.&lt;/p&gt;

&lt;h3&gt;
  
  
  The &lt;code&gt;K6_WEB_DASHBOARD=true&lt;/code&gt; flag
&lt;/h3&gt;

&lt;p&gt;k6 ships with a built-in real-time web dashboard. Setting &lt;code&gt;K6_WEB_DASHBOARD_PORT=-1&lt;/code&gt; disables the HTTP server (since we're in a non-interactive SSH session) but still enables the dashboard's internal metrics aggregation and summary report output at the end. If you're running interactively, set a port like &lt;code&gt;8089&lt;/code&gt; and open the dashboard in your browser during the test run.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: GitHub Actions Workflow — Manual Trigger
&lt;/h2&gt;

&lt;p&gt;Load tests are &lt;strong&gt;not&lt;/strong&gt; run on every push. Running a 2-minute load test on every PR would be slow, expensive (in credits), and noisy. Instead, it's a &lt;code&gt;workflow_dispatch&lt;/code&gt; — triggered manually, on demand:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Load Test&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;workflow_dispatch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Environment&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;test'&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;production'&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;choice&lt;/span&gt;
        &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;production&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;staging&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;load-test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;

    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Checkout code&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Setup SSH&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;mkdir -p ~/.ssh&lt;/span&gt;
          &lt;span class="s"&gt;echo "${{ secrets.HETZNER_SSH_KEY }}" &amp;gt; ~/.ssh/id_rsa&lt;/span&gt;
          &lt;span class="s"&gt;chmod 600 ~/.ssh/id_rsa&lt;/span&gt;
          &lt;span class="s"&gt;ssh-keyscan -H ${{ secrets.HETZNER_HOST }} &amp;gt;&amp;gt; ~/.ssh/known_hosts&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run load test&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;CLIENT_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.CLIENT_ID }}&lt;/span&gt;
          &lt;span class="na"&gt;CLIENT_SECRET&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.CLIENT_SECRET }}&lt;/span&gt;
          &lt;span class="na"&gt;HETZNER_HOST&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.HETZNER_HOST }}&lt;/span&gt;
          &lt;span class="na"&gt;HETZNER_USER&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.HETZNER_USER }}&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;chmod +x scripts/run-load-test.sh&lt;/span&gt;
          &lt;span class="s"&gt;./scripts/run-load-test.sh ${{ github.event.inputs.environment }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Required GitHub secrets:&lt;/strong&gt;&lt;br&gt;
| Secret | Value |&lt;br&gt;
|---|---|&lt;br&gt;
| &lt;code&gt;HETZNER_SSH_KEY&lt;/code&gt; | Contents of the ED25519 private key |&lt;br&gt;
| &lt;code&gt;HETZNER_HOST&lt;/code&gt; | Server IP or hostname |&lt;br&gt;
| &lt;code&gt;HETZNER_USER&lt;/code&gt; | SSH user (e.g. &lt;code&gt;root&lt;/code&gt;) |&lt;br&gt;
| &lt;code&gt;CLIENT_ID&lt;/code&gt; | Templify API client ID |&lt;br&gt;
| &lt;code&gt;CLIENT_SECRET&lt;/code&gt; | Templify API client secret |&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;ssh-keyscan&lt;/code&gt; step adds the server's host key to &lt;code&gt;known_hosts&lt;/code&gt;, preventing the interactive "are you sure?" prompt that would hang the CI runner.&lt;/p&gt;


&lt;h2&gt;
  
  
  What the Output Looks Like
&lt;/h2&gt;

&lt;p&gt;When k6 finishes, it prints a summary to stdout — which GitHub Actions captures and displays in the workflow logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: load-test.js
     output: -

  scenarios: (100.00%) 1 scenario, 5 max VUs, 2m30s max duration (incl. graceful stop):
           * default: Up to 5 looping VUs for 2m0s over 3 stages (gracefulRampDown: 30s, ...)

✓ PDF generation status is 200
✓ PDF generation response time &amp;lt; 5s

     checks.........................: 100.00% ✓ 142  ✗ 0
     data_received..................: 14 MB   117 kB/s
     data_sent......................: 87 kB   727 B/s
     http_req_blocked...............: avg=18.4ms   min=2µs    med=5µs    max=1.04s    p(90)=10µs   p(95)=14µs
     http_req_duration..............: avg=1.91s    min=892ms  med=1.79s  max=4.12s    p(90)=2.94s  p(95)=3.21s
   ✓ { expected_response:true }....: avg=1.91s    min=892ms  med=1.79s  max=4.12s    p(90)=2.94s  p(95)=3.21s
     http_req_failed................: 0.00%   ✓ 0    ✗ 142
     http_req_receiving.............: avg=143.2ms  min=4.98ms med=79.5ms max=731ms    p(90)=381ms  p(95)=477ms
     http_req_sending...............: avg=258µs    min=97µs   med=213µs  max=1.45ms   p(90)=435µs  p(95)=509µs
     http_req_tls_handshaking.......: avg=18.3ms   min=0s     med=0s     max=1.04s    p(90)=0s     p(95)=0s
     http_req_waiting...............: avg=1.77s    min=858ms  med=1.66s  max=3.86s    p(90)=2.72s  p(95)=3.02s
     http_reqs......................: 142     1.183333/s
     iteration_duration.............: avg=2.91s    min=1.9s   med=2.79s  max=5.15s    p(90)=3.94s  p(95)=4.21s
     iterations.....................: 142     1.183333/s
     vus............................: 1       min=1  max=5
     vus_max........................: 5       min=5  max=5


running (2m00.0s), 0/5 VUs, 142 complete and 0 interrupted iterations
default ✓ [==============================] 0/5 VUs  2m0s

✓ http_req_duration............: p(95)=3.21s &amp;lt; 3.5s  ✓ PASS
✓ http_req_failed..............: rate=0.00% &amp;lt; 10%     ✓ PASS
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The thresholds section at the bottom is the pass/fail verdict. k6 exits with a non-zero code if any threshold is breached — which means GitHub Actions marks the workflow run as failed. No manual inspection required.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Full File Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;job-runner/
├── k6/
│   ├── load-test.js       # k6 test script
│   └── Dockerfile.k6      # Minimal k6 Docker image
├── scripts/
│   └── run-load-test.sh   # SSH + SCP + docker run orchestration
└── .github/
    └── workflows/
        └── load-test.yml  # Manual GitHub Actions workflow
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Four files. That's the entire setup.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Design Decisions, Explained
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Run k6 on the server, not from CI
&lt;/h3&gt;

&lt;p&gt;Running from GitHub's Ubuntu runners introduces network hops: CI runner → Cloudflare/CDN → Vercel (API gateway) → Hetzner. That's fine for integration testing but adds noise to performance benchmarking. Running on Hetzner itself tests the raw capacity of the PDF service without network jitter.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;workflow_dispatch&lt;/code&gt; not &lt;code&gt;push&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Load tests are deliberately not automated on push. They consume API credits (each PDF generation deducts 1 credit), generate real load, and take 2+ minutes. The right time to run them is before a deploy to production or when investigating a performance regression — not on every feature branch commit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Credentials via &lt;code&gt;__ENV&lt;/code&gt;, never hardcoded
&lt;/h3&gt;

&lt;p&gt;k6's &lt;code&gt;__ENV&lt;/code&gt; object reads environment variables passed at runtime. This means the same script works in local dev (&lt;code&gt;k6 run --env CLIENT_ID=xxx load-test.js&lt;/code&gt;), in Docker (&lt;code&gt;-e CLIENT_ID=xxx&lt;/code&gt;), and in CI — without any code changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;--no-cache&lt;/code&gt; on Docker build
&lt;/h3&gt;

&lt;p&gt;The test script changes often. Docker's layer cache would happily serve a stale &lt;code&gt;load-test.js&lt;/code&gt; if you forget to invalidate it. &lt;code&gt;--no-cache&lt;/code&gt; is a small penalty (~5 seconds) that guarantees correctness.&lt;/p&gt;




&lt;h2&gt;
  
  
  Running Locally
&lt;/h2&gt;

&lt;p&gt;If you want to run this without GitHub Actions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install k6 (macOS)&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;k6

&lt;span class="c"&gt;# Run directly&lt;/span&gt;
k6 run &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--env&lt;/span&gt; &lt;span class="nv"&gt;CLIENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_client_id &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--env&lt;/span&gt; &lt;span class="nv"&gt;CLIENT_SECRET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_client_secret &lt;span class="se"&gt;\&lt;/span&gt;
  k6/load-test.js

&lt;span class="c"&gt;# Or via Docker&lt;/span&gt;
docker build &lt;span class="nt"&gt;-f&lt;/span&gt; k6/Dockerfile.k6 &lt;span class="nt"&gt;-t&lt;/span&gt; k6-load-test k6/
docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;CLIENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_client_id &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;CLIENT_SECRET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_client_secret &lt;span class="se"&gt;\&lt;/span&gt;
  k6-load-test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To open the live dashboard while running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;k6 run &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--env&lt;/span&gt; &lt;span class="nv"&gt;CLIENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xxx &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--env&lt;/span&gt; &lt;span class="nv"&gt;CLIENT_SECRET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xxx &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--out&lt;/span&gt; web-dashboard&lt;span class="o"&gt;=&lt;/span&gt;open &lt;span class="se"&gt;\&lt;/span&gt;
  k6/load-test.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This opens a browser tab with real-time charts of VU count, request rate, response times, and threshold status.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. PDF generation doesn't scale linearly.&lt;/strong&gt; At 1 VU, p95 is ~1.2s. At 5 VUs, p95 climbs to ~3.2s. The bottleneck is Chromium — each instance is single-threaded and memory-hungry. Beyond 5–6 concurrent renders on a 2-vCPU box, response times spike and errors appear.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The ramp-up stage matters.&lt;/strong&gt; Starting at full concurrency immediately causes a thundering herd. Ramping over 30 seconds gives the server time to warm up connection pools and stabilize before the steady-state measurement begins.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. &lt;code&gt;sleep(1)&lt;/code&gt; is realistic pacing.&lt;/strong&gt; Without a sleep, each VU would hammer the API as fast as possible — useful for finding the absolute breaking point, but not representative of real user behavior. A 1-second pause between requests models a user who just submitted a form and is waiting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Thresholds are commitments.&lt;/strong&gt; Defining &lt;code&gt;p(95)&amp;lt;3500&lt;/code&gt; in the script makes the performance budget explicit and machine-enforceable. When it breaks, you know exactly why — and you can't ship until it passes.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The current setup is a solid baseline. Natural next steps would be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Export metrics to InfluxDB + Grafana&lt;/strong&gt; for historical trend tracking across deploys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add a spike test stage&lt;/strong&gt; — a sudden jump to 20 VUs for 10 seconds — to test recovery behavior&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test the async endpoint separately&lt;/strong&gt; — async PDF generation has different characteristics (202 immediate response, webhook delivery latency)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parameterize the template ID and payload&lt;/strong&gt; to test multiple templates in a single run using k6's &lt;code&gt;SharedArray&lt;/code&gt; for test data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But for a $4/month box generating PDFs for real customers, knowing it handles 5 concurrent requests within SLA — proved by automated tests triggered from GitHub — is exactly the confidence level needed to sleep well at night.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with &lt;a href="https://k6.io" rel="noopener noreferrer"&gt;k6&lt;/a&gt; by Grafana Labs, deployed on &lt;a href="https://www.hetzner.com/cloud" rel="noopener noreferrer"&gt;Hetzner Cloud&lt;/a&gt;, automated with &lt;a href="https://github.com/features/actions" rel="noopener noreferrer"&gt;GitHub Actions&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>testing</category>
      <category>performance</category>
      <category>k6</category>
    </item>
    <item>
      <title>How We Built An Operations Support AI Agent for a Global Auto Industry Leader's Post Sales Software Department</title>
      <dc:creator>Prathamesh Deshmukh</dc:creator>
      <pubDate>Thu, 05 Mar 2026 06:11:55 +0000</pubDate>
      <link>https://forem.com/prathamudeshmukh/how-we-built-an-operations-support-ai-agent-for-a-global-auto-industry-leaders-post-sales-software-357m</link>
      <guid>https://forem.com/prathamudeshmukh/how-we-built-an-operations-support-ai-agent-for-a-global-auto-industry-leaders-post-sales-software-357m</guid>
      <description>&lt;p&gt;I was working with a global auto industry leader on their post-sales software platform. The platform had recently launched with seven modules. Each module was built as a microservice. &lt;br&gt;
Communication across services happened primarily through a message streaming broker. &lt;/p&gt;

&lt;p&gt;The data flow between services was non-trivial — upstream and downstream dependencies, bidirectional communication patterns, and conditional routing based on context.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Operational Reality
&lt;/h3&gt;

&lt;p&gt;When an end user raised a complaint, the support team had to perform initial root cause analysis before escalating to engineering.&lt;/p&gt;

&lt;p&gt;The system had too many moving parts for quick intuition based debugging. And more importantly, the mental model of  "how everything connects” was concentrated in one person on the support team.&lt;/p&gt;

&lt;p&gt;This wasn't a tooling problem.&lt;br&gt;
It was a knowledge distribution problem.&lt;/p&gt;

&lt;p&gt;The question became:&lt;/p&gt;

&lt;p&gt;Can we codify the debugging intuition of the most experienced support engineer — and make it usable by anyone?&lt;/p&gt;

&lt;p&gt;That's where the idea of an operations support AI agent emerged.&lt;/p&gt;

&lt;p&gt;But we were careful about one thing:&lt;br&gt;
The goal wasn't to make an agent that  "knows everything.”&lt;br&gt;
The goal was to make an agent grounded in the actual architecture of the system.&lt;/p&gt;




&lt;h2&gt;
  
  
  Designing the Agent Backwards from Reality
&lt;/h2&gt;

&lt;p&gt;The complexity wasn't just the number of services.&lt;/p&gt;

&lt;p&gt;It was:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inter service communication patterns&lt;/li&gt;
&lt;li&gt;Conditional flows&lt;/li&gt;
&lt;li&gt;Bidirectional dependencies&lt;/li&gt;
&lt;li&gt;And multiple layers of state verification (UI, logs, database)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So instead of jumping straight into prompt engineering, we started with context engineering.&lt;/p&gt;

&lt;p&gt;We asked:&lt;br&gt;
What does a strong human support engineer actually do when debugging?&lt;/p&gt;

&lt;p&gt;The answer was structured, even if it wasn't documented.&lt;br&gt;
And that structure became the foundation of the agent.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp7g8x3twx4raba8ndg2h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp7g8x3twx4raba8ndg2h.png" alt="Build phases" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 1: Reconstruct the System's Big-Picture Flow
&lt;/h3&gt;

&lt;p&gt;The services were distributed across multiple repositories (polyrepo structure). To understand interactions, we first had to bring everything into one workspace.&lt;/p&gt;

&lt;p&gt;What we did&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Checked out all service repositories together.&lt;/li&gt;
&lt;li&gt;For each service, we prompted AI to generate upstream and downstream dependency diagrams based on message broker configurations found in the codebase.&lt;/li&gt;
&lt;li&gt;We generated these per service to avoid overloading the model.&lt;/li&gt;
&lt;li&gt;Once individual service documents were created, we asked AI to compile them into a single system-wide data flow diagram using Mermaid (text-based diagram generation).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result was a consolidated "big-picture" document.&lt;/p&gt;

&lt;p&gt;This became foundational context for the agent - not a theoretical architecture diagram, but something derived from the actual codebase configurations.&lt;/p&gt;

&lt;p&gt;It allowed the agent to reason about interaction points instead of guessing.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxylium26qe95fnzgwy9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxylium26qe95fnzgwy9.png" alt="Sample Big Picture" width="800" height="863"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 2: Model How Humans Debug via the UI
&lt;/h3&gt;

&lt;p&gt;One of the most interesting observations was this:&lt;/p&gt;

&lt;p&gt;The most effective early debugging didn't start with logs.&lt;br&gt;
It started with the application UI.&lt;/p&gt;

&lt;p&gt;Support engineers used UI screens to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Search for domain entities&lt;/li&gt;
&lt;li&gt;Inspect state&lt;/li&gt;
&lt;li&gt;Check timestamps&lt;/li&gt;
&lt;li&gt;Identify where a transaction stopped progressing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So we needed the agent to replicate that behavior.&lt;/p&gt;

&lt;p&gt;What we did&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prompted AI to extract the list of UI screens from the micro-frontend system.&lt;/li&gt;
&lt;li&gt;Prompted separately for each UI module to maintain output quality.&lt;/li&gt;
&lt;li&gt;Generated a structured document listing:&lt;/li&gt;
&lt;li&gt;UI screens&lt;/li&gt;
&lt;li&gt;Available search filters&lt;/li&gt;
&lt;li&gt;Displayed columns/data points&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then we created an index/router document that allowed the agent to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify which screens correspond to a domain entity&lt;/li&gt;
&lt;li&gt;Suggest navigation paths&lt;/li&gt;
&lt;li&gt;Recommend filters to apply&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This transformed the agent from a generic reasoning engine into something application-aware.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3g7wqe15mqb7tjepuhdn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3g7wqe15mqb7tjepuhdn.png" alt="Sample System UI models" width="800" height="1527"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 3: Enable Database-Level Reasoning with ER Context
&lt;/h3&gt;

&lt;p&gt;When UI-level validation wasn't enough, the fallback was querying the database.&lt;/p&gt;

&lt;p&gt;But meaningful DB debugging requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understanding entity relationships&lt;/li&gt;
&lt;li&gt;Knowing which fields exist&lt;/li&gt;
&lt;li&gt;Writing contextually valid queries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So we:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generated ER diagrams for each backend service&lt;/li&gt;
&lt;li&gt;Built a routing index so the agent could load the appropriate ER diagram based on the issue's domain context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Again, the pattern was the same:&lt;br&gt;
Keep context modular.&lt;br&gt;
Allow conditional loading.&lt;br&gt;
Avoid overwhelming the model.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8abamhh4400qylek90pv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8abamhh4400qylek90pv.png" alt="Sample ER diagrams" width="800" height="452"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 4: Designing the Agent Interaction Model
&lt;/h3&gt;

&lt;p&gt;Only after building the context layer did we design the agent itself.&lt;/p&gt;

&lt;p&gt;We structured it intentionally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Role:&lt;/strong&gt; The agent acts as an Expert Operations Support Engineer.&lt;br&gt;
&lt;strong&gt;Task:&lt;/strong&gt;&lt;br&gt;
For every issue:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Extract domain entity information from the problem description.&lt;/li&gt;
&lt;li&gt;Generate a checklist in a strict order:&lt;/li&gt;
&lt;li&gt;Verify entity state via UI screens&lt;/li&gt;
&lt;li&gt;Verify entity flow in message broker logs (with topic names)&lt;/li&gt;
&lt;li&gt;Verify entity integrity via database queries&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This sequence mirrors how experienced support engineers approach triage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context References:&lt;/strong&gt; &lt;br&gt;
The agent explicitly refers to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;big-picture.md&lt;/li&gt;
&lt;li&gt;ui-screens-index.md&lt;/li&gt;
&lt;li&gt;er-diagrams-index.md&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Output Format&lt;/strong&gt;&lt;br&gt;
Every response includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Problem Understanding&lt;/li&gt;
&lt;li&gt;Overall Impact&lt;/li&gt;
&lt;li&gt;Checklist to Follow&lt;/li&gt;
&lt;li&gt;Interaction Points Identified&lt;/li&gt;
&lt;li&gt;Short Summary&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The output was designed to be executable.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fskiprxeccr2ww1ol9qcl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fskiprxeccr2ww1ol9qcl.png" alt="Sample output" width="800" height="820"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What We Achieved
&lt;/h2&gt;

&lt;p&gt;This resulted in a PoC custom agent capable of generating structured, domain-aware debugging checklists.&lt;/p&gt;

&lt;p&gt;More importantly:&lt;/p&gt;

&lt;p&gt;We converted implicit operational knowledge into explicit, structured artifacts.&lt;/p&gt;

&lt;p&gt;The support workflow was no longer dependent on a single individual's system intuition.&lt;/p&gt;




&lt;h2&gt;
  
  
  Human Evaluation: A Necessary Constraint
&lt;/h2&gt;

&lt;p&gt;We did not treat the agent as authoritative.&lt;/p&gt;

&lt;p&gt;Every generated checklist was reviewed by the existing support engineers who already performed these tasks manually.&lt;/p&gt;

&lt;p&gt;The next phase was clear:&lt;br&gt;
Test it with individuals who had minimal context of the system.&lt;/p&gt;

&lt;p&gt;Iteration was always part of the plan.&lt;/p&gt;

&lt;p&gt;An operations agent like this should be criticized continuously.&lt;br&gt;
Its usefulness depends entirely on how rigorously it is refined.&lt;/p&gt;




&lt;h3&gt;
  
  
  Where This Can Go
&lt;/h3&gt;

&lt;p&gt;Once the checklist is grounded in real architecture, each phase becomes automatable.&lt;/p&gt;

&lt;p&gt;Future possibilities we identified:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automatically updating UI and ER documents on PR merges&lt;/li&gt;
&lt;li&gt;Garbage collection of outdated context&lt;/li&gt;
&lt;li&gt;Triggering the agent via MCP when a P1 ticket is raised&lt;/li&gt;
&lt;li&gt;Attaching generated checklists directly to support tickets&lt;/li&gt;
&lt;li&gt;Providing read-only search capabilities for domain entities&lt;/li&gt;
&lt;li&gt;Integrating log keyword searches and adapting based on results&lt;/li&gt;
&lt;li&gt;Even raising bug tickets automatically if conditions are met&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The autonomy doesn't need to jump to full resolution.&lt;/p&gt;

&lt;p&gt;It can increase incrementally - phase by phase — based on trust and accuracy.&lt;/p&gt;




&lt;h3&gt;
  
  
  Reflection
&lt;/h3&gt;

&lt;p&gt;The hardest part of AI in operations isn't reasoning.&lt;br&gt;
It's grounding.&lt;/p&gt;

&lt;p&gt;Once the system's architecture, UI workflows, and data relationships were codified into structured context, the agent's job became deterministic.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>agents</category>
    </item>
    <item>
      <title>Stop Wasting Context</title>
      <dc:creator>Prathamesh Deshmukh</dc:creator>
      <pubDate>Wed, 25 Feb 2026 12:16:58 +0000</pubDate>
      <link>https://forem.com/prathamudeshmukh/stop-wasting-context-34b3</link>
      <guid>https://forem.com/prathamudeshmukh/stop-wasting-context-34b3</guid>
      <description>&lt;p&gt;&lt;strong&gt;OpenAI says "Context is a scarce resource."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Treat it like one.&lt;/p&gt;

&lt;p&gt;A giant instruction file feels safe. It feels thorough. But in reality, it crowds out the actual task, the code, and the relevant constraints.&lt;/p&gt;

&lt;p&gt;The agent doesn't get smarter with more text.&lt;br&gt;
It just gets distracted.&lt;/p&gt;

&lt;p&gt;It either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Misses the real constraint buried in noise&lt;/li&gt;
&lt;li&gt;Starts optimizing for the wrong objective
Or worse, overfits to instructions that don't matter right now&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Right Mental Model is to think of context like RAM in a running system.&lt;br&gt;
RAM is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Finite&lt;/li&gt;
&lt;li&gt;Expensive&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Meant for what's actively being processed&lt;br&gt;
You don't load your entire hard drive into memory just because it might be useful.&lt;/p&gt;

&lt;p&gt;Same with LLM context.&lt;/p&gt;

&lt;p&gt;So what would you do to optimize RAM?&lt;br&gt;
Do the same for context.&lt;/p&gt;
&lt;h3&gt;
  
  
  Garbage Collect Aggressively
&lt;/h3&gt;

&lt;p&gt;Remove:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Old decisions that no longer apply&lt;/li&gt;
&lt;li&gt;Duplicated instructions&lt;/li&gt;
&lt;li&gt;Outdated constraints&lt;/li&gt;
&lt;li&gt;"Nice-to-know" explanations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If it's not needed for this task, it shouldn't be in memory.&lt;/p&gt;
&lt;h3&gt;
  
  
  Load on Demand (Lazy Loading)
&lt;/h3&gt;

&lt;p&gt;Don't preload:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All coding standards&lt;/li&gt;
&lt;li&gt;All architecture docs&lt;/li&gt;
&lt;li&gt;All squad rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inject only what's relevant to the current step&lt;/li&gt;
&lt;li&gt;Use smaller scoped agents&lt;/li&gt;
&lt;li&gt;Pull specific docs when needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Context should be dynamic, not monolithic.&lt;/p&gt;
&lt;h3&gt;
  
  
  Compress, Don't Copy
&lt;/h3&gt;

&lt;p&gt;Replace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Long paragraphs&lt;/li&gt;
&lt;li&gt;Repeated policy text&lt;/li&gt;
&lt;li&gt;Verbose explanations
With:&lt;/li&gt;
&lt;li&gt;Bullet summaries&lt;/li&gt;
&lt;li&gt;Structured rules&lt;/li&gt;
&lt;li&gt;Canonical references&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You don't duplicate libraries in RAM — you reference them.&lt;/p&gt;
&lt;h3&gt;
  
  
  Modularize Instructions
&lt;/h3&gt;

&lt;p&gt;Instead of one giant instruction file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- core-standards.md
- frontend-guidelines.md
- backend-guidelines.md
- architecture-principles.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Load only what the current task touches.&lt;br&gt;
Context should be composable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Separate Long-Term vs Working Memory
&lt;/h3&gt;

&lt;p&gt;Some things are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stable principles (coding philosophy, architectural values)&lt;/li&gt;
&lt;li&gt;Temporary task constraints (fix this bug, implement this endpoint)
Don't mix them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Keep:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stable principles lean and abstract&lt;/li&gt;
&lt;li&gt;Task context precise and scoped&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Avoid Over-Specification
&lt;/h3&gt;

&lt;p&gt;The more constraints you add, the more the model optimizes for instruction compliance.&lt;br&gt;
The less it reasons about the problem, high-signal beats high-volume.&lt;/p&gt;

&lt;h3&gt;
  
  
  Optimize for Relevance, Not Completeness
&lt;/h3&gt;

&lt;p&gt;You don't win by giving the model everything.&lt;br&gt;
You win by giving it exactly what it needs to think clearly.&lt;/p&gt;

&lt;p&gt;The goal isn't:&lt;br&gt;
"Did I include all the instructions?"&lt;/p&gt;

&lt;p&gt;The goal is:&lt;br&gt;
"Did I include the right instructions?"&lt;/p&gt;

&lt;h3&gt;
  
  
  Final Take
&lt;/h3&gt;

&lt;p&gt;Large context != better output.&lt;br&gt;
Relevant context = better reasoning.&lt;/p&gt;

&lt;p&gt;Treat context like RAM:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep it lean&lt;/li&gt;
&lt;li&gt;Keep it current&lt;/li&gt;
&lt;li&gt;Load intentionally&lt;/li&gt;
&lt;li&gt;Evict aggressively&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Systems that manage memory well perform better.&lt;br&gt;
Agents are no different.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>memory</category>
    </item>
    <item>
      <title>Convert Any HTML to a Branded PDF Using Templify's API</title>
      <dc:creator>Prathamesh Deshmukh</dc:creator>
      <pubDate>Tue, 11 Nov 2025 07:44:15 +0000</pubDate>
      <link>https://forem.com/prathamudeshmukh/convert-any-html-to-a-branded-pdf-using-templifys-api-i03</link>
      <guid>https://forem.com/prathamudeshmukh/convert-any-html-to-a-branded-pdf-using-templifys-api-i03</guid>
      <description>&lt;h2&gt;
  
  
  Convert Any HTML to a Branded PDF Using Templify's API
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Stop wrestling with Puppeteer and CSS quirks. With &lt;strong&gt;&lt;a href="https://templify.cloud" rel="noopener noreferrer"&gt;Templify&lt;/a&gt;&lt;/strong&gt;, you can turn any HTML (with your brand styling) into a high-quality PDF in a few lines of code - all via a developer-friendly API.&lt;/p&gt;


&lt;h2&gt;
  
  
  Why This Problem Exists
&lt;/h2&gt;

&lt;p&gt;If you've ever tried to generate PDFs programmatically, you've probably faced one of these:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fonts or images breaking between pages
&lt;/li&gt;
&lt;li&gt;Headless browser setup nightmares
&lt;/li&gt;
&lt;li&gt;Manual branding updates that break existing layouts
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Templify was built to end that pain - combining a &lt;strong&gt;powerful rendering engine&lt;/strong&gt; with a &lt;strong&gt;no-code template editor&lt;/strong&gt; and &lt;strong&gt;clean REST API&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  What Is Templify?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Templify&lt;/strong&gt; is an API-first + visual platform to convert HTML into PDF.&lt;/p&gt;

&lt;p&gt;You can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Upload, edit or create new templates visually (using GrapesJS Studio)&lt;/li&gt;
&lt;li&gt;Inject dynamic data via variables and JSON&lt;/li&gt;
&lt;li&gt;Render high-fidelity PDFs through a simple API call&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of it as &lt;em&gt;“Puppeteer meets Canva meets API.”&lt;/em&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;At its core, Templify uses a simple POST request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;POST https://api.templify.cloud/convert
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With your HTML and optional data payload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;--location&lt;/span&gt; https://api.templify.cloud/convert/YOUR_TEMPLATE_ID_HERE&lt;span class="s1"&gt;' \
--header '&lt;/span&gt;client_id: USER_ID_HERE&lt;span class="s1"&gt;' \
--header '&lt;/span&gt;client_secret: CLIENT_SECRET_HERE&lt;span class="s1"&gt;' \
--header '&lt;/span&gt;Content-Type: application/json&lt;span class="s1"&gt;' \
--header '&lt;/span&gt;Cookie: &lt;span class="nv"&gt;NEXT_LOCALE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;en&lt;span class="s1"&gt;' \
--data '&lt;/span&gt;&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"templateData"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;
         &lt;span class="s2"&gt;"name"&lt;/span&gt;: &lt;span class="s2"&gt;"John Doe"&lt;/span&gt;,
         &lt;span class="s2"&gt;"invoice_number"&lt;/span&gt;: &lt;span class="s2"&gt;"INV-1001"&lt;/span&gt;,
         &lt;span class="s2"&gt;"items"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;
           &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"description"&lt;/span&gt;: &lt;span class="s2"&gt;"Item 1"&lt;/span&gt;, &lt;span class="s2"&gt;"price"&lt;/span&gt;: 20 &lt;span class="o"&gt;}&lt;/span&gt;,
           &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"description"&lt;/span&gt;: &lt;span class="s2"&gt;"Item 2"&lt;/span&gt;, &lt;span class="s2"&gt;"price"&lt;/span&gt;: 30 &lt;span class="o"&gt;}&lt;/span&gt;
         &lt;span class="o"&gt;]&lt;/span&gt;
       &lt;span class="o"&gt;}&lt;/span&gt;
     &lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="s1"&gt;'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it - your template instantly becomes a branded PDF.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add Branding, Fonts, and Styles
&lt;/h3&gt;

&lt;p&gt;Since Templify allow you to create templates with raw HTML + CSS, you can easily embed your brand identity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;style&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;body&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;font-family&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;'Inter'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;sans-serif&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#222&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nc"&gt;.header&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#2b2bff&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="no"&gt;white&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;16px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/style&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"header"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;h1&amp;gt;&lt;/span&gt;{{company}} Invoice&lt;span class="nt"&gt;&amp;lt;/h1&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;Customer: {{name}}&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When rendered, your output PDF retains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fonts and brand colors&lt;/li&gt;
&lt;li&gt;Page margins and layouts&lt;/li&gt;
&lt;li&gt;Images, logos, and even charts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No Chrome setup. No CSS hacks. Just clean output every time.&lt;/p&gt;

&lt;p&gt;Bonus: Use Pre-Designed Templates&lt;/p&gt;

&lt;p&gt;If you don't want to write HTML manually, log into the Templify Dashboard&lt;br&gt;
 → create a new template visually, and just reference its ID in your API call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;POST https://api.templify.cloud/convert/TEMPLATE_ID
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Templify will inject your dynamic data into that template and return the branded PDF instantly.&lt;/p&gt;

</description>
      <category>pdf</category>
      <category>api</category>
      <category>dynamic</category>
      <category>node</category>
    </item>
  </channel>
</rss>
