Forem: Prathamesh Deshmukh

How I Load Test a PDF Generation API with k6, Docker, and GitHub Actions

Prathamesh Deshmukh — Sat, 28 Mar 2026 05:13:48 +0000

The Problem with "It Works on My Machine"

PDF generation is one of those deceptively expensive operations. You fire off a request, Puppeteer spins up a headless Chromium, renders a full HTML page, and exports it to bytes. Works great in dev. Works great in staging with one user. Then someone puts it in production and a dozen concurrent requests land at once — and you discover your server is quietly crying.

That was the situation with Templify, a PDF generation platform I built. The core API — POST /convert/{templateId} — compiles a Handlebars template and delegates to a job-runner service (Express + Puppeteer) to render the PDF. Each request is CPU-bound and takes 1–4 seconds depending on template complexity.

Before confidently telling users the API handles concurrent load, I needed proof. Enter k6.

Why k6

I've used Locust, JMeter, and Artillery. k6 wins on developer ergonomics:

Test scripts are JavaScript — no YAML configs, no XML, no DSL to learn
Built-in thresholds — define pass/fail criteria in the script itself
Docker-first — the official grafana/k6 image just works
CLI output is readable — colored, structured, and tells you exactly what you need

The only gotcha: k6 uses a custom JS runtime (Goja, not Node.js), so you can't import arbitrary npm packages. For API testing, that limitation never matters.

The Infrastructure (Brief Context)

The job-runner service runs in Docker on a Hetzner CX11 — 2 vCPU, 4GB RAM, ~$4.15/month. It hosts both production (port 3000) and staging (port 3001) environments as separate containers on the same box.

The load test runs on the Hetzner server itself, not from the GitHub Actions runner. This is intentional: it removes network latency variability from CI runners and tests the raw throughput of the server under local loopback — the truest measure of what the hardware can sustain.

Step 1: Write the k6 Test Script

The test script lives at job-runner/k6/load-test.js.

Load Shape

k6 calls the test configuration options. The shape I chose is a classic ramp-up → steady state → ramp-down pattern:

export const options = {
  stages: [
    { duration: '30s', target: 5 },  // ramp up to 5 VUs
    { duration: '1m',  target: 5 },  // hold for 1 minute
    { duration: '30s', target: 0 },  // ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<3500'], // 95th percentile under 3.5s
    http_req_failed:   ['rate<0.1'],   // error rate under 10%
  },
};

Why 5 virtual users? This isn't an arbitrary number. Each VU issues one request, waits 1 second (sleep(1)), then issues another. At 5 concurrent VUs with ~2–3s response times, the server is handling ~2–3 simultaneous Puppeteer renders at any given moment — right at the limit of what a 2-vCPU box can sustain without queuing.

Why p95 instead of average? Averages lie. A test where 94% of requests complete in 200ms and 6% time out at 30 seconds shows a "fine" average. p95 tells you what the worst realistic experience looks like.

The Payload

The test uses a realistic Handlebars template payload — a multi-section marketing brochure for a fictional company called TechInnovate. This matters: testing with {"name": "test"} would render in 300ms; testing with the actual production payload shape catches real performance characteristics.

const PAYLOAD = {
  templateData: {
    hero: {
      title: "Transforming Business Through Technology",
      subtitle: "Innovative solutions that drive growth, efficiency, and competitive advantage",
      ctaButton: { url: "#products", text: "Discover Our Solutions" }
    },
    about: {
      title: "About TechInnovate",
      description: "Founded in 2015, TechInnovate has been at the forefront...",
      bulletPoints: [
        "10+ years of industry experience",
        "200+ successful projects delivered",
        "98% client satisfaction rate",
        "Global team of certified experts"
      ]
    },
    products: { items: [ /* 3 products with prices, descriptions */ ] },
    features: { items: [ /* 4 feature cards */ ] },
    testimonial: {
      quote: "We have seen a 40% increase in efficiency...",
      author: "Jennifer Martinez",
      company: "CEO, Global Enterprises"
    }
    // ... contact, footer, social links
  }
};

The Request Function

const TEMPLATE_ID = "c07deb00-bb22-4e5f-b48e-1b1c17f7c969";
const CLIENT_ID = __ENV.CLIENT_ID;
const CLIENT_SECRET = __ENV.CLIENT_SECRET;
const baseUrl = "https://api.templify.cloud";

export default function () {
  const headers = {
    'Content-Type': 'application/json',
    'client_secret': CLIENT_SECRET,
    'client_id': CLIENT_ID,
  };

  const pdfResponse = http.post(
    `${baseUrl}/convert/${TEMPLATE_ID}`,
    JSON.stringify(PAYLOAD),
    { headers }
  );

  check(pdfResponse, {
    'PDF generation status is 200': (r) => r.status === 200,
    'PDF generation response time < 5s': (r) => r.timings.duration < 5000,
  });

  sleep(1);
}

Credentials come from environment variables via k6's __ENV — never hardcoded. The check() function records pass/fail metrics per assertion without stopping the test.

Step 2: Containerize k6

The Dockerfile is deliberately minimal:

FROM grafana/k6:latest

COPY *.js .

CMD ["run", "load-test.js"]

That's it. The official grafana/k6 image ships k6 at a known version in a minimal Alpine-based image. No node_modules, no build step, no complexity. The *.js glob future-proofs it — add more test files and they're automatically available.

Step 3: The Deploy Script — Running Tests Remotely

The run-load-test.sh script orchestrates the full flow: sync the k6 files to the server, build the Docker image there, run it.

#!/bin/bash
set -e

TARGET_HOST=${HETZNER_HOST}
HETZNER_USER=${HETZNER_USER}
CLIENT_ID=${CLIENT_ID}
CLIENT_SECRET=${CLIENT_SECRET}

echo "Running load test for PRODUCTION environment..."

# Clean previous run
ssh -o StrictHostKeyChecking=no -i ~/.ssh/id_rsa \
  $HETZNER_USER@$TARGET_HOST "rm -rf ~/load-test/k6"

# Sync k6 folder to server
scp -o StrictHostKeyChecking=no -i ~/.ssh/id_rsa \
  -r k6 $HETZNER_USER@$TARGET_HOST:~/load-test/

# Build and run on the server
ssh -o StrictHostKeyChecking=no -i ~/.ssh/id_rsa \
  $HETZNER_USER@$TARGET_HOST << EOF
    cd ~/load-test/k6
    docker stop k6-load-test 2>/dev/null || true
    docker rm k6-load-test 2>/dev/null || true
    docker build --no-cache -f Dockerfile.k6 -t k6-load-test .
    docker run --rm \
      --name k6-load-test \
      -e CLIENT_ID=$CLIENT_ID \
      -e CLIENT_SECRET=$CLIENT_SECRET \
      -e K6_WEB_DASHBOARD=true \
      -e K6_WEB_DASHBOARD_PORT=-1 \
      k6-load-test
EOF

echo "Load test completed."

Why build on the server instead of pulling from a registry?

Because the test script changes frequently during development. Pushing to a registry on every iteration adds friction. scp + docker build --no-cache is fast (< 30 seconds) and guarantees you're running exactly the code you just edited — no cache surprises.

The `K6_WEB_DASHBOARD=true` flag

k6 ships with a built-in real-time web dashboard. Setting K6_WEB_DASHBOARD_PORT=-1 disables the HTTP server (since we're in a non-interactive SSH session) but still enables the dashboard's internal metrics aggregation and summary report output at the end. If you're running interactively, set a port like 8089 and open the dashboard in your browser during the test run.

Step 4: GitHub Actions Workflow — Manual Trigger

Load tests are not run on every push. Running a 2-minute load test on every PR would be slow, expensive (in credits), and noisy. Instead, it's a workflow_dispatch — triggered manually, on demand:

name: Load Test

on:
  workflow_dispatch:
    inputs:
      environment:
        description: 'Environment to test'
        required: true
        default: 'production'
        type: choice
        options:
          - production
          - staging

jobs:
  load-test:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Setup SSH
        run: |
          mkdir -p ~/.ssh
          echo "${{ secrets.HETZNER_SSH_KEY }}" > ~/.ssh/id_rsa
          chmod 600 ~/.ssh/id_rsa
          ssh-keyscan -H ${{ secrets.HETZNER_HOST }} >> ~/.ssh/known_hosts

      - name: Run load test
        env:
          CLIENT_ID: ${{ secrets.CLIENT_ID }}
          CLIENT_SECRET: ${{ secrets.CLIENT_SECRET }}
          HETZNER_HOST: ${{ secrets.HETZNER_HOST }}
          HETZNER_USER: ${{ secrets.HETZNER_USER }}
        run: |
          chmod +x scripts/run-load-test.sh
          ./scripts/run-load-test.sh ${{ github.event.inputs.environment }}

Required GitHub secrets:
| Secret | Value |
|---|---|
| HETZNER_SSH_KEY | Contents of the ED25519 private key |
| HETZNER_HOST | Server IP or hostname |
| HETZNER_USER | SSH user (e.g. root) |
| CLIENT_ID | Templify API client ID |
| CLIENT_SECRET | Templify API client secret |

The ssh-keyscan step adds the server's host key to known_hosts, preventing the interactive "are you sure?" prompt that would hang the CI runner.

What the Output Looks Like

When k6 finishes, it prints a summary to stdout — which GitHub Actions captures and displays in the workflow logs:

          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: load-test.js
     output: -

  scenarios: (100.00%) 1 scenario, 5 max VUs, 2m30s max duration (incl. graceful stop):
           * default: Up to 5 looping VUs for 2m0s over 3 stages (gracefulRampDown: 30s, ...)

✓ PDF generation status is 200
✓ PDF generation response time < 5s

     checks.........................: 100.00% ✓ 142  ✗ 0
     data_received..................: 14 MB   117 kB/s
     data_sent......................: 87 kB   727 B/s
     http_req_blocked...............: avg=18.4ms   min=2µs    med=5µs    max=1.04s    p(90)=10µs   p(95)=14µs
     http_req_duration..............: avg=1.91s    min=892ms  med=1.79s  max=4.12s    p(90)=2.94s  p(95)=3.21s
   ✓ { expected_response:true }....: avg=1.91s    min=892ms  med=1.79s  max=4.12s    p(90)=2.94s  p(95)=3.21s
     http_req_failed................: 0.00%   ✓ 0    ✗ 142
     http_req_receiving.............: avg=143.2ms  min=4.98ms med=79.5ms max=731ms    p(90)=381ms  p(95)=477ms
     http_req_sending...............: avg=258µs    min=97µs   med=213µs  max=1.45ms   p(90)=435µs  p(95)=509µs
     http_req_tls_handshaking.......: avg=18.3ms   min=0s     med=0s     max=1.04s    p(90)=0s     p(95)=0s
     http_req_waiting...............: avg=1.77s    min=858ms  med=1.66s  max=3.86s    p(90)=2.72s  p(95)=3.02s
     http_reqs......................: 142     1.183333/s
     iteration_duration.............: avg=2.91s    min=1.9s   med=2.79s  max=5.15s    p(90)=3.94s  p(95)=4.21s
     iterations.....................: 142     1.183333/s
     vus............................: 1       min=1  max=5
     vus_max........................: 5       min=5  max=5


running (2m00.0s), 0/5 VUs, 142 complete and 0 interrupted iterations
default ✓ [==============================] 0/5 VUs  2m0s

✓ http_req_duration............: p(95)=3.21s < 3.5s  ✓ PASS
✓ http_req_failed..............: rate=0.00% < 10%     ✓ PASS

The thresholds section at the bottom is the pass/fail verdict. k6 exits with a non-zero code if any threshold is breached — which means GitHub Actions marks the workflow run as failed. No manual inspection required.

The Full File Structure

job-runner/
├── k6/
│   ├── load-test.js       # k6 test script
│   └── Dockerfile.k6      # Minimal k6 Docker image
├── scripts/
│   └── run-load-test.sh   # SSH + SCP + docker run orchestration
└── .github/
    └── workflows/
        └── load-test.yml  # Manual GitHub Actions workflow

Four files. That's the entire setup.

Key Design Decisions, Explained

Run k6 on the server, not from CI

Running from GitHub's Ubuntu runners introduces network hops: CI runner → Cloudflare/CDN → Vercel (API gateway) → Hetzner. That's fine for integration testing but adds noise to performance benchmarking. Running on Hetzner itself tests the raw capacity of the PDF service without network jitter.

`workflow_dispatch` not `push`

Load tests are deliberately not automated on push. They consume API credits (each PDF generation deducts 1 credit), generate real load, and take 2+ minutes. The right time to run them is before a deploy to production or when investigating a performance regression — not on every feature branch commit.

Credentials via `__ENV`, never hardcoded

k6's __ENV object reads environment variables passed at runtime. This means the same script works in local dev (k6 run --env CLIENT_ID=xxx load-test.js), in Docker (-e CLIENT_ID=xxx), and in CI — without any code changes.

`--no-cache` on Docker build

The test script changes often. Docker's layer cache would happily serve a stale load-test.js if you forget to invalidate it. --no-cache is a small penalty (~5 seconds) that guarantees correctness.

Running Locally

If you want to run this without GitHub Actions:

# Install k6 (macOS)
brew install k6

# Run directly
k6 run \
  --env CLIENT_ID=your_client_id \
  --env CLIENT_SECRET=your_client_secret \
  k6/load-test.js

# Or via Docker
docker build -f k6/Dockerfile.k6 -t k6-load-test k6/
docker run --rm \
  -e CLIENT_ID=your_client_id \
  -e CLIENT_SECRET=your_client_secret \
  k6-load-test

To open the live dashboard while running:

k6 run \
  --env CLIENT_ID=xxx \
  --env CLIENT_SECRET=xxx \
  --out web-dashboard=open \
  k6/load-test.js

This opens a browser tab with real-time charts of VU count, request rate, response times, and threshold status.

What I Learned

1. PDF generation doesn't scale linearly. At 1 VU, p95 is ~1.2s. At 5 VUs, p95 climbs to ~3.2s. The bottleneck is Chromium — each instance is single-threaded and memory-hungry. Beyond 5–6 concurrent renders on a 2-vCPU box, response times spike and errors appear.

2. The ramp-up stage matters. Starting at full concurrency immediately causes a thundering herd. Ramping over 30 seconds gives the server time to warm up connection pools and stabilize before the steady-state measurement begins.

3. sleep(1) is realistic pacing. Without a sleep, each VU would hammer the API as fast as possible — useful for finding the absolute breaking point, but not representative of real user behavior. A 1-second pause between requests models a user who just submitted a form and is waiting.

4. Thresholds are commitments. Defining p(95)<3500 in the script makes the performance budget explicit and machine-enforceable. When it breaks, you know exactly why — and you can't ship until it passes.

What's Next

The current setup is a solid baseline. Natural next steps would be:

Export metrics to InfluxDB + Grafana for historical trend tracking across deploys
Add a spike test stage — a sudden jump to 20 VUs for 10 seconds — to test recovery behavior
Test the async endpoint separately — async PDF generation has different characteristics (202 immediate response, webhook delivery latency)
Parameterize the template ID and payload to test multiple templates in a single run using k6's SharedArray for test data

But for a $4/month box generating PDFs for real customers, knowing it handles 5 concurrent requests within SLA — proved by automated tests triggered from GitHub — is exactly the confidence level needed to sleep well at night.

Built with k6 by Grafana Labs, deployed on Hetzner Cloud, automated with GitHub Actions.

How We Built An Operations Support AI Agent for a Global Auto Industry Leader's Post Sales Software Department

Prathamesh Deshmukh — Thu, 05 Mar 2026 06:11:55 +0000

I was working with a global auto industry leader on their post-sales software platform. The platform had recently launched with seven modules. Each module was built as a microservice.
Communication across services happened primarily through a message streaming broker.

The data flow between services was non-trivial — upstream and downstream dependencies, bidirectional communication patterns, and conditional routing based on context.

The Operational Reality

When an end user raised a complaint, the support team had to perform initial root cause analysis before escalating to engineering.

The system had too many moving parts for quick intuition based debugging. And more importantly, the mental model of "how everything connects” was concentrated in one person on the support team.

This wasn't a tooling problem.
It was a knowledge distribution problem.

The question became:

Can we codify the debugging intuition of the most experienced support engineer — and make it usable by anyone?

That's where the idea of an operations support AI agent emerged.

But we were careful about one thing:
The goal wasn't to make an agent that "knows everything.”
The goal was to make an agent grounded in the actual architecture of the system.

Designing the Agent Backwards from Reality

The complexity wasn't just the number of services.

It was:

Inter service communication patterns
Conditional flows
Bidirectional dependencies
And multiple layers of state verification (UI, logs, database)

So instead of jumping straight into prompt engineering, we started with context engineering.

We asked:
What does a strong human support engineer actually do when debugging?

The answer was structured, even if it wasn't documented.
And that structure became the foundation of the agent.

Step 1: Reconstruct the System's Big-Picture Flow

The services were distributed across multiple repositories (polyrepo structure). To understand interactions, we first had to bring everything into one workspace.

What we did

Checked out all service repositories together.
For each service, we prompted AI to generate upstream and downstream dependency diagrams based on message broker configurations found in the codebase.
We generated these per service to avoid overloading the model.
Once individual service documents were created, we asked AI to compile them into a single system-wide data flow diagram using Mermaid (text-based diagram generation).

The result was a consolidated "big-picture" document.

This became foundational context for the agent - not a theoretical architecture diagram, but something derived from the actual codebase configurations.

It allowed the agent to reason about interaction points instead of guessing.

Step 2: Model How Humans Debug via the UI

One of the most interesting observations was this:

The most effective early debugging didn't start with logs.
It started with the application UI.

Support engineers used UI screens to:

Search for domain entities
Inspect state
Check timestamps
Identify where a transaction stopped progressing

So we needed the agent to replicate that behavior.

What we did

Prompted AI to extract the list of UI screens from the micro-frontend system.
Prompted separately for each UI module to maintain output quality.
Generated a structured document listing:
UI screens
Available search filters
Displayed columns/data points

Then we created an index/router document that allowed the agent to:

Identify which screens correspond to a domain entity
Suggest navigation paths
Recommend filters to apply

This transformed the agent from a generic reasoning engine into something application-aware.

Step 3: Enable Database-Level Reasoning with ER Context

When UI-level validation wasn't enough, the fallback was querying the database.

But meaningful DB debugging requires:

Understanding entity relationships
Knowing which fields exist
Writing contextually valid queries

So we:

Generated ER diagrams for each backend service
Built a routing index so the agent could load the appropriate ER diagram based on the issue's domain context

Again, the pattern was the same:
Keep context modular.
Allow conditional loading.
Avoid overwhelming the model.

Step 4: Designing the Agent Interaction Model

Only after building the context layer did we design the agent itself.

We structured it intentionally.

Role: The agent acts as an Expert Operations Support Engineer.
Task:
For every issue:

Extract domain entity information from the problem description.
Generate a checklist in a strict order:
Verify entity state via UI screens
Verify entity flow in message broker logs (with topic names)
Verify entity integrity via database queries

This sequence mirrors how experienced support engineers approach triage.

Context References:
The agent explicitly refers to:

big-picture.md
ui-screens-index.md
er-diagrams-index.md

Output Format
Every response includes:

Problem Understanding
Overall Impact
Checklist to Follow
Interaction Points Identified
Short Summary

The output was designed to be executable.

What We Achieved

This resulted in a PoC custom agent capable of generating structured, domain-aware debugging checklists.

More importantly:

We converted implicit operational knowledge into explicit, structured artifacts.

The support workflow was no longer dependent on a single individual's system intuition.

Human Evaluation: A Necessary Constraint

We did not treat the agent as authoritative.

Every generated checklist was reviewed by the existing support engineers who already performed these tasks manually.

The next phase was clear:
Test it with individuals who had minimal context of the system.

Iteration was always part of the plan.

An operations agent like this should be criticized continuously.
Its usefulness depends entirely on how rigorously it is refined.

Where This Can Go

Once the checklist is grounded in real architecture, each phase becomes automatable.

Future possibilities we identified:

Automatically updating UI and ER documents on PR merges
Garbage collection of outdated context
Triggering the agent via MCP when a P1 ticket is raised
Attaching generated checklists directly to support tickets
Providing read-only search capabilities for domain entities
Integrating log keyword searches and adapting based on results
Even raising bug tickets automatically if conditions are met

The autonomy doesn't need to jump to full resolution.

It can increase incrementally - phase by phase — based on trust and accuracy.

Reflection

The hardest part of AI in operations isn't reasoning.
It's grounding.

Once the system's architecture, UI workflows, and data relationships were codified into structured context, the agent's job became deterministic.

Stop Wasting Context

Prathamesh Deshmukh — Wed, 25 Feb 2026 12:16:58 +0000

OpenAI says "Context is a scarce resource."

Treat it like one.

A giant instruction file feels safe. It feels thorough. But in reality, it crowds out the actual task, the code, and the relevant constraints.

The agent doesn't get smarter with more text.
It just gets distracted.

It either:

Misses the real constraint buried in noise
Starts optimizing for the wrong objective Or worse, overfits to instructions that don't matter right now

The Right Mental Model is to think of context like RAM in a running system.
RAM is:

Finite
Expensive

Meant for what's actively being processed
You don't load your entire hard drive into memory just because it might be useful.

Same with LLM context.

So what would you do to optimize RAM?
Do the same for context.

Garbage Collect Aggressively

Remove:

Old decisions that no longer apply
Duplicated instructions
Outdated constraints
"Nice-to-know" explanations

If it's not needed for this task, it shouldn't be in memory.

Load on Demand (Lazy Loading)

Don't preload:

All coding standards
All architecture docs
All squad rules

Instead:

Inject only what's relevant to the current step
Use smaller scoped agents
Pull specific docs when needed

Context should be dynamic, not monolithic.

Compress, Don't Copy

Replace:

Long paragraphs
Repeated policy text
Verbose explanations With:
Bullet summaries
Structured rules
Canonical references

You don't duplicate libraries in RAM — you reference them.

Modularize Instructions

Instead of one giant instruction file:

- core-standards.md
- frontend-guidelines.md
- backend-guidelines.md
- architecture-principles.md

Load only what the current task touches.
Context should be composable.

Separate Long-Term vs Working Memory

Some things are:

Stable principles (coding philosophy, architectural values)
Temporary task constraints (fix this bug, implement this endpoint) Don't mix them.

Keep:

Stable principles lean and abstract
Task context precise and scoped

Avoid Over-Specification

The more constraints you add, the more the model optimizes for instruction compliance.
The less it reasons about the problem, high-signal beats high-volume.

Optimize for Relevance, Not Completeness

You don't win by giving the model everything.
You win by giving it exactly what it needs to think clearly.

The goal isn't:
"Did I include all the instructions?"

The goal is:
"Did I include the right instructions?"

Final Take

Large context != better output.
Relevant context = better reasoning.

Treat context like RAM:

Keep it lean
Keep it current
Load intentionally
Evict aggressively

Systems that manage memory well perform better.
Agents are no different.

Convert Any HTML to a Branded PDF Using Templify's API

Prathamesh Deshmukh — Tue, 11 Nov 2025 07:44:15 +0000

Convert Any HTML to a Branded PDF Using Templify's API

TL;DR:

Stop wrestling with Puppeteer and CSS quirks. With Templify, you can turn any HTML (with your brand styling) into a high-quality PDF in a few lines of code - all via a developer-friendly API.

Why This Problem Exists

If you've ever tried to generate PDFs programmatically, you've probably faced one of these:

Fonts or images breaking between pages
Headless browser setup nightmares
Manual branding updates that break existing layouts

Templify was built to end that pain - combining a powerful rendering engine with a no-code template editor and clean REST API.

What Is Templify?

Templify is an API-first + visual platform to convert HTML into PDF.

You can:

Upload, edit or create new templates visually (using GrapesJS Studio)
Inject dynamic data via variables and JSON
Render high-fidelity PDFs through a simple API call

Think of it as “Puppeteer meets Canva meets API.”

How It Works

At its core, Templify uses a simple POST request:

POST https://api.templify.cloud/convert

With your HTML and optional data payload:

curl --location https://api.templify.cloud/convert/YOUR_TEMPLATE_ID_HERE' \
--header 'client_id: USER_ID_HERE' \
--header 'client_secret: CLIENT_SECRET_HERE' \
--header 'Content-Type: application/json' \
--header 'Cookie: NEXT_LOCALE=en' \
--data '{
  "templateData": {
         "name": "John Doe",
         "invoice_number": "INV-1001",
         "items": [
           { "description": "Item 1", "price": 20 },
           { "description": "Item 2", "price": 30 }
         ]
       }
     }'

That's it - your template instantly becomes a branded PDF.

Add Branding, Fonts, and Styles

Since Templify allow you to create templates with raw HTML + CSS, you can easily embed your brand identity:

<style>
  body {
    font-family: 'Inter', sans-serif;
    color: #222;
  }
  .header {
    background: #2b2bff;
    color: white;
    padding: 16px;
  }
</style>

<div class="header">
  <h1>{{company}} Invoice</h1>
</div>
<p>Customer: {{name}}</p>

When rendered, your output PDF retains:

Fonts and brand colors
Page margins and layouts
Images, logos, and even charts

No Chrome setup. No CSS hacks. Just clean output every time.

Bonus: Use Pre-Designed Templates

If you don't want to write HTML manually, log into the Templify Dashboard
→ create a new template visually, and just reference its ID in your API call:

POST https://api.templify.cloud/convert/TEMPLATE_ID

Templify will inject your dynamic data into that template and return the branded PDF instantly.

Forem: Prathamesh Deshmukh

How I Load Test a PDF Generation API with k6, Docker, and GitHub Actions

The Problem with "It Works on My Machine"

Why k6

The Infrastructure (Brief Context)

Step 1: Write the k6 Test Script

Load Shape

The Payload

The Request Function

Step 2: Containerize k6

Step 3: The Deploy Script — Running Tests Remotely

Why build on the server instead of pulling from a registry?

The K6_WEB_DASHBOARD=true flag

Step 4: GitHub Actions Workflow — Manual Trigger

What the Output Looks Like

The Full File Structure

Key Design Decisions, Explained

Run k6 on the server, not from CI

workflow_dispatch not push

Credentials via __ENV, never hardcoded

--no-cache on Docker build

Running Locally

What I Learned

What's Next

How We Built An Operations Support AI Agent for a Global Auto Industry Leader's Post Sales Software Department

The Operational Reality

Designing the Agent Backwards from Reality

Step 1: Reconstruct the System's Big-Picture Flow

Step 2: Model How Humans Debug via the UI

Step 3: Enable Database-Level Reasoning with ER Context

Step 4: Designing the Agent Interaction Model

What We Achieved

Human Evaluation: A Necessary Constraint

Where This Can Go

Reflection

Stop Wasting Context

Garbage Collect Aggressively

Load on Demand (Lazy Loading)

Compress, Don't Copy

Modularize Instructions

Separate Long-Term vs Working Memory

Avoid Over-Specification

Optimize for Relevance, Not Completeness

Final Take

Convert Any HTML to a Branded PDF Using Templify's API

Convert Any HTML to a Branded PDF Using Templify's API

Why This Problem Exists

What Is Templify?

How It Works

Add Branding, Fonts, and Styles

The `K6_WEB_DASHBOARD=true` flag

`workflow_dispatch` not `push`

Credentials via `__ENV`, never hardcoded

`--no-cache` on Docker build