<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Rosemary Wang</title>
    <description>The latest articles on Forem by Rosemary Wang (@joatmon08).</description>
    <link>https://forem.com/joatmon08</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F448000%2Fa492540f-e7cd-4d45-81f4-1db01474677b.jpg</url>
      <title>Forem: Rosemary Wang</title>
      <link>https://forem.com/joatmon08</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/joatmon08"/>
    <language>en</language>
    <item>
      <title>Agent2Agent Protocol, IBM Vault, &amp; OAuth 2.0 On-Behalf-Of</title>
      <dc:creator>Rosemary Wang</dc:creator>
      <pubDate>Thu, 23 Apr 2026 19:54:13 +0000</pubDate>
      <link>https://forem.com/joatmon08/agent2agent-protocol-ibm-vault-oauth-20-on-behalf-of-1hba</link>
      <guid>https://forem.com/joatmon08/agent2agent-protocol-ibm-vault-oauth-20-on-behalf-of-1hba</guid>
      <description>&lt;p&gt;I wrote a blog on using &lt;a href="https://hashicorpengineering.substack.com/p/a2a-vault-oidc" rel="noopener noreferrer"&gt;AI agent authorization with Agent2Agent protocol and IBM Vault&lt;/a&gt; that focused on setting up Vault as an OIDC provider to authenticate and authorize requests from an Agent2Agent client to a server. While it works, the post missed something rather critical: identity delegation. Basically, if I am an end user, I want to delegate my Agent2Agent (A2A) client to act on my behalf to access the A2A server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08qpno3nwdacx6vum227.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08qpno3nwdacx6vum227.png" alt="OAuth 2.0 Token Exchange with Vault" width="800" height="553"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It turns out a number of folks in the agent identity space (&lt;a href="https://learn.microsoft.com/en-us/entra/agent-id/agent-identities" rel="noopener noreferrer"&gt;Microsoft Entra Agent ID&lt;/a&gt;, &lt;a href="https://blog.christianposta.com/explaining-on-behalf-of-for-ai-agents/" rel="noopener noreferrer"&gt;Christian Posta&lt;/a&gt;) have been exploring and implementing &lt;a href="https://www.rfc-editor.org/rfc/rfc8693.html" rel="noopener noreferrer"&gt;RFC 8693: OAuth 2.0 Token Exchange&lt;/a&gt; as a way of facilitating and tracking identity delegation. At the time of this post, Vault did not have a secrets engine that that implemented this specification - so I did it as a proof of concept for my own education and knowledge. I ended up creating a Security Token Service (STS) with a custom Vault secrets engine that implements RFC 8693.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: get a subject token from Vault as OIDC provider
&lt;/h2&gt;

&lt;p&gt;The general workflow for delegating identity required me to read the specification a few times. First, the end user authenticates to the client agent using an OIDC provider to get a subject token. &lt;/p&gt;

&lt;p&gt;I set up Vault as an OIDC provider to support a &lt;code&gt;may-act&lt;/code&gt; OIDC scope. This scope attaches a &lt;code&gt;may_act&lt;/code&gt; claim to the &lt;code&gt;id_token&lt;/code&gt; with a list of client agents allowed to act on the user's behalf.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;may_act_scope_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"may-act"&lt;/span&gt;
  &lt;span class="nx"&gt;may_act_claim&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;info&lt;/span&gt; &lt;span class="nx"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;client_agents&lt;/span&gt; &lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;client_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sub&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;vault_identity_entity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;client_agents&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="p"&gt;}])&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"vault_identity_oidc_scope"&lt;/span&gt; &lt;span class="s2"&gt;"may_act"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;may_act_scope_name&lt;/span&gt;
  &lt;span class="nx"&gt;template&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOT&lt;/span&gt;&lt;span class="sh"&gt;
{
  "client_id": "${vault_identity_oidc_client.agent.client_id}",
  "may_act": ${local.may_act_claim}
}
&lt;/span&gt;&lt;span class="no"&gt;EOT
&lt;/span&gt;  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"May act claim that includes what agents can act on behalf of user"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;client_id&lt;/code&gt; and the &lt;code&gt;sub&lt;/code&gt; in the &lt;code&gt;may_act&lt;/code&gt; claim refer to the client agent that requests delegated access, not the end user. The combination of &lt;code&gt;client_id&lt;/code&gt; and &lt;code&gt;sub&lt;/code&gt; enables the custom Vault secrets engine to check that the client agent's Vault role (&lt;code&gt;client_id&lt;/code&gt;) and entity ID (&lt;code&gt;sub&lt;/code&gt;) can have access on behalf of the end user. I decided both needed to be checked because Vault assigns a new entity for every role for each authentication method.&lt;/p&gt;

&lt;p&gt;With the correct OIDC scope for the OIDC request, Vault returns a subject access token with a set of claims allowing certain entities to act on behalf of the user.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"at_hash"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gBJNAqZ6z7Yz7UG-z69Leg"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Sy0uWliApQrPApxpLp7gYVD0wAjvQNse"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"c_hash"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"PLTNZjVHMxhDWOLIeZ_sQA"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"client_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Sy0uWliApQrPApxpLp7gYVD0wAjvQNse"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"exp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1776796482&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iat"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1776792882&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iss"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$VAULT_ADDR/v1/identity/oidc/provider/agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"may_act"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"client_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test-client"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"83b1d088-c7d5-b8a4-dd7b-99baca521f8d"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"namespace"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"root"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"50099deb-d0cf-911b-4310-64a173c542a6"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that if you have multiple entities that can act on behalf of a user, you'd need to create different scopes for each. As long as the OIDC provider supports the various scopes with different &lt;code&gt;may_act&lt;/code&gt; claims, your end user can adjust which entities may act on their behalf.&lt;/p&gt;

&lt;p&gt;Beyond defining the scope, I set a few other configurations for Vault as a OIDC provider. The full code example is located on &lt;a href="https://github.com/joatmon08/infrastructure-agent/blob/main/terraform/vault/oidc.tf" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. You can use another identity provider as an OIDC provider as well, as long as they provide a subject token with the &lt;code&gt;may_act&lt;/code&gt; claim.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: get an actor token from Vault's identity secrets engine
&lt;/h2&gt;

&lt;p&gt;Next, the client agents needs to request an actor token with a &lt;code&gt;client_id&lt;/code&gt; and &lt;code&gt;sub&lt;/code&gt; identifying the agent. I set up the Vault &lt;a href="https://developer.hashicorp.com/vault/docs/secrets/identity" rel="noopener noreferrer"&gt;identity secrets engine&lt;/a&gt; to generate a JWT with the required claims.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test-client"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"client_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test-client"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"exp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1776881849&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iat"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1776795449&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iss"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$VAULT_ADDR/v1/identity/oidc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"namespace"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"root"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scope"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"helloworld:read"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"83b1d088-c7d5-b8a4-dd7b-99baca521f8d"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The identity secrets engine requires the token request to be tied to an entity, which makes it ideal for generating the actor token. The entity ID indicates the authentication and role making the request. For example, the &lt;code&gt;sub&lt;/code&gt; claim includes an entity ID tied to a few authentication methods and roles, including the Kubernetes and AppRole auth methods.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;vault &lt;span class="nb"&gt;read &lt;/span&gt;identity/entity/id/83b1d088-c7d5-b8a4-dd7b-99baca521f8d

Key                    Value
&lt;span class="nt"&gt;---&lt;/span&gt;                    &lt;span class="nt"&gt;-----&lt;/span&gt;
aliases                &lt;span class="o"&gt;[&lt;/span&gt;map[canonical_id:83b1d088-c7d5-b8a4-dd7b-99baca521f8d creation_time:2026-04-20T16:50:33.077473683Z custom_metadata:&amp;lt;nil&amp;gt; &lt;span class="nb"&gt;id&lt;/span&gt;:87db0c7d-032b-ad5c-c3fb-d9faee1686f7 last_update_time:2026-04-20T17:54:26.863503728Z &lt;span class="nb"&gt;local&lt;/span&gt;:false merged_from_canonical_ids:&amp;lt;nil&amp;gt; metadata:map[service_account_name:test-client service_account_namespace:default service_account_secret_name: service_account_uid:2505bc80-5765-4f18-9f60-b4877d860350] mount_accessor:auth_kubernetes_6cb5b3d7 mount_path:auth/kubernetes/ mount_type:kubernetes name:2505bc80-5765-4f18-9f60-b4877d860350] map[canonical_id:83b1d088-c7d5-b8a4-dd7b-99baca521f8d creation_time:2026-04-21T14:21:23.194441388Z custom_metadata:map[] &lt;span class="nb"&gt;id&lt;/span&gt;:8a3cd010-1a3e-1918-1459-f873767c8a46 last_update_time:2026-04-21T14:21:23.194441388Z &lt;span class="nb"&gt;local&lt;/span&gt;:false merged_from_canonical_ids:&amp;lt;nil&amp;gt; metadata:&amp;lt;nil&amp;gt; mount_accessor:auth_approle_7135b542 mount_path:auth/approle/ mount_type:approle name:test-client]]
name                   test-client
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After some research and testing, it seems that the &lt;code&gt;scope&lt;/code&gt; claim does not matter so much for the actor token. For the full configuration to set up the identity secrets engine, review the &lt;a href="https://github.com/joatmon08/infrastructure-agent/blob/main/terraform/vault/identity-actor-token.tf" rel="noopener noreferrer"&gt;example code&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: request a delegated access token from Vault
&lt;/h2&gt;

&lt;p&gt;At this point, I realized I needed to create a &lt;a href="https://developer.hashicorp.com/vault/tutorials/custom-secrets-engine" rel="noopener noreferrer"&gt;custom secrets engine&lt;/a&gt; in Vault to support token exchange. I won't go into the specifics of developing the secrets engine, since most of it involved reading the spec and making sure it conformed to the right claims. The code for the plugin is on &lt;a href="https://github.com/joatmon08/vault-plugin-secrets-oauth-token-exchange" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This plugin is not officially supported - its main intention is a proof-of-concept. As a result, use it with caution. Some important points:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The subject token's signature gets verified against a subject token's JWKS endpoint (OIDC provider).&lt;/li&gt;
&lt;li&gt;The actor token's signature gets verified against the actor token's JWKS endpoint (identity secrets engine).&lt;/li&gt;
&lt;li&gt;When requesting the delegated access token from Vault includes parameters for &lt;code&gt;scope&lt;/code&gt; and &lt;code&gt;aud&lt;/code&gt;. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This ensures the authenticity and integrity of the claims while keeping the implementation Vault-agnostic. While I could introspect the actor token directly against the identity secrets engine, I decided that a public JWKS endpoint was a better approach so I didn't have to pass a Vault token to the secrets engine.&lt;/p&gt;

&lt;p&gt;After validating and verifying the subject and actor tokens, the custom secrets engine generates an access token with an &lt;code&gt;act&lt;/code&gt; claim. The &lt;code&gt;act&lt;/code&gt; claim identifies the actor who requested access on behalf of the end user. The custom secrets engine appends &lt;code&gt;scope&lt;/code&gt; to audit the scope requested by each actor.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"act"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"client_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test-client"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"scope"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"helloworld:read"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"83b1d088-c7d5-b8a4-dd7b-99baca521f8d"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"helloworld-server"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"client_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test-client"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"exp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1776796510&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iat"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1776792910&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iss"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$VAULT_ADDR/v1/sts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scope"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"helloworld:read"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"50099deb-d0cf-911b-4310-64a173c542a6"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you have another client agent who requests on behalf of a client agent, the secrets engine generates an access token with a nested &lt;code&gt;act&lt;/code&gt; claim to denote the delegation chain. Use the delegated access token for the original client agent as the subject token for the second exchange. You need an actor token for the second agent, as the second agent is acting on behalf of the first client acting on behalf of the end user (confusing, I know). Based on RFC 8693, the custom secrets engine will only evaluate the top-level actor against the &lt;code&gt;may_act&lt;/code&gt; claim. Nested actor claims are for audit purposes.&lt;/p&gt;

&lt;p&gt;The custom secrets engine has to be registered with the Vault server. I won't dive too deeply into the registration workflow in this post. If you want to learn more, check out the &lt;a href="https://github.com/joatmon08/infrastructure-agent/blob/main/terraform/kubernetes/vault-plugin-loader.tf" rel="noopener noreferrer"&gt;Terraform configuration&lt;/a&gt; that downloads the plugin binaries to a &lt;code&gt;PersistentVolume&lt;/code&gt; on Kubernetes and the &lt;a href="https://github.com/joatmon08/infrastructure-agent/blob/main/scripts/vault-init.sh" rel="noopener noreferrer"&gt;script&lt;/a&gt; to register the binaries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Update A2A agents
&lt;/h2&gt;

&lt;p&gt;The access token is what the client agent passes to the server agent. The server agent verifies the access token's signature against the custom secrets engine's JWKS endpoint, checks the &lt;code&gt;aud&lt;/code&gt; claim matches the name of the server agent, and verifies the issuer comes from the custom secrets engine.&lt;/p&gt;

&lt;p&gt;If the access token does not contain the correct &lt;code&gt;aud&lt;/code&gt; or the correct &lt;code&gt;scope&lt;/code&gt; claim, the server agent does not allow the client agent to access its skills. The &lt;a href="https://github.com/joatmon08/infrastructure-agent/tree/main/agents/helloworld" rel="noopener noreferrer"&gt;server agent&lt;/a&gt; does not have any direct dependencies on Vault. It uses the custom secrets engine's OpenID Connect configuration endpoint to get the JWKS endpoint for token verification.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/joatmon08/infrastructure-agent/tree/main/agents/test-client" rel="noopener noreferrer"&gt;client agent&lt;/a&gt; does need access to Vault in order to get the subject and actor token. Rather than have the client agent access the Vault API, I used &lt;a href="https://developer.hashicorp.com/vault/docs/agent-and-proxy/agent" rel="noopener noreferrer"&gt;Vault Agent&lt;/a&gt; to read the required credentials to generate subject and actor tokens from Vault and write them to files for the client agent to use.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;## omitted for clarity&lt;/span&gt;

    &lt;span class="nx"&gt;template&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;labels&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;test_client_name&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nx"&gt;annotations&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="s2"&gt;"vault.hashicorp.com/agent-inject"&lt;/span&gt;                              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"true"&lt;/span&gt;
          &lt;span class="s2"&gt;"vault.hashicorp.com/role"&lt;/span&gt;                                      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"test-client"&lt;/span&gt;
          &lt;span class="s2"&gt;"vault.hashicorp.com/agent-inject-token"&lt;/span&gt;                        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"true"&lt;/span&gt;
          &lt;span class="s2"&gt;"vault.hashicorp.com/agent-run-as-same-user"&lt;/span&gt;                    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"true"&lt;/span&gt;
          &lt;span class="s2"&gt;"vault.hashicorp.com/tls-skip-verify"&lt;/span&gt;                           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"true"&lt;/span&gt;
          &lt;span class="s2"&gt;"vault.hashicorp.com/agent-inject-secret-client_secrets.json"&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"identity/oidc/client/agent"&lt;/span&gt;
          &lt;span class="s2"&gt;"vault.hashicorp.com/agent-inject-template-client_secrets.json"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;-&lt;/span&gt;&lt;span class="no"&gt;EOT&lt;/span&gt;&lt;span class="sh"&gt;
            {
            {{- with secret "identity/oidc/client/agent" }}
                "client_id": "{{ .Data.client_id }}",
                "client_secret": "{{ .Data.client_secret }}",
                "redirect_uris": {{ .Data.redirect_uris | toJSON }}
            {{- end }}
            }
&lt;/span&gt;&lt;span class="no"&gt;          EOT
&lt;/span&gt;          &lt;span class="s2"&gt;"vault.hashicorp.com/agent-inject-secret-oidc_provider.json"&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"identity/oidc/provider/agent/.well-known/openid-configuration"&lt;/span&gt;
          &lt;span class="s2"&gt;"vault.hashicorp.com/agent-inject-template-oidc_provider.json"&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;-&lt;/span&gt;&lt;span class="no"&gt;EOT&lt;/span&gt;&lt;span class="sh"&gt;
            {
            {{- with secret "identity/oidc/provider/agent/.well-known/openid-configuration" }}
                "authorization_endpoint": "{{ .Data.authorization_endpoint }}",
                "issuer": "{{ .Data.issuer }}",
                "token_endpoint": "{{ .Data.token_endpoint }}",
                "userinfo_endpoint": "{{ .Data.userinfo_endpoint }}"
            {{- end }}
            }
&lt;/span&gt;&lt;span class="no"&gt;          EOT
&lt;/span&gt;          &lt;span class="s2"&gt;"vault.hashicorp.com/agent-inject-secret-actor_token"&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"identity/oidc/token/test-client"&lt;/span&gt;
          &lt;span class="s2"&gt;"vault.hashicorp.com/agent-inject-template-actor_token"&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;-&lt;/span&gt;&lt;span class="no"&gt;EOT&lt;/span&gt;&lt;span class="sh"&gt;
            {{- with secret "identity/oidc/token/test-client" -}}
            {{ .Data.token }}
            {{- end }}
&lt;/span&gt;&lt;span class="no"&gt;          EOT
&lt;/span&gt;        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that &lt;code&gt;test-client&lt;/code&gt; runs with a Kubernetes service account. I configured a Vault role for the Kubernetes auth method and an alias for the &lt;code&gt;test-client&lt;/code&gt; entity tied to the &lt;code&gt;test-client&lt;/code&gt; service account. This ensures that when the &lt;code&gt;test-client&lt;/code&gt; requests an actor token, it has an entity ID.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"vault_identity_entity_alias"&lt;/span&gt; &lt;span class="s2"&gt;"client_agents"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;client_agents&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;kubernetes_service_account_v1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;client_agents&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;uid&lt;/span&gt;
  &lt;span class="nx"&gt;mount_accessor&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;vault_auth_backend&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;kubernetes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;accessor&lt;/span&gt;
  &lt;span class="nx"&gt;canonical_id&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;vault_identity_entity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;client_agents&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"vault_kubernetes_auth_backend_role"&lt;/span&gt; &lt;span class="s2"&gt;"client_agents"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt;                         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;client_agents&lt;/span&gt;
  &lt;span class="nx"&gt;backend&lt;/span&gt;                          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;vault_auth_backend&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;kubernetes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;path&lt;/span&gt;
  &lt;span class="nx"&gt;role_name&lt;/span&gt;                        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;
  &lt;span class="nx"&gt;bound_service_account_names&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;bound_service_account_namespaces&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;k8s_namespace&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;token_ttl&lt;/span&gt;                        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3600&lt;/span&gt;
  &lt;span class="nx"&gt;token_policies&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;vault_policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;actor_token&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;vault_policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agent_oidc_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;vault_policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;oauth_exchange_token&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While you can write code in your A2A client agent to authenticate to Vault and get the credentials, I found it easier to use Vault Agent to write them to a file for the client agent to consume. When the credentials expire, Vault Agent will write new credentials to the file.&lt;/p&gt;

&lt;h2&gt;
  
  
  End-to-end workflow
&lt;/h2&gt;

&lt;p&gt;To demonstrate the workflow, the &lt;code&gt;test-client&lt;/code&gt; includes a UI that has the end user log in and obtain the subject token from Vault as an OIDC provider.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn85w711k8nn56f2ih81t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn85w711k8nn56f2ih81t.png" alt="UI getting subject token from Vault as OIDC provider" width="800" height="544"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then, the end user requests a delegated access token with a specific scope and subject to access the A2A server agent. The &lt;code&gt;test-client&lt;/code&gt; receives an access token from Vault's custom secrets engine.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic4lne85lxtm5pdtncnx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic4lne85lxtm5pdtncnx.png" alt="UI getting delegated access token from Vault custom secrets engine" width="800" height="743"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;test-client&lt;/code&gt; agent uses the access token to successfully request a message from &lt;code&gt;helloworld-server&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fir3dvemzbern2rzv9nib.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fir3dvemzbern2rzv9nib.png" alt=" " width="800" height="547"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are two ways in which the client agent does not have sufficient permissions to act on behalf of the end user to call the server agent.&lt;/p&gt;

&lt;p&gt;First, if the client agent's actor token identity does not match the end user's subject token &lt;code&gt;may_act&lt;/code&gt; claim, the Vault custom secrets engine does not issue a delegated access token.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpmon3aob2omu8mxvywxp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpmon3aob2omu8mxvywxp.png" alt="Actor does not have permission to act on behalf of" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Second, if the &lt;code&gt;test-client&lt;/code&gt; uses an access token with insufficient scopes or incorrect server agent as the subject, the server agent denies access.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn9tf7yb0veekedish0pc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn9tf7yb0veekedish0pc.png" alt="Client agent has incorrect scopes to access server agent" width="800" height="528"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As a check, I reviewed the Vault audit logs to verify if it logged the end user requests to the OIDC provider, actor token requests from the client agent, and the delegated access token request from the client agent. The good news - it does! However, you have to tune the secrets engine to output the claims as non-HMAC keys. For example, I used the &lt;code&gt;vault secrets tune&lt;/code&gt; subcommand to make it more clear for me to read.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vault secrets tune &lt;span class="nt"&gt;-audit-non-hmac-request-keys&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;scope &lt;span class="nt"&gt;-audit-non-hmac-request-keys&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;subject &lt;span class="nt"&gt;-audit-non-hmac-request-keys&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;audience sts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By configuring Vault as an OIDC provider, the identity secrets engine for the actor token, and a custom token exchange secrets engine for delegation, you can track and enforce some agent-to-agent communication.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Overall, this turned out to be far more challenging to implement than expected. It took quite a bit of reverse engineering the specification, reviewing the idea with other folks, arguing with my coding agent, and deploying Vault repeatedly.&lt;/p&gt;

&lt;p&gt;The custom secrets engine I created for token exchange has a workflow that should go in the identity secrets engine as it has the same general structure. For my purposes, I ended up developing it as a separate secrets engine so I don't have to maintain a fork of the identity secrets engine plugin. I learned quite a bit about entity IDs and OAuth 2.0 in the process.&lt;/p&gt;

&lt;p&gt;I do see a few problems with the approach. An administrator has to configure &lt;code&gt;may_act&lt;/code&gt; claims for Vault entities and clients and assign (effectively) a role to every client agent. While this is something you can automate, I imagine it can get fairly complicated and challenging to maintain. It's also deterministic, which doesn't quite address the fact that agents are autonomous and might choose to act on other's behalf. As I am not comfortable letting an agent run amok with minimal supervision, I am fine with the administrative overhead.&lt;/p&gt;

&lt;p&gt;Another problem is actually where to enforce the scope of what the client agent can do with the server agent. This is probably where an AI gateway would help, especially as it can review the access tokens and identify what a client agent can do with an MCP server or server agent. At the very least, this workflow does enable some kind of authentication request tracking so you can audit if and when a client agent requested access to a server agent or MCP server. I'll try working on this another day, probably with &lt;a href="https://github.com/IBM/mcp-context-forge" rel="noopener noreferrer"&gt;ContextForge&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In the meantime, if you're interested in how this works, check out the &lt;a href="https://github.com/joatmon08/infrastructure-agent" rel="noopener noreferrer"&gt;demo repository&lt;/a&gt;, which deploys a Kubernetes cluster and all of the components and configuration for Vault.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>vault</category>
      <category>security</category>
      <category>identity</category>
    </item>
    <item>
      <title>Using AI for Terraform: flows, prompts, and agents with LangFlow &amp; Docling</title>
      <dc:creator>Rosemary Wang</dc:creator>
      <pubDate>Mon, 02 Feb 2026 19:38:35 +0000</pubDate>
      <link>https://forem.com/joatmon08/using-ai-for-terraform-flows-prompts-and-agents-with-langflow-docling-3e53</link>
      <guid>https://forem.com/joatmon08/using-ai-for-terraform-flows-prompts-and-agents-with-langflow-docling-3e53</guid>
      <description>&lt;p&gt;In &lt;a href="https://dev.to/joatmon08/using-ai-for-terraform-running-a-locally-with-langflow-opensearch-ollama-5co6"&gt;part 1&lt;/a&gt;, I learned to deploy a local stack for running an AI agent. Originally, I thought I could generate "good" Terraform configuration based on all my content for secure and scalable infrastructure as code. &lt;/p&gt;

&lt;p&gt;Shortly after, I received a message encouraging me to think about the second edition of my book. I have procrastinated on this for a while now since it takes time to revise material and generate new examples. One of the biggest challenges with my book was writing examples. Initial feedback suggested I write everything in Python for greater accessibility and Google Cloud for lower cost, which I did. In retrospect, I should have just written everything in Terraform to run on AWS. &lt;/p&gt;

&lt;p&gt;As I reflected on this further, I realized something important. Isn't book writing the perfect use case for an AI agent? If I had agent who knew my writing style help me write new examples in Terraform for my book, maybe I could expedite the process of creating a second edition.&lt;/p&gt;

&lt;p&gt;With my regrets in mind, I decided to try to create a "book writing" agent that helps me generate examples to match my text. After all, I had the chapters of the book written. I wanted new examples to reframe some of the principles and practices. This sent me on a major exploration of prompts, agent instructions, and flows in LangFlow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Process PDFs with Docling
&lt;/h2&gt;

&lt;p&gt;I had editable drafts of the chapters, but only the final PDF version of the book had the proper code annotations and figures. I needed to process the PDF book chapters into text and images before chunking and storing them in my OpenSearch vector database. Enter &lt;a href="https://www.docling.ai/" rel="noopener noreferrer"&gt;Docling&lt;/a&gt;, a document processing tool for unstructured data.&lt;/p&gt;

&lt;p&gt;Fortunately, LangFlow has a &lt;a href="https://docs.langflow.org/bundles-docling" rel="noopener noreferrer"&gt;Docling component&lt;/a&gt; for processing a set of files and chunking them. You do have to install it before you run LangFlow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv pip &lt;span class="nb"&gt;install &lt;/span&gt;langflow[docling]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you start creating a flow in LangFlow, drag-and-drop the Docling component.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faiovmnzprdwp8uw4auj1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faiovmnzprdwp8uw4auj1.png" alt="Docling component in LangFlow" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are a few attributes you need to consider with Docling:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pipeline - I opted for &lt;code&gt;standard&lt;/code&gt; just to process the text. If I wanted to also process the figures, I could select &lt;code&gt;vlm&lt;/code&gt; (&lt;a href="https://www.nvidia.com/en-us/glossary/vision-language-models/" rel="noopener noreferrer"&gt;Visual Language Model&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;OCR Engine - &lt;code&gt;None&lt;/code&gt; for now. I wanted to test if I had sufficient resources to run Docling.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My laptop constrains the amount of memory Docling can use to process the documents, which is why I did not upload the entire book or use VLM or OCR.&lt;/p&gt;

&lt;p&gt;Next, I need to chunk the text into my vector database. Rather than do fixed-size chunking, I decided to try &lt;a href="https://www.ibm.com/think/architectures/rag-cookbook/chunking" rel="noopener noreferrer"&gt;hybrid chunking&lt;/a&gt; which combines fixed-size chunking with semantic chunking. This ensures that the various chapters of my book have chunks with proper context. After chunking three chapters of my book, I stored the chunks in the OpenSearch vector database using Granite embeddings hosted by Ollama.&lt;/p&gt;

&lt;p&gt;Besides the PDF chapters of the book, I had a few blogs on best practices for writing Terraform. These had some text and examples that I wanted to include as part of the agent's response. Using the URL component, I added the set of URLs for the blog posts and passed it to the vector database.&lt;/p&gt;

&lt;p&gt;Now that I had my expert-level content on infrastructure as code and Terraform practices into my vector database, I could use an agent to reference that context.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create the Terraform coding agent
&lt;/h2&gt;

&lt;p&gt;I started with what I thought was the easier agent to build - a coding agent that generates "good" Terraform. This agent needed to generate Terraform configuration with the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Proper resource and module declarations - no hallucinations please&lt;/li&gt;
&lt;li&gt;Correct formatting&lt;/li&gt;
&lt;li&gt;All variables and outputs defined&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In addition to writing proper working Terraform, I wanted the agent to include a good example. I had a &lt;a href="https://github.com/joatmon08/hashicorp-stack-demoapp" rel="noopener noreferrer"&gt;demo repository&lt;/a&gt; that I constantly copied and pasted into other repositories, so I wanted the agent to reference that configuration when the prompt matched.&lt;/p&gt;

&lt;p&gt;With all these requirements, I realized I need to use two MCP servers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/hashicorp/terraform-mcp-server" rel="noopener noreferrer"&gt;Terraform MCP server&lt;/a&gt; for the latest resource and module documentation&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/github/github-mcp-server" rel="noopener noreferrer"&gt;GitHub MCP server&lt;/a&gt; for getting files in my reference repository&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I did not need all the tools available on these MCP servers. From an access control perspective, I used the following for each MCP server:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Terraform MCP server - &lt;code&gt;get_latest_module_version&lt;/code&gt;, &lt;code&gt;get_latest_provider_version&lt;/code&gt;, &lt;code&gt;get_module_details&lt;/code&gt;, &lt;code&gt;get_provider_details&lt;/code&gt;, &lt;code&gt;get_provider_capabilities&lt;/code&gt;, &lt;code&gt;search_modules&lt;/code&gt;, and &lt;code&gt;search_providers&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;GitHub MCP server - &lt;code&gt;get_file_contents&lt;/code&gt; and &lt;code&gt;search_code&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My agent could retrieve the latest Terraform modules and providers or search GitHub for reference code. I connected the GitHub and Terraform MCP servers with the URL component to my agent using LangFlow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4qgaaqu2dodylv6qsnr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4qgaaqu2dodylv6qsnr.png" alt="Terraform coding agent with MCP servers in LangFlow" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally, the coding agent needed instructions. I started with the official &lt;a href="https://github.com/hashicorp/agent-skills" rel="noopener noreferrer"&gt;agent skills for Terraform&lt;/a&gt; and refined the instructions to better suite Granite and my use case. It took quite a bit of trial-and-error. The full set of instructions are on a &lt;a href="https://github.com/joatmon08/infrastructure-agent/blob/main/langflow/EXAMPLE_AGENT.md" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;. I had two main observations.&lt;/p&gt;

&lt;p&gt;First, I had be &lt;em&gt;very&lt;/em&gt; specific about which repository and commit the agent should reference for a "good" Terraform example. It turns out the &lt;a href="https://github.com/langflow-ai/langflow/issues/8059" rel="noopener noreferrer"&gt;MCP component in LangFlow cannot handle optional parameters at the time of this post&lt;/a&gt; so I had to put the exact commit hash and branch the agent should reference in the agent instructions.&lt;/p&gt;

&lt;p&gt;Second, the prompt had to include the specific module and resource I wanted the example to include (e.g., Create an example with the &lt;code&gt;aws_opensearchserverless_collection&lt;/code&gt; resource.) If I did not include the exact module or resource, Granite would search for the wrong module or resource with the Terraform MCP server. After adjusting my expectations on the prompts, I moved onto the book writing agent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create the book writing agent
&lt;/h2&gt;

&lt;p&gt;Why did I separate the book writing task into its own agent? I discovered that combining both Terraform generation and an explanation based on context from OpenSearch led to very poor results. The agent was tasked with doing too much ("generate the explanation AND the example"), which led to some garbled responses that made little sense.&lt;/p&gt;

&lt;p&gt;I decided to split the book writing into its own agent so it could properly draft a response that sounds like I wrote the paragraph and not just reiterating the Terraform configuration. This worked much better overall. I also moved the book writing agent first, since it could generate the explanation and the Terraform coding agent could adjust the example.&lt;/p&gt;

&lt;p&gt;I connected the output of the writer agent to the input of the coding agent. This ensures that the explanation includes references to the expected resources and examples from user input.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fat8807xhaqvdywh7dwz8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fat8807xhaqvdywh7dwz8.png" alt="End-to-end agent workflow in LangFlow" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The full set of instructions are on the demo &lt;a href="https://github.com/joatmon08/infrastructure-agent/blob/main/langflow/WRITER_AGENT.md" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;. I wanted it to include the sources so I could check exactly where it found the information.&lt;/p&gt;

&lt;p&gt;With both agents, I am going to do more work to refine the prompts. They do a decent job of producing a semi-coherent explanation and example but I think I can improve them with more review.&lt;/p&gt;

&lt;h2&gt;
  
  
  The result
&lt;/h2&gt;

&lt;p&gt;I passed in a few prompts to test out my agents. While the results needed some edits, they turned out more usable than I expected. For example, I asked the following:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Explain the singleton pattern using the &lt;code&gt;aws_opensearchserverless_collection&lt;/code&gt; resource. Include ideas on when to refactor from a singleton to a composite module.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The agent returned a very lengthy explanation, which I'll include a few excerpts below. The first paragraph wasn't completely incorrect but needed to remove the mention of Google project resources since I wanted to use AWS for examples.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The &lt;strong&gt;singleton pattern&lt;/strong&gt; is commonly used in Terraform configurations to manage resources that should exist only once within an environment, such as Google project configurations. In the context of AWS OpenSearch Serverless collections, this pattern ensures that there is a single instance of the collection resource, which typically does not change frequently.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;As for the example, it did use the resource I requested but it did not use the right arguments. For example, &lt;code&gt;domain_id&lt;/code&gt; doesn't exist for the &lt;code&gt;aws_opensearchserverless_collection&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_opensearchserverless_collection"&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;collection_name&lt;/span&gt;
  &lt;span class="nx"&gt;domain_id&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_opensearch_serverless_domain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Primary collection for ${var.environment}"&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;common_tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;Environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;environment&lt;/span&gt;
      &lt;span class="nx"&gt;Project&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_name&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When I asked the agent to explain when to refactor from a singleton to composite module, it provided a correct but very generic explanation. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Refactoring from a singleton pattern to a composite module becomes necessary when:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Resource Evolution&lt;/strong&gt;: Resources evolve beyond single instances, requiring multiple configurations or variations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shared Configurations&lt;/strong&gt;: Multiple resources need shared configurations that are not strictly unique but require common parameters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complex Workflows&lt;/strong&gt;: The infrastructure management involves complex workflows where different components interact and share state.&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;

&lt;p&gt;I asked the book writing agent to apply the other principles when possible, which it did try in later paragraphs. In general, the agent provided a good start for me to edit and iterate on the explanation. I would not use the response as-is, since it has some incorrect points and the explanation is far too generic, but it does write it in the style and tone of voice that I would write in my book.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;I learned quite a bit about prompt engineering while trying to build a book writing agent for myself. In some situations, I had to be very specific about how and where an agent should refer to certain tools or data sources. In some of the first iterations, the agent kept bringing up other principles like simplicity, which I do not mention in my book. I had to ask the agent for the source of the principle, which was another book entirely.&lt;/p&gt;

&lt;p&gt;In general, the agent did improve over time. The more I asked of it and provided feedback, the better the response it generated. However, I still wouldn't use the response in the book without some editing. I could use the examples, for the most part, but I had to check them correctness and clarity.&lt;/p&gt;

&lt;p&gt;Next, I plan on moving these components off my local machine into some cloud infrastructure.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>terraform</category>
      <category>langflow</category>
    </item>
    <item>
      <title>Using AI for Terraform: running locally with Langflow, OpenSearch, &amp; Ollama</title>
      <dc:creator>Rosemary Wang</dc:creator>
      <pubDate>Tue, 13 Jan 2026 14:53:29 +0000</pubDate>
      <link>https://forem.com/joatmon08/using-ai-for-terraform-running-a-locally-with-langflow-opensearch-ollama-5co6</link>
      <guid>https://forem.com/joatmon08/using-ai-for-terraform-running-a-locally-with-langflow-opensearch-ollama-5co6</guid>
      <description>&lt;p&gt;I'm a pragmatist at heart. While I don't fully believe in using AI for everything, I did find myself getting very frustrated with my copy and paste process for "good" Terraform configuration. I already wrote Terraform configuration that ran with many resources and was mostly secure by default anyway. Why did I have to go back two or three years to an example and then update it? Could I really use AI to write some new demo code?&lt;/p&gt;

&lt;p&gt;I realized I had a lot of content I could reference and get myself out of the copy-paste whirlpool. Most of the time, I looked up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slides of old &lt;a href="https://joatmon08.github.io/03_speaking.html" rel="noopener noreferrer"&gt;talks&lt;/a&gt; with accurate diagrams&lt;/li&gt;
&lt;li&gt;Some old code from two or three specific repositories on &lt;a href="https://github.com/joatmon08" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;My &lt;a href="https://www.manning.com/books/infrastructure-as-code-patterns-and-practices" rel="noopener noreferrer"&gt;book&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Terraform &lt;a href="https://registry.terraform.io/browse/modules" rel="noopener noreferrer"&gt;modules&lt;/a&gt; in the registry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The problem? I know how to build infrastructure with Terraform but I know nothing about AI. So I decided to learn.&lt;/p&gt;

&lt;p&gt;When I started blogging and trying to learn technology for myself, I ran everything locally and avoided paying for resources. That meant using the free credits for most cloud or managed offerings and working within a resource-constrained system. For this series, I decided on the following tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; for models&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.langflow.org/" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt; for no-code/low-code agentic development&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://opensearch.org/" rel="noopener noreferrer"&gt;OpenSearch&lt;/a&gt; for vector search&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.docling.ai/" rel="noopener noreferrer"&gt;Docling&lt;/a&gt; to process my PDF documents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As for the model, I was willing to try some of the "open" models like &lt;a href="https://www.ibm.com/granite" rel="noopener noreferrer"&gt;Granite&lt;/a&gt; through Ollama. If they didn't work, I would try others.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building something to run models
&lt;/h2&gt;

&lt;p&gt;As a starting point, I ran everything in containers. If I need more resources, I would move it to some cloud deployment later. With a Docker Compose file, I deployed Ollama, Langflow, and OpenSearch.&lt;/p&gt;

&lt;p&gt;Ollama runs models on your local machine. Since I get impatient waiting for Ollama to start and pull the models, I built a Docker container with Ollama and pre-pulled models and embeddings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; ollama/ollama:latest&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; ./init-ollama.sh /tmp/init-ollama.sh&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /tmp&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x init-ollama.sh &lt;span class="se"&gt;\
&lt;/span&gt;   &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; ./init-ollama.sh

&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 11434&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, I used &lt;code&gt;granite4:tiny-h&lt;/code&gt; since I am running it locally on my laptop.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

ollama serve &amp;amp;
ollama list
ollama pull granite4:tiny-h
ollama pull granite-embedding:30m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Deploying an agent toolchain
&lt;/h2&gt;

&lt;p&gt;I do not know how to write an AI agent. I also didn't feel like coding a whole agent toolchain to achieve my goal of writing infrastructure for my purposes. Luckily, I found Langflow, which offers a no-code/low-code way to deploy AI agents and MCP servers. I created a Dockerfile for Langflow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; langflowai/langflow:1.7.2&lt;/span&gt;

&lt;span class="k"&gt;USER&lt;/span&gt;&lt;span class="s"&gt; root&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apt update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; libgl1 libglib2.0-0 &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; uv pip &lt;span class="nb"&gt;install &lt;/span&gt;langflow[docling]

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["python", "-m", "langflow", "run", "--host", "0.0.0.0", "--port", "7860"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Initially, I did use the Langflow image without creating a custom Dockerfile. Unfortunately, the Docling component I wanted to use for processing PDF chapters of my book needed a dependency installed. I built that into my own Langflow image so I didn't have to run the install separately.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building a RAG stack for context
&lt;/h2&gt;

&lt;p&gt;It turns out that retrieval augmented generation (RAG) is an important part of getting my use case working. I have context that I want my agent to use, so that information needs to be processed and stored in a vector database.&lt;/p&gt;

&lt;p&gt;I chose Opensearch because I deployed it before and it was something I could run locally. Unfortunately, it turns out that using Opensearch as a vector database for Docling requires some additional configuration. Supposedly, Opensearch creates the index if it doesn't already exist. For some reason, it is unclear if auto-creation works for a simple index or it also works for a vector index. I kept getting errors from Langflow that the index did not exist.&lt;/p&gt;

&lt;p&gt;As a workaround, I reverse engineered the index and manually requested the Opensearch API to create an empty index. At this point, I was tired of writing scripts and resorted to asking &lt;a href="https://www.ibm.com/products/bob" rel="noopener noreferrer"&gt;Project Bob&lt;/a&gt;, an AI software agent, for help. I think I asked it to generate me a Dockerfile for Opensearch with a step to create a vector index named "langflow" with "ef_search" of 512 and a property named "chunk_embeddings" of "knn_vector" type with 384 dimensions. It gave me a pretty good script as a response, complete with the proper API call.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c"&gt;# Wait for OpenSearch to be ready&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Waiting for OpenSearch to start..."&lt;/span&gt;
&lt;span class="k"&gt;until &lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://localhost:9200/_cluster/health &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
   &lt;/span&gt;&lt;span class="nb"&gt;sleep &lt;/span&gt;2
&lt;span class="k"&gt;done

&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"OpenSearch is ready. Creating 'langflow' index..."&lt;/span&gt;

&lt;span class="c"&gt;# Create the langflow index with vector search configuration&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; PUT &lt;span class="s2"&gt;"http://localhost:9200/langflow"&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;&lt;span class="s1"&gt;'
{
 "settings": {
   "index": {
     "knn": true,
     "knn.algo_param.ef_search": 512
   }
 },
 "mappings": {
   "properties": {
     "chunk_embedding": {
       "type": "knn_vector",
       "dimension": 384
     }
   }
 }
}
'&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Index 'langflow' created successfully!"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, Bob created a Dockerfile out of the script. Bob was a bit verbose but the script did work with some modifications.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; opensearchproject/opensearch:3&lt;/span&gt;

&lt;span class="c"&gt;# Copy the initialization script&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; ./init-opensearch.sh /usr/share/opensearch/init-opensearch.sh&lt;/span&gt;

&lt;span class="c"&gt;# Create a wrapper script to run both OpenSearch and the init script&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'#!/bin/bash'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /usr/share/opensearch/entrypoint-wrapper.sh &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;   &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'/usr/share/opensearch/opensearch-docker-entrypoint.sh opensearch &amp;amp;'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /usr/share/opensearch/entrypoint-wrapper.sh &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;   &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'sleep 5'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /usr/share/opensearch/entrypoint-wrapper.sh &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;   &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'/usr/share/opensearch/init-opensearch.sh'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /usr/share/opensearch/entrypoint-wrapper.sh &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;   &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'wait'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /usr/share/opensearch/entrypoint-wrapper.sh &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;   &lt;span class="nb"&gt;chmod&lt;/span&gt; +x /usr/share/opensearch/entrypoint-wrapper.sh

&lt;span class="c"&gt;# Use the wrapper as the entrypoint&lt;/span&gt;
&lt;span class="k"&gt;ENTRYPOINT&lt;/span&gt;&lt;span class="s"&gt; ["/usr/share/opensearch/entrypoint-wrapper.sh"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As I said before, I am pragmatic about my use of AI for coding. I didn't want to use something like Bob to speed things up but I got tired and thought, "Why not?" I think the Dockerfile and script were generated in about two minutes compared to the hour that it took me to write and test the Ollama one. The result was functional but I wouldn't use AI to generate anything I didn't have the confidence to verify or test myself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Putting it all together
&lt;/h2&gt;

&lt;p&gt;I created the set of containers using Docker Compose on my local machine, including the &lt;a href="https://developer.hashicorp.com/terraform/mcp-server" rel="noopener noreferrer"&gt;Terraform MCP server&lt;/a&gt;. By using the MCP server for the Terraform registry, I could search the public modules and providers available to expedite new examples and versions of modules I used before.&lt;/p&gt;

&lt;p&gt;Each of the containers includes a set of environment variables to enable it to run locally. Some variables, like those for Opensearch, disable security plugins and enable demo configurations for ease of use.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

 &lt;span class="na"&gt;terraform-mcp-server&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
   &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;hashicorp/terraform-mcp-server:0.3.3&lt;/span&gt;
   &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;terraform-mcp-server&lt;/span&gt;
   &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8080:8080"&lt;/span&gt;
   &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TRANSPORT_MODE=streamable-http'&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TRANSPORT_HOST=0.0.0.0'&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TFE_TOKEN=${TFE_TOKEN}'&lt;/span&gt;

 &lt;span class="na"&gt;ollama&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
   &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dockerfiles&lt;/span&gt;
     &lt;span class="na"&gt;dockerfile&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dockerfile.ollama&lt;/span&gt;
   &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama&lt;/span&gt;
   &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;11434:11434"&lt;/span&gt;
   &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ollama_data:/root/.ollama&lt;/span&gt;
   &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;
   &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;OLLAMA_CONTEXT_LENGTH=131072'&lt;/span&gt;

 &lt;span class="na"&gt;langflow&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
   &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dockerfiles&lt;/span&gt;
     &lt;span class="na"&gt;dockerfile&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dockerfile.langflow&lt;/span&gt;
   &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;langflow&lt;/span&gt;
   &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;7860:7860"&lt;/span&gt;
   &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;LANGFLOW_HOST=0.0.0.0'&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;LANGFLOW_OPEN_BROWSER=false'&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;LANGFLOW_WORKER_TIMEOUT=1800'&lt;/span&gt;
   &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;langflow_data:/app/langflow&lt;/span&gt;
   &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ollama&lt;/span&gt;
   &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

 &lt;span class="na"&gt;opensearch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
   &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dockerfiles&lt;/span&gt;
     &lt;span class="na"&gt;dockerfile&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dockerfile.opensearch&lt;/span&gt;
   &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;opensearch&lt;/span&gt;
   &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;cluster.name=opensearch-cluster&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;node.name=opensearch-node1&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;discovery.type=single-node&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;bootstrap.memory_lock=true&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENSEARCH_JAVA_OPTS=-Xms512m&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-Xmx512m"&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DISABLE_INSTALL_DEMO_CONFIG=true"&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DISABLE_SECURITY_PLUGIN=true"&lt;/span&gt;
   &lt;span class="na"&gt;ulimits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="na"&gt;memlock&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
       &lt;span class="na"&gt;soft&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;-1&lt;/span&gt;
       &lt;span class="na"&gt;hard&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;-1&lt;/span&gt;
     &lt;span class="na"&gt;nofile&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
       &lt;span class="na"&gt;soft&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;65536&lt;/span&gt;
       &lt;span class="na"&gt;hard&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;65536&lt;/span&gt;
   &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;opensearch_data:/usr/share/opensearch/data&lt;/span&gt;
   &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;9200:9200"&lt;/span&gt;
     &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;9600:9600"&lt;/span&gt;
   &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
 &lt;span class="na"&gt;ollama_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
 &lt;span class="na"&gt;langflow_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
 &lt;span class="na"&gt;opensearch_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each component represents an important part of the AI stack, such as prompts, agents, context, and models. After they came up, I could access Langflow on &lt;a href="http://localhost:7860" rel="noopener noreferrer"&gt;http://localhost:7860&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkm99e8sir7dcga2zxsj3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkm99e8sir7dcga2zxsj3.png" alt="Langflow start page to create a first flow" width="800" height="412"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The remaining components I could access via API or connect it to a flow in Langflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;After much trial and error, I managed to figure out how to create a local stack to build out an AI agent to help me update my examples based on knowledge from my book, talks, and other code examples. As I explore more, I will add more tools and context to improve the agent (and maybe even build multiple agents).&lt;/p&gt;

&lt;p&gt;I realized I could probably use the AI agent built into my coding IDE to do most of this. Project Bob did end up helping in the process to build this stack and it did make it faster. The downside to using any AI agent built into my coding IDE was the overall cost. I quickly realized that I had to check my usage to ensure I didn't make too many requests.&lt;/p&gt;

&lt;p&gt;I was glad that I could run this locally. The small Granite model really helped - I only had to give Ollama a little bit more CPU and memory to run the model. I could mitigate the cost of running against hosted LLMs and maybe achieve a similar result. I found the process of deploying each component valuable as a learning experience.&lt;/p&gt;

&lt;p&gt;Next, I plan on building a flow in Langflow to process all of my book chapters, slides, and code examples before passing it to an agent to process.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>terraform</category>
      <category>infrastructureascode</category>
      <category>docker</category>
    </item>
  </channel>
</rss>
