<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Josh Blair</title>
    <description>The latest articles on Forem by Josh Blair (@josh_blair).</description>
    <link>https://forem.com/josh_blair</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F100411%2F963e53f6-678b-4041-9d0f-ce8331890bed.png</url>
      <title>Forem: Josh Blair</title>
      <link>https://forem.com/josh_blair</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/josh_blair"/>
    <language>en</language>
    <item>
      <title>Zero-Secret CI/CD: GitHub Actions + OIDC on AWS (Part 6)</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 02:04:06 +0000</pubDate>
      <link>https://forem.com/josh_blair/zero-secret-cicd-github-actions-oidc-on-aws-part-6-22e7</link>
      <guid>https://forem.com/josh_blair/zero-secret-cicd-github-actions-oidc-on-aws-part-6-22e7</guid>
      <description>&lt;h1&gt;
  
  
  Zero-Secret CI/CD: GitHub Actions + OIDC on AWS (Part 6)
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;No &lt;code&gt;AWS_ACCESS_KEY_ID&lt;/code&gt; in your GitHub secrets. Ever. Here's how OIDC trust works and why it's strictly better.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The most common GitHub Actions setup I see in portfolios stores &lt;code&gt;AWS_ACCESS_KEY_ID&lt;/code&gt; and &lt;code&gt;AWS_SECRET_ACCESS_KEY&lt;/code&gt; as repository secrets. Those are long-lived credentials tied to an IAM user. One breach of your GitHub account — a compromised OAuth token, a compromised third-party Action, a secret accidentally logged in workflow output — and an attacker has permanent AWS access until someone notices and rotates the keys.&lt;/p&gt;

&lt;p&gt;OIDC federation eliminates the stored credentials entirely. GitHub Actions assumes an IAM role using a short-lived signed token. When the job ends, the session expires. There are no keys to rotate because there are no keys.&lt;/p&gt;

&lt;p&gt;This post covers how the trust relationship works, how the CI and deploy workflows are structured, and how the frontend gets deployed to CloudFront with correct cache headers.&lt;/p&gt;




&lt;h2&gt;
  
  
  How GitHub Actions OIDC Works
&lt;/h2&gt;

&lt;p&gt;GitHub operates as an OpenID Connect identity provider. When a workflow job runs with the &lt;code&gt;id-token: write&lt;/code&gt; permission, GitHub can mint a signed JWT asserting the identity of the running job:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"repo:joshblair/sift:ref:refs/heads/main"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts.amazonaws.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iss"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://token.actions.githubusercontent.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"repository"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"joshblair/sift"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ref"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"refs/heads/main"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AWS STS accepts this JWT via &lt;code&gt;AssumeRoleWithWebIdentity&lt;/code&gt; and issues a short-lived role session — credentials that expire when the job ends, typically within an hour. The exchange only works if AWS has been told to trust GitHub's OIDC provider and the role's trust policy permits the specific repository making the request.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up the Trust (Once Per Account)
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;scripts/bootstrap.sh&lt;/code&gt; runs once to wire this up. It does three things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Creates the GitHub OIDC provider in IAM:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws iam create-open-id-connect-provider &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; &lt;span class="s2"&gt;"https://token.actions.githubusercontent.com"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--client-id-list&lt;/span&gt; &lt;span class="s2"&gt;"sts.amazonaws.com"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--thumbprint-list&lt;/span&gt; &lt;span class="s2"&gt;"6938fd4d98bab03faadb97b34396831e3780aea1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells AWS to trust JWTs signed by GitHub's OIDC endpoint. It's a one-time setup for the AWS account — not per-repo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Creates the IAM deploy role with a scoped trust policy:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"Federated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::ACCOUNT_ID:oidc-provider/token.actions.githubusercontent.com"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts:AssumeRoleWithWebIdentity"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"StringEquals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"token.actions.githubusercontent.com:aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts.amazonaws.com"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"StringLike"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"token.actions.githubusercontent.com:sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"repo:joshblair/sift:*"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;StringLike&lt;/code&gt; condition on &lt;code&gt;sub&lt;/code&gt; limits trust to jobs running from the &lt;code&gt;joshblair/sift&lt;/code&gt; repository. The wildcard &lt;code&gt;*&lt;/code&gt; allows both branch pushes and pull request checks. For a production setup, you'd tighten this to &lt;code&gt;repo:org/repo:ref:refs/heads/main&lt;/code&gt; to prevent deploy jobs from running on feature branches.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Prints the values to add as GitHub Actions variables:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Bootstrap complete. Add these as GitHub Actions variables in your repo settings:

  AWS_REGION       = us-west-2
  SAM_BUCKET       = sift-sam-123456789-us-west-2
  DEPLOY_ROLE_ARN  = arn:aws:iam::123456789:role/sift-github-actions-deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These go in the repository's Variables (not Secrets) — they're not sensitive values. Cognito configuration (&lt;code&gt;VITE_USER_POOL_ID&lt;/code&gt;, &lt;code&gt;VITE_USER_POOL_CLIENT_ID&lt;/code&gt;, &lt;code&gt;VITE_COGNITO_DOMAIN&lt;/code&gt;) is also stored as variables.&lt;/p&gt;

&lt;h3&gt;
  
  
  In the Workflow
&lt;/h3&gt;

&lt;p&gt;With the provider and role in place, a single step handles authentication in every job that needs AWS access:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;   &lt;span class="c1"&gt;# allows the job to request an OIDC token&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;role-to-assume&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.DEPLOY_ROLE_ARN }}&lt;/span&gt;
    &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;     &lt;span class="s"&gt;${{ vars.AWS_REGION }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After this step, the job has temporary AWS credentials in its environment — the same &lt;code&gt;AWS_ACCESS_KEY_ID&lt;/code&gt;, &lt;code&gt;AWS_SECRET_ACCESS_KEY&lt;/code&gt;, and &lt;code&gt;AWS_SESSION_TOKEN&lt;/code&gt; that the AWS CLI and SDKs look for, but populated automatically for the duration of the job.&lt;/p&gt;




&lt;h2&gt;
  
  
  CI: Three Parallel Jobs on Every Pull Request
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;ci.yml&lt;/code&gt; triggers on pull requests to main. Three jobs run in parallel — each covers a different part of the stack.&lt;/p&gt;

&lt;h3&gt;
  
  
  .NET Build and Test
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Restore&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dotnet restore&lt;/span&gt;
  &lt;span class="na"&gt;working-directory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;backend&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dotnet build --no-restore --configuration Release&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Test&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dotnet test --no-build --configuration Release --logger "console;verbosity=normal"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Standard &lt;code&gt;dotnet&lt;/code&gt; pipeline. The &lt;code&gt;--no-restore&lt;/code&gt; and &lt;code&gt;--no-build&lt;/code&gt; flags avoid redundant work between steps. Tests run against unit test doubles — no database, no Bedrock — so they complete in a few seconds with no external dependencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python Pipeline Tests
&lt;/h3&gt;

&lt;p&gt;The six pipeline Lambda functions live in separate directories under &lt;code&gt;backend/pipeline/&lt;/code&gt;. Each has its own &lt;code&gt;tests/&lt;/code&gt; subdirectory. The shared utilities (database connection, Bedrock client) live in a Lambda layer at &lt;code&gt;backend/pipeline/layers/shared/python/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In the Lambda execution environment, that layer is mounted at a path Python automatically searches. In CI there's no layer mounting — so the path is added to &lt;code&gt;PYTHONPATH&lt;/code&gt; instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Add shared layer to PYTHONPATH&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;echo "PYTHONPATH=$PYTHONPATH:$GITHUB_WORKSPACE/backend/pipeline/layers/shared/python" &amp;gt;&amp;gt; $GITHUB_ENV&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Writing to &lt;code&gt;$GITHUB_ENV&lt;/code&gt; makes the variable available to all subsequent steps in the job — not just the current shell. This is the correct approach; &lt;code&gt;export&lt;/code&gt; would only persist for the current &lt;code&gt;run&lt;/code&gt; block.&lt;/p&gt;

&lt;p&gt;The test suites then run one per handler:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest — chunk (no AWS deps)&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest backend/pipeline/chunk/tests/ -v&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest — extract&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest backend/pipeline/extract/tests/ -v&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest — embed&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest backend/pipeline/embed/tests/ -v&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest — metadata&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest backend/pipeline/metadata/tests/ -v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The chunker tests are noted as having no AWS dependencies — that handler is pure Python stdlib, so no mocking needed. The others mock &lt;code&gt;boto3&lt;/code&gt; calls to Bedrock and S3.&lt;/p&gt;

&lt;h3&gt;
  
  
  TypeScript Check, Lint, and Build
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TypeScript check&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx tsc --noEmit&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ESLint&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx eslint src --max-warnings &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build (smoke test)&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm run build&lt;/span&gt;
  &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;VITE_API_URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://placeholder.execute-api.us-west-2.amazonaws.com/dev&lt;/span&gt;
    &lt;span class="na"&gt;VITE_USER_POOL_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;us-west-2_PLACEHOLDER&lt;/span&gt;
    &lt;span class="na"&gt;VITE_USER_POOL_CLIENT_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;placeholder&lt;/span&gt;
    &lt;span class="na"&gt;VITE_COGNITO_DOMAIN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;placeholder.auth.us-west-2.amazoncognito.com&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;TypeScript checking and ESLint catch type errors and style issues. The &lt;code&gt;vite build&lt;/code&gt; step is a smoke test: TypeScript's &lt;code&gt;tsc --noEmit&lt;/code&gt; checks types but doesn't bundle. Vite's bundler can still fail on import cycles, missing environment variable references, or tree-shaking edge cases that &lt;code&gt;tsc&lt;/code&gt; doesn't see. Placeholder values satisfy Vite's env var requirements without needing real infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;--max-warnings 0&lt;/code&gt; on ESLint means warnings are treated as errors — a warning that gets committed and ignored accumulates into noise. Zero tolerance keeps the lint output meaningful.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deploy: Two Sequential Jobs on Every Push to Main
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;deploy.yml&lt;/code&gt; triggers on pushes to &lt;code&gt;main&lt;/code&gt;. It has two jobs: &lt;code&gt;deploy-backend&lt;/code&gt; builds and deploys the infrastructure stacks, then &lt;code&gt;deploy-frontend&lt;/code&gt; builds the React app with the real API URL and syncs it to S3.&lt;/p&gt;

&lt;h3&gt;
  
  
  Job 1: SAM Build and Deploy
&lt;/h3&gt;

&lt;p&gt;After authenticating with OIDC, the job builds the Lambda functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SAM build&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sam build --template-file infrastructure/template.yaml&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The build step does the work of compiling or packaging each Lambda function according to the &lt;code&gt;Metadata.BuildMethod&lt;/code&gt; in the template. The .NET functions use a &lt;code&gt;Makefile&lt;/code&gt; target that runs &lt;code&gt;dotnet publish&lt;/code&gt;. The Python pipeline functions are packaged with their dependencies. Because all functions target &lt;code&gt;x86_64&lt;/code&gt; — matching the &lt;code&gt;ubuntu-latest&lt;/code&gt; runner — no cross-compilation or Docker containerization is needed.&lt;/p&gt;

&lt;p&gt;The deploy step reads from the build output, not the source template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SAM deploy — main stack&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;sam deploy \&lt;/span&gt;
      &lt;span class="s"&gt;--no-confirm-changeset \&lt;/span&gt;
      &lt;span class="s"&gt;--no-fail-on-empty-changeset \&lt;/span&gt;
      &lt;span class="s"&gt;--s3-bucket ${{ vars.SAM_BUCKET }} \&lt;/span&gt;
      &lt;span class="s"&gt;--s3-prefix sift-main \&lt;/span&gt;
      &lt;span class="s"&gt;--stack-name sift-dev \&lt;/span&gt;
      &lt;span class="s"&gt;--parameter-overrides Env=dev SamBucket=${{ vars.SAM_BUCKET }} \&lt;/span&gt;
      &lt;span class="s"&gt;--capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice there's no &lt;code&gt;--template-file&lt;/code&gt; flag here. When &lt;code&gt;sam deploy&lt;/code&gt; runs without specifying a template, it reads from &lt;code&gt;.aws-sam/build/template.yaml&lt;/code&gt; — the processed template that references the compiled Lambda artifacts from the build step. Passing &lt;code&gt;--template-file infrastructure/template.yaml&lt;/code&gt; would bypass the build and re-package raw source, which would break the .NET functions.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;--no-fail-on-empty-changeset&lt;/code&gt; means the deploy step succeeds even if nothing changed. Without it, a push that only modifies the frontend would fail the backend deploy job with "No changes to deploy."&lt;/p&gt;

&lt;p&gt;The frontend hosting stack is a plain CloudFormation template (no SAM transforms), deployed separately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy frontend hosting stack&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;aws cloudformation deploy \&lt;/span&gt;
      &lt;span class="s"&gt;--template-file infrastructure/template-frontend.yaml \&lt;/span&gt;
      &lt;span class="s"&gt;--stack-name sift-frontend-dev \&lt;/span&gt;
      &lt;span class="s"&gt;--parameter-overrides Env=dev \&lt;/span&gt;
      &lt;span class="s"&gt;--no-fail-on-empty-changeset&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The job then captures the stack outputs — API URL, S3 bucket name, CloudFront distribution ID — and writes them to &lt;code&gt;$GITHUB_OUTPUT&lt;/code&gt; so the next job can read them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Capture stack outputs&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;outputs&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;API_URL=$(aws cloudformation describe-stacks \&lt;/span&gt;
      &lt;span class="s"&gt;--stack-name sift-dev \&lt;/span&gt;
      &lt;span class="s"&gt;--query "Stacks[0].Outputs[?OutputKey=='ApiUrl'].OutputValue" \&lt;/span&gt;
      &lt;span class="s"&gt;--output text)&lt;/span&gt;
    &lt;span class="s"&gt;echo "api-url=$API_URL" &amp;gt;&amp;gt; $GITHUB_OUTPUT&lt;/span&gt;
    &lt;span class="s"&gt;# ... bucket name and CF distribution ID&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Job 2: Frontend Build and Sync
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;deploy-frontend&lt;/code&gt; declares &lt;code&gt;needs: deploy-backend&lt;/code&gt;, which both enforces ordering and makes the first job's outputs available as &lt;code&gt;needs.deploy-backend.outputs.*&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm run build&lt;/span&gt;
  &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;VITE_API_URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;             &lt;span class="s"&gt;${{ needs.deploy-backend.outputs.api-url }}&lt;/span&gt;
    &lt;span class="na"&gt;VITE_USER_POOL_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;        &lt;span class="s"&gt;${{ vars.VITE_USER_POOL_ID }}&lt;/span&gt;
    &lt;span class="na"&gt;VITE_USER_POOL_CLIENT_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.VITE_USER_POOL_CLIENT_ID }}&lt;/span&gt;
    &lt;span class="na"&gt;VITE_COGNITO_DOMAIN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;      &lt;span class="s"&gt;${{ vars.VITE_COGNITO_DOMAIN }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;VITE_API_URL&lt;/code&gt; is the only value that comes from the previous job's runtime output — it's only known after the SAM stack deploys. Everything else is static configuration stored as repository variables.&lt;/p&gt;

&lt;p&gt;The S3 sync uses split cache headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Sync to S3&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;aws s3 sync dist/ s3://${{ needs.deploy-backend.outputs.frontend-bucket }} \&lt;/span&gt;
      &lt;span class="s"&gt;--delete \&lt;/span&gt;
      &lt;span class="s"&gt;--cache-control "public,max-age=31536000,immutable" \&lt;/span&gt;
      &lt;span class="s"&gt;--exclude "index.html"&lt;/span&gt;
    &lt;span class="s"&gt;aws s3 cp dist/index.html s3://$BUCKET/index.html \&lt;/span&gt;
      &lt;span class="s"&gt;--cache-control "no-cache,no-store,must-revalidate"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the standard SPA cache strategy. Vite includes a content hash in every asset filename — &lt;code&gt;main-Dz9a8bK2.js&lt;/code&gt;, &lt;code&gt;vendor-x4jKLmY8.css&lt;/code&gt;. These filenames change when the content changes. The browser can safely cache them for a year (&lt;code&gt;max-age=31536000,immutable&lt;/code&gt;); if the file changes, it gets a new URL.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;index.html&lt;/code&gt; doesn't have a hash in its name — it's always &lt;code&gt;index.html&lt;/code&gt;. It contains &lt;code&gt;&amp;lt;script src="./assets/main-Dz9a8bK2.js"&amp;gt;&lt;/code&gt;, pointing to the current hashed filenames. If &lt;code&gt;index.html&lt;/code&gt; is cached, the browser never fetches the new asset filenames after a deploy. &lt;code&gt;no-cache,no-store,must-revalidate&lt;/code&gt; forces the browser to revalidate &lt;code&gt;index.html&lt;/code&gt; on every navigation — it fetches fresh, reads the new asset filenames, and the rest loads from cache.&lt;/p&gt;

&lt;p&gt;Finally, the CloudFront edge caches are invalidated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Invalidate CloudFront&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;aws cloudfront create-invalidation \&lt;/span&gt;
      &lt;span class="s"&gt;--distribution-id ${{ needs.deploy-backend.outputs.cloudfront-id }} \&lt;/span&gt;
      &lt;span class="s"&gt;--paths "/*"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without this, CloudFront would serve the cached previous version of &lt;code&gt;index.html&lt;/code&gt; from edge locations for up to 24 hours — undermining the &lt;code&gt;no-cache&lt;/code&gt; header set on the origin.&lt;/p&gt;




&lt;h2&gt;
  
  
  CloudFront + S3 Hosting
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;template-frontend.yaml&lt;/code&gt; configures CloudFront to serve a private S3 bucket using Origin Access Control:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;FrontendBucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::S3::Bucket&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;PublicAccessBlockConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;BlockPublicAcls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;       &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;IgnorePublicAcls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;      &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;BlockPublicPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;     &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;RestrictPublicBuckets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The bucket has all public access blocked. The only way to read objects is through CloudFront.&lt;/p&gt;

&lt;p&gt;Origin Access Control (OAC) replaces the older Origin Access Identity (OAI) pattern. OAC uses SigV4 request signing rather than a special IAM principal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;OriginAccessControl&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::CloudFront::OriginAccessControl&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;OriginAccessControlConfig&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;OriginAccessControlOriginType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;s3&lt;/span&gt;
      &lt;span class="na"&gt;SigningBehavior&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;               &lt;span class="s"&gt;always&lt;/span&gt;
      &lt;span class="na"&gt;SigningProtocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;               &lt;span class="s"&gt;sigv4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The bucket policy grants &lt;code&gt;s3:GetObject&lt;/code&gt; to CloudFront, scoped to this specific distribution's ARN:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Condition&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;StringEquals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;AWS:SourceArn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s"&gt;arn:aws:cloudfront::${AWS::AccountId}:distribution/${CloudFrontDistribution}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;AWS:SourceArn&lt;/code&gt; condition means even if someone obtained CloudFront's service principal, they couldn't use it to access this bucket from a different distribution. The permission is tied to the specific CloudFront resource, not just the service.&lt;/p&gt;

&lt;p&gt;The distribution handles the SPA routing requirement with custom error responses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;CustomErrorResponses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;403&lt;/span&gt;
    &lt;span class="na"&gt;ResponseCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
    &lt;span class="na"&gt;ResponsePagePath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/index.html&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;404&lt;/span&gt;
    &lt;span class="na"&gt;ResponseCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
    &lt;span class="na"&gt;ResponsePagePath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/index.html&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a user bookmarks &lt;code&gt;app.example.com/documents&lt;/code&gt; and navigates directly to it, S3 returns a 403 (no object at that key) or 404. Without this configuration, the user sees an XML error response. With it, CloudFront intercepts that error and serves &lt;code&gt;index.html&lt;/code&gt; instead — React Router then handles the &lt;code&gt;/documents&lt;/code&gt; path client-side. The &lt;code&gt;ErrorCachingMinTTL: 0&lt;/code&gt; on each rule prevents CloudFront from caching the error responses themselves.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Result
&lt;/h2&gt;

&lt;p&gt;The full pipeline, end to end:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Developer opens a pull request → CI runs three parallel jobs (45–90 seconds)&lt;/li&gt;
&lt;li&gt;Merge to main → Deploy job builds and deploys all infrastructure stacks&lt;/li&gt;
&lt;li&gt;Frontend build runs with the real API URL from stack outputs&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;aws s3 sync&lt;/code&gt; with correct cache headers, CloudFront invalidation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The only AWS credentials that exist are the temporary role session credentials inside the running job. There's no &lt;code&gt;AWS_ACCESS_KEY_ID&lt;/code&gt; in repository secrets, no IAM user to audit, and no credential rotation to schedule. The IAM role trust policy limits which repos can assume it, and the role itself is scoped to exactly the permissions needed to deploy Sift — no more.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Complete Series
&lt;/h2&gt;

&lt;p&gt;That's all six parts:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Part&lt;/th&gt;
&lt;th&gt;Topic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Architecture overview — service choices and cost breakdown&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Multi-tenant auth — Cognito JWT, API Gateway validation, Postgres RLS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Step Functions pipeline — state machine, Map state, Express Workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;RAG and vector search — pgvector, Titan Embed v2, citations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;React frontend — Amplify auth, presigned upload, React Query polling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;CI/CD — OIDC federation, SAM build/deploy, CloudFront cache strategy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The live demo is at &lt;a href="https://sift.bonefishsoftware.com" rel="noopener noreferrer"&gt;sift.bonefishsoftware.com&lt;/a&gt;. The code is at &lt;a href="https://github.com/joshblair/sift" rel="noopener noreferrer"&gt;github.com/joshblair/sift&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the Sift series: building a production-ready multi-tenant RAG platform on AWS.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>github</category>
      <category>aws</category>
      <category>devops</category>
      <category>cicd</category>
    </item>
    <item>
      <title>Building the React Frontend: Document Library and Chat UI (Part 5)</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 02:03:48 +0000</pubDate>
      <link>https://forem.com/josh_blair/building-the-react-frontend-document-library-and-chat-ui-part-5-22li</link>
      <guid>https://forem.com/josh_blair/building-the-react-frontend-document-library-and-chat-ui-part-5-22li</guid>
      <description>&lt;h1&gt;
  
  
  Building the React Frontend: Document Library and Chat UI (Part 5)
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;React Query's &lt;code&gt;refetchInterval&lt;/code&gt; turns a polling requirement into a one-liner. Here's the whole frontend, explained.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The frontend is where the demo either lands or doesn't. Interviewers will click around. Uploads need to feel instant, the pipeline status needs to update without a refresh, and the chat responses need citations that prove the AI actually read the documents — not just hallucinated plausible-sounding text.&lt;/p&gt;

&lt;p&gt;This post covers how each of those things works, starting from auth and working through upload, status polling, and the chat UI.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tech Choices
&lt;/h2&gt;

&lt;p&gt;A quick inventory before diving in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vite&lt;/strong&gt; — faster dev server than Create React App, native ES module HMR, straightforward to configure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;React 18 + TypeScript&lt;/strong&gt; — strict mode, no &lt;code&gt;any&lt;/code&gt;, every API response typed at the boundary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tailwind v4&lt;/strong&gt; — utility-first, no separate CSS files to maintain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;React Query (&lt;code&gt;@tanstack/react-query&lt;/code&gt;)&lt;/strong&gt; — server state management, automatic cache invalidation, and the polling behavior discussed below&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS Amplify UI&lt;/strong&gt; — the &lt;code&gt;Authenticator&lt;/code&gt; component handles the full Cognito sign-in/sign-up flow without custom UI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Axios&lt;/strong&gt; — request interceptor injects the Bearer token on every outbound request&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Auth: Amplify and the Token Interceptor
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;configureAmplify()&lt;/code&gt; runs once at module load before anything else renders:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;configureAmplify&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;Amplify&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;configure&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;Auth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;Cognito&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;userPoolId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;       &lt;span class="k"&gt;import&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;VITE_USER_POOL_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;userPoolClientId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;VITE_USER_POOL_CLIENT_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;loginWith&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;oauth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;          &lt;span class="k"&gt;import&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;VITE_COGNITO_DOMAIN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;scopes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;          &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;email&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openid&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;profile&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="na"&gt;redirectSignIn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="na"&gt;redirectSignOut&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="na"&gt;responseType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;code&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All values come from &lt;code&gt;VITE_&lt;/code&gt; environment variables, injected at build time from &lt;code&gt;.env.local&lt;/code&gt;. The &lt;code&gt;responseType: 'code'&lt;/code&gt; uses the PKCE authorization code flow — the correct flow for single-page apps, which can't keep a client secret.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;getAccessToken&lt;/code&gt; function returns the token that every API request carries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getAccessToken&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetchAuthSession&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;idToken&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;No ID token&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice it's returning the &lt;strong&gt;ID token&lt;/strong&gt;, not the access token. This is intentional and slightly non-obvious. Cognito's Pre-Token Generation V1 trigger — which injects the &lt;code&gt;tenantId&lt;/code&gt; custom claim (covered in &lt;a href="https://dev.to/josh_blair/multi-tenant-auth-with-cognito-and-postgresql-row-level-security-part-2-5d30"&gt;Part 2&lt;/a&gt;) — only applies to the ID token. The access token doesn't get custom claims from V1 triggers. Since API Gateway validates the &lt;code&gt;tenantId&lt;/code&gt; claim from the token, the ID token is what needs to be sent.&lt;/p&gt;

&lt;p&gt;Amplify handles token refresh transparently — &lt;code&gt;fetchAuthSession()&lt;/code&gt; returns a fresh token if the current one is near expiry, with no code required on the caller's side.&lt;/p&gt;

&lt;p&gt;The Axios client wires this up with a request interceptor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;api&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;axios&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;VITE_API_URL&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nx"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;interceptors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getAccessToken&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Authorization&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`Bearer &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every API call goes through this interceptor, so no individual function or component ever has to think about auth headers.&lt;/p&gt;

&lt;h3&gt;
  
  
  App Initialization
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;App.tsx&lt;/code&gt; wraps everything in the Amplify &lt;code&gt;Authenticator&lt;/code&gt; component:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;App&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;QueryClientProvider&lt;/span&gt; &lt;span class="na"&gt;client&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;queryClient&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Authenticator&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;AppRoutes&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Authenticator&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;QueryClientProvider&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;Authenticator&lt;/code&gt; renders the Cognito Hosted UI for unauthenticated users — sign-in, sign-up, email verification — and calls its render prop only after the user is authenticated. The entire application sits inside that render prop. No route guards, no redirect logic, no token-checking in individual components.&lt;/p&gt;

&lt;p&gt;The first thing &lt;code&gt;AppRoutes&lt;/code&gt; does on mount is sync the user record:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;AppRoutes&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;synced&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setSynced&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;tenantApi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sync&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setSynced&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;synced&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Spinner&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="c1"&gt;// ... routes&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;POST /tenants/me/sync&lt;/code&gt; upserts the user row in the database using the JWT's &lt;code&gt;sub&lt;/code&gt; and &lt;code&gt;email&lt;/code&gt; claims. This handles first-time logins (where no user row exists yet) and keeps the email current if it changes in Cognito. The app doesn't render routes until this completes — a brief spinner rather than a flash of potentially stale state.&lt;/p&gt;




&lt;h2&gt;
  
  
  Document Upload: The Two-Step Flow
&lt;/h2&gt;

&lt;p&gt;The upload flow has two distinct steps, and understanding why matters for the architecture.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why not upload through API Gateway?&lt;/strong&gt; API Gateway HTTP APIs have a 10MB payload limit. A single PDF can easily exceed that. The solution is to bypass API Gateway entirely for the file content and upload directly to S3.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; The client calls &lt;code&gt;POST /documents/upload-url&lt;/code&gt; with the filename and file type. The Lambda creates the database record (status &lt;code&gt;pending&lt;/code&gt;) and returns a presigned S3 PUT URL along with the new document ID.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; The client PUTs the file directly to the presigned URL — which goes straight to S3, bypassing API Gateway.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;useUploadDocument&lt;/code&gt; hook handles both steps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useUploadDocument&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queryClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useQueryClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;useMutation&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;mutationFn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;File&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;()?.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;txt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;uploadUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;documentId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;documentsApi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getUploadUrl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="c1"&gt;// PUT directly to S3 presigned URL — bypasses API Gateway&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;axios&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;uploadUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;})&lt;/span&gt;

      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;documentId&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;onSuccess&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;queryClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invalidateQueries&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;queryKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;documents&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second &lt;code&gt;axios.put&lt;/code&gt; uses the base &lt;code&gt;axios&lt;/code&gt; instance, not the &lt;code&gt;api&lt;/code&gt; client with the auth interceptor. The presigned URL already has credentials embedded — adding an &lt;code&gt;Authorization&lt;/code&gt; header would cause S3 to reject the request.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;onSuccess&lt;/code&gt; invalidates the &lt;code&gt;documents&lt;/code&gt; query, which triggers an immediate refetch. The new document appears in the list as &lt;code&gt;pending&lt;/code&gt; before the pipeline has even started.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why create the DB record before the upload?&lt;/strong&gt; The S3 key is &lt;code&gt;{tenantId}/{documentId}/{filename}&lt;/code&gt;. Creating the database record first gives us the document ID to construct the key. When the upload completes and EventBridge fires, the Step Functions pipeline starts immediately — and the document row it needs to update already exists.&lt;/p&gt;

&lt;p&gt;The dropzone handles drag-and-drop and file input with consistent behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handleFile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useCallback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;File&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;()?.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;ACCEPTED_EXT&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;setError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Unsupported file type. Accepted: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;ACCEPTED_EXT&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nf"&gt;setUploading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;onUpload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;setError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Upload failed. Please try again.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;setUploading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;onUpload&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;File type validation happens client-side before any network request is made. During upload the dropzone is visually disabled (&lt;code&gt;pointer-events-none&lt;/code&gt;) and shows a spinner — the user can't double-submit.&lt;/p&gt;




&lt;h2&gt;
  
  
  Status Polling: React Query's &lt;code&gt;refetchInterval&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Once a document is uploaded, the pipeline runs asynchronously. The frontend needs to show status updates — &lt;code&gt;pending&lt;/code&gt; → &lt;code&gt;processing&lt;/code&gt; → &lt;code&gt;ready&lt;/code&gt; — without requiring a manual refresh.&lt;/p&gt;

&lt;p&gt;React Query's &lt;code&gt;refetchInterval&lt;/code&gt; option accepts a callback that can return a number (milliseconds) or &lt;code&gt;false&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useDocuments&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;useQuery&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;queryKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;documents&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;queryFn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="nx"&gt;documentsApi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;refetchInterval&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;pending&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;processing&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While any document in the list has status &lt;code&gt;pending&lt;/code&gt; or &lt;code&gt;processing&lt;/code&gt;, the hook polls every 3 seconds. When all documents are &lt;code&gt;ready&lt;/code&gt; or &lt;code&gt;failed&lt;/code&gt;, it stops. No WebSocket infrastructure, no long-polling, no &lt;code&gt;useEffect&lt;/code&gt; with &lt;code&gt;setInterval&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;DocumentCard&lt;/code&gt; component drives the status display:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;statusConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;label&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Pending&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="na"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;secondary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="na"&gt;icon&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Clock&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;processing&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;label&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Processing&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;warning&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="na"&gt;icon&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Loader2&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;ready&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;label&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Ready&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="na"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;success&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="na"&gt;icon&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CheckCircle2&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;failed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;label&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Failed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="na"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;destructive&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="na"&gt;icon&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;XCircle&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The processing badge uses &lt;code&gt;animate-spin&lt;/code&gt; on its icon — a CSS animation that costs nothing and makes the in-progress state visually obvious. Once processing completes, the card fills in the summary paragraph and topic chips from the metadata extraction step. On failure, the error message from the &lt;code&gt;mark_failed&lt;/code&gt; Lambda appears inline in a red callout:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;failed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;errorMessage&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"mt-2 text-xs text-red-600 bg-red-50 rounded p-2"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;errorMessage&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Chat: Local State in &lt;code&gt;useChat&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;The chat interface doesn't use React Query — there's no server cache to manage, just a thread of messages that grows as the user asks questions.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;useChat&lt;/code&gt; is a custom hook that manages the message array, loading state, and error state:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useChat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;([])&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setIsLoading&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setError&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;         &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sendMessage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useCallback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;setError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;question&lt;/span&gt; &lt;span class="p"&gt;}])&lt;/span&gt;
    &lt;span class="nf"&gt;setIsLoading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chatApi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;citations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;citations&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;setError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Failed to get a response. Please try again.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;// remove the user message on failure&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;setIsLoading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sendMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;clearMessages&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The user's message is added to the thread immediately — before the API call completes — so the UI feels responsive. If the call fails, &lt;code&gt;prev.slice(0, -1)&lt;/code&gt; removes it and shows an error, leaving the thread in the state it was before the failed send. This avoids the awkward state of showing a user message with no corresponding response.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;ChatPage&lt;/code&gt; scrolls to the bottom on every new message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bottomRef&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useRef&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;HTMLDivElement&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;bottomRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;scrollIntoView&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;behavior&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;smooth&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The ref sits on a &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; after the last message. &lt;code&gt;scrollIntoView&lt;/code&gt; fires both when a new message arrives and when &lt;code&gt;isLoading&lt;/code&gt; becomes true (so the "Thinking…" indicator is visible). The input is disabled during loading — no queuing up a second question while the first is in flight.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rendering Citations
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;ChatMessage&lt;/code&gt; handles both user and assistant messages. User messages are right-aligned in a blue bubble; assistant messages are left-aligned in grey. Citations appear as cards below the assistant's response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;citations&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;citations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"space-y-1"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"text-xs text-slate-400 px-1"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Sources:&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;citations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"rounded-md border border-slate-200 bg-white px-3 py-2 text-xs text-slate-600"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"font-medium text-blue-600"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;[&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;]&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt; &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"font-medium"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"text-slate-400"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; · chunk &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunkIndex&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"mt-1 text-slate-500 line-clamp-2 italic"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;"&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;excerpt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;"&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;[1]&lt;/code&gt;, &lt;code&gt;[2]&lt;/code&gt; numbers here correspond to the same numbers Claude used inline in the answer text. The user can see "Revenue grew 18% in Q3 [1]" and then immediately read the exact sentence from the PDF that grounded that claim. &lt;code&gt;line-clamp-2&lt;/code&gt; keeps the excerpt compact — two lines of truncated italic text, enough to confirm the source without overflowing the card.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Missing for Production
&lt;/h2&gt;

&lt;p&gt;The demo is complete enough to show in an interview, but a production deployment would need a few more things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Error boundaries.&lt;/strong&gt; A thrown exception anywhere in the component tree currently crashes the whole app. React error boundaries would catch rendering errors and show a recovery UI instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pagination.&lt;/strong&gt; The document list fetches all documents in a single request. At hundreds of documents, this becomes a performance problem — both for the API query and for rendering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File size validation.&lt;/strong&gt; The dropzone validates file type but not file size. A 500MB PDF will pass client-side validation and fail at the S3 put with an unhelpful error. A size check before requesting the upload URL is an easy addition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Streaming chat responses.&lt;/strong&gt; The chat waits for the full response before displaying anything. For longer answers this is a noticeable pause. Lambda Function URLs support HTTP response streaming, which would allow tokens to appear as they're generated. API Gateway doesn't support streaming, so this would require routing the &lt;code&gt;/chat&lt;/code&gt; endpoint through a Function URL instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;E2E tests.&lt;/strong&gt; There are no Playwright or Cypress tests. For a portfolio project that's a reasonable trade-off; for a production app, automated browser tests on the upload and chat flows would be essential.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dev.to/josh_blair/zero-secret-cicd-github-actions-oidc-on-aws-part-6-22e7"&gt;Part 6&lt;/a&gt;&lt;/strong&gt; covers the CI/CD pipeline — GitHub Actions with OIDC federation, how the deploy workflow deploys five nested SAM stacks in dependency order, and why there are zero stored AWS credentials anywhere in the repository.&lt;/p&gt;

&lt;p&gt;The code for this post:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;frontend/src/auth/cognito.ts&lt;/code&gt; — Amplify config, ID token vs access token&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;frontend/src/api/client.ts&lt;/code&gt; — Axios client, auth interceptor, typed API surface&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;frontend/src/hooks/useDocuments.ts&lt;/code&gt; — React Query with adaptive &lt;code&gt;refetchInterval&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;frontend/src/hooks/useUploadDocument.ts&lt;/code&gt; — two-step upload mutation&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;frontend/src/components/UploadDropzone.tsx&lt;/code&gt; — drag-and-drop with validation&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;frontend/src/components/DocumentCard.tsx&lt;/code&gt; — status badges, summary, error display&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;frontend/src/hooks/useChat.ts&lt;/code&gt; — local chat state, optimistic message add&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;frontend/src/pages/Chat.tsx&lt;/code&gt; — scroll behavior, loading indicator&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;frontend/src/components/ChatMessage.tsx&lt;/code&gt; — citation cards&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Part of the Sift series: building a production-ready multi-tenant RAG platform on AWS.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>react</category>
      <category>typescript</category>
      <category>aws</category>
      <category>webdev</category>
    </item>
    <item>
      <title>RAG and Vector Search with pgvector and Amazon Bedrock (Part 4)</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 02:02:41 +0000</pubDate>
      <link>https://forem.com/josh_blair/rag-and-vector-search-with-pgvector-and-amazon-bedrock-part-4-5294</link>
      <guid>https://forem.com/josh_blair/rag-and-vector-search-with-pgvector-and-amazon-bedrock-part-4-5294</guid>
      <description>&lt;h1&gt;
  
  
  RAG and Vector Search with pgvector and Amazon Bedrock (Part 4)
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;How to build retrieval-augmented generation that actually cites its sources — without a vector database subscription.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Most RAG tutorials reach for Pinecone, Chroma, or Weaviate as the vector store. Those are all fine services, but they add another cost line, another auth boundary, and a dependency you don't control. If you're already running Postgres — and for multi-tenant SaaS, you should be — the pgvector extension gives you vector similarity search inside your existing database, protected by the same Row-Level Security policies you already have.&lt;/p&gt;

&lt;p&gt;This post covers the full query path in Sift: how a user's question becomes a vector, how pgvector finds the closest document chunks, and how Claude turns those chunks into a cited answer.&lt;/p&gt;




&lt;h2&gt;
  
  
  What RAG Actually Does
&lt;/h2&gt;

&lt;p&gt;The core idea is simple. At query time:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Embed the user's question with the same model used to embed the documents&lt;/li&gt;
&lt;li&gt;Find the document chunks whose embeddings are closest to the question embedding&lt;/li&gt;
&lt;li&gt;Send those chunks to an LLM, tell it to answer the question using only that context&lt;/li&gt;
&lt;li&gt;Return the answer with numbered citations linking back to the source text&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. The sophistication is in the details of each step.&lt;/p&gt;




&lt;h2&gt;
  
  
  Embeddings with Bedrock Titan Embed v2
&lt;/h2&gt;

&lt;p&gt;Both the pipeline (at ingest time) and the chat handler (at query time) use the same embedding model: &lt;code&gt;amazon.titan-embed-text-v2:0&lt;/code&gt;. Using the same model for both sides of the search is a hard requirement — embeddings from different models live in incompatible vector spaces.&lt;/p&gt;

&lt;p&gt;The Python implementation in the pipeline's shared module:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;EMBED_MODEL_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amazon.titan-embed-text-v2:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputText&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dimensions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;normalize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_get_client&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EMBED_MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;contentType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;accept&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;embedding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two parameters worth noting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;dimensions: 1024&lt;/code&gt;&lt;/strong&gt; — Titan Embed v2 supports multiple output sizes (256, 512, or 1024 dimensions). Fewer dimensions mean smaller storage and faster search at the cost of some precision. 1024 is the maximum and gives the best retrieval quality; for a demo at this scale, there's no reason to trade it away.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;normalize: True&lt;/code&gt;&lt;/strong&gt; — this asks Bedrock to return a unit-length vector. Normalized embeddings mean cosine similarity is equivalent to dot product. pgvector can compute dot products slightly faster than cosine distance, and it simplifies reasoning about scores. More importantly, it means you don't have to normalize manually — if you skip it and your embeddings have different magnitudes, your similarity scores will be skewed by vector length rather than semantic meaning.&lt;/p&gt;

&lt;p&gt;Authentication is IAM. The Lambda execution role has &lt;code&gt;bedrock:InvokeModel&lt;/code&gt; permission via its attached policy — no API keys, no secrets to rotate.&lt;/p&gt;




&lt;h2&gt;
  
  
  Schema: Storing Vectors in Postgres
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;document_chunks&lt;/code&gt; table has a &lt;code&gt;vector(1024)&lt;/code&gt; column — the native pgvector type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;document_chunks&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;            &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="n"&gt;document_id&lt;/span&gt;   &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;tenant_id&lt;/span&gt;     &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;chunk_index&lt;/span&gt;   &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt;       &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;embedding&lt;/span&gt;     &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt;    &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;(1024)&lt;/code&gt; in the column type is a hard constraint — Postgres will reject inserts with a vector of any other dimension. That's a useful guardrail: if the embedding model changes and the dimension changes with it, the insert fails loudly rather than silently storing mismatched vectors.&lt;/p&gt;

&lt;h3&gt;
  
  
  The IVFFlat Index
&lt;/h3&gt;

&lt;p&gt;An exact nearest-neighbor search scans every vector in the table and computes distance to the query vector. For a small dataset that's fine. At tens of millions of chunks it becomes expensive.&lt;/p&gt;

&lt;p&gt;IVFFlat (Inverted File Flat) is an approximate nearest-neighbor index. It clusters the vectors into groups (called "lists") at index build time. At query time, it only searches the most promising lists rather than the entire table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;document_chunks&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;ivfflat&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector_cosine_ops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;vector_cosine_ops&lt;/code&gt; tells the index to use cosine distance as its metric, which matches the &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; operator in the query. The &lt;code&gt;lists = 100&lt;/code&gt; parameter controls how many clusters to build — the pgvector docs recommend roughly &lt;code&gt;sqrt(rows)&lt;/code&gt; as a starting point.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The IVFFlat gotcha:&lt;/strong&gt; the index needs data to exist when it's built. An IVFFlat index built on an empty table is useless. In Sift, the initial migration creates the index after the schema is established, and the seed data runs in the same migration. For a production system where the table grows continuously, HNSW is a better choice — it maintains good search quality as data is inserted without needing a rebuild.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inserting Vectors from Python
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;psycopg2&lt;/code&gt; driver doesn't natively understand the pgvector type. Rather than adding the &lt;code&gt;pgvector&lt;/code&gt; Python package (which requires a compiled extension and adds deploy complexity), the pipeline constructs a Postgres vector literal as a plain string and casts it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;vector_literal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    INSERT INTO document_chunks
        (document_id, tenant_id, chunk_index, content, embedding)
    VALUES (%s, %s, %s, %s, %s::vector)
    ON CONFLICT DO NOTHING
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vector_literal&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;::vector&lt;/code&gt; cast in the SQL converts the string to the native vector type at insert time. This works on any Postgres driver, any Lambda architecture (x86 or ARM), without native extensions. The &lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt; handles at-least-once delivery from the Step Functions Map state — if an EmbedChunk Lambda retries, it won't create duplicate chunks.&lt;/p&gt;




&lt;h2&gt;
  
  
  Similarity Search
&lt;/h2&gt;

&lt;p&gt;At query time, the C# &lt;code&gt;ChatService&lt;/code&gt; embeds the user's question and runs the search. The same vector literal approach works from the .NET side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ChunkResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;SearchChunksAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Guid&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;vectorLiteral&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;$"[&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;","&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s"&gt;]"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;TenantContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SetAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateCommand&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CommandText&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;        &lt;span class="n"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunk_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="n"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
        &lt;span class="n"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;document_chunks&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;
        &lt;span class="n"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="n"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;
        &lt;span class="n"&gt;ORDER&lt;/span&gt; &lt;span class="n"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
        &lt;span class="n"&gt;LIMIT&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;
        &lt;span class="s"&gt;""";
&lt;/span&gt;    &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parameters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddWithValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NpgsqlTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NpgsqlDbType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vectorLiteral&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parameters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddWithValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TopK&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; operator is pgvector's cosine distance operator. It returns values between 0 and 2 — 0 means identical vectors, 2 means pointing in opposite directions. Ordering by ascending distance gives the most semantically similar chunks first.&lt;/p&gt;

&lt;p&gt;Notice that &lt;code&gt;TenantContext.SetAsync&lt;/code&gt; runs before the query. This sets the Postgres session variable that the RLS policy reads. The similarity search is automatically tenant-scoped — there's no &lt;code&gt;WHERE tenant_id = $3&lt;/code&gt; in this query, but Postgres applies the policy invisibly. A user from Acme Corp can only find chunks from their own documents, even though the &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; distance calculation runs across an index that spans all tenants' data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why 8 chunks?&lt;/strong&gt; &lt;code&gt;TopK = 8&lt;/code&gt; is a constant in &lt;code&gt;ChatService.cs&lt;/code&gt;. Eight chunks at ~512 tokens each is roughly 4,000 tokens of context — enough to answer most questions without overwhelming the model or the latency budget. The tradeoff is real: more chunks means higher recall (better chance the right information is included) at the cost of slower generation and more noise in the prompt. Eight is a practical default, not a theoretically derived optimum.&lt;/p&gt;




&lt;h2&gt;
  
  
  The RAG Prompt
&lt;/h2&gt;

&lt;p&gt;With the top 8 chunks retrieved, the service builds the prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\n\n"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Select&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="s"&gt;$"[&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] From \"&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Filename&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;\" (chunk &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ChunkIndex&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;):\n&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;systemPrompt&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;    &lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;are&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;helpful&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="n"&gt;assistant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Answer&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;only&lt;/span&gt;
    &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;provided&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="n"&gt;excerpts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Cite&lt;/span&gt; &lt;span class="n"&gt;your&lt;/span&gt; &lt;span class="n"&gt;sources&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="err"&gt;[1],&lt;/span&gt; &lt;span class="err"&gt;[2],&lt;/span&gt; &lt;span class="nn"&gt;etc.&lt;/span&gt;
    &lt;span class="n"&gt;If&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="n"&gt;cannot&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="n"&gt;found&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;excerpts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;say&lt;/span&gt; &lt;span class="n"&gt;so&lt;/span&gt; &lt;span class="n"&gt;clearly&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
    &lt;span class="s"&gt;""";
&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;userMessage&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;$"Document excerpts:\n&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;\n\nQuestion: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each chunk gets a numbered label &lt;code&gt;[1]&lt;/code&gt;, &lt;code&gt;[2]&lt;/code&gt;, etc., with the filename and chunk index. The system prompt instructs Claude to use those same numbers as inline citations. The model sees something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[1] From "Q3_Report.pdf" (chunk 4):
Revenue for Q3 was $4.2M, up 18% year-over-year driven by enterprise contracts...

[2] From "Q3_Report.pdf" (chunk 5):
The increase was concentrated in the healthcare vertical, which grew 31%...

Question: What drove the Q3 revenue increase?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And responds with an answer that cites &lt;code&gt;[1]&lt;/code&gt; and &lt;code&gt;[2]&lt;/code&gt; inline, so the reader knows exactly which passage each claim came from.&lt;/p&gt;

&lt;p&gt;The model for this step is Claude Haiku 4.5 — fast and cheap for a task that's mostly about summarizing and organizing provided context rather than knowledge retrieval or reasoning. The &lt;code&gt;max_tokens: 1024&lt;/code&gt; cap keeps response times predictable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Citations as First-Class Data
&lt;/h3&gt;

&lt;p&gt;The response doesn't just return the answer string. The &lt;code&gt;ChatResponse&lt;/code&gt; model carries a parallel citations array:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatResponse&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;             &lt;span class="n"&gt;Answer&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ChatCitation&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Citations&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatCitation&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Guid&lt;/span&gt;   &lt;span class="n"&gt;DocumentId&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Filename&lt;/span&gt;   &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Excerpt&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;    &lt;span class="n"&gt;ChunkIndex&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each citation includes the first 200 characters of the chunk's content. The React frontend renders them as expandable cards below the answer — the user can click &lt;code&gt;[1]&lt;/code&gt; to see the exact excerpt that grounded that part of the response, with the source document and chunk position shown.&lt;/p&gt;

&lt;p&gt;This matters for trust. A RAG system that returns confident-sounding answers with no way to verify them is worse than one that shows its work.&lt;/p&gt;




&lt;h2&gt;
  
  
  Limitations and What Production Would Change
&lt;/h2&gt;

&lt;p&gt;The implementation above works well at demo scale. A few things I'd change for a real production deployment:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chunking strategy.&lt;/strong&gt; The sliding-window chunker in &lt;a href="https://dev.to/josh_blair/serverless-document-pipelines-with-aws-step-functions-part-3-2111"&gt;Part 3&lt;/a&gt; splits on character count, not semantic boundaries. A 512-token window can cut off mid-sentence, mid-table, or mid-list. Better approaches: a recursive sentence splitter that tries to preserve paragraph boundaries, or a semantic chunker that uses an embedding model to detect topic shifts. The trade-off is complexity and ingest latency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Index type.&lt;/strong&gt; IVFFlat is good for static or slowly-growing datasets, but it degrades as data is inserted after the index is built — you need periodic reindexing. HNSW (Hierarchical Navigable Small World) maintains search quality dynamically as data grows, at the cost of higher memory usage. For a production system with continuous ingestion, HNSW is the right default.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reranking.&lt;/strong&gt; Vector similarity is a good first filter but not a perfect one. A cross-encoder reranker — a small model that takes (question, chunk) pairs and scores their relevance directly — can significantly improve the precision of the final context window. The typical pattern is: retrieve top 20–50 chunks with vector search, rerank with a cross-encoder, pass the top 8 to the LLM.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Streaming.&lt;/strong&gt; The current API waits for Claude to finish generating the full answer before returning it. For longer answers that can take 3–5 seconds, that's a noticeable pause. Lambda Function URLs support response streaming, which would let the frontend display tokens as they arrive. API Gateway HTTP APIs don't support streaming, so switching to Function URLs for the chat endpoint would be the path there.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dev.to/josh_blair/building-the-react-frontend-document-library-and-chat-ui-part-5-22li"&gt;Part 5&lt;/a&gt;&lt;/strong&gt; covers the React frontend: how the upload flow works, the polling pattern that drives the document status cards, and how Amplify's auth integration wires up the Cognito token flow.&lt;/p&gt;

&lt;p&gt;The code for this post:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;backend/shared/bedrock.py&lt;/code&gt; — embedding call, normalize flag&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;migrations/001_initial_schema.sql&lt;/code&gt; — &lt;code&gt;vector(1024)&lt;/code&gt; column, IVFFlat index&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;backend/pipeline/embed/embed_handler.py&lt;/code&gt; — vector literal insert, &lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;backend/src/Sift.Api/Services/ChatService.cs&lt;/code&gt; — full query path: embed → search → generate&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;backend/src/Sift.Api/Models/Chat.cs&lt;/code&gt; — response shape with citations&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Part of the Sift series: building a production-ready multi-tenant RAG platform on AWS.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>postgres</category>
      <category>python</category>
    </item>
    <item>
      <title>Serverless Document Pipelines with AWS Step Functions (Part 3)</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 02:02:23 +0000</pubDate>
      <link>https://forem.com/josh_blair/serverless-document-pipelines-with-aws-step-functions-part-3-2111</link>
      <guid>https://forem.com/josh_blair/serverless-document-pipelines-with-aws-step-functions-part-3-2111</guid>
      <description>&lt;h1&gt;
  
  
  Serverless Document Pipelines with AWS Step Functions (Part 3)
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;Why I chose Step Functions over SQS + Lambda — and what the execution history is actually worth.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Every async processing pipeline starts the same way: a file lands somewhere, something needs to happen to it in multiple stages, and you need it to be reliable. The obvious architecture is SQS queues chained between Lambda functions. It's battle-tested, it scales, and you've probably built it before.&lt;/p&gt;

&lt;p&gt;I deliberately chose not to use it here.&lt;/p&gt;

&lt;p&gt;Sift's document processing pipeline has six stages: extract text, chunk it, generate embeddings in parallel, extract metadata with an LLM, and mark the document ready (or failed). I implemented all of it as a Step Functions Express Workflow. This post covers why, how the state machine is structured, and what the Map state for parallel embedding actually does.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Trigger: S3 → EventBridge → Step Functions
&lt;/h2&gt;

&lt;p&gt;When a user uploads a document, the browser sends it directly to S3 via a presigned URL. The API never sees the file content — it just issues the URL and records the pending document in the database. From there, the pipeline starts automatically.&lt;/p&gt;

&lt;p&gt;The trigger chain has two hops.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hop 1: S3 to EventBridge.&lt;/strong&gt; The uploads bucket has EventBridge notifications enabled:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;UploadsBucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::S3::Bucket&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;NotificationConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;EventBridgeConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;EventBridgeEnabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That one flag makes the bucket publish &lt;code&gt;Object Created&lt;/code&gt; events to the default EventBridge bus automatically, for every object upload. No SNS topic, no S3 notification configuration specifying ARNs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hop 2: EventBridge to Step Functions.&lt;/strong&gt; An EventBridge rule matches those events and triggers the state machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;S3UploadRule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Events::Rule&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;EventPattern&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;aws.s3&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;detail-type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Object Created&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;detail&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;bucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="nv"&gt;UploadsBucket&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;Targets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TriggerPipeline&lt;/span&gt;
        &lt;span class="na"&gt;Arn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;PipelineStateMachine&lt;/span&gt;
        &lt;span class="na"&gt;RoleArn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;EventBridgeToSfnRole.Arn&lt;/span&gt;
        &lt;span class="na"&gt;InputTransformer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;InputPathsMap&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;    &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$.detail.object.key"&lt;/span&gt;
            &lt;span class="na"&gt;bucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$.detail.bucket.name"&lt;/span&gt;
          &lt;span class="na"&gt;InputTemplate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{"s3Key":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"[key]",&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"bucketName":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"[bucket]"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;(EventBridge InputTransformer uses &lt;code&gt;&amp;lt;placeholder&amp;gt;&lt;/code&gt; angle-bracket syntax; shown here with brackets to avoid rendering issues.)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;InputTransformer&lt;/code&gt; is doing something important: it reshapes the raw S3 event (which has a lot of noise — checksums, ETags, content type) into a clean minimal payload before Step Functions even sees it. The state machine starts with just &lt;code&gt;{ s3Key, bucketName }&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why EventBridge instead of S3 → Lambda directly?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;S3 supports direct Lambda triggers. The reason to go through EventBridge anyway is decoupling: the Step Functions ARN isn't embedded in the S3 bucket configuration. If I wanted to add a second consumer — say, a Lambda that indexes the filename for search — I'd add another EventBridge rule target, not modify the S3 bucket. The bucket doesn't know what listens to its events.&lt;/p&gt;




&lt;h2&gt;
  
  
  The State Machine
&lt;/h2&gt;

&lt;p&gt;The entire pipeline is defined as YAML inside the SAM template. Here's the full structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;PipelineStateMachine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::StateMachine&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s"&gt;sift-pipeline-${Env}&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;EXPRESS&lt;/span&gt;
    &lt;span class="na"&gt;Definition&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Comment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Sift document ingestion pipeline&lt;/span&gt;
      &lt;span class="na"&gt;StartAt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ExtractText&lt;/span&gt;
      &lt;span class="na"&gt;States&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;ExtractText&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Task&lt;/span&gt;
          &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;ExtractTextFunction.Arn&lt;/span&gt;
          &lt;span class="na"&gt;Retry&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorEquals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;States.TaskFailed&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;IntervalSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
              &lt;span class="na"&gt;MaxAttempts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
          &lt;span class="na"&gt;Catch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorEquals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;States.ALL&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;ResultPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.error&lt;/span&gt;
              &lt;span class="na"&gt;Next&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;MarkFailed&lt;/span&gt;
          &lt;span class="na"&gt;Next&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ChunkText&lt;/span&gt;

        &lt;span class="na"&gt;ChunkText&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Task&lt;/span&gt;
          &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;ChunkTextFunction.Arn&lt;/span&gt;
          &lt;span class="na"&gt;Catch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorEquals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;States.ALL&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;ResultPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.error&lt;/span&gt;
              &lt;span class="na"&gt;Next&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;MarkFailed&lt;/span&gt;
          &lt;span class="na"&gt;Next&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GenerateEmbeddings&lt;/span&gt;

        &lt;span class="na"&gt;GenerateEmbeddings&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Map&lt;/span&gt;
          &lt;span class="na"&gt;ItemsPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.chunks&lt;/span&gt;
          &lt;span class="na"&gt;Parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;documentId.$&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.documentId&lt;/span&gt;
            &lt;span class="na"&gt;tenantId.$&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;   &lt;span class="s"&gt;$.tenantId&lt;/span&gt;
            &lt;span class="na"&gt;index.$&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;      &lt;span class="s"&gt;$$.Map.Item.Value.index&lt;/span&gt;
            &lt;span class="na"&gt;content.$&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;    &lt;span class="s"&gt;$$.Map.Item.Value.content&lt;/span&gt;
          &lt;span class="na"&gt;MaxConcurrency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
          &lt;span class="na"&gt;ResultPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.embeddingResults&lt;/span&gt;
          &lt;span class="na"&gt;Iterator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;StartAt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;EmbedChunk&lt;/span&gt;
            &lt;span class="na"&gt;States&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;EmbedChunk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Task&lt;/span&gt;
                &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;EmbedChunkFunction.Arn&lt;/span&gt;
                &lt;span class="na"&gt;Retry&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorEquals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;States.TaskFailed&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
                    &lt;span class="na"&gt;IntervalSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
                    &lt;span class="na"&gt;MaxAttempts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
                &lt;span class="na"&gt;End&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
          &lt;span class="na"&gt;Catch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorEquals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;States.ALL&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;ResultPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.error&lt;/span&gt;
              &lt;span class="na"&gt;Next&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;MarkFailed&lt;/span&gt;
          &lt;span class="na"&gt;Next&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ExtractMetadata&lt;/span&gt;

        &lt;span class="na"&gt;ExtractMetadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Task&lt;/span&gt;
          &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;ExtractMetadataFunction.Arn&lt;/span&gt;
          &lt;span class="na"&gt;Catch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorEquals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;States.ALL&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;ResultPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.error&lt;/span&gt;
              &lt;span class="na"&gt;Next&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;MarkFailed&lt;/span&gt;
          &lt;span class="na"&gt;Next&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;MarkReady&lt;/span&gt;

        &lt;span class="na"&gt;MarkReady&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Task&lt;/span&gt;
          &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;MarkReadyFunction.Arn&lt;/span&gt;
          &lt;span class="na"&gt;End&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

        &lt;span class="na"&gt;MarkFailed&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Task&lt;/span&gt;
          &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;MarkFailedFunction.Arn&lt;/span&gt;
          &lt;span class="na"&gt;End&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9w9arbdu94vcpog6ud83.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9w9arbdu94vcpog6ud83.png" alt="Sift document pipeline diagram"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's walk through each stage.&lt;/p&gt;




&lt;h2&gt;
  
  
  Stage 1: ExtractText
&lt;/h2&gt;

&lt;p&gt;The first Lambda gets &lt;code&gt;{ s3Key, bucketName }&lt;/code&gt; and has two jobs: parse the tenant and document IDs from the key, and extract plain text from whatever file type was uploaded.&lt;/p&gt;

&lt;p&gt;The S3 key format is &lt;code&gt;{tenantId}/{documentId}/{filename}&lt;/code&gt; — the same prefix structure used for tenant isolation in S3 (covered in &lt;a href="https://dev.to/josh_blair/multi-tenant-auth-with-cognito-and-postgresql-row-level-security-part-2-5d30"&gt;Part 2&lt;/a&gt;). Parsing it is a single split:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;parts&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s3_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tenant_id&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;document_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;filename&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Text extraction is dispatched on file extension:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rsplit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_extract_pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_extract_docx&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_extract_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;txt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;replace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unsupported file extension: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ext&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For PDFs, &lt;code&gt;pdfplumber&lt;/code&gt; handles multi-page extraction and tracks the page count. For DOCX, &lt;code&gt;python-docx&lt;/code&gt; walks the paragraph list. For CSV, &lt;code&gt;pandas&lt;/code&gt; converts the dataframe to a string representation with column names in the header — not ideal for prose, but searchable and embeddable. The page count flows downstream to the &lt;code&gt;documents&lt;/code&gt; table and surfaces in the UI.&lt;/p&gt;

&lt;p&gt;The Lambda also sets the document status to &lt;code&gt;processing&lt;/code&gt; before returning. This tells the frontend's polling logic that the pipeline is running and to keep checking.&lt;/p&gt;

&lt;p&gt;The return value passes everything forward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenantId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;documentId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filename&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;       &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pageCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;page_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Stage 2: ChunkText
&lt;/h2&gt;

&lt;p&gt;This stage splits the extracted text into overlapping windows. The constants:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;CHUNK_SIZE&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;   &lt;span class="c1"&gt;# tokens, approximated as characters / 4
&lt;/span&gt;&lt;span class="n"&gt;CHUNK_OVERLAP&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;    &lt;span class="c1"&gt;# tokens of overlap between adjacent chunks
&lt;/span&gt;&lt;span class="n"&gt;CHARS_PER_TOKEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The overlap is the important detail. If a chunk boundary lands in the middle of a sentence that contains the answer to a user's question, a chunk with no overlap might return two fragments — each with half the context — neither of which scores well in similarity search. With 64-token overlap, adjacent chunks share a paragraph's worth of text, so the answer has a better chance of appearing intact in at least one chunk.&lt;/p&gt;

&lt;p&gt;The chunker is a sliding-window algorithm that splits on word boundaries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_chunk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;overlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;words&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;buf&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;buf_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;word_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;buf_len&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;word_len&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="c1"&gt;# Roll back by overlap characters
&lt;/span&gt;            &lt;span class="n"&gt;rolled&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rolled_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;reversed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rolled_len&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;overlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;break&lt;/span&gt;
                &lt;span class="n"&gt;rolled&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;rolled_len&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="n"&gt;buf&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rolled&lt;/span&gt;
            &lt;span class="n"&gt;buf_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rolled_len&lt;/span&gt;
        &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;buf_len&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;word_len&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No external dependencies — pure Python standard library. That means cold starts for this Lambda are essentially free. The output is a list of &lt;code&gt;{ index, content }&lt;/code&gt; objects that becomes the input to the Map state.&lt;/p&gt;




&lt;h2&gt;
  
  
  Stage 3: GenerateEmbeddings (The Map State)
&lt;/h2&gt;

&lt;p&gt;This is where Step Functions earns its keep.&lt;/p&gt;

&lt;p&gt;Embedding generation is the most time-consuming part of the pipeline. A 20-page PDF might produce 80–100 chunks, each requiring a separate Bedrock API call. Running them sequentially would be slow and wasteful. The Map state fans them out in parallel.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;GenerateEmbeddings&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Map&lt;/span&gt;
  &lt;span class="na"&gt;ItemsPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.chunks&lt;/span&gt;
  &lt;span class="na"&gt;Parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;documentId.$&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.documentId&lt;/span&gt;
    &lt;span class="na"&gt;tenantId.$&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;   &lt;span class="s"&gt;$.tenantId&lt;/span&gt;
    &lt;span class="na"&gt;index.$&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;      &lt;span class="s"&gt;$$.Map.Item.Value.index&lt;/span&gt;
    &lt;span class="na"&gt;content.$&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;    &lt;span class="s"&gt;$$.Map.Item.Value.content&lt;/span&gt;
  &lt;span class="na"&gt;MaxConcurrency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
  &lt;span class="na"&gt;ResultPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.embeddingResults&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things are happening here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;ItemsPath: $.chunks&lt;/code&gt;&lt;/strong&gt; tells Step Functions to iterate over the &lt;code&gt;chunks&lt;/code&gt; array from the previous state's output. Each item becomes one Lambda invocation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The &lt;code&gt;Parameters&lt;/code&gt; block&lt;/strong&gt; reshapes each iteration's input. Without it, each EmbedChunk Lambda invocation would receive the full &lt;code&gt;chunks&lt;/code&gt; array — which it doesn't need, and which would exceed Lambda's payload size at any real document length. Instead, &lt;code&gt;$$.Map.Item.Value.index&lt;/code&gt; and &lt;code&gt;$$.Map.Item.Value.content&lt;/code&gt; pull just the current chunk's fields, and &lt;code&gt;documentId.$&lt;/code&gt; and &lt;code&gt;tenantId.$&lt;/code&gt; carry the parent context. The &lt;code&gt;$$&lt;/code&gt; prefix accesses the Step Functions execution context rather than the state input.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;MaxConcurrency: 5&lt;/code&gt;&lt;/strong&gt; caps the parallelism. Bedrock has per-account request rate limits. With 100 chunks and no concurrency cap, all 100 invocations would fire simultaneously and most would get throttled — producing retries, latency, and noise. Five concurrent invocations keeps throughput high while staying well under the throttle threshold.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;ResultPath: $.embeddingResults&lt;/code&gt;&lt;/strong&gt; is subtle. Normally, a Map state replaces the entire state input with its result array. Setting a &lt;code&gt;ResultPath&lt;/code&gt; instead merges the results into the existing input under a new key. This is important: ExtractMetadata needs the &lt;code&gt;text&lt;/code&gt;, &lt;code&gt;tenantId&lt;/code&gt;, &lt;code&gt;documentId&lt;/code&gt;, and &lt;code&gt;chunks&lt;/code&gt; fields from earlier stages. Without &lt;code&gt;ResultPath&lt;/code&gt;, they'd be overwritten.&lt;/p&gt;

&lt;p&gt;Each EmbedChunk Lambda invocation does two things:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;         &lt;span class="c1"&gt;# Bedrock Titan Embed v2
&lt;/span&gt;    &lt;span class="nf"&gt;_insert_chunk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;documentId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenantId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;index&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ok&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;embed()&lt;/code&gt; call hits Titan Embed v2 (1024 dimensions). The insert uses &lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt; — if the Lambda retries after a partial failure, it won't create duplicate chunks.&lt;/p&gt;

&lt;p&gt;The vector gets written as a Postgres vector literal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;vector_literal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO document_chunks (document_id, tenant_id, chunk_index, content, embedding) &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;VALUES (%s, %s, %s, %s, %s::vector)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vector_literal&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Stage 4: ExtractMetadata
&lt;/h2&gt;

&lt;p&gt;Once all chunks are embedded, a final Bedrock call generates a summary and topic list for the document. This surfaces in the UI as the document card's description.&lt;/p&gt;

&lt;p&gt;The Lambda sends only the first 6,000 characters of the document text to stay within Claude Haiku's practical context window for this task:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;excerpt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;6000&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The prompt asks for structured JSON output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;system&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a document analyst. Given document text, return a JSON object &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;with exactly two keys: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; (one paragraph, max 200 words) and &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="s"&gt;topics&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; (array of 3-7 short topic strings). &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Return only the JSON object, no other text.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LLMs sometimes wrap their JSON output in markdown code fences even when told not to. The handler strips them before parsing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;FENCE&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;`&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;  &lt;span class="c1"&gt;# markdown code fence marker
&lt;/span&gt;&lt;span class="n"&gt;cleaned&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;removeprefix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FENCE&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;removeprefix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FENCE&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;removesuffix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FENCE&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cleaned&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The results get written to the &lt;code&gt;documents&lt;/code&gt; table. The page count and chunk count from earlier stages are also persisted here — they came through in the state machine data, so no extra database reads needed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Stages 5 and 6: MarkReady and MarkFailed
&lt;/h2&gt;

&lt;p&gt;These terminal states are simple status updates. MarkReady stamps &lt;code&gt;status = 'ready'&lt;/code&gt; and &lt;code&gt;processed_at = NOW()&lt;/code&gt;. MarkFailed records the error message (truncated to 1,000 characters) and sets &lt;code&gt;status = 'failed'&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Every non-terminal state has a &lt;code&gt;Catch&lt;/code&gt; block that routes all errors to MarkFailed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Catch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorEquals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;States.ALL&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;ResultPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.error&lt;/span&gt;
    &lt;span class="na"&gt;Next&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;MarkFailed&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ResultPath: $.error&lt;/code&gt; merges the error details into the state data under &lt;code&gt;$.error&lt;/code&gt; rather than replacing the entire input. That means MarkFailed still receives &lt;code&gt;documentId&lt;/code&gt; and &lt;code&gt;tenantId&lt;/code&gt; — it can always look up which document to update, even when the failure happens deep in an unexpected state.&lt;/p&gt;

&lt;p&gt;The pipeline status flows back to the React frontend through the &lt;code&gt;documents&lt;/code&gt; table. The UI polls the document status endpoint every few seconds and updates the card from &lt;code&gt;uploading&lt;/code&gt; → &lt;code&gt;processing&lt;/code&gt; → &lt;code&gt;ready&lt;/code&gt; (or &lt;code&gt;failed&lt;/code&gt; with the error message shown inline).&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Express Workflows, Not Standard
&lt;/h2&gt;

&lt;p&gt;Step Functions has two execution types. The choice matters for cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Standard Workflows&lt;/strong&gt; charge per state transition — $0.025 per 1,000 transitions. A pipeline with 100 chunks runs the Map state, which means 100 EmbedChunk transitions plus the surrounding states. At scale, that adds up fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Express Workflows&lt;/strong&gt; charge per execution and duration — $1.00 per million executions plus $0.00001 per GB-second. For a pipeline that completes in 2–4 minutes, the cost per document is a fraction of a cent.&lt;/p&gt;

&lt;p&gt;The tradeoffs Express gives up: maximum 5-minute duration, at-least-once (not exactly-once) execution semantics, and no synchronous execution pattern. None of those matter here — the pipeline completes well under 5 minutes for any realistic document size, and &lt;code&gt;ON CONFLICT DO NOTHING&lt;/code&gt; in the embed insert makes at-least-once delivery safe.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Argument for Step Functions
&lt;/h2&gt;

&lt;p&gt;None of the above required Step Functions specifically. You could build the same pipeline with SQS queues between Lambda functions. The chunked output goes on a queue; workers pick up items and embed them; another queue signals the metadata stage.&lt;/p&gt;

&lt;p&gt;The practical difference shows up when something breaks.&lt;/p&gt;

&lt;p&gt;When a document gets stuck in an SQS pipeline, diagnosing it means correlating CloudWatch log groups across multiple Lambda functions, checking DLQ message counts, and reconstructing the sequence of events from timestamps. The document is somewhere in the pipeline, but you're inferring state from indirect evidence.&lt;/p&gt;

&lt;p&gt;In Step Functions, you open the console, click the execution, and see this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ExtractText        → SUCCEEDED  (2.3s)
ChunkText          → SUCCEEDED  (0.1s)
GenerateEmbeddings → FAILED
  └─ EmbedChunk[47] → FAILED (attempt 3/3)
       Error: ThrottlingException
       Cause: Rate exceeded for model amazon.titan-embed-text-v2:0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The failure is pinpointed: chunk 47, third retry, Bedrock throttle. Every invocation's input and output is stored in the execution history. For a portfolio project where the goal is demonstrating architectural thinking clearly — including to interviewers who might pull up the AWS console during a technical screen — that visibility is genuinely worth something.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dev.to/josh_blair/rag-and-vector-search-with-pgvector-and-amazon-bedrock-part-4-5294"&gt;Part 4&lt;/a&gt;&lt;/strong&gt; covers the RAG query path: how a user's question gets embedded, how pgvector finds the closest chunks across potentially thousands of document segments, and how the citation system links each paragraph of Claude's response back to the source text.&lt;/p&gt;

&lt;p&gt;The code for this post:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;infrastructure/template.yaml&lt;/code&gt; — state machine definition, EventBridge rule, InputTransformer&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;backend/pipeline/extract/extract_handler.py&lt;/code&gt; — file type dispatch, S3 key parsing&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;backend/pipeline/chunk/chunk_handler.py&lt;/code&gt; — sliding window chunker&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;backend/pipeline/embed/embed_handler.py&lt;/code&gt; — Bedrock Titan Embed v2, pgvector insert&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;backend/pipeline/metadata/metadata_handler.py&lt;/code&gt; — structured Haiku output, markdown fence stripping&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Part of the Sift series: building a production-ready multi-tenant RAG platform on AWS.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>python</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Multi-Tenant Auth with Cognito and PostgreSQL Row-Level Security (Part 2)</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 02:01:31 +0000</pubDate>
      <link>https://forem.com/josh_blair/multi-tenant-auth-with-cognito-and-postgresql-row-level-security-part-2-5d30</link>
      <guid>https://forem.com/josh_blair/multi-tenant-auth-with-cognito-and-postgresql-row-level-security-part-2-5d30</guid>
      <description>&lt;h1&gt;
  
  
  Multi-Tenant Auth with Cognito and PostgreSQL Row-Level Security (Part 2)
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;How a single Postgres session variable — &lt;code&gt;app.current_tenant_id&lt;/code&gt; — eliminates an entire class of data-leak bugs at the database level.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The hardest bug to find in a multi-tenant SaaS app is the one that silently returns the wrong data. It doesn't throw an exception. It doesn't return a 403. It just quietly hands tenant A's documents to tenant B, and you don't find out until a customer does.&lt;/p&gt;

&lt;p&gt;Most demo apps guard against this with &lt;code&gt;WHERE tenant_id = @tenantId&lt;/code&gt; clauses scattered through their query code. That works until one gets missed — a new endpoint, a refactor that drops the filter, a copy-paste that forgets to update the variable name. One missed clause is a data leak.&lt;/p&gt;

&lt;p&gt;Sift uses a different approach: the database enforces tenant isolation automatically, regardless of what the application code does. This post covers how the full chain works — from Cognito JWT to Postgres policy — and why each piece is necessary.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Full Trust Chain
&lt;/h2&gt;

&lt;p&gt;Here's the sequence for every authenticated API request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Browser → Cognito (login)
       ← JWT signed with tenantId claim

Browser → API Gateway (request + JWT in Authorization header)
        → API Gateway validates JWT signature, rejects if invalid
        → Lambda receives validated claims in request context

Lambda  → extracts tenantId from request.RequestContext.Authorizer.Jwt.Claims
        → opens DB connection
        → calls set_config('app.current_tenant_id', tenantId, false)
        → runs query — Postgres RLS policy auto-filters every row
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each step is independently enforced. A user can't forge the &lt;code&gt;tenantId&lt;/code&gt; in the JWT because they don't have Cognito's signing key. A request can't skip the JWT check because API Gateway rejects it before Lambda runs. A query can't bypass the RLS filter because the application user doesn't have &lt;code&gt;BYPASSRLS&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Few02e80icoab32r483qx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Few02e80icoab32r483qx.png" alt="Sift multi-tenancy isolation diagram" width="800" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's walk through each layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 1: Cognito — Injecting tenantId into the JWT
&lt;/h2&gt;

&lt;p&gt;Cognito User Pools support custom attributes on user objects. In &lt;code&gt;template-cognito.yaml&lt;/code&gt;, the User Pool schema includes a &lt;code&gt;custom:tenantId&lt;/code&gt; attribute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;UserPool&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Cognito::UserPool&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;tenantId&lt;/span&gt;
        &lt;span class="na"&gt;AttributeDataType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;String&lt;/span&gt;
        &lt;span class="na"&gt;Mutable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;LambdaConfig&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;PreTokenGeneration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;PreTokenLambda.Arn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;custom:tenantId&lt;/code&gt; attribute is stored on the Cognito user record — set at invite/signup time by an admin or provisioning flow. But custom attributes don't appear in the JWT by default. That's where the Pre-Token Generation Lambda comes in.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Pre-Token Generation Lambda
&lt;/h3&gt;

&lt;p&gt;Every time Cognito issues a token, it calls this Lambda before signing it. The Lambda reads the user's &lt;code&gt;custom:tenantId&lt;/code&gt; attribute and injects it as a top-level claim:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;DEFAULT_TENANT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;aaaaaaaa-0000-0000-0000-000000000001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;  &lt;span class="c1"&gt;# acme demo tenant
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;request&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;userAttributes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;custom:tenantId&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;DEFAULT_TENANT&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;claimsOverrideDetails&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;claimsToAddOrOverride&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tenantId&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After this Lambda runs, the resulting JWT contains &lt;code&gt;"tenantId": "&amp;lt;uuid&amp;gt;"&lt;/code&gt; as a standard claim alongside &lt;code&gt;sub&lt;/code&gt;, &lt;code&gt;email&lt;/code&gt;, and the rest. Cognito then signs the token with its RSA private key.&lt;/p&gt;

&lt;p&gt;The security guarantee here is important: the &lt;code&gt;tenantId&lt;/code&gt; in the JWT is now as trustworthy as the user's identity itself. The browser receives a signed token and passes it on requests — it cannot alter the &lt;code&gt;tenantId&lt;/code&gt; without invalidating the signature.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 2: API Gateway — JWT Validation for Free
&lt;/h2&gt;

&lt;p&gt;API Gateway HTTP APIs support a native JWT authorizer. No custom Lambda authorizer needed — API Gateway handles validation itself before the request reaches the Lambda function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;HttpApi&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::HttpApi&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Auth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;DefaultAuthorizer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CognitoJwtAuthorizer&lt;/span&gt;
      &lt;span class="na"&gt;Authorizers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;CognitoJwtAuthorizer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;IdentitySource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$request.header.Authorization&lt;/span&gt;
          &lt;span class="na"&gt;JwtConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;issuer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt;
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;https://cognito-idp.${AWS::Region}.amazonaws.com/${UserPoolId}&lt;/span&gt;
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;UserPoolId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!ImportValue&lt;/span&gt;
                  &lt;span class="na"&gt;Fn::Sub&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sift-${Env}-UserPoolId&lt;/span&gt;
            &lt;span class="na"&gt;audience&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!ImportValue&lt;/span&gt;
                  &lt;span class="na"&gt;Fn::Sub&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sift-${Env}-UserPoolClientId&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This configuration tells API Gateway to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Extract the Bearer token from the &lt;code&gt;Authorization&lt;/code&gt; header&lt;/li&gt;
&lt;li&gt;Verify the signature against Cognito's public keys (fetched from the issuer URL's JWKS endpoint)&lt;/li&gt;
&lt;li&gt;Verify the &lt;code&gt;aud&lt;/code&gt; claim matches the expected client ID&lt;/li&gt;
&lt;li&gt;Reject the request with a 401 if any check fails&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the JWT is invalid, expired, or has the wrong audience, Lambda never runs. The &lt;code&gt;tenantId&lt;/code&gt; claim that reaches Lambda has already been cryptographically validated.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 3: Lambda — Extracting the Claim
&lt;/h2&gt;

&lt;p&gt;Once API Gateway passes the request through, the validated JWT claims are available in the Lambda event's request context. Extracting &lt;code&gt;tenantId&lt;/code&gt; is a single line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;Guid&lt;/span&gt; &lt;span class="nf"&gt;GetTenantId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;APIGatewayHttpApiV2ProxyRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;claim&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Authorizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Claims&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"tenantId"&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Guid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;claim&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runs at the top of every Lambda handler before any service code executes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;APIGatewayHttpApiV2ProxyResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;FunctionHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;APIGatewayHttpApiV2ProxyRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ILambdaContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;GetTenantId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;cognitoSub&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Authorizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Claims&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="c1"&gt;// ... route to service&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;tenantId&lt;/code&gt; extracted here gets passed down to the service layer, which uses it to set the database session variable before running any query.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 4: PostgreSQL Row-Level Security
&lt;/h2&gt;

&lt;p&gt;This is where the real defense happens.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem with WHERE Clauses
&lt;/h3&gt;

&lt;p&gt;The naive approach is filtering every query manually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;tenantId&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;docId&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works until it doesn't. Add a new endpoint in a hurry, forget the &lt;code&gt;WHERE tenant_id&lt;/code&gt; clause, and you've got a data leak — with no runtime error to alert you. The query succeeds; it just returns data it shouldn't.&lt;/p&gt;

&lt;p&gt;Row-Level Security moves the filter into the database engine. The policy is applied automatically to every query on that table, whether the application code includes a filter or not.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up RLS
&lt;/h3&gt;

&lt;p&gt;The schema enables RLS on every tenant-scoped table and defines a single policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;       &lt;span class="n"&gt;ENABLE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;document_chunks&lt;/span&gt; &lt;span class="n"&gt;ENABLE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;           &lt;span class="n"&gt;ENABLE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;tenant_isolation&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_tenant_id'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;tenant_isolation&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;document_chunks&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_tenant_id'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;tenant_isolation&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_tenant_id'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;current_setting('app.current_tenant_id')&lt;/code&gt; reads a Postgres session-level variable that the application sets before running any query. The &lt;code&gt;::UUID&lt;/code&gt; cast means a missing or malformed value throws an error rather than silently returning empty results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting the Session Variable
&lt;/h3&gt;

&lt;p&gt;In &lt;code&gt;TenantContext.cs&lt;/code&gt;, every service method that opens a database connection immediately calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt; &lt;span class="nf"&gt;SetAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NpgsqlConnection&lt;/span&gt; &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Guid&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateCommand&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="c1"&gt;// is_local=false → session scope. Safe because Lambda connections are&lt;/span&gt;
    &lt;span class="c1"&gt;// never pooled across requests (Pooling=false in DbConnectionFactory).&lt;/span&gt;
    &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CommandText&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"SELECT set_config('app.current_tenant_id', $1, false)"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parameters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AddWithValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ExecuteNonQueryAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;false&lt;/code&gt; parameter in &lt;code&gt;set_config&lt;/code&gt; means session scope rather than transaction scope. This is intentional: the Lambda's database connections are not pooled across requests (&lt;code&gt;Pooling=false&lt;/code&gt; in the connection factory), so a session-scoped variable is both safe and slightly more efficient than resetting it per-transaction.&lt;/p&gt;

&lt;p&gt;Every service method calls this before touching data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IEnumerable&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;ListAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Guid&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;var&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;TenantContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SetAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// From this point on, every query automatically filters to tenantId's rows&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The BYPASSRLS Gotcha
&lt;/h3&gt;

&lt;p&gt;There's one critical detail that's easy to miss: &lt;strong&gt;Postgres superusers bypass RLS by default.&lt;/strong&gt; If your application database user has superuser privileges, &lt;code&gt;ENABLE ROW LEVEL SECURITY&lt;/code&gt; does nothing — the policies are silently skipped.&lt;/p&gt;

&lt;p&gt;The application user in Sift is the standard credentials from Secrets Manager — a normal role with &lt;code&gt;SELECT&lt;/code&gt;, &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, &lt;code&gt;DELETE&lt;/code&gt; on the application tables, and nothing more. No &lt;code&gt;SUPERUSER&lt;/code&gt;. No &lt;code&gt;BYPASSRLS&lt;/code&gt;. This isn't incidental; it's a deliberate requirement for RLS to actually work.&lt;/p&gt;

&lt;p&gt;If you're debugging why your RLS policies seem to have no effect, check the role: &lt;code&gt;SELECT rolsuper, rolbypassrls FROM pg_roles WHERE rolname = 'your_app_user';&lt;/code&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 5: S3 Key Prefix
&lt;/h2&gt;

&lt;p&gt;RLS handles the database. S3 needs a complementary approach.&lt;/p&gt;

&lt;p&gt;Every document is stored at a key prefixed with the tenant's ID:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{tenantId}/{documentId}/filename.pdf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a user requests a presigned upload URL, the Lambda constructs the key using the &lt;code&gt;tenantId&lt;/code&gt; from their validated JWT. The browser then uploads directly to S3 using that presigned URL — it has no say in what key gets used.&lt;/p&gt;

&lt;p&gt;The result: even if a presigned URL leaked somehow, it would only allow access to that specific object under the tenant's prefix. Tenant A's presigned URL cannot be used to list or access tenant B's documents — the key prefixes are different UUIDs.&lt;/p&gt;

&lt;p&gt;S3 doesn't have "row-level security" — but a properly namespaced key structure combined with IAM policies that don't allow &lt;code&gt;s3:ListBucket&lt;/code&gt; on the uploads bucket achieves the same practical outcome.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Three Layers?
&lt;/h2&gt;

&lt;p&gt;Each layer protects a different attack surface:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Protects Against&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cognito JWT&lt;/td&gt;
&lt;td&gt;User forging a different tenantId on the client&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API Gateway validation&lt;/td&gt;
&lt;td&gt;Bypassing auth entirely (no token, expired token)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Postgres RLS&lt;/td&gt;
&lt;td&gt;Application bugs — a missing WHERE clause, a new query that forgets the filter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 key prefix&lt;/td&gt;
&lt;td&gt;Cross-tenant object access, presigned URL misuse&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The layers are independent. A bug that breaks one doesn't compromise the others. If somehow a &lt;code&gt;tenantId&lt;/code&gt; was wrong in application code, RLS would still return empty results — not the wrong tenant's data. Defense in depth means the blast radius of any single failure is contained.&lt;/p&gt;




&lt;h2&gt;
  
  
  Seeing It in Action
&lt;/h2&gt;

&lt;p&gt;The live demo has two tenants: &lt;strong&gt;Acme Corp&lt;/strong&gt; and &lt;strong&gt;Globex Inc&lt;/strong&gt;, both seeded in the initial migration. You can log in as an Acme Corp user, upload a document, then log in as a Globex user — the document list is empty. Both tenants run against the same Aurora cluster, the same Lambda functions, the same S3 bucket. The query is identical. Postgres silently returns the right rows for each.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://dev.to/josh_blair/serverless-document-pipelines-with-aws-step-functions-part-3-2111"&gt;Part 3&lt;/a&gt;&lt;/strong&gt; covers the Step Functions Express pipeline — how the six-stage document processing workflow is orchestrated, why the Map state handles embedding generation, and how the state machine's declarative retry config replaces dozens of lines of error-handling code.&lt;/p&gt;

&lt;p&gt;The code for everything in this post:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;infrastructure/template-cognito.yaml&lt;/code&gt; — Pre-Token Lambda and User Pool definition&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;migrations/001_initial_schema.sql&lt;/code&gt; — full &lt;code&gt;CREATE POLICY&lt;/code&gt; statements&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;backend/src/Sift.Api/Infrastructure/TenantContext.cs&lt;/code&gt; — the &lt;code&gt;set_config&lt;/code&gt; call&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;backend/src/Sift.Api/Infrastructure/DbConnectionFactory.cs&lt;/code&gt; — why &lt;code&gt;Pooling=false&lt;/code&gt; matters&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;backend/src/Sift.Api/Functions/DocumentsFunction.cs&lt;/code&gt; — JWT claim extraction&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Part of the Sift series: building a production-ready multi-tenant RAG platform on AWS.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>security</category>
      <category>postgres</category>
      <category>dotnet</category>
    </item>
    <item>
      <title>Building a Multi-Tenant AI Document Platform on AWS (Part 1: Architecture)</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 02:01:28 +0000</pubDate>
      <link>https://forem.com/josh_blair/building-a-multi-tenant-ai-document-platform-on-aws-part-1-architecture-16fi</link>
      <guid>https://forem.com/josh_blair/building-a-multi-tenant-ai-document-platform-on-aws-part-1-architecture-16fi</guid>
      <description>&lt;h1&gt;
  
  
  Building a Multi-Tenant AI Document Platform on AWS (Part 1: Architecture)
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;How I designed a production-ready RAG system from scratch using AWS-native services — and kept the monthly bill under $20.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Every senior full-stack role I see right now lists "AI experience" somewhere in the requirements. The problem is that most AI portfolio projects look the same: a Next.js frontend, a call to the OpenAI API, maybe a LangChain wrapper. They're fine demos, but they don't show anything about how you'd actually build and operate a system at scale.&lt;/p&gt;

&lt;p&gt;I wanted to build something that would hold up in a technical interview with an AWS architect — not just "I built a chatbot." So I built &lt;strong&gt;Sift&lt;/strong&gt;: a multi-tenant RAG (Retrieval-Augmented Generation) document platform, fully serverless, built on AWS-native services, with real data isolation between tenants, an async document processing pipeline, and a total monthly cost of about $10–15 in a live demo environment.&lt;/p&gt;

&lt;p&gt;This is the first post in a six-part series. Here I'll walk through the overall architecture and explain why I made each service choice. Future posts will dig into specific areas: auth and multi-tenancy, the Step Functions pipeline, the RAG implementation, the React frontend, and the CI/CD pipeline.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Sift Does
&lt;/h2&gt;

&lt;p&gt;Users upload documents — PDF, DOCX, CSV, or plain text. Sift processes them asynchronously through a six-stage pipeline, generating text embeddings and storing them in a vector database. Users then ask natural language questions, and Sift retrieves the most relevant chunks from their documents and feeds them to a Claude model to generate a grounded answer with numbered citations linking back to the source text.&lt;/p&gt;

&lt;p&gt;It's a legitimate enterprise use case: internal knowledge bases, contract review, research assistants. The architecture reflects that — it's not a demo that would fall apart the moment a second organization tried to use it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Full Architecture
&lt;/h2&gt;

&lt;p&gt;Here's how data flows through the system:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwwkgslm6da7xb9p1j8ep.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwwkgslm6da7xb9p1j8ep.png" alt="Sift architecture diagram" width="799" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's go through each choice.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Each Service?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Aurora Serverless v2 with pgvector
&lt;/h3&gt;

&lt;p&gt;The go-to recommendation for vector storage right now is a managed vector database — Pinecone, Weaviate, OpenSearch with k-NN. I deliberately chose not to use any of them.&lt;/p&gt;

&lt;p&gt;Sift already needs a relational database for its operational data: tenants, users, documents, processing status. Aurora PostgreSQL Serverless v2 handles all of that &lt;em&gt;and&lt;/em&gt; the vector search with the pgvector extension. That's one less service to operate, one less connection to secure, one less cost line item.&lt;/p&gt;

&lt;p&gt;pgvector supports cosine similarity search, IvfFlat indexes for approximate nearest-neighbor search at scale, and runs directly in the Postgres query planner — which means I can join vector search results with relational data in a single query. When a user asks a question, I can filter to only that tenant's documents &lt;em&gt;before&lt;/em&gt; the similarity search, not after. That's a meaningful security and efficiency win.&lt;/p&gt;

&lt;p&gt;The Serverless v2 auto-pause feature means the cluster scales to zero ACUs (Aurora Capacity Units) during idle periods. For a portfolio demo that doesn't have 24/7 traffic, that alone cuts the database cost from ~$50–70/month for a minimum-size provisioned Aurora instance to roughly $7/month in actual usage.&lt;/p&gt;

&lt;p&gt;The tradeoff: the first query after an idle period has a cold-start delay while the cluster resumes — typically 5–15 seconds. That's acceptable for a demo, and for production you'd set a non-zero minimum ACU to keep it warm.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step Functions Express Workflows
&lt;/h3&gt;

&lt;p&gt;When I first sketched the pipeline, the obvious choice was SQS queues between Lambda stages — it's the standard event-driven pattern and I've used it plenty of times. I chose Step Functions Express instead, and it was the right call for this project.&lt;/p&gt;

&lt;p&gt;The visibility argument is simple: when I open the AWS console and click into a Step Functions execution, I can see exactly which stage a document is in, what the input and output were at each step, and precisely where it failed if something went wrong. With SQS, you're inferring pipeline state from DLQ message counts and CloudWatch metrics. That's fine in production where you've built dashboards for it — it's friction in a portfolio project where the goal is demonstrating the architecture clearly.&lt;/p&gt;

&lt;p&gt;Step Functions also handles retries and error catching declaratively in the state machine definition. Instead of writing retry logic in each Lambda, I configure it once in the YAML:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Retry&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorEquals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;States.ALL&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;IntervalSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
    &lt;span class="na"&gt;MaxAttempts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
    &lt;span class="na"&gt;BackoffRate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Transient errors — throttling, network blips — get retried automatically. Unrecoverable failures route to the &lt;code&gt;MarkFailed&lt;/code&gt; state, which updates the document status and preserves the error for display in the UI.&lt;/p&gt;

&lt;p&gt;Express Workflows (vs. Standard Workflows) are priced per execution and duration rather than per state transition. For a document pipeline that completes in under 5 minutes, this is significantly cheaper. The tradeoff is that Express Workflows have a maximum duration of 5 minutes and don't support activities or sync patterns that Standard Workflows do — neither matters here.&lt;/p&gt;

&lt;h3&gt;
  
  
  Amazon Bedrock
&lt;/h3&gt;

&lt;p&gt;I used Bedrock for both embedding generation (Titan Embed v2, 1024 dimensions) and the chat completion (Claude Haiku 4.5).&lt;/p&gt;

&lt;p&gt;The straightforward reason is that it keeps everything inside the AWS trust boundary. No data leaves my VPC to a third-party API. Authentication is IAM, not an API key stored in a secret. There's no separate vendor account to manage, no risk of data being used for model training, and the latency is lower because it's in-region.&lt;/p&gt;

&lt;p&gt;The more practical reason: on an AWS-focused résumé, "I used Bedrock" is worth more than "I used OpenAI." It demonstrates familiarity with Bedrock's APIs, model catalog, and IAM integration patterns that you'd actually use in enterprise AWS environments.&lt;/p&gt;

&lt;p&gt;Titan Embed v2 produces 1024-dimensional vectors with normalized outputs — which means cosine similarity and dot product give equivalent results, simplifying the pgvector query. Claude Haiku 4.5 handles the metadata extraction (generating a document summary and extracting topics from text chunks) and the final chat response generation. Haiku is fast and cheap for the metadata step; the quality is more than adequate for structured extraction tasks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Amazon Cognito
&lt;/h3&gt;

&lt;p&gt;Auth is one of those areas where building your own is almost always the wrong call. Cognito gives you a managed user pool with email/password auth, JWT issuance, and a hosted UI for free at demo scale.&lt;/p&gt;

&lt;p&gt;The interesting piece is the Pre-Token Generation Lambda trigger. When Cognito issues a JWT, it calls a Lambda function that can add custom claims to the token before it's signed. I use this to inject a &lt;code&gt;tenantId&lt;/code&gt; claim — the tenant the user belongs to — directly into the token.&lt;/p&gt;

&lt;p&gt;That &lt;code&gt;tenantId&lt;/code&gt; flows downstream: API Gateway validates the JWT and rejects unauthenticated requests; my Lambda functions extract the claim from the validated token context; and the database uses it to enforce Row-Level Security. The tenant identity is cryptographically bound to the token — a user can't forge a different &lt;code&gt;tenantId&lt;/code&gt; without Cognito's private signing key.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS SAM over CDK
&lt;/h3&gt;

&lt;p&gt;CDK is a better abstraction for large teams and complex infrastructure. SAM is CloudFormation with a thinner layer of syntactic sugar, which means the output is standard CloudFormation YAML that any AWS engineer can read without knowing TypeScript or Python CDK constructs.&lt;/p&gt;

&lt;p&gt;For portfolio purposes, that's actually the right choice. When I'm walking through this in an interview, I want to point directly at &lt;code&gt;template.yaml&lt;/code&gt; and explain the &lt;code&gt;AWS::Serverless::StateMachine&lt;/code&gt; resource, the IAM policy, the event source mapping — without the interviewer needing to know what &lt;code&gt;aws_cdk.aws_stepfunctions&lt;/code&gt; compiles to under the hood.&lt;/p&gt;




&lt;h2&gt;
  
  
  Multi-Tenancy: Three Layers of Isolation
&lt;/h2&gt;

&lt;p&gt;This is where a lot of "multi-tenant" demos fall short — they add a &lt;code&gt;tenantId&lt;/code&gt; column to their tables and filter on it in application code. That works until a bug in one query path leaks cross-tenant data.&lt;/p&gt;

&lt;p&gt;Sift uses three independent isolation layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1 — Cognito custom claim.&lt;/strong&gt; The &lt;code&gt;tenantId&lt;/code&gt; is in the signed JWT. No application code can inject or modify it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2 — S3 key prefix.&lt;/strong&gt; Every uploaded document is stored at &lt;code&gt;{tenantId}/{documentId}/{filename}&lt;/code&gt;. Tenant A can't construct a presigned URL for tenant B's key without knowing the exact key path — and they'd also need to be an IAM principal with S3 access, which they're not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 3 — PostgreSQL Row-Level Security.&lt;/strong&gt; Before any database query executes, the Lambda sets a session variable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;set_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_tenant_id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The RLS policies on every table enforce this automatically at the database level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;tenant_isolation&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_tenant_id'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even if application code contained a bug that omitted a &lt;code&gt;WHERE tenant_id = ?&lt;/code&gt; clause, the database would still return only the current tenant's rows. The isolation is enforced below the application layer.&lt;/p&gt;

&lt;p&gt;One important gotcha: the database application user must not have &lt;code&gt;BYPASSRLS&lt;/code&gt; privileges. Postgres superusers bypass RLS by default. If your Lambda connects as a superuser, your RLS policies are decoration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost Breakdown
&lt;/h2&gt;

&lt;p&gt;One thing I wanted to demonstrate with this project is that you can run a legitimate, production-architected system for almost nothing at low volume:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Aurora Serverless v2 (auto-pause, demo traffic)&lt;/td&gt;
&lt;td&gt;~$7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Amazon Bedrock (embeddings + chat, light usage)&lt;/td&gt;
&lt;td&gt;~$2–5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda, API Gateway, S3, CloudFront&lt;/td&gt;
&lt;td&gt;Free tier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cognito (under 50k MAU)&lt;/td&gt;
&lt;td&gt;Free tier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$10–15&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The variable cost scales with Bedrock usage — each embedding call and each chat completion. At real production volume you'd be looking at real Bedrock costs, and you'd want to add CloudFront caching for static assets and tune Lambda memory settings. But the architectural pattern holds: Aurora Serverless v2 auto-pause means you're not paying for compute when nobody is using the system.&lt;/p&gt;




&lt;h2&gt;
  
  
  AWS Well-Architected Alignment
&lt;/h2&gt;

&lt;p&gt;Since this is a portfolio project for AWS architecture roles, I mapped the design against the six pillars explicitly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Operational Excellence&lt;/strong&gt; — SAM + GitHub Actions with OIDC: the entire infrastructure is code, deployments are automated, there are no manual steps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security&lt;/strong&gt; — OIDC federation (zero stored credentials), Row-Level Security, Cognito JWT, no Lambda with public S3 access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reliability&lt;/strong&gt; — Step Functions retry logic, EventBridge decoupling between S3 upload and pipeline execution, Aurora Multi-AZ in production config&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance Efficiency&lt;/strong&gt; — Aurora Serverless v2 scales with demand, Map state in Step Functions parallelizes embedding generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost Optimization&lt;/strong&gt; — Aurora auto-pause, Bedrock pay-per-token, Lambda and CloudFront at free tier scale&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sustainability&lt;/strong&gt; — Serverless by default: compute resources allocated only when actively processing&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;This series covers each major subsystem in depth:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/josh_blair/multi-tenant-auth-with-cognito-and-postgresql-row-level-security-part-2-5d30"&gt;Part 2&lt;/a&gt;:&lt;/strong&gt; Multi-tenant auth — how the Cognito Pre-Token Lambda works, the RLS policy setup, and the C# tenant context middleware&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/josh_blair/serverless-document-pipelines-with-aws-step-functions-part-3-2111"&gt;Part 3&lt;/a&gt;:&lt;/strong&gt; The Step Functions pipeline — state machine design, the Map state for parallel embedding, error handling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/josh_blair/rag-and-vector-search-with-pgvector-and-amazon-bedrock-part-4-5294"&gt;Part 4&lt;/a&gt;:&lt;/strong&gt; RAG and vector search — chunking strategy, pgvector queries, and how citations are generated&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/josh_blair/building-the-react-frontend-document-library-and-chat-ui-part-5-22li"&gt;Part 5&lt;/a&gt;:&lt;/strong&gt; The React frontend — polling patterns, Amplify auth integration, the upload flow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/josh_blair/zero-secret-cicd-github-actions-oidc-on-aws-part-6-22e7"&gt;Part 6&lt;/a&gt;:&lt;/strong&gt; CI/CD with GitHub Actions and OIDC — zero-secret deployments to AWS&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The live demo is running at &lt;a href="https://sift.bonefishsoftware.com" rel="noopener noreferrer"&gt;sift.bonefishsoftware.com&lt;/a&gt; — log in with the shared Acme Corp credentials from the README and try uploading a PDF.&lt;/p&gt;

&lt;p&gt;The code is at &lt;a href="https://github.com/joshblair/sift" rel="noopener noreferrer"&gt;github.com/joshblair/sift&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the Sift series: building a production-ready multi-tenant RAG platform on AWS.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>architecture</category>
      <category>serverless</category>
      <category>postgres</category>
    </item>
    <item>
      <title>Serverless Contact Form — Lambda, API Gateway, DynamoDB, and SES</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 00:07:07 +0000</pubDate>
      <link>https://forem.com/josh_blair/serverless-contact-form-lambda-api-gateway-dynamodb-and-ses-21ap</link>
      <guid>https://forem.com/josh_blair/serverless-contact-form-lambda-api-gateway-dynamodb-and-ses-21ap</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;The contact form on bonefishsoftware.com is fully serverless — no EC2, no always-on server. A visitor submits the form, the request hits an API Gateway HTTP API, a Python Lambda function validates and stores the submission in DynamoDB, then sends an email notification via SES. Cost at low volume: effectively zero.&lt;/p&gt;

&lt;h2&gt;
  
  
  Flow
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1h4p7jcsdxa8x95c3082.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1h4p7jcsdxa8x95c3082.png" alt="Contact form flow" width="799" height="364"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Infrastructure — AWS SAM
&lt;/h2&gt;

&lt;p&gt;The contact API is deployed using &lt;strong&gt;AWS SAM&lt;/strong&gt; (Serverless Application Model), a CloudFormation extension that simplifies Lambda + API Gateway resource definitions.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why SAM over plain CloudFormation?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
SAM's &lt;code&gt;AWS::Serverless::Function&lt;/code&gt; with &lt;code&gt;Events&lt;/code&gt; automatically creates the API Gateway routes, integrations, Lambda permissions, and stage. Doing this in plain CloudFormation requires 6–8 separate resource definitions. SAM condenses it to one function resource with an &lt;code&gt;Events&lt;/code&gt; section.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Key lesson: &lt;code&gt;AWS::Serverless::HttpApi&lt;/code&gt; vs &lt;code&gt;AWS::ApiGatewayV2::Api&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The most important thing to get right: when using SAM's &lt;code&gt;HttpApi&lt;/code&gt; event type, the &lt;code&gt;ApiId&lt;/code&gt; &lt;strong&gt;must reference an &lt;code&gt;AWS::Serverless::HttpApi&lt;/code&gt; resource&lt;/strong&gt; — not a native &lt;code&gt;AWS::ApiGatewayV2::Api&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Using the wrong resource type results in the SAM transform silently skipping route and integration creation, leaving you with a deployed API that returns 404 on every request.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ✅ Correct — SAM manages routes/integrations automatically&lt;/span&gt;
&lt;span class="na"&gt;ContactApi&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::HttpApi&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;CorsConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;AllowOrigins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;https&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;//bonefishsoftware.com&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;AllowMethods&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;POST&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;OPTIONS&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;AllowHeaders&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Content-Type&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;ContactFunction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::Function&lt;/span&gt;
  &lt;span class="na"&gt;Events&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;PostContact&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HttpApi&lt;/span&gt;
      &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;ApiId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;ContactApi&lt;/span&gt;   &lt;span class="c1"&gt;# ← references Serverless::HttpApi&lt;/span&gt;
        &lt;span class="na"&gt;Path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/contact&lt;/span&gt;
        &lt;span class="na"&gt;Method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;POST&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ❌ Wrong — routes NOT created by SAM&lt;/span&gt;
&lt;span class="na"&gt;ContactApi&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::ApiGatewayV2::Api&lt;/span&gt;  &lt;span class="c1"&gt;# ← native resource, SAM ignores events&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Lambda Function (Python)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# lambda/contact/handler.py
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timezone&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;botocore.exceptions&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ClientError&lt;/span&gt;

&lt;span class="n"&gt;ALLOWED_ORIGIN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ALLOWED_ORIGIN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://bonefishsoftware.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;TABLE_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TABLE_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;FROM_ADDRESS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FROM_ADDRESS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;TO_ADDRESS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TO_ADDRESS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;dynamodb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dynamodb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ses&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ses&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Handle CORS preflight
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;requestContext&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;method&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPTIONS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;

    &lt;span class="c1"&gt;# Parse and validate
&lt;/span&gt;    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;TypeError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid request body.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;name&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;email&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;company&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Name, email, and message are required.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Please provide a valid email address.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;submission_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utc&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;_save_to_dynamo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;submission_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;_send_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;submission_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;ClientError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to process your message. Please try again.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Message received! We&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ll be in touch soon.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why no external dependencies?
&lt;/h3&gt;

&lt;p&gt;Python's &lt;code&gt;boto3&lt;/code&gt; SDK is built into the Lambda runtime — no requirements.txt needed. The deployment package is just a single &lt;code&gt;handler.py&lt;/code&gt; file, keeping Lambda cold-start time minimal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Validation approach
&lt;/h3&gt;

&lt;p&gt;Validation is intentionally lightweight:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Required field presence check&lt;/li&gt;
&lt;li&gt;Naive email format check (contains &lt;code&gt;@&lt;/code&gt; and a &lt;code&gt;.&lt;/code&gt; after it)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We're not the primary spam defense here — the contact form has no public financial incentive to spam, and rate limiting can be added at the API Gateway level later if needed.&lt;/p&gt;




&lt;h2&gt;
  
  
  DynamoDB Table
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;SubmissionsTable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::DynamoDB::Table&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;TableName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bonefish-contact-submissions&lt;/span&gt;
    &lt;span class="na"&gt;BillingMode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PAY_PER_REQUEST&lt;/span&gt;
    &lt;span class="na"&gt;AttributeDefinitions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;AttributeName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;submissionId&lt;/span&gt;
        &lt;span class="na"&gt;AttributeType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;S&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;AttributeName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;timestamp&lt;/span&gt;
        &lt;span class="na"&gt;AttributeType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;S&lt;/span&gt;
    &lt;span class="na"&gt;KeySchema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;AttributeName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;submissionId&lt;/span&gt;
        &lt;span class="na"&gt;KeyType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HASH&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;AttributeName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;timestamp&lt;/span&gt;
        &lt;span class="na"&gt;KeyType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;RANGE&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;PAY_PER_REQUEST billing&lt;/strong&gt; — no provisioned capacity to manage. At the volume of a contact form (single-digit submissions per day at most), this costs essentially nothing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Composite key&lt;/strong&gt; (submissionId + timestamp) — the UUID ensures uniqueness; the timestamp makes it easy to sort and query submissions chronologically in future tooling.&lt;/p&gt;




&lt;h2&gt;
  
  
  SES Email Delivery
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Domain verification (DKIM)
&lt;/h3&gt;

&lt;p&gt;Emails sent from &lt;code&gt;noreply@bonefishsoftware.com&lt;/code&gt; require the domain to be verified in SES. This involves adding three DKIM CNAME records to Route 53:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws sesv2 create-email-identity &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--email-identity&lt;/span&gt; bonefishsoftware.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; us-west-2
&lt;span class="c"&gt;# → returns 3 DKIM tokens&lt;/span&gt;

&lt;span class="c"&gt;# Add CNAME records: &amp;lt;token&amp;gt;._domainkey.bonefishsoftware.com → &amp;lt;token&amp;gt;.dkim.amazonses.com&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;DKIM signing tells receiving mail servers that the email genuinely came from our domain, improving deliverability and preventing spoofing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sandbox mode
&lt;/h3&gt;

&lt;p&gt;By default, SES operates in &lt;strong&gt;sandbox mode&lt;/strong&gt; — you can only send to verified email addresses. This is sufficient for the contact form (we only send TO &lt;code&gt;josh.blair@gmail.com&lt;/code&gt;, which is verified). The submitter's email address appears only in the &lt;code&gt;Reply-To&lt;/code&gt; header and the email body — never as a direct recipient.&lt;/p&gt;

&lt;p&gt;SES production access (removing sandbox restrictions) was requested via:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws sesv2 put-account-details &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--mail-type&lt;/span&gt; TRANSACTIONAL &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--website-url&lt;/span&gt; https://bonefishsoftware.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--use-case-description&lt;/span&gt; &lt;span class="s2"&gt;"..."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--production-access-enabled&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Reply-To header
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;ses&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FROM_ADDRESS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                  &lt;span class="c1"&gt;# noreply@bonefishsoftware.com
&lt;/span&gt;    &lt;span class="n"&gt;Destination&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ToAddresses&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;TO_ADDRESS&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;  &lt;span class="c1"&gt;# josh.blair@gmail.com
&lt;/span&gt;    &lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{...},&lt;/span&gt;
    &lt;span class="n"&gt;ReplyToAddresses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;             &lt;span class="c1"&gt;# submitter's email
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Setting &lt;code&gt;ReplyToAddresses&lt;/code&gt; to the submitter's email means hitting "Reply" in gmail automatically addresses the response to the client — no copy-pasting required.&lt;/p&gt;




&lt;h2&gt;
  
  
  CORS Configuration
&lt;/h2&gt;

&lt;p&gt;CORS is configured at the API Gateway level (not in the Lambda response), via the &lt;code&gt;AWS::Serverless::HttpApi&lt;/code&gt; resource:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;ContactApi&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::HttpApi&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;CorsConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;AllowOrigins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;https://bonefishsoftware.com&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;http://localhost:5173&lt;/span&gt;    &lt;span class="c1"&gt;# local dev&lt;/span&gt;
      &lt;span class="na"&gt;AllowHeaders&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Content-Type&lt;/span&gt;
      &lt;span class="na"&gt;AllowMethods&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;POST&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;OPTIONS&lt;/span&gt;
      &lt;span class="na"&gt;MaxAge&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;300&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Lambda still returns &lt;code&gt;Access-Control-Allow-Origin&lt;/code&gt; headers in its response (for safety), but API Gateway handles the OPTIONS preflight response automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson learned:&lt;/strong&gt; An early version used &lt;code&gt;AWS::ApiGatewayV2::Api&lt;/code&gt; instead of &lt;code&gt;AWS::Serverless::HttpApi&lt;/code&gt;. The CORS configuration was present but the routes were never created — resulting in 404 responses and CORS errors in the browser. Switching to &lt;code&gt;AWS::Serverless::HttpApi&lt;/code&gt; resolved both issues simultaneously.&lt;/p&gt;




&lt;h2&gt;
  
  
  Frontend Integration
&lt;/h2&gt;

&lt;p&gt;The contact form in React fetches the API URL from a Vite environment variable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/pages/Contact.tsx&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;apiUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;VITE_CONTACT_API_URL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleSubmit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;React&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;FormEvent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;preventDefault&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nf"&gt;setStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;submitting&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;apiUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;form&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Request failed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nf"&gt;setStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;success&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;setStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;VITE_CONTACT_API_URL&lt;/code&gt; is injected at build time by CodeBuild as an environment variable. Vite replaces &lt;code&gt;import.meta.env.VITE_*&lt;/code&gt; references with literal string values during the build — there's no runtime environment lookup.&lt;/p&gt;




&lt;h2&gt;
  
  
  SAM Deployment
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sam deploy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--template-file&lt;/span&gt; infra/stacks/contact-api.yml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--stack-name&lt;/span&gt; bonefish-contact-api &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; us-west-2 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--capabilities&lt;/span&gt; CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPAND &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--s3-bucket&lt;/span&gt; bonefish-pipeline-artifacts-709085484102 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--s3-prefix&lt;/span&gt; sam-contact &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--no-confirm-changeset&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;CAPABILITY_AUTO_EXPAND&lt;/code&gt; is required when the template uses &lt;code&gt;Transform: AWS::Serverless-2016-10-31&lt;/code&gt;. This tells CloudFormation to expand SAM macros before processing the template.&lt;/p&gt;

&lt;p&gt;SAM packages the Lambda code (zips &lt;code&gt;lambda/contact/&lt;/code&gt;), uploads to the artifacts S3 bucket, and replaces the local &lt;code&gt;CodeUri&lt;/code&gt; path with the S3 URL in the transformed template — all automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost Analysis
&lt;/h2&gt;

&lt;p&gt;At typical consulting site traffic:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Usage&lt;/th&gt;
&lt;th&gt;Estimated cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API Gateway HTTP API&lt;/td&gt;
&lt;td&gt;100 requests/month&lt;/td&gt;
&lt;td&gt;$0.001&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda&lt;/td&gt;
&lt;td&gt;100 invocations × 128MB × 500ms&lt;/td&gt;
&lt;td&gt;&amp;lt; $0.01&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB&lt;/td&gt;
&lt;td&gt;100 writes, PAY_PER_REQUEST&lt;/td&gt;
&lt;td&gt;&amp;lt; $0.01&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SES&lt;/td&gt;
&lt;td&gt;100 emails&lt;/td&gt;
&lt;td&gt;$0.01&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;&amp;lt; $0.05/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The entire contact form backend costs pennies per month.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>CI/CD with AWS CodePipeline and CodeBuild</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 00:06:17 +0000</pubDate>
      <link>https://forem.com/josh_blair/cicd-with-aws-codepipeline-and-codebuild-1h40</link>
      <guid>https://forem.com/josh_blair/cicd-with-aws-codepipeline-and-codebuild-1h40</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Every push to the &lt;code&gt;main&lt;/code&gt; branch on GitHub automatically builds the React app and deploys it to S3 + CloudFront — zero manual steps. This article covers how that pipeline is wired together using AWS CodePipeline, CodeBuild, and GitHub via CodeStar Connections.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pipeline Flow
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0hz9ido25k6apybbj4nu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0hz9ido25k6apybbj4nu.png" alt="CI/CD pipeline" width="800" height="344"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  CloudFormation Stack
&lt;/h2&gt;

&lt;p&gt;The pipeline infrastructure is defined in &lt;code&gt;infra/stacks/pipeline.yml&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  CodeStar Connection (GitHub App)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;GitHubConnection&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::CodeStarConnections::Connection&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ConnectionName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bonefish-github&lt;/span&gt;
    &lt;span class="na"&gt;ProviderType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GitHub&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Manual step required:&lt;/strong&gt; After deploying the stack, you must go to &lt;strong&gt;AWS Console → CodePipeline → Settings → Connections&lt;/strong&gt; and click "Update pending connection" to authorize the GitHub App. This cannot be automated — AWS requires explicit human approval to grant access to your GitHub account.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When authorizing, install the &lt;strong&gt;AWS Connector for GitHub&lt;/strong&gt; app on your GitHub account and grant it access to the specific repository. "Connect as a GitHub user" only works for CodeBuild — CodePipeline requires the GitHub App.&lt;/p&gt;

&lt;h3&gt;
  
  
  Artifact Bucket
&lt;/h3&gt;

&lt;p&gt;Intermediate pipeline artifacts (source zip, build output) are stored in a private S3 bucket:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;ArtifactBucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::S3::Bucket&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;BucketName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bonefish-pipeline-artifacts-${AWS::AccountId}'&lt;/span&gt;
    &lt;span class="na"&gt;VersioningConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Enabled&lt;/span&gt;
    &lt;span class="na"&gt;PublicAccessBlockConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;BlockPublicAcls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;BlockPublicPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;IgnorePublicAcls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;RestrictPublicBuckets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CodeBuild Project
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;BuildProject&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::CodeBuild::Project&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bonefish-build&lt;/span&gt;
    &lt;span class="na"&gt;ServiceRole&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;CodeBuildRole.Arn&lt;/span&gt;
    &lt;span class="na"&gt;Artifacts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CODEPIPELINE&lt;/span&gt;
    &lt;span class="na"&gt;Environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LINUX_CONTAINER&lt;/span&gt;
      &lt;span class="na"&gt;ComputeType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;BUILD_GENERAL1_SMALL&lt;/span&gt;
      &lt;span class="na"&gt;Image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws/codebuild/standard:7.0&lt;/span&gt;
      &lt;span class="na"&gt;EnvironmentVariables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;S3_BUCKET&lt;/span&gt;
          &lt;span class="na"&gt;Value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;S3BucketName&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;DISTRIBUTION_ID&lt;/span&gt;
          &lt;span class="na"&gt;Value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;DistributionId&lt;/span&gt;
    &lt;span class="na"&gt;Source&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CODEPIPELINE&lt;/span&gt;
      &lt;span class="na"&gt;BuildSpec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;buildspec.yml&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CodePipeline
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Pipeline&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::CodePipeline::Pipeline&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bonefish-website-pipeline&lt;/span&gt;
    &lt;span class="na"&gt;RoleArn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;PipelineRole.Arn&lt;/span&gt;
    &lt;span class="na"&gt;PipelineType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;V2&lt;/span&gt;
    &lt;span class="na"&gt;Stages&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Source&lt;/span&gt;
        &lt;span class="na"&gt;Actions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GitHub&lt;/span&gt;
            &lt;span class="na"&gt;ActionTypeId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;Category&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Source&lt;/span&gt;
              &lt;span class="na"&gt;Owner&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS&lt;/span&gt;
              &lt;span class="na"&gt;Provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CodeStarSourceConnection&lt;/span&gt;
              &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;1'&lt;/span&gt;
            &lt;span class="na"&gt;Configuration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;ConnectionArn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;GitHubConnection&lt;/span&gt;
              &lt;span class="na"&gt;FullRepositoryId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;${GitHubOwner}/${GitHubRepo}'&lt;/span&gt;
              &lt;span class="na"&gt;BranchName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;GitHubBranch&lt;/span&gt;
              &lt;span class="na"&gt;DetectChanges&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
            &lt;span class="na"&gt;OutputArtifacts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SourceArtifact&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build&lt;/span&gt;
        &lt;span class="na"&gt;Actions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;BuildAndDeploy&lt;/span&gt;
            &lt;span class="na"&gt;ActionTypeId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;Category&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build&lt;/span&gt;
              &lt;span class="na"&gt;Owner&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS&lt;/span&gt;
              &lt;span class="na"&gt;Provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CodeBuild&lt;/span&gt;
              &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;1'&lt;/span&gt;
            &lt;span class="na"&gt;Configuration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;ProjectName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;BuildProject&lt;/span&gt;
            &lt;span class="na"&gt;InputArtifacts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SourceArtifact&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;DetectChanges: true&lt;/code&gt; means CodePipeline automatically triggers on every push to the configured branch — no webhooks to configure manually.&lt;/p&gt;




&lt;h2&gt;
  
  
  IAM Roles
&lt;/h2&gt;

&lt;p&gt;Two IAM roles are needed: one for CodePipeline, one for CodeBuild.&lt;/p&gt;

&lt;h3&gt;
  
  
  CodePipeline Role
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;PipelineRole&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::IAM::Role&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;AssumeRolePolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
          &lt;span class="na"&gt;Principal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;Service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;codepipeline.amazonaws.com&lt;/span&gt;
          &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sts:AssumeRole&lt;/span&gt;
    &lt;span class="na"&gt;Policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PipelinePolicy&lt;/span&gt;
        &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Sid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ArtifactBucket&lt;/span&gt;
              &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;GetObject&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;PutObject&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;GetObjectVersion&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;GetBucketVersioning&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;${ArtifactBucket.Arn}'&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;${ArtifactBucket.Arn}/*'&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Sid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CodeBuild&lt;/span&gt;
              &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;codebuild&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;BatchGetBuilds&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;codebuild&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;StartBuild&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;BuildProject.Arn&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Sid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CodeStarConnection&lt;/span&gt;
              &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;codestar-connections&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;UseConnection&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;GitHubConnection&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CodeBuild Role
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;CodeBuildRole&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::IAM::Role&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;AssumeRolePolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
          &lt;span class="na"&gt;Principal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;Service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;codebuild.amazonaws.com&lt;/span&gt;
          &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sts:AssumeRole&lt;/span&gt;
    &lt;span class="na"&gt;Policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CodeBuildPolicy&lt;/span&gt;
        &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Sid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Logs&lt;/span&gt;
              &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;logs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;CreateLogGroup&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;logs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;CreateLogStream&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;logs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;PutLogEvents&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;*'&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Sid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ArtifactBucket&lt;/span&gt;
              &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;GetObject&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;PutObject&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;GetObjectVersion&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;${ArtifactBucket.Arn}/*'&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Sid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;WebsiteSync&lt;/span&gt;
              &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;PutObject&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;DeleteObject&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;GetObject&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;s3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;ListBucket&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;arn:aws:s3:::${S3BucketName}'&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;arn:aws:s3:::${S3BucketName}/*'&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Sid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CloudFrontInvalidation&lt;/span&gt;
              &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;cloudfront&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;CreateInvalidation&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;arn:aws:cloudfront::${AWS::AccountId}:distribution/${DistributionId}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Principle of least privilege — CodeBuild can only write to the specific S3 bucket and invalidate the specific CloudFront distribution.&lt;/p&gt;




&lt;h2&gt;
  
  
  buildspec.yml
&lt;/h2&gt;

&lt;p&gt;The build specification lives in the repo root and tells CodeBuild exactly what to do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.2&lt;/span&gt;

&lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;VITE_CONTACT_API_URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# overridden by CodeBuild project env var&lt;/span&gt;

&lt;span class="na"&gt;phases&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;install&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runtime-versions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;nodejs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;
    &lt;span class="na"&gt;commands&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;npm ci&lt;/span&gt;

  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;commands&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;npm run build&lt;/span&gt;

  &lt;span class="na"&gt;post_build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;commands&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;aws s3 sync dist/ s3://$S3_BUCKET --delete&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;aws cloudfront create-invalidation --distribution-id $DISTRIBUTION_ID --paths "/*"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Key points
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;npm ci&lt;/code&gt; not &lt;code&gt;npm install&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;code&gt;npm ci&lt;/code&gt; installs exactly what's in &lt;code&gt;package-lock.json&lt;/code&gt; and fails if there are any discrepancies. This ensures deterministic builds — the same packages every time, in every environment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;--delete&lt;/code&gt; flag on s3 sync&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Removes files from S3 that no longer exist in the build output. Without this, deleted pages or renamed assets would stay in S3 forever and get served to users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CloudFront invalidation&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Vite includes content hashes in asset filenames (&lt;code&gt;index-BX7FeaXh.js&lt;/code&gt;), so JS/CSS files are automatically cache-busted. However, &lt;code&gt;index.html&lt;/code&gt; itself doesn't have a hash — it must be explicitly invalidated so CloudFront fetches the new version immediately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;VITE_CONTACT_API_URL&lt;/code&gt; env var&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Vite's &lt;code&gt;import.meta.env.VITE_*&lt;/code&gt; variables are replaced at build time (not runtime). The API Gateway URL is injected by CodeBuild as an environment variable and baked into the built JS bundle. This means the frontend always has the correct endpoint URL without any runtime configuration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Troubleshooting the GitHub Connection
&lt;/h2&gt;

&lt;p&gt;The CodeStar Connection requires careful setup. Common issues encountered:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Status: PENDING after stack deploy&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Expected. You must visit the AWS Console to authorize it. Cannot be done via CLI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"No Branch found" error&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The GitHub App was authorized but the private repository wasn't explicitly granted access. Fix: GitHub → Settings → Applications → AWS Connector for GitHub → Configure → add the repo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Role does not have sufficient permissions" error&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
After replacing a broken connection with a new one, the CodePipeline IAM role policy still referenced the old connection ARN. Fix: update the &lt;code&gt;codestar-connections:UseConnection&lt;/code&gt; resource ARN in the IAM policy to match the new connection ARN.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deployment Timeline
&lt;/h2&gt;

&lt;p&gt;From &lt;code&gt;git push&lt;/code&gt; to live site:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Duration&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;CodePipeline detects change&lt;/td&gt;
&lt;td&gt;~10 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source download from GitHub&lt;/td&gt;
&lt;td&gt;~15 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;npm ci&lt;/code&gt; (cache warm)&lt;/td&gt;
&lt;td&gt;~30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;npm run build&lt;/code&gt; (tsc + vite)&lt;/td&gt;
&lt;td&gt;~15 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;aws s3 sync&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~10 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudFront invalidation&lt;/td&gt;
&lt;td&gt;~10 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~90 seconds&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Full deployment in under two minutes on every push to main.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>tutorial</category>
      <category>cicd</category>
    </item>
    <item>
      <title>Static Site Hosting on AWS — S3, CloudFront, ACM, and Route 53</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 00:06:08 +0000</pubDate>
      <link>https://forem.com/josh_blair/static-site-hosting-on-aws-s3-cloudfront-acm-and-route-53-20b2</link>
      <guid>https://forem.com/josh_blair/static-site-hosting-on-aws-s3-cloudfront-acm-and-route-53-20b2</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;This article covers deploying a static React site to AWS using S3 as the origin, CloudFront as the CDN, ACM for TLS certificates, and Route 53 for DNS. Everything is defined as CloudFormation infrastructure-as-code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8v63mlwmlkxu22f9tejt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8v63mlwmlkxu22f9tejt.png" alt="Static hosting architecture" width="800" height="575"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  CloudFormation Infrastructure
&lt;/h2&gt;

&lt;p&gt;The hosting infrastructure is split across &lt;strong&gt;three stacks&lt;/strong&gt; deployed in sequence:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stack&lt;/th&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;Region&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bonefish-acm&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;infra/acm/certificate.yml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;us-east-1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ACM TLS certificate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bonefish-website&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;infra/stacks/website.yml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;us-west-2&lt;/td&gt;
&lt;td&gt;S3 + CloudFront&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bonefish-pipeline&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;infra/stacks/pipeline.yml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;us-west-2&lt;/td&gt;
&lt;td&gt;CI/CD (covered in article 4)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  ACM Certificate (us-east-1)
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why us-east-1?&lt;/strong&gt; CloudFront is a global service that only accepts ACM certificates provisioned in &lt;code&gt;us-east-1&lt;/code&gt;. This is an AWS requirement regardless of where your other resources live.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# infra/acm/certificate.yml&lt;/span&gt;
&lt;span class="na"&gt;Resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Certificate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::CertificateManager::Certificate&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;DomainName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bonefishsoftware.com&lt;/span&gt;
      &lt;span class="na"&gt;SubjectAlternativeNames&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;www.bonefishsoftware.com&lt;/span&gt;
      &lt;span class="na"&gt;ValidationMethod&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;DNS&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;DNS validation:&lt;/strong&gt; ACM generates two CNAME records that must be added to your DNS zone. Because the domain is managed by Route 53, we added these via CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws route53 change-resource-record-sets &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--hosted-zone-id&lt;/span&gt; &lt;span class="nv"&gt;$ZONE_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--change-batch&lt;/span&gt; &lt;span class="s1"&gt;'{ "Changes": [
    { "Action": "UPSERT", "ResourceRecordSet": {
      "Name": "_bf48ff...bonefishsoftware.com.",
      "Type": "CNAME", "TTL": 300,
      "ResourceRecords": [{"Value": "_ea68....acm-validations.aws."}]
    }}
  ]}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the domain's nameservers resolve correctly, ACM validates automatically (typically 2–5 minutes).&lt;/p&gt;




&lt;h2&gt;
  
  
  S3 Bucket
&lt;/h2&gt;

&lt;p&gt;The S3 bucket is &lt;strong&gt;private&lt;/strong&gt; — no public access whatsoever. CloudFront accesses it via OAC (Origin Access Control).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;WebsiteBucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::S3::Bucket&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;BucketName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bonefishsoftware-com-website&lt;/span&gt;
    &lt;span class="na"&gt;VersioningConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Enabled&lt;/span&gt;
    &lt;span class="na"&gt;PublicAccessBlockConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;BlockPublicAcls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;BlockPublicPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;IgnorePublicAcls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;RestrictPublicBuckets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bucket Policy — allow CloudFront OAC only
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;WebsiteBucketPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::S3::BucketPolicy&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Bucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;WebsiteBucket&lt;/span&gt;
    &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
          &lt;span class="na"&gt;Principal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;Service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cloudfront.amazonaws.com&lt;/span&gt;
          &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;s3:GetObject&lt;/span&gt;
          &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;${WebsiteBucket.Arn}/*'&lt;/span&gt;
          &lt;span class="na"&gt;Condition&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;StringEquals&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;AWS:SourceArn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;-&lt;/span&gt;
                &lt;span class="s"&gt;arn:aws:cloudfront::${AWS::AccountId}:distribution/${Distribution}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;AWS:SourceArn&lt;/code&gt; condition means only our specific CloudFront distribution can read from this bucket — not any other CloudFront distribution in any AWS account.&lt;/p&gt;




&lt;h2&gt;
  
  
  Origin Access Control (OAC)
&lt;/h2&gt;

&lt;p&gt;OAC is the modern replacement for Origin Access Identity (OAI). Key advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Supports all S3 API operations&lt;/li&gt;
&lt;li&gt;Works with SSE-KMS encrypted buckets&lt;/li&gt;
&lt;li&gt;Uses AWS SigV4 request signing (more secure)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;OriginAccessControl&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::CloudFront::OriginAccessControl&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;OriginAccessControlConfig&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bonefishsoftware-com-oac&lt;/span&gt;
      &lt;span class="na"&gt;OriginAccessControlOriginType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;s3&lt;/span&gt;
      &lt;span class="na"&gt;SigningBehavior&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always&lt;/span&gt;
      &lt;span class="na"&gt;SigningProtocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sigv4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  CloudFront Distribution
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Distribution&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::CloudFront::Distribution&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;DistributionConfig&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;DefaultRootObject&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;index.html&lt;/span&gt;
      &lt;span class="na"&gt;Aliases&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;bonefishsoftware.com&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;www.bonefishsoftware.com&lt;/span&gt;
      &lt;span class="na"&gt;ViewerCertificate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;AcmCertificateArn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;CertificateArn&lt;/span&gt;
        &lt;span class="na"&gt;SslSupportMethod&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sni-only&lt;/span&gt;
        &lt;span class="na"&gt;MinimumProtocolVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TLSv1.2_2021&lt;/span&gt;
      &lt;span class="na"&gt;HttpVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http2and3&lt;/span&gt;
      &lt;span class="na"&gt;Origins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;S3Origin&lt;/span&gt;
          &lt;span class="na"&gt;DomainName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;WebsiteBucket.RegionalDomainName&lt;/span&gt;
          &lt;span class="na"&gt;S3OriginConfig&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;OriginAccessIdentity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;
          &lt;span class="na"&gt;OriginAccessControlId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;OriginAccessControl.Id&lt;/span&gt;
      &lt;span class="na"&gt;DefaultCacheBehavior&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;TargetOriginId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;S3Origin&lt;/span&gt;
        &lt;span class="na"&gt;ViewerProtocolPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redirect-to-https&lt;/span&gt;
        &lt;span class="na"&gt;CachePolicyId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;658327ea-f89d-4fab-a63d-7e88639e58f6&lt;/span&gt;  &lt;span class="c1"&gt;# CachingOptimized&lt;/span&gt;
        &lt;span class="na"&gt;Compress&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;CustomErrorResponses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;403&lt;/span&gt;
          &lt;span class="na"&gt;ResponseCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
          &lt;span class="na"&gt;ResponsePagePath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/index.html&lt;/span&gt;
          &lt;span class="na"&gt;ErrorCachingMinTTL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;404&lt;/span&gt;
          &lt;span class="na"&gt;ResponseCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
          &lt;span class="na"&gt;ResponsePagePath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/index.html&lt;/span&gt;
          &lt;span class="na"&gt;ErrorCachingMinTTL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
      &lt;span class="na"&gt;PriceClass&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PriceClass_100&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Key configuration points
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;RegionalDomainName&lt;/code&gt; not &lt;code&gt;DomainName&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Always use &lt;code&gt;WebsiteBucket.RegionalDomainName&lt;/code&gt; (e.g. &lt;code&gt;bucket.s3.us-west-2.amazonaws.com&lt;/code&gt;) when configuring an S3 origin with OAC. Using the global &lt;code&gt;DomainName&lt;/code&gt; can cause redirect loops.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;PriceClass_100&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
CloudFront price classes control which edge locations serve your content. &lt;code&gt;PriceClass_100&lt;/code&gt; covers North America and Europe — the right choice for most US-based businesses. &lt;code&gt;PriceClass_All&lt;/code&gt; includes Asia/Pacific/South America but costs more.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;http2and3&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Enables HTTP/2 and HTTP/3 (QUIC) — better performance for browsers that support it.&lt;/p&gt;


&lt;h2&gt;
  
  
  Route 53 DNS Setup
&lt;/h2&gt;

&lt;p&gt;Since the domain is registered in Route 53, nameservers were updated via CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws route53domains update-domain-nameservers &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--domain-name&lt;/span&gt; bonefishsoftware.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; us-east-1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--nameservers&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ns-1325.awsdns-37.org &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ns-759.awsdns-30.net &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ns-1601.awsdns-08.co.uk &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ns-72.awsdns-09.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two &lt;strong&gt;Alias records&lt;/strong&gt; point the apex domain and &lt;code&gt;www&lt;/code&gt; to CloudFront:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws route53 change-resource-record-sets &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--hosted-zone-id&lt;/span&gt; &lt;span class="nv"&gt;$ZONE_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--change-batch&lt;/span&gt; &lt;span class="s1"&gt;'{
    "Changes": [{
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "bonefishsoftware.com",
        "Type": "A",
        "AliasTarget": {
          "HostedZoneId": "Z2FDTNDATAQYW2",
          "DNSName": "d39qxh6q0wxdkd.cloudfront.net",
          "EvaluateTargetHealth": false
        }
      }
    }]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;Z2FDTNDATAQYW2&lt;/code&gt;&lt;/strong&gt; is the fixed Hosted Zone ID for all CloudFront distributions — not specific to yours. Always use this value for CloudFront A alias records.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Deployment Order
&lt;/h2&gt;

&lt;p&gt;The order matters because each stack depends on outputs from the previous:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Create Route 53 hosted zone
   → Get nameservers, update domain registrar

2. Deploy ACM certificate stack (us-east-1)
   → Add DNS validation CNAMEs to Route 53
   → Wait for ISSUED status (~2–5 min once DNS propagates)

3. Deploy website stack (us-west-2)
   → Pass CertificateArn as parameter
   → Outputs: BucketName, DistributionId, DistributionDomain

4. Deploy pipeline stack (us-west-2)
   → Pass BucketName + DistributionId from step 3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Gradual Cutover Strategy
&lt;/h2&gt;

&lt;p&gt;Rather than blocking on the ACM cert, the &lt;code&gt;website.yml&lt;/code&gt; template supports deploying &lt;strong&gt;without a cert&lt;/strong&gt; first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;CertificateArn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;String&lt;/span&gt;
    &lt;span class="na"&gt;Default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;   &lt;span class="c1"&gt;# ← Leave blank for initial deploy&lt;/span&gt;

&lt;span class="na"&gt;Conditions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;HasCustomDomain&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!And&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Not&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;!Equals&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="nv"&gt;CertificateArn&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="pi"&gt;]]&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Not&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;!Equals&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="nv"&gt;DomainName&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="pi"&gt;]]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets you deploy the full CloudFront + S3 stack immediately, get the CloudFront URL (&lt;code&gt;*.cloudfront.net&lt;/code&gt;), test it, and then &lt;strong&gt;update the stack&lt;/strong&gt; with the cert ARN once it's issued — no downtime, no waiting.&lt;/p&gt;




&lt;h2&gt;
  
  
  SES DKIM Records
&lt;/h2&gt;

&lt;p&gt;For outbound email from &lt;code&gt;noreply@bonefishsoftware.com&lt;/code&gt;, three DKIM CNAME records were added to Route 53:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jbly6mlsiqbi6h7rnnfthdfkiaxzmvjn._domainkey.bonefishsoftware.com
  → jbly6mlsiqbi6h7rnnfthdfkiaxzmvjn.dkim.amazonses.com

xj7cx3tujwzexkmsxmfruyannxq3sevj._domainkey.bonefishsoftware.com
  → xj7cx3tujwzexkmsxmfruyannxq3sevj.dkim.amazonses.com

fyee6qf6o4fzqiolzxl2dyd4rjboqw4l._domainkey.bonefishsoftware.com
  → fyee6qf6o4fzqiolzxl2dyd4rjboqw4l.dkim.amazonses.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These allow SES to cryptographically sign outbound emails, which improves deliverability and prevents spoofing.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>cloudfront</category>
      <category>tutorial</category>
      <category>webdev</category>
    </item>
    <item>
      <title>React + Vite + TypeScript + Tailwind CSS v4 — Project Setup</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 00:05:15 +0000</pubDate>
      <link>https://forem.com/josh_blair/react-vite-typescript-tailwind-css-v4-project-setup-4c34</link>
      <guid>https://forem.com/josh_blair/react-vite-typescript-tailwind-css-v4-project-setup-4c34</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;This article covers scaffolding a production-ready React single-page application using Vite, TypeScript, and Tailwind CSS v4. This is the frontend that gets built into static assets and served from S3 + CloudFront.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tech Choices
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Vite over Create React App
&lt;/h3&gt;

&lt;p&gt;CRA is deprecated. Vite is the modern standard — it uses native ES modules for near-instant dev server startup and produces an optimized production build via Rollup.&lt;/p&gt;

&lt;h3&gt;
  
  
  TypeScript
&lt;/h3&gt;

&lt;p&gt;Type safety catches bugs at compile time instead of runtime. The CloudFormation templates and the site itself are both infra-as-code; TypeScript gives the same discipline on the frontend.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tailwind CSS v4
&lt;/h3&gt;

&lt;p&gt;Tailwind v4 (released 2024) is a ground-up rewrite. Key changes from v3:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No &lt;code&gt;tailwind.config.js&lt;/code&gt;&lt;/strong&gt; — theme customization moves into CSS via &lt;code&gt;@theme&lt;/code&gt; in your stylesheet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No PostCSS config file&lt;/strong&gt; — use the &lt;code&gt;@tailwindcss/vite&lt;/code&gt; Vite plugin instead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CSS-first configuration&lt;/strong&gt; — design tokens are CSS custom properties&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Scaffolding
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create project with react-ts template&lt;/span&gt;
npm create vite@latest &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="nt"&gt;--template&lt;/span&gt; react-ts

&lt;span class="c"&gt;# Install routing and Tailwind&lt;/span&gt;
npm &lt;span class="nb"&gt;install &lt;/span&gt;react-router-dom
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-D&lt;/span&gt; tailwindcss @tailwindcss/vite
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  vite.config.ts
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;defineConfig&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;vite&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;react&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@vitejs/plugin-react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;tailwindcss&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@tailwindcss/vite&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineConfig&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;plugins&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;react&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nf"&gt;tailwindcss&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;@tailwindcss/vite&lt;/code&gt; plugin replaces the old PostCSS setup. No &lt;code&gt;postcss.config.js&lt;/code&gt; needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  src/index.css
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight css"&gt;&lt;code&gt;&lt;span class="c"&gt;/* Google Fonts import MUST come before @import "tailwindcss" */&lt;/span&gt;
&lt;span class="k"&gt;@import&lt;/span&gt; &lt;span class="sx"&gt;url('https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@300;400;500;600;700&amp;amp;display=swap')&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;@import&lt;/span&gt; &lt;span class="s1"&gt;"tailwindcss"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c"&gt;/* Custom design tokens via @theme (Tailwind v4 CSS-first config) */&lt;/span&gt;
&lt;span class="k"&gt;@theme&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="py"&gt;--color-bg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#111318&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="py"&gt;--color-surface&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#1C2028&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="py"&gt;--color-accent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#00D4FF&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="py"&gt;--color-text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#F0F4F8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="py"&gt;--color-text-muted&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#8B95A3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="py"&gt;--color-border&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#2A3040&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="py"&gt;--font-sans&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;'Space Grotesk'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;system-ui&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;sans-serif&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nt"&gt;body&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;background-color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#111318&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#F0F4F8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;font-family&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;'Space Grotesk'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;system-ui&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;sans-serif&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;#root&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;min-height&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="n"&gt;dvh&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;display&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;flex&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;flex-direction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;column&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Gotcha:&lt;/strong&gt; In Tailwind v4, &lt;code&gt;@import "tailwindcss"&lt;/code&gt; expands to real CSS. Any &lt;code&gt;@import url(...)&lt;/code&gt; (like Google Fonts) &lt;strong&gt;must appear before it&lt;/strong&gt; or the browser ignores the font import. This caused a CSS warning in the build until we fixed the order.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Project Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;src/
├── components/
│   ├── Navbar.tsx        # Sticky nav, mobile hamburger, active link highlighting
│   ├── Footer.tsx        # Logo, nav links, LinkedIn + GitHub icons
│   └── SectionHeader.tsx # Reusable section heading (title + subtitle)
├── pages/
│   ├── Home.tsx          # Hero, featured services, cert strip, CTA banner
│   ├── Services.tsx      # Full 6-card service grid
│   ├── Technologies.tsx  # Grouped tech badges + certifications
│   ├── Portfolio.tsx     # Placeholder project cards
│   ├── Team.tsx          # Bio card with photo, certs, social links
│   └── Contact.tsx       # Form → API Gateway fetch
├── data/
│   ├── services.ts       # Service card data (title, description, icon)
│   ├── technologies.ts   # Tech groups + certifications
│   └── team.ts           # Team member data
├── App.tsx               # BrowserRouter + route config
├── main.tsx              # React DOM root
└── index.css             # Global styles + Tailwind v4 config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Data-driven content
&lt;/h3&gt;

&lt;p&gt;Rather than hardcoding content in page components, all repeated content lives in typed data files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/data/services.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Service&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;icon&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;services&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Service&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;eda-serverless&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Event-Driven Architecture &amp;amp; Serverless&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Architect decoupled, resilient systems using SQS, SNS, EventBridge, and Lambda...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;icon&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;⚡&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes it trivial to add, update, or reorder content without touching component markup.&lt;/p&gt;




&lt;h2&gt;
  
  
  Routing
&lt;/h2&gt;

&lt;p&gt;React Router v7 handles client-side routing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/App.tsx&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;BrowserRouter&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Routes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Route&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Navigate&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react-router-dom&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;App&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Navbar&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Routes&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Route&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/"&lt;/span&gt; &lt;span class="na"&gt;element&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Home&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Route&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/services"&lt;/span&gt; &lt;span class="na"&gt;element&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Services&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Route&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/technologies"&lt;/span&gt; &lt;span class="na"&gt;element&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Technologies&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Route&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/portfolio"&lt;/span&gt; &lt;span class="na"&gt;element&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Portfolio&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Route&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/team"&lt;/span&gt; &lt;span class="na"&gt;element&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Team&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Route&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/contact"&lt;/span&gt; &lt;span class="na"&gt;element&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Contact&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Route&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"*"&lt;/span&gt; &lt;span class="na"&gt;element&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Navigate&lt;/span&gt; &lt;span class="na"&gt;to&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/"&lt;/span&gt; &lt;span class="na"&gt;replace&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Routes&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Footer&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  SPA Routing on CloudFront
&lt;/h3&gt;

&lt;p&gt;Because React Router handles navigation client-side, all URLs (&lt;code&gt;/services&lt;/code&gt;, &lt;code&gt;/team&lt;/code&gt;, etc.) point to the same &lt;code&gt;index.html&lt;/code&gt;. When a user navigates directly to &lt;code&gt;https://bonefishsoftware.com/team&lt;/code&gt;, S3 returns a &lt;strong&gt;403&lt;/strong&gt; (key doesn't exist). CloudFront must be configured to map 403 and 404 errors back to &lt;code&gt;index.html&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# In website.yml CloudFormation&lt;/span&gt;
&lt;span class="na"&gt;CustomErrorResponses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;403&lt;/span&gt;
    &lt;span class="na"&gt;ResponseCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
    &lt;span class="na"&gt;ResponsePagePath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/index.html&lt;/span&gt;
    &lt;span class="na"&gt;ErrorCachingMinTTL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ErrorCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;404&lt;/span&gt;
    &lt;span class="na"&gt;ResponseCode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
    &lt;span class="na"&gt;ResponsePagePath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/index.html&lt;/span&gt;
    &lt;span class="na"&gt;ErrorCachingMinTTL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without this, refreshing any non-root page returns a CloudFront error page.&lt;/p&gt;




&lt;h2&gt;
  
  
  Design System
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Color palette&lt;/strong&gt; (dark charcoal + electric cyan):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Token&lt;/th&gt;
&lt;th&gt;Hex&lt;/th&gt;
&lt;th&gt;Usage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bg&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;#111318&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Page background&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;surface&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;#1C2028&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cards, panels&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;surface-2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;#232936&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Nested surfaces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;accent&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;#00D4FF&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cyan highlights, links, active states&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;text&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;#F0F4F8&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Primary text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;text-muted&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;#8B95A3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Secondary text, descriptions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;border&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;#2A3040&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Card borders, dividers&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Font:&lt;/strong&gt; &lt;a href="https://fonts.google.com/specimen/Space+Grotesk" rel="noopener noreferrer"&gt;Space Grotesk&lt;/a&gt; — a modern geometric sans-serif that feels technical but approachable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Logo Integration
&lt;/h2&gt;

&lt;p&gt;The company logo is an SVG with a &lt;strong&gt;transparent background&lt;/strong&gt;. The white stripe visible in the logo is part of the Colorado flag design — it's not a background fill.&lt;/p&gt;

&lt;p&gt;Using the SVG directly in the Navbar (instead of PNG) means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Zero white-box artifact on the dark background&lt;/li&gt;
&lt;li&gt;Infinitely scalable — looks sharp at any size&lt;/li&gt;
&lt;li&gt;Small file size
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Navbar.tsx&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;NavLink&lt;/span&gt; &lt;span class="na"&gt;to&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;img&lt;/span&gt; &lt;span class="na"&gt;src&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/logo.svg"&lt;/span&gt; &lt;span class="na"&gt;alt&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Bonefish Software"&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"h-10 w-auto"&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;NavLink&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The no-text variant (&lt;code&gt;/logo-icon.svg&lt;/code&gt;) is used in the footer where horizontal space is tighter.&lt;/p&gt;




&lt;h2&gt;
  
  
  Production Build
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm run build
&lt;span class="c"&gt;# → tsc -b &amp;amp;&amp;amp; vite build&lt;/span&gt;
&lt;span class="c"&gt;# → dist/ (index.html + hashed JS/CSS bundles)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Build output for this site:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;dist/index.html                   0.64 kB │ gzip:  0.39 kB
dist/assets/index-[hash].css     22.21 kB │ gzip:  4.91 kB
dist/assets/index-[hash].js     257.69 kB │ gzip: 81.09 kB
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The entire site gzips to under &lt;strong&gt;86 kB&lt;/strong&gt; — fast on any connection.&lt;/p&gt;

</description>
      <category>react</category>
      <category>webdev</category>
      <category>vite</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Building a Production Company Website on AWS — Project Overview</title>
      <dc:creator>Josh Blair</dc:creator>
      <pubDate>Thu, 21 May 2026 00:05:06 +0000</pubDate>
      <link>https://forem.com/josh_blair/building-a-production-company-website-on-aws-project-overview-2dhm</link>
      <guid>https://forem.com/josh_blair/building-a-production-company-website-on-aws-project-overview-2dhm</guid>
      <description>&lt;h2&gt;
  
  
  What We Built
&lt;/h2&gt;

&lt;p&gt;This series documents the end-to-end process of designing, building, and deploying a production company website for a software and cloud consulting business. The result is a modern, fully serverless stack with automated CI/CD deployments — built intentionally to demonstrate and practice the AWS services I use professionally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tech stack at a glance:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;React 18 + Vite + TypeScript + Tailwind CSS v4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hosting&lt;/td&gt;
&lt;td&gt;Amazon S3 + CloudFront (CDN)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DNS &amp;amp; TLS&lt;/td&gt;
&lt;td&gt;Route 53 + ACM (SSL/TLS certificate)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CI/CD&lt;/td&gt;
&lt;td&gt;GitHub + AWS CodePipeline + CodeBuild&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Contact API&lt;/td&gt;
&lt;td&gt;API Gateway (HTTP) + AWS Lambda (Python)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data storage&lt;/td&gt;
&lt;td&gt;Amazon DynamoDB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Email delivery&lt;/td&gt;
&lt;td&gt;Amazon SES&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IaC&lt;/td&gt;
&lt;td&gt;AWS CloudFormation + AWS SAM&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8v63mlwmlkxu22f9tejt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8v63mlwmlkxu22f9tejt.png" alt="Architecture overview" width="800" height="575"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  CI/CD Pipeline
&lt;/h2&gt;

&lt;p&gt;Every push to the &lt;code&gt;main&lt;/code&gt; branch on GitHub automatically triggers a full build and deploy:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0hz9ido25k6apybbj4nu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0hz9ido25k6apybbj4nu.png" alt="CI/CD pipeline" width="800" height="344"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Articles in This Series
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Article&lt;/th&gt;
&lt;th&gt;What it covers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Project Overview&lt;/strong&gt; &lt;em&gt;(this article)&lt;/em&gt;
&lt;/td&gt;
&lt;td&gt;Full architecture, stack decisions, what we're building&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;&lt;a href="https://dev.to/josh_blair/react-vite-typescript-tailwind-css-v4-project-setup-4c34"&gt;React + Vite + Tailwind Setup&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Scaffolding the SPA, routing, design system, component structure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;&lt;a href="https://dev.to/josh_blair/static-site-hosting-on-aws-s3-cloudfront-acm-and-route-53-20b2"&gt;Static Site Hosting on AWS&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;S3, CloudFront with OAC, ACM, Route 53 DNS, CloudFormation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;&lt;a href="https://dev.to/josh_blair/cicd-with-aws-codepipeline-and-codebuild-1h40"&gt;CI/CD with CodePipeline &amp;amp; CodeBuild&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;GitHub connection, pipeline stages, buildspec.yml, IAM roles&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;&lt;a href="https://dev.to/josh_blair/serverless-contact-form-lambda-api-gateway-dynamodb-and-ses-21ap"&gt;Serverless Contact Form&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Lambda, API Gateway, DynamoDB, SES, CORS, SAM deployment&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Key Design Decisions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why React + Vite (not Next.js)?
&lt;/h3&gt;

&lt;p&gt;Next.js is a great framework, but for a static marketing site, it's overhead. Vite produces a clean static build (&lt;code&gt;dist/&lt;/code&gt;) that S3 + CloudFront serves perfectly. The site has no server-side rendering requirements. Vite also gives faster local dev iteration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why S3 + CloudFront (not Amplify or Vercel)?
&lt;/h3&gt;

&lt;p&gt;This project is intentionally built on "raw" AWS primitives — CodePipeline, CloudFormation, S3, CloudFront — rather than abstracted platforms. The goal is to learn and demonstrate AWS services used in real enterprise projects.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why CloudFormation (not CDK or Terraform)?
&lt;/h3&gt;

&lt;p&gt;CloudFormation is the AWS-native tool that every AWS practitioner encounters. Understanding it directly — before abstracting to CDK — builds a stronger mental model of what's actually being deployed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why separate ACM cert in us-east-1?
&lt;/h3&gt;

&lt;p&gt;CloudFront is a global service and requires ACM certificates to be provisioned specifically in &lt;code&gt;us-east-1&lt;/code&gt;, regardless of where your other resources live. This is an AWS constraint, not a design choice.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why CloudFront OAC (not OAI)?
&lt;/h3&gt;

&lt;p&gt;Origin Access Control (OAC) is the modern replacement for Origin Access Identity (OAI). It supports all S3 operations, works with SSE-KMS encrypted buckets, and uses AWS SigV4 signing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Repository Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bonefishsoftware.com/
├── src/                    # React source
│   ├── components/         # Navbar, Footer, SectionHeader
│   ├── pages/              # Home, Services, Technologies, Portfolio, Team, Contact
│   └── data/               # services.ts, technologies.ts, team.ts
├── public/                 # Static assets (logo, sitemap, robots.txt)
├── lambda/
│   └── contact/            # Python Lambda — contact form handler
├── infra/
│   ├── acm/                # certificate.yml (deploy to us-east-1)
│   └── stacks/
│       ├── website.yml     # S3 + CloudFront (deploy to us-west-2)
│       ├── pipeline.yml    # CodePipeline + CodeBuild (deploy to us-west-2)
│       └── contact-api.yml # SAM — API Gateway + Lambda + DynamoDB (deploy to us-west-2)
├── docs/                   # This documentation
├── buildspec.yml           # CodeBuild build spec
└── index.html              # Vite entry point with SEO meta tags
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>aws</category>
      <category>webdev</category>
      <category>tutorial</category>
      <category>cloud</category>
    </item>
  </channel>
</rss>
