<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mursal Furqan Kumbhar</title>
    <description>The latest articles on Forem by Mursal Furqan Kumbhar (@mursalfk).</description>
    <link>https://forem.com/mursalfk</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F202358%2F2fbf2caa-6490-4caa-a0aa-f128a9ade35a.gif</url>
      <title>Forem: Mursal Furqan Kumbhar</title>
      <link>https://forem.com/mursalfk</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mursalfk"/>
    <language>en</language>
    <item>
      <title>VS Code Extensions That Code For You</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Tue, 14 Apr 2026 10:35:13 +0000</pubDate>
      <link>https://forem.com/mursalfk/vs-code-extensions-that-code-for-you-13d6</link>
      <guid>https://forem.com/mursalfk/vs-code-extensions-that-code-for-you-13d6</guid>
      <description>&lt;h2&gt;
  
  
  Hello Devs 👋
&lt;/h2&gt;

&lt;p&gt;If your VSCode setup still looks like it did before the AI boom, you may be coding with a bicycle in a Formula 1 race. 🏎️ Modern AI extensions are no longer novelty gadgets, they’re force multipliers. From ghost-writing boilerplate to understanding sprawling codebases and debugging at 2 AM when your brain has blue-screened, these tools can turn your editor into a caffeinated co-pilot. Here are 6 of the best AI extensions helping developers ship faster, smarter, and with fewer “why is this null?” moments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why does this matter?
&lt;/h2&gt;

&lt;p&gt;Your editor should work as hard as you do. Most devs use VSCode, but only 10% use AI extensions. In 2026, that gap is the difference between a slow dev and a 10x dev.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. &lt;u&gt;Continue&lt;/u&gt; by Continue.dev
&lt;/h2&gt;

&lt;p&gt;Open-Source AI Assistant. Plug in any model, like GPT-4, Claude, Llama or Mistral directly into VSCode. It:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uses your own API Key (GPT/Claude/etc.)&lt;/li&gt;
&lt;li&gt;Run local models via Llama&lt;/li&gt;
&lt;li&gt;Custom slash commands &amp;amp; context&lt;/li&gt;
&lt;li&gt;Full-Chat + Inline edit supports
&lt;strong&gt;Rating:&lt;/strong&gt; 5/5
&lt;strong&gt;Cost:&lt;/strong&gt; Free | Open-Source&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  2. &lt;u&gt;Supermaven&lt;/u&gt; by Supermaven Inc.
&lt;/h2&gt;

&lt;p&gt;Blazing fast AI autocomplete with a 1 million context window. Its usage feels almost instant, doesn't lag and doesn't make you wait.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1,000,000 Token context.&lt;/li&gt;
&lt;li&gt;Sees your entire repository and codebase&lt;/li&gt;
&lt;li&gt;Sub-100ms suggestions. Truly feels instant&lt;/li&gt;
&lt;li&gt;Understands cross-file dependencies&lt;/li&gt;
&lt;li&gt;Free tier available
&lt;strong&gt;Rating:&lt;/strong&gt; 5/5
&lt;strong&gt;Cost:&lt;/strong&gt; Free Plan | Pro is $10/Month&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. &lt;u&gt;Pieces for Devs&lt;/u&gt; by Pieces.app
&lt;/h2&gt;

&lt;p&gt;Your AI-Powered developer memory. It saves code snippets, context and links, and retrieves them as and when needed. It also:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Auto-saves snippets with AI Tags&lt;/li&gt;
&lt;li&gt;Long-term memory across projects&lt;/li&gt;
&lt;li&gt;AI Explains saved snippets on demand&lt;/li&gt;
&lt;li&gt;Works offline with on-device AI
&lt;strong&gt;Rating&lt;/strong&gt;: 5/5
&lt;strong&gt;Cost&lt;/strong&gt;: Free for Personal Use&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. &lt;u&gt;Codeium&lt;/u&gt; by Exafunction
&lt;/h2&gt;

&lt;p&gt;The best free Copilot alternative. It's fast, accurate and supports 70+ languages. Codeium facilitates its users in many ways, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It's completely free for individual devs&lt;/li&gt;
&lt;li&gt;Provides in-editor AI Chat for debugging&lt;/li&gt;
&lt;li&gt;Autocompletes across 70+ programming languages&lt;/li&gt;
&lt;li&gt;Also comes with Codeium Search and Codebase Q&amp;amp;A
&lt;strong&gt;Rating&lt;/strong&gt;: 5/5
&lt;strong&gt;Cost&lt;/strong&gt;: 100% FREE | No Limits&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5. &lt;u&gt;Tabine&lt;/u&gt; by Tabnine
&lt;/h2&gt;

&lt;p&gt;AI Code Completor, trained on your own codebase. This is best for teams that can't share code with external servers (mainly proprietary developers). Its main features include, but are not limited to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Runs locally, your code stays private&lt;/li&gt;
&lt;li&gt;Learns your codebase patterns&lt;/li&gt;
&lt;li&gt;Works offline, no internet needed&lt;/li&gt;
&lt;li&gt;Team model training on your repository
&lt;strong&gt;Rating:&lt;/strong&gt; 5/5
&lt;strong&gt;Cost:&lt;/strong&gt; Pro is $12/Month&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  6. &lt;u&gt;GitHub Copilot&lt;/u&gt; by GitHub (Microsoft)
&lt;/h2&gt;

&lt;p&gt;The OG AI Pair Programmer. This extension auto-completes entire functions, writes tests, and explains code, all inside VSCode. Its main features include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inline code suggestions as you type&lt;/li&gt;
&lt;li&gt;Copilot Chat, ask anything about your code&lt;/li&gt;
&lt;li&gt;Multi-file context awareness&lt;/li&gt;
&lt;li&gt;Generate unit tests automatically
&lt;strong&gt;Rating&lt;/strong&gt;: 5/5
&lt;strong&gt;Cost&lt;/strong&gt;: $10/Month | &lt;em&gt;Free for Students&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclucion
&lt;/h2&gt;

&lt;p&gt;The best developers in 2026 are not just writing code, they’re orchestrating tools. Whether you want lightning-fast autocomplete, local privacy-first AI, or a full-blown coding sidekick living in your editor, there’s an extension here for every workflow. Try a few, mix and match, and build the AI-powered setup that makes your keyboard feel slightly unfair. ⚡&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Giving AI Agents “Live Memory” with Aurora Zero-ETL and Redshift Vector Search</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Wed, 25 Mar 2026 05:38:18 +0000</pubDate>
      <link>https://forem.com/mursalfk/giving-ai-agents-live-memory-with-aurora-zero-etl-and-redshift-vector-search-35gg</link>
      <guid>https://forem.com/mursalfk/giving-ai-agents-live-memory-with-aurora-zero-etl-and-redshift-vector-search-35gg</guid>
      <description>&lt;p&gt;Hey everyone! Honestly, it feels so good to be sitting down and writing to you all again. I want to start with a massive apology for the radio silence and the long gap between my last few posts. Life has a funny way of getting busy, but I’ve missed this community and our deep dives into the future of tech. &lt;/p&gt;

&lt;p&gt;We’re talking about a problem that has been bugging me all through 2025 and into 2026: &lt;strong&gt;AI situational awareness.&lt;/strong&gt; Let’s dive in.&lt;/p&gt;

&lt;h2&gt;
  
  
  The "Goldfish" Problem: Why Your AI is Gaslighting Your Users 🐠🌀
&lt;/h2&gt;

&lt;p&gt;Look, we need to have a genuine heart-to-heart about your AI agents. We’re well into 2026, and yet, most AI “agents” still have the situational awareness of a goldfish in a blender. It’s a bit heartbreaking, isn't it?&lt;/p&gt;

&lt;p&gt;You know the feeling: You’ve spent weeks perfecting your system prompts, the UI is looking like a work of art, and your LLM is sharp. But the second a user actually &lt;em&gt;does&lt;/em&gt; something—like upgrading a subscription, changing a delivery address, or updating their profile—the AI is the absolute last one to know. It’s like the “memory” of your agent is stuck in a permanent time lag, trapped in a 2024 batch-processing nightmare.&lt;/p&gt;

&lt;p&gt;Imagine you’re building the ultimate AI concierge for a luxury travel app. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;2:00:00 PM:&lt;/strong&gt; Your user, Sarah, realizes she’s over the cold and cancels her flight to Paris because she’d rather sip espresso in Rome. This change hits your Amazon Aurora database instantly. Success!&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;2:00:10 PM:&lt;/strong&gt; Feeling productive, Sarah asks the AI Agent, “Hey, what time does my flight leave?”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Disaster:&lt;/strong&gt; Because your old-school data sync (ETL) only runs every 30 minutes, the AI looks at the stale warehouse data and says, “Your flight to Paris departs at 6 PM! Don't forget your beret!” 🥖&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sarah is now officially being gaslit by an algorithm. 🤦‍♂️ The immersion? Totally gone. The trust? Shattered. This happens because the "Transactional Brain" (Aurora) isn't talking to the "Analytical Memory" (Redshift) fast enough. We need a way to move that data so quickly it feels like magic. Today, we are fixing that with &lt;strong&gt;Live Memory&lt;/strong&gt; using &lt;strong&gt;Amazon Aurora Zero-ETL&lt;/strong&gt; and &lt;strong&gt;Redshift Vector Search&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: The “Wormhole” (Zero-ETL) 🚀
&lt;/h3&gt;

&lt;p&gt;In the "old days" (way back in 2024), moving data from Aurora to Redshift meant building a "Glue" pipeline. It was like trying to build a bridge between two islands using nothing but popsicle sticks, duct tape, and a whole lot of hope. It broke constantly, it cost a fortune, and it was painfully slow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Zero-ETL&lt;/strong&gt; is the modern way to handle this. Think of it as a native “wormhole” between Aurora and Redshift. When you enable it, AWS handles all the heavy lifting and plumbing behind the scenes. No Python scripts to debug, no Lambda triggers to monitor, and absolutely zero drama.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjzjaw4kq4oho3zsmkayn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjzjaw4kq4oho3zsmkayn.png" alt="img1" width="744" height="459"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this is the “Secret Sauce” for your Agent:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Near-Zero Latency:&lt;/strong&gt; We’re talking seconds, not minutes. By the time Sarah finishes typing her question, the data is already there.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-Scaling:&lt;/strong&gt; As your app goes viral, the teleporter grows with it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Total Peace of Mind:&lt;/strong&gt; You can finally stop monitoring failed Glue jobs at 3 AM. If it’s in Aurora, it’s in Redshift.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 2: From “Rows” to “Recall” (Vector Search) 🧠
&lt;/h3&gt;

&lt;p&gt;Okay, so the data has successfully teleported into Redshift. But here’s the catch: Redshift is a warehouse full of structured tables, and AI Agents don’t really "think" in tables—they think in &lt;strong&gt;Vectors&lt;/strong&gt; (mathematical representations of meaning).&lt;/p&gt;

&lt;p&gt;In the past, you’d have to ship this data &lt;em&gt;again&lt;/em&gt; to a dedicated vector database. But that just adds more lag! Instead, we use Redshift’s &lt;strong&gt;Native Vector Search&lt;/strong&gt;. This keeps your “Live Memory” in one single, high-performance place. To make this work, we set up a &lt;strong&gt;Materialized View&lt;/strong&gt; in Redshift that automatically calls &lt;strong&gt;Amazon Bedrock&lt;/strong&gt; to turn those new rows into vectors on the fly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- The "I don't have a goldfish memory anymore" Query&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;live_user_context&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;event_description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;-- This calls Bedrock to create a vector on the fly!&lt;/span&gt;
    &lt;span class="n"&gt;amazon_bedrock_embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;semantic_vector&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;aurora_synced_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_logs&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: The “Aha!” Moment 🔍
&lt;/h3&gt;

&lt;p&gt;Now, let's replay the Sarah scenario. When she asks about her flight, the AI Agent sends a vector query to Redshift. Redshift looks at the most recent events—including the one that landed just 10 seconds ago—and finds the “Canceled Paris / Booked Rome” entry.&lt;/p&gt;

&lt;p&gt;The Agent responds: &lt;strong&gt;“I see you switched to the Rome trip—great choice! That flight leaves at 8 PM. Should I book a car to the airport?”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Boom. You just saved the user experience. 🥳 You've moved from a "chatbot" to a genuine "agent" that understands the present moment.&lt;/p&gt;

&lt;h3&gt;
  
  
  By The Numbers: Is it worth your time? 📈
&lt;/h3&gt;

&lt;p&gt;We ran some benchmarks to see just how much of a difference this "Live Memory" architecture actually makes. When you compare the old "Batch" world to the new "Zero-ETL" world, the latency gap is massive.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F84ptdmz2iuo8kmphylnv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F84ptdmz2iuo8kmphylnv.png" alt="img2" width="750" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Average “Brain Lag” (Latency):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Standard S3/Glue Sync: ~480 seconds&lt;/li&gt;
&lt;li&gt;Zero-ETL + Redshift: &lt;strong&gt;~12 seconds&lt;/strong&gt; (A &lt;strong&gt;97% improvement&lt;/strong&gt;!)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Developer Happiness:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Measured in “How many times I had to fix a pipeline this week”: Dropped from an average of 4 down to &lt;strong&gt;0&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pro-Tips for the Modern ML Engineer 🛠️
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Keep it Serverless:&lt;/strong&gt; Use &lt;strong&gt;Redshift Serverless&lt;/strong&gt;. It scales down to zero when your agents are sleeping. &lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Hybrid Search is King:&lt;/strong&gt; Don’t rely &lt;em&gt;only&lt;/em&gt; on vectors. Combine them with regular SQL filters (like &lt;code&gt;WHERE user_id = '123'&lt;/code&gt;) to make your search lightning fast.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Capture the Vibe:&lt;/strong&gt; Remember, this isn’t just for logs. If a user tells the AI, “I love my better half a lot,” ❤️ that sentiment gets synced and embedded instantly. The next time they ask for anniversary ideas, the agent immediately knows the priority is “High Romance.”&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Final Thoughts
&lt;/h3&gt;

&lt;p&gt;Anyway, it feels amazing to be back in the flow and sharing these breakthroughs with you all again. The leap from “static” AI to “live” agents is one of the most exciting shifts we’re seeing this year, and I can’t wait to see how you all implement these “wormholes” in your own stacks. If you run into any snags or just want to geek out over vector search optimizations, drop a comment below or find me over on Dev.to! I promise the next article won’t take nearly as long to reach your screens. Until next time — stay curious, keep building, and let’s make those agents a whole lot smarter. 🚀✨&lt;/p&gt;

</description>
      <category>aws</category>
      <category>machinelearning</category>
      <category>etl</category>
      <category>ai</category>
    </item>
    <item>
      <title>RIP 12-Digit IDs: The AWS Matrix Has Finally Been Decoded</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Mon, 09 Feb 2026 09:41:12 +0000</pubDate>
      <link>https://forem.com/aws-builders/rip-12-digit-ids-the-aws-matrix-has-finally-been-decoded-2om9</link>
      <guid>https://forem.com/aws-builders/rip-12-digit-ids-the-aws-matrix-has-finally-been-decoded-2om9</guid>
      <description>&lt;p&gt;Y'all... pull up a chair. We need to talk about a literal miracle. 🥂&lt;/p&gt;

&lt;p&gt;It was 2:41 AM. PagerDuty was screaming like a banshee in my ear. My eyes were blurry, my coffee was cold, and Slack was silent in that heavy, "something-is-very-wrong" way. I logged into the AWS Console to “quickly restart a service” and get back to sleep.&lt;/p&gt;

&lt;p&gt;Five clicks later—just as I was about to hit "Delete"—my stomach dropped. My heart hit the floor.&lt;/p&gt;

&lt;p&gt;Wrong account. I was in Production. Nothing blew up that night. I caught myself at the last millisecond. I got lucky. But for 21+ years, millions of engineers weren’t so lucky. We’ve been squinting at 12-digit account IDs like we’re trying to decode the Matrix while the world burns around us.&lt;/p&gt;

&lt;p&gt;But the wait is over. AWS finally added the Account Name to the top bar. It’s a "high-tech" feature that only took two decades to arrive, and it is GLORIOUS.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏆 Welcome to the "Almost Nuked Prod" Hall of Fame
&lt;/h2&gt;

&lt;p&gt;For over two decades, AWS UX followed the "Extreme Stealth" philosophy. The top bar gave you the region and your username, but the actual account context? Hidden behind a click. Pure chaos. It was like driving a car where the speedometer is inside the glove box.&lt;/p&gt;

&lt;p&gt;This led to the Three Stages of AWS Grief:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Over-Confidence&lt;/strong&gt;: “I definitely logged into Sandbox. I'm a pro, I don't need to check.”&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Sudden Realization&lt;/strong&gt;: “Wait... why does this S3 bucket have 4PB of data and a 'Do Not Delete' tag? Why is this instance type an x1e.32xlarge?”&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Panic&lt;/strong&gt;: Aggressively smashing Ctrl+W to close 42 Chrome tabs before your shaking finger accidentally clicks 'Terminate Instance'.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Why This is Low-Key Life-Changing
&lt;/h2&gt;

&lt;p&gt;This isn’t just cosmetic; it’s cognitive safety. In a multi-account world, you’re juggling dozens of environments. This update assumes that engineers are human—that we get tired, stressed, and caffeine-deprived. It’s a sanity check for your career. Good UX doesn’t assume perfection; it assumes you're human.&lt;/p&gt;

&lt;h2&gt;
  
  
  🛠️ Hands-On: Build Your Safety Net Today
&lt;/h2&gt;

&lt;p&gt;Since you're likely rocking .ipynb files in VSCode (the ultimate dev setup, IMO), you’re used to seeing your environment in the kernel picker or the status bar. Let's get that same "don't-get-fired" energy in your browser.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Make it Pretty with the AWS CLI
&lt;/h3&gt;

&lt;p&gt;If your account name is company-billing-prod-final-v2, it’s going to get truncated. Don't click through menus like it's 2004—we're builders! Pop this into your terminal to set a punchy, unmistakable Alias:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Give your account a name that SCREAMS at you in the top bar&lt;/span&gt;
aws iam create-account-alias &lt;span class="nt"&gt;--account-alias&lt;/span&gt; PROD-SENSITIVE-ZONE
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Boom&lt;/em&gt;. Now your top bar says PROD-SENSITIVE-ZONE in clear text instead of a random string of numbers. 💅&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The "Visual Fire Alarm" (Layered Security)
&lt;/h3&gt;

&lt;p&gt;Text is great, but color-coding saves careers. Since AWS hasn't given us a "Red Header" mode for Prod yet (maybe in the year 2047?), we have to be clever:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Unified Settings&lt;/strong&gt;: Click your Name (top right) → Settings → Visual Mode. Choose a distinct theme for your main accounts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Browser Profiles&lt;/strong&gt;: This is the real pro-tip. Use different Browser Profiles (Chrome/Brave/Firefox) for different environments. I keep my Prod profile themed in "Emergency Red" with a massive warning icon. If the window isn't red, it isn't Prod.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  3. The "Don't Fire Me" Python Snippet
&lt;/h3&gt;

&lt;p&gt;Before running any destructive boto3 code in your Jupyter Notebook, run this check. It’s the programmatic version of "looking both ways before crossing the street."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;safety_check&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Grab the current identity
&lt;/span&gt;    &lt;span class="n"&gt;sts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;identity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_caller_identity&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;account_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;identity&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Account&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Put your real Prod ID here
&lt;/span&gt;    &lt;span class="n"&gt;PROD_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;123456789012&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; 

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;account_id&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;PROD_ID&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🚨 ALERT: YOU ARE IN PRODUCTION! 🚨&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Step away from the Shift+Enter key unless you&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re sure!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Connected to: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;account_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Environment: Non-Prod (Safe to play, fam)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;safety_check&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🚀 The Future of AWS UX
&lt;/h2&gt;

&lt;p&gt;While we celebrate this "giant leap for engineer-kind," we’re still looking toward the future. Maybe one day the "Delete Database" button won't be the exact same shade of blue as the "Save" button? Maybe we'll get a "High Stakes" mode that requires a physical key turn?&lt;/p&gt;

&lt;p&gt;Until then, appreciate the label. It might just be the thing that keeps you from becoming a "Why We Had an Outage" post-mortem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Final Thought&lt;/strong&gt;: The most dangerous bugs don’t live in your code; they live in interfaces that assume humans never slip. AWS finally acknowledged that we’re human. Sometimes, the biggest cloud innovation... is simply knowing exactly where you are before you hit 'Enter'. ☁️&lt;/p&gt;

</description>
      <category>aws</category>
      <category>cloudcomputing</category>
      <category>devops</category>
      <category>programming</category>
    </item>
    <item>
      <title>From Monoliths to Multitaskers: Building Your AWS AI Dream Team! 🚀🤖</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Sat, 07 Feb 2026 07:47:02 +0000</pubDate>
      <link>https://forem.com/aws-builders/from-monoliths-to-multitaskers-building-your-aws-ai-dream-team-5dnf</link>
      <guid>https://forem.com/aws-builders/from-monoliths-to-multitaskers-building-your-aws-ai-dream-team-5dnf</guid>
      <description>&lt;p&gt;After being away from active tech ecosystem for about a month, I have decided to shed some light on the current landscape of AI agents on AWS. And let me tell you, things have gotten wild while I was gone. We aren't just talking to single chatbots anymore; we're building entire autonomous squads.&lt;/p&gt;

&lt;p&gt;If you’ve been building with AI lately, you’ve probably hit the "Monolith Wall." You know the one: you try to get one Large Language Model (LLM) to handle your database, write your emails, troubleshoot your network, and maybe make a decent cup of coffee. It gets confused, starts hallucinating, and suddenly your "smart" app is telling users that the server is down because it's "feeling sleepy."&lt;/p&gt;

&lt;p&gt;The 2025 vibe is different. We are moving away from "User → LLM → Response" and toward "User → &lt;strong&gt;Agent Network&lt;/strong&gt; → Coordinated Action." Think of it like a D&amp;amp;D party: you wouldn't ask the Barbarian to pick a lock, and you wouldn't ask the Rogue to tank a dragon. You need specialists.    &lt;/p&gt;

&lt;p&gt;Today, we’re going to learn how to orchestrate a Multi-Agent System (MAS) on AWS Bedrock. Grab your coffee (or mana potion), and let’s dive in!&lt;/p&gt;

&lt;h2&gt;
  
  
  🛠 The Secret Sauce: What is a "Bedrock Agent"?
&lt;/h2&gt;

&lt;p&gt;Before we build the team, we need to understand the individual "hero." A Bedrock Agent isn't just an LLM with a fancy name. It's a system with three core parts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Brain (Model Provider)&lt;/strong&gt;: Usually something like Claude 3.5 Sonnet or the zippy new Amazon Nova series.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Manual (System Prompt)&lt;/strong&gt;: The instructions that tell the agent, "You are a world-class SRE," and give it boundaries (like "Don't delete production, please").&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Toolbelt (Action Groups)&lt;/strong&gt;: These are the superpowers! This is where the agent calls AWS Lambda functions to actually do things in the real world.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  🎭 The Multi-Agent Party: Meet the Supervisor
&lt;/h2&gt;

&lt;p&gt;Why have one agent when you can have five? In a Supervisor-Agent Pattern, you have one "Dungeon Master" (the Supervisor) who listens to the user and decides which specialist to call.&lt;/p&gt;

&lt;p&gt;Imagine we're building a Telco Network Operations Assistant. Our party looks like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Specialty&lt;/th&gt;
&lt;th&gt;Data Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;*&lt;em&gt;The Supervisor *&lt;/em&gt;
&lt;/td&gt;
&lt;td&gt;Orchestration &amp;amp; Planning&lt;/td&gt;
&lt;td&gt;Routes the query to the right specialist.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;The Alchemist (KPI Agent)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Performance Metrics&lt;/td&gt;
&lt;td&gt;Checks throughput and latency.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;The Scout (Alarm Agent)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Real-time Status&lt;/td&gt;
&lt;td&gt;Looks for active site failures.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;The Chronicler (Log Agent)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;History &amp;amp; Patterns&lt;/td&gt;
&lt;td&gt;Scans CloudWatch logs for anomalies.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  🏗 Hands-On Walkthrough: Building the "Site Ops Hero"
&lt;/h2&gt;

&lt;p&gt;Let's get our hands dirty. We want to ask: "What's the status of site_dallas_001?" and get a correlated answer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: The Magic Lambda (Your Action Group)&lt;/strong&gt;&lt;br&gt;
Your agent needs a bridge to the real world. We’ll use a Lambda function. When the agent calls it, it sends a JSON event that looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"actionGroup"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"NetworkTools"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"apiPath"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/get-site-alarms"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"site_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"site_dallas_001"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tip: Bedrock events aren't the same as API Gateway events! You’ll need to parse the &lt;code&gt;apiPath&lt;/code&gt; and &lt;code&gt;parameters&lt;/code&gt; directly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Set up the Specialist Agents&lt;/strong&gt;&lt;br&gt;
In the Bedrock Console (or via Boto3), create your specialist agents. Give them Knowledge Bases so they can read your documentation stored in S3.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pro-tip: Use Titan Text Embeddings v2 to turn your PDFs into "vector gold" that the agent can search in milliseconds.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Summon the Supervisor&lt;/strong&gt;&lt;br&gt;
Create a new agent and enable Collaboration Mode. Associate your specialist agents as "collaborators." Now, when you talk to the Supervisor, it creates a plan:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Ask the Scout for active alarms.&lt;/li&gt;
&lt;li&gt;Ask the Alchemist if performance is dropping.&lt;/li&gt;
&lt;li&gt;Combine the answers and tell the human exactly what’s broken.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  💰 Budgeting Your "Mana" (Cost Optimization)
&lt;/h2&gt;

&lt;p&gt;Running a whole team of AI agents can get pricey if you're not careful. Here’s how to keep your AWS bill from going "Critical Hit" on your wallet:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tiered Intelligence&lt;/strong&gt;: Use the "big brain" (Claude 3.5 Sonnet or Nova Premier) for the Supervisor who needs to plan. Use the "fast brain" (Nova Lite or Micro) for simple sub-tasks. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prompt Caching&lt;/strong&gt;: If you’re sending the same 50-page manual to your agent every time, Prompt Caching can save you up to $90\%$ on input tokens. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch Inference&lt;/strong&gt;: If you’re doing something that doesn’t need an answer right now (like summarizing yesterday's logs), use Batch Mode for a 50% discount.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🛡 Staying Safe (Guardrails)
&lt;/h2&gt;

&lt;p&gt;You don't want your agent going rogue. Use &lt;strong&gt;Amazon Bedrock Guardrails&lt;/strong&gt; to set "Denied Topics." For example, you can explicitly forbid the agent from giving out employee home addresses or venting about its boss. You can even set up Contextual Grounding Checks to make sure the agent only answers based on your data, not its own imagination (aka hallucinations).&lt;/p&gt;

&lt;h2&gt;
  
  
  🏁 Conclusion: Your Adventure Awaits!
&lt;/h2&gt;

&lt;p&gt;Building multi-agent systems on AWS is like moving from playing a single-instrument solo to conducting a full philharmonic orchestra. It’s more complex, sure, but the results are magical.&lt;/p&gt;

&lt;p&gt;Whether you're automating SRE tasks  or building a Pokémon battle advisor, the tools are all there in Bedrock.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;What are you waiting for? Go build something awesome! 🚀✨&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;P.S. When you're done, remember to delete your S3 buckets and Lambda functions so you don't wake up to a "Surprise Invoice" Boss Fight.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>machinelearning</category>
      <category>agents</category>
      <category>programming</category>
    </item>
    <item>
      <title>The AWS Lambda Setting I Wish I’d Changed Earlier</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Tue, 30 Dec 2025 09:10:50 +0000</pubDate>
      <link>https://forem.com/aws-builders/the-aws-lambda-setting-i-wish-id-changed-earlier-45i0</link>
      <guid>https://forem.com/aws-builders/the-aws-lambda-setting-i-wish-id-changed-earlier-45i0</guid>
      <description>&lt;p&gt;I've been using AWS Lambda for quite a few years, and like many people, I left the architecture on its default settings, i.e. x86 &lt;em&gt;(Without thinking 2x)&lt;/em&gt;. It worked really well, it scaled as needed, but the truth unfolded when invoices showed up. End of Story. &lt;strong&gt;&lt;em&gt;Fine Della Storia&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Upon closer examination of the situation, I paid more attention to the &lt;strong&gt;execution time &amp;amp; cost&lt;/strong&gt;, and then I thought to experiment with &lt;strong&gt;Arm64&lt;/strong&gt;, i.e. &lt;em&gt;AWS Graviton&lt;/em&gt;. And the rare cloud optimisations that actually delivered, followed. 🎇 &lt;strong&gt;A VISIBLE WIN&lt;/strong&gt; 🎇&lt;br&gt;
.&lt;br&gt;
.&lt;br&gt;
.&lt;br&gt;
&lt;em&gt;Now the main Question&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr9zwmbgyideqw8z486yp.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr9zwmbgyideqw8z486yp.gif" alt="thinkging" width="498" height="329"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  What led me to try &lt;em&gt;Arm64&lt;/em&gt;?
&lt;/h2&gt;

&lt;p&gt;The answer's as simple as it gets: A few Lambda handling API Requests &amp;amp; Background jobs were being called 1,000,000s of times per month and frankly, each one looked quite cheap, individually. But when bunched together, they burned a steamy hole in my pocket every month.&lt;/p&gt;

&lt;p&gt;Now being a &lt;strong&gt;lazy_developer&lt;/strong&gt; who uses snake cases while programming, I said &lt;strong&gt;NO&lt;/strong&gt; to refactoring the logic or the code. I just wanted a clean performance win. And that's when... &lt;strong&gt;Arm64&lt;/strong&gt; looked like a low-risk bet to me.&lt;/p&gt;
&lt;h2&gt;
  
  
  Time taken towards the CHANGE!!!
&lt;/h2&gt;

&lt;p&gt;Now, when I say I moved from x86 to Arm64, the first thing to pop in your head would be, "Nah! It's gonna take forever to change." Right? WRONG. I changed the lambda architecture from x86 to Arm64, redeployed and tested it within a matter of hours (And that's because I chose NOT to read the documentation first 🥲. It could have been sooner). That was it. No code was changed, no dependencies were broken, and no surprises. This alone was... unexpected.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faumcar68onbwgi0worvr.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faumcar68onbwgi0worvr.gif" alt="magic" width="220" height="144"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  What were my ACTUAL Observations
&lt;/h2&gt;

&lt;p&gt;The moment I started moving only a few Lambdas functions to Arm64, the execution time dropped noticeably, especially for CPU-based logic. And guess what... the Cold Start felt snappier on lightweight Node.js &amp;amp; Python functions. In the end, the monthly lambda cost dip without touching memory allocation felt like a cherry on top. &lt;/p&gt;

&lt;p&gt;So, nothing dramatic for every invocation, but at scale, the difference was truly impossible to ignore.&lt;/p&gt;
&lt;h2&gt;
  
  
  Reality Check: The Cost
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsb2hx20li4905atzf8lz.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsb2hx20li4905atzf8lz.gif" alt="cost" width="498" height="294"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Lambda billing isn't rocket science. It's simple maths:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Execution time * Memory * Invocations
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your function runs just 20-30 milliseconds faster, and it is executed millions of times, that tiny change in the execution speed turns into a new &lt;strong&gt;Apple Watch&lt;/strong&gt; (Yes, no kidneys needed 😜). This was the moment Arm64 stopped feeling like an optimisation experiment &amp;amp; started feeling like a worthy replacement to x86 as &lt;em&gt;default choice&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why did Arm64 work the best for ME?
&lt;/h2&gt;

&lt;p&gt;In my small but meaningful experience, Arm64 shone brighter than star &lt;strong&gt;Sirius&lt;/strong&gt;, when I used it for API Gateway-backed Lambdas, Event Driven pipelines, background jobs, schedulers, data processing &amp;amp; automation or even for lightweight ML inference. In short, I opted for Arm64, where I used modern serverless workloads with updated runtimes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I would never use Arm64?
&lt;/h2&gt;

&lt;p&gt;It's not like I am completely against x86 now. I still used it in my project for some scenarios. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I kept x86 for older functions with legacy native binaries&lt;/li&gt;
&lt;li&gt;Or for rare dependencies that weren't Arm-ready, yet&lt;/li&gt;
&lt;li&gt;Or for the code, that I didn't want to touch, being that close to the deadline. (Programmers, IYKYK). For those who don't, we don't want the project go&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1fwo17d7r9jq1me9vlb8.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1fwo17d7r9jq1me9vlb8.gif" alt="boom" width="498" height="332"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What changes in the future?
&lt;/h2&gt;

&lt;p&gt;In the future, if I am starting a new serverless project (Or today, who knows):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I will definitely go for Arm64, by default&lt;/li&gt;
&lt;li&gt;I am gonna only move back to x86, only &amp;amp; only, if something explicitly breaks (Or my client doesn't love their 💵)&lt;/li&gt;
&lt;li&gt;Will benchmark early, instead of assuming&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Arm64 on AWS Lambda isn't a future feature, it's just a teeny tiny and quiet &lt;strong&gt;upgrade&lt;/strong&gt; that's already paying off its cost in real-life systems. It's not theoretical. IT's PRACTICAL. Well, if you care about factors such as performance, cost &amp;amp; efficiency and you're running modern hsuper heavy serverless workloads, listen to me (Or read this carefully): TRY IT ONCE. 9/10 Chances? Like me, you won't turn back 😉.&lt;/p&gt;

&lt;p&gt;Once again, this was my personal experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  Buon Anno
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4x9t7k5qjzfe244w9nf0.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4x9t7k5qjzfe244w9nf0.gif" alt="happy new year" width="498" height="242"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  HappyCoding
&lt;/h2&gt;

</description>
      <category>aws</category>
      <category>cloud</category>
      <category>beginners</category>
      <category>programming</category>
    </item>
    <item>
      <title>How I Automated Document Insights Using AWS Textract, Bedrock, and QuickSight</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Fri, 12 Sep 2025 17:35:28 +0000</pubDate>
      <link>https://forem.com/aws-builders/how-i-automated-document-insights-using-aws-textract-bedrock-and-quicksight-h7p</link>
      <guid>https://forem.com/aws-builders/how-i-automated-document-insights-using-aws-textract-bedrock-and-quicksight-h7p</guid>
      <description>&lt;h2&gt;
  
  
  Hello Everyone 👋
&lt;/h2&gt;

&lt;p&gt;So I had a bunch of scanned invoices and PDFs lying around from an old freelance gig. Manually reading them and extracting data felt like a punishment. Instead of going through them one by one, I decided to try something smarter (I know, common &lt;em&gt;Developer Syndrome&lt;/em&gt; 😜), to build a small AWS workflow that could:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read scanned docs,&lt;/li&gt;
&lt;li&gt;Summarise key info,&lt;/li&gt;
&lt;li&gt;Convert it to structured data, and&lt;/li&gt;
&lt;li&gt;Visualise everything inside a dashboard.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s how I did it, step by step. You can follow the same process and tweak it for contracts, reports, receipts, whatever you’ve got.&lt;/p&gt;




&lt;h2&gt;
  
  
  ✅ Step 1: Set Up Your AWS Environment (Production-ready)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) Pick a single region that supports everything
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Goal:&lt;/strong&gt; Keep &lt;em&gt;S3, Textract, Bedrock,&lt;/em&gt; and &lt;em&gt;QuickSight&lt;/em&gt; in the &lt;strong&gt;same region&lt;/strong&gt; to avoid cross-region latency/fees.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How to choose:&lt;/strong&gt; Check that your chosen region offers &lt;strong&gt;Bedrock&lt;/strong&gt; (model availability varies) and &lt;strong&gt;Textract&lt;/strong&gt;. If QuickSight isn’t in that region, use &lt;strong&gt;Athena + Glue&lt;/strong&gt; as a bridge (S3 anywhere → Athena in QuickSight’s region).&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Edge cases&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Bedrock not in your favorite region:&lt;/em&gt; run &lt;strong&gt;Textract + S3&lt;/strong&gt; where your data lives, then send &lt;strong&gt;plain text only&lt;/strong&gt; to Bedrock in a supported region (privacy + cost trade-off).&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Corporate constraint on regions:&lt;/em&gt; document the exception and add an S3 Replication rule or use Athena.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  2) Create an S3 bucket with the right defaults
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name:&lt;/strong&gt; &lt;code&gt;doc-insights-&amp;lt;env&amp;gt;-&amp;lt;region&amp;gt;-&amp;lt;account-id&amp;gt;&lt;/code&gt; (e.g., &lt;code&gt;doc-insights-prod-eu-central-1-123456789012&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Settings to enable right away&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Block Public Access&lt;/strong&gt;: ON (all four toggles).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Versioning&lt;/strong&gt;: ON (lets you recover bad parses).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Default encryption&lt;/strong&gt;: &lt;strong&gt;SSE-KMS&lt;/strong&gt; (customer-managed key) for auditability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Object Ownership&lt;/strong&gt;: Bucket owner enforced (disables ACLs, simplifies perms).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lifecycle rules&lt;/strong&gt;:&lt;/li&gt;
&lt;li&gt;Move &lt;code&gt;raw/&lt;/code&gt; to infrequent access after N days.&lt;/li&gt;
&lt;li&gt;Expire &lt;code&gt;tmp/&lt;/code&gt; after M days.&lt;/li&gt;
&lt;li&gt;Keep &lt;code&gt;processed/&lt;/code&gt; longer for audits.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Folder layout (simple + scalable)&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  s3://doc-insights-bucket/
    raw/               # uploads: PDFs, images
    textract-json/     # raw JSON from Textract (per doc)
    extracted/         # normalized tables/kv (CSV/Parquet)
    summaries/         # Bedrock text summaries (JSON/MD)
    processed/         # final CSVs for BI
    logs/              # app / pipeline logs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Edge cases&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Big uploads / flaky networks:&lt;/em&gt; use &lt;strong&gt;multi-part upload&lt;/strong&gt; (SDK/CLI handles this).&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;PII/PHI compliance:&lt;/em&gt; restrict bucket to VPC endpoints (below), rotate KMS keys, and enable &lt;strong&gt;CloudTrail data events&lt;/strong&gt; for S3.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Cross-account access:&lt;/em&gt; prefer &lt;strong&gt;bucket policy&lt;/strong&gt; + &lt;strong&gt;role assumption&lt;/strong&gt; over sharing long-lived keys.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  3) IAM roles &amp;amp; least-privilege policies
&lt;/h3&gt;

&lt;p&gt;Create three roles (one per workload) and scope them tightly:&lt;/p&gt;

&lt;h3&gt;
  
  
  a) &lt;strong&gt;Textract role&lt;/strong&gt; (used by your Lambda/ETL or notebook)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Trusted entity:&lt;/strong&gt; &lt;code&gt;lambda.amazonaws.com&lt;/code&gt; (or your compute service)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inline policy (example)&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"textract:AnalyzeDocument"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"textract:StartDocumentAnalysis"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"textract:GetDocumentAnalysis"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"s3:GetObject"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"s3:PutObject"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:s3:::doc-insights-bucket/*"&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"s3:ListBucket"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:s3:::doc-insights-bucket"&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"kms:Encrypt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"kms:Decrypt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"kms:GenerateDataKey"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"kms:DescribeKey"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:kms:REGION:ACCOUNT:key/KEY_ID"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Use &lt;strong&gt;Start/Get&lt;/strong&gt; APIs for async jobs (multi-page PDFs); &lt;strong&gt;AnalyzeDocument&lt;/strong&gt; is fine for small/sync runs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  b) &lt;strong&gt;Bedrock role&lt;/strong&gt; (for summarization)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"bedrock:InvokeModel"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"bedrock:InvokeModelWithResponseStream"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"s3:GetObject"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"s3:PutObject"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:s3:::doc-insights-bucket/*"&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"kms:Encrypt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"kms:Decrypt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"kms:GenerateDataKey"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"kms:DescribeKey"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:kms:REGION:ACCOUNT:key/KEY_ID"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Limit &lt;code&gt;Resource&lt;/code&gt; to the specific model ARNs once you pick them (Claude, Mistral, etc.).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  c) &lt;strong&gt;QuickSight service role access to S3&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Inside &lt;strong&gt;QuickSight → Admin → Security &amp;amp; permissions&lt;/strong&gt;, grant access to your bucket. Backing IAM should allow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"s3:GetObject"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:s3:::doc-insights-bucket/processed/*"&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"s3:ListBucket"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:s3:::doc-insights-bucket"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Edge cases&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;SSE-KMS blocks reads:&lt;/em&gt; add the QuickSight role principal to your &lt;strong&gt;KMS key policy&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Cross-account S3:&lt;/em&gt; add a &lt;strong&gt;bucket policy&lt;/strong&gt; that allows the QuickSight account role + use a &lt;strong&gt;manifest file&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  4) KMS key policy that won’t bite you later
&lt;/h3&gt;

&lt;p&gt;Add all compute roles and the QuickSight role to the &lt;strong&gt;key policy&lt;/strong&gt; (not just IAM perms), e.g.:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"EnableRoot"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"AWS"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::ACCOUNT:root"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"kms:*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"AllowUseOfKey"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"AWS"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::ACCOUNT:role/TextractRole"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::ACCOUNT:role/BedrockRole"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::ACCOUNT:role/service-role/QuickSight-ROLE"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="s2"&gt;"kms:Encrypt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"kms:Decrypt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"kms:GenerateDataKey"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"kms:DescribeKey"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Edge cases&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;AccessDenied (kms:Decrypt) in QuickSight/Athena:&lt;/em&gt; 99% of the time it’s the key policy missing the service role.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  5) Networking (private, compliant, and fast)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Create VPC interface endpoints (PrivateLink)&lt;/strong&gt; for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;S3&lt;/strong&gt; (Gateway endpoint), &lt;strong&gt;Textract&lt;/strong&gt;, &lt;strong&gt;Bedrock&lt;/strong&gt;, &lt;strong&gt;STS&lt;/strong&gt;, &lt;strong&gt;Logs&lt;/strong&gt;, etc.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Lock your compute (Lambda/ECS/EC2) to &lt;strong&gt;no-internet&lt;/strong&gt; and route AWS traffic via endpoints.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Edge cases&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;QuickSight can’t sit in your VPC:&lt;/em&gt; it’s managed; keep the &lt;strong&gt;S3 bucket policy open to QuickSight role&lt;/strong&gt; (not public) and allow KMS for that role.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Corporate egress proxy:&lt;/em&gt; verify Bedrock/Textract endpoints are reachable or use endpoints.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  6) Turn on Amazon Bedrock access (once per account/region)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;In the &lt;strong&gt;Bedrock console&lt;/strong&gt;, enable model access (Claude/Mistral etc.). Some models require an &lt;strong&gt;opt-in&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Edge cases&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Model not visible:&lt;/em&gt; you’re in a region without that model, or access not granted yet.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Throughput errors:&lt;/em&gt; request &lt;strong&gt;service quota&lt;/strong&gt; increases for &lt;code&gt;InvokeModel&lt;/code&gt; if you batch summaries.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  7) Enable Amazon QuickSight
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Choose the same &lt;strong&gt;region&lt;/strong&gt; as your S3 if possible.&lt;/li&gt;
&lt;li&gt;In &lt;strong&gt;Security &amp;amp; permissions&lt;/strong&gt;, tick your bucket and (if needed) &lt;strong&gt;Athena&lt;/strong&gt; access.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Decide &lt;strong&gt;SPICE&lt;/strong&gt; vs &lt;strong&gt;Direct Query&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SPICE&lt;/strong&gt; (in-memory) is faster; watch &lt;strong&gt;capacity limits&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Direct&lt;/strong&gt; is fine for small CSVs; for large data, prefer &lt;strong&gt;Parquet in S3 via Athena&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Edge cases&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;CSV &amp;gt; size limit / too many rows:&lt;/em&gt; convert to &lt;strong&gt;Parquet&lt;/strong&gt; and query via &lt;strong&gt;Athena&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Timezone mismatches:&lt;/em&gt; set dataset time zone explicitly; normalize date formats (ISO-8601).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  8) Budget, cost alerts, and audit
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AWS Budgets:&lt;/strong&gt; create a monthly budget with email alerts (S3 + Textract + Bedrock + QuickSight).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CloudTrail data events:&lt;/strong&gt; ON for S3 (per-object audit).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CloudWatch Logs:&lt;/strong&gt; centralize Lambda/app logs to &lt;code&gt;logs/&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Edge cases&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Sudden spikes:&lt;/em&gt; enable &lt;strong&gt;S3 object-level CloudWatch metrics&lt;/strong&gt; and add alarms on object count/size in &lt;code&gt;raw/&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  9) File formats &amp;amp; naming that make your life easy
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Uploads (raw):&lt;/strong&gt; accept PDF/JPEG/PNG/TIFF; normalize on ingest.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Outputs:&lt;/strong&gt; prefer &lt;strong&gt;CSV (QuickSight-friendly)&lt;/strong&gt; and/or &lt;strong&gt;Parquet&lt;/strong&gt; (Athena-friendly).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Names:&lt;/strong&gt; &lt;code&gt;raw/{yyyy}/{mm}/{dd}/{vendor}/{fileid}.pdf&lt;/code&gt;
&lt;code&gt;processed/{yyyy}/{mm}/{dd}/invoices.csv&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Edge cases&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Weird encodings:&lt;/em&gt; force &lt;strong&gt;UTF-8&lt;/strong&gt;, escape delimiters, include header row.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Duplicate filenames:&lt;/em&gt; use a UUID suffix or checksum in the key.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  10) Optional: IaC and CLI quick start
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Create bucket (CLI)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api create-bucket &lt;span class="nt"&gt;--bucket&lt;/span&gt; doc-insights-bucket &lt;span class="nt"&gt;--create-bucket-configuration&lt;/span&gt; &lt;span class="nv"&gt;LocationConstraint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$AWS_REGION&lt;/span&gt;
aws s3api put-bucket-encryption &lt;span class="nt"&gt;--bucket&lt;/span&gt; doc-insights-bucket &lt;span class="nt"&gt;--server-side-encryption-configuration&lt;/span&gt; &lt;span class="s1"&gt;'{
  "Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"aws:kms","KMSMasterKeyID":"arn:aws:kms:REGION:ACCOUNT:key/KEY_ID"}}]}'&lt;/span&gt;
aws s3api put-bucket-versioning &lt;span class="nt"&gt;--bucket&lt;/span&gt; doc-insights-bucket &lt;span class="nt"&gt;--versioning-configuration&lt;/span&gt; &lt;span class="nv"&gt;Status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Enabled
aws s3api put-public-access-block &lt;span class="nt"&gt;--bucket&lt;/span&gt; doc-insights-bucket &lt;span class="nt"&gt;--public-access-block-configuration&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="s1"&gt;'{"BlockPublicAcls":true,"IgnorePublicAcls":true,"BlockPublicPolicy":true,"RestrictPublicBuckets":true}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Example lifecycle (CLI)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api put-bucket-lifecycle-configuration &lt;span class="nt"&gt;--bucket&lt;/span&gt; doc-insights-bucket &lt;span class="nt"&gt;--lifecycle-configuration&lt;/span&gt; &lt;span class="s1"&gt;'{
  "Rules":[
    {"ID":"raw-to-ia","Filter":{"Prefix":"raw/"},"Status":"Enabled","Transitions":[{"Days":30,"StorageClass":"STANDARD_IA"}]},
    {"ID":"expire-tmp","Filter":{"Prefix":"tmp/"},"Status":"Enabled","Expiration":{"Days":7}}
  ]}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  📄 Step 2: Use Amazon Textract to Extract Text from Documents (Production-grade)
&lt;/h2&gt;

&lt;p&gt;Textract can do three things you’ll care about:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;OCR (lines/words)&lt;/strong&gt; → &lt;code&gt;DetectDocumentText&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generic forms &amp;amp; tables&lt;/strong&gt; → &lt;code&gt;AnalyzeDocument&lt;/code&gt; with &lt;code&gt;FORMS&lt;/code&gt;, &lt;code&gt;TABLES&lt;/code&gt;, optionally &lt;code&gt;QUERIES&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Invoice/receipt specialization&lt;/strong&gt; → &lt;code&gt;AnalyzeExpense&lt;/code&gt; (often best for invoices)&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Rule of thumb&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If it’s an &lt;strong&gt;invoice/receipt&lt;/strong&gt;, try &lt;strong&gt;&lt;code&gt;AnalyzeExpense&lt;/code&gt;&lt;/strong&gt; first (it returns normalized fields like &lt;code&gt;VENDOR_NAME&lt;/code&gt;, &lt;code&gt;INVOICE_TOTAL&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;If it’s a &lt;strong&gt;contract/form/report&lt;/strong&gt;, use &lt;strong&gt;&lt;code&gt;AnalyzeDocument&lt;/code&gt;&lt;/strong&gt; with &lt;code&gt;FORMS&lt;/code&gt; and &lt;code&gt;TABLES&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;&lt;code&gt;QUERIES&lt;/code&gt;&lt;/strong&gt; to target specific fields (e.g., “What is the due date?”) when layout varies wildly.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  1) Sync vs Async: choose the right API
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Synchronous&lt;/strong&gt; (single call; immediate result)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;AnalyzeDocument&lt;/code&gt; (good for &lt;strong&gt;single-page&lt;/strong&gt; or &lt;strong&gt;small&lt;/strong&gt; docs; great for tests)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DetectDocumentText&lt;/code&gt; (basic OCR)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Asynchronous&lt;/strong&gt; (submit job → poll or page through results)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;StartDocumentAnalysis&lt;/code&gt; + &lt;code&gt;GetDocumentAnalysis&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Required for &lt;strong&gt;multi-page&lt;/strong&gt; and &lt;strong&gt;larger&lt;/strong&gt; PDFs/images&lt;/li&gt;
&lt;li&gt;Supports pagination via &lt;code&gt;NextToken&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Guidance:&lt;/strong&gt; Use &lt;strong&gt;sync&lt;/strong&gt; for quick console/dev tests or one-pagers; use &lt;strong&gt;async&lt;/strong&gt; for anything real (multi-page, batch).&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  2) Pre-processing (massively improves accuracy)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input formats:&lt;/strong&gt; PDF, PNG, JPEG, TIFF work well; keep scans clean.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Quality:&lt;/strong&gt; Aim for readable text (good contrast, minimal blur). If possible:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deskew, denoise, increase contrast, convert to grayscale/PNG.&lt;/li&gt;
&lt;li&gt;Fix rotation (many scanners rotate by 90° or 180°).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Encrypted or password-protected PDFs:&lt;/strong&gt; decrypt before upload.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Color:&lt;/strong&gt; Grayscale often beats noisy full-color scans.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multi-page splits:&lt;/strong&gt; If you hit page/file limits, split before upload.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;File names:&lt;/strong&gt; Include vendor/date hints in the key to simplify downstream routing (e.g., &lt;code&gt;raw/2025/09/11/vendorX/invoice_123.pdf&lt;/code&gt;).&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  3) Minimal but robust code paths (Python/boto3)
&lt;/h3&gt;

&lt;h4&gt;
  
  
  A. &lt;strong&gt;Synchronous&lt;/strong&gt;: quick test for forms + tables
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;

&lt;span class="n"&gt;textract&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;textract&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eu-central-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;textract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;S3Object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc-insights-bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;invoice1.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
    &lt;span class="n"&gt;FeatureTypes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FORMS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TABLES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Blocks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
&lt;span class="c1"&gt;# → Save raw JSON, parse KEY_VALUE_SET/TABLE blocks later
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  B. &lt;strong&gt;Asynchronous&lt;/strong&gt;: production for multi-page docs (handles pagination + retries)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;itertools&lt;/span&gt;

&lt;span class="n"&gt;textract&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;textract&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eu-central-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;job&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;textract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start_document_analysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;DocumentLocation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;S3Object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc-insights-bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;long-invoice.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
    &lt;span class="n"&gt;FeatureTypes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FORMS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TABLES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;job_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;JobId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_all_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_wait_s&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;900&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;next_token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="c1"&gt;# Wait for completion (simple poll)
&lt;/span&gt;    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IN_PROGRESS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IN_PROGRESS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;textract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_document_analysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;JobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;job_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NextToken&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;next_token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;next_token&lt;/span&gt; \
              &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;textract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_document_analysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;JobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;job_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;JobStatus&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IN_PROGRESS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# If first page says in progress, wait; otherwise collect &amp;amp; keep paging
&lt;/span&gt;            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Blocks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;next_token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NextToken&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;next_token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;max_wait_s&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Textract job took too long.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUCCEEDED&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NextToken&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;textract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_document_analysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;JobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;job_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NextToken&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NextToken&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                &lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Textract job failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;StatusMessage&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;pages&lt;/span&gt;

&lt;span class="n"&gt;pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_all_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Combine blocks across pages for parsing
&lt;/span&gt;&lt;span class="n"&gt;all_blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;itertools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_iterable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Blocks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  C. &lt;strong&gt;Invoices/Receipts&lt;/strong&gt;: specialized &lt;code&gt;AnalyzeExpense&lt;/code&gt; (often higher accuracy on totals/dates)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;

&lt;span class="n"&gt;textract&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;textract&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eu-central-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;textract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze_expense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;S3Object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc-insights-bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;invoice1.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# resp["ExpenseDocuments"][0]["SummaryFields"] includes types like:
# VENDOR_NAME, INVOICE_RECEIPT_DATE, INVOICE_RECEIPT_ID, TOTAL, DUE_DATE
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  D. &lt;strong&gt;Targeted fields&lt;/strong&gt; with &lt;code&gt;QUERIES&lt;/code&gt; (when layouts vary)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="n"&gt;textract&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;textract&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eu-central-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;textract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;S3Object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc-insights-bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;invoice1.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
    &lt;span class="n"&gt;FeatureTypes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;QUERIES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TABLES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;  &lt;span class="c1"&gt;# you can mix with FORMS too
&lt;/span&gt;    &lt;span class="n"&gt;QueriesConfig&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Queries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the invoice number?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Alias&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;invoice_no&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the total amount?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Alias&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total_amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the due date?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Alias&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;due_date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;You can run &lt;strong&gt;QUERIES + TABLES&lt;/strong&gt; together to pull exact fields and still capture line items.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  4) How to parse Textract blocks safely (just enough for Step 2)
&lt;/h3&gt;

&lt;p&gt;Textract represents a &lt;strong&gt;graph&lt;/strong&gt; of &lt;code&gt;Blocks&lt;/code&gt; with &lt;code&gt;Relationships&lt;/code&gt;. The key ones:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PAGE&lt;/code&gt;, &lt;code&gt;LINE&lt;/code&gt;, &lt;code&gt;WORD&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;KEY_VALUE_SET&lt;/code&gt; (with &lt;code&gt;EntityTypes&lt;/code&gt;: &lt;code&gt;KEY&lt;/code&gt; or &lt;code&gt;VALUE&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;TABLE&lt;/code&gt; → &lt;code&gt;CELL&lt;/code&gt; (cell has row/col indices)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;QUERY&lt;/code&gt; → &lt;code&gt;ANSWER&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Parsing outline (KV):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Build a dict of blocks by &lt;code&gt;Id&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Collect all &lt;code&gt;KEY_VALUE_SET&lt;/code&gt; blocks.&lt;/li&gt;
&lt;li&gt;For each &lt;code&gt;KEY&lt;/code&gt; block, follow &lt;code&gt;Relationships&lt;/code&gt; to its &lt;code&gt;VALUE&lt;/code&gt; block.&lt;/li&gt;
&lt;li&gt;Extract text by concatenating child &lt;code&gt;WORD&lt;/code&gt; blocks; track &lt;code&gt;Confidence&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Normalize keys (&lt;code&gt;vendor&lt;/code&gt;, &lt;code&gt;invoice no&lt;/code&gt;, etc.) and keep the &lt;code&gt;BoundingBox&lt;/code&gt; for traceability.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Parsing outline (TABLE):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find &lt;code&gt;TABLE&lt;/code&gt; blocks, get their child &lt;code&gt;CELL&lt;/code&gt;s.&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;(RowIndex, ColumnIndex)&lt;/code&gt; to reconstruct rows.&lt;/li&gt;
&lt;li&gt;Heuristically detect &lt;strong&gt;headers&lt;/strong&gt; (first row or bold-ish/upper-case hints if present in text).&lt;/li&gt;
&lt;li&gt;Output as CSV; preserve page number.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;We’ll do the full CSV/JSON conversion in &lt;strong&gt;Step 3&lt;/strong&gt;. Here we only ensure you’ve got clean raw JSON saved and a clear parser plan.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  5) Storage &amp;amp; idempotency
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Always&lt;/strong&gt; save raw responses (one file per source doc):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;textract-json/&amp;lt;doc-id&amp;gt;.json&lt;/code&gt; (for &lt;code&gt;AnalyzeDocument&lt;/code&gt;/&lt;code&gt;DetectDocumentText&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;textract-json/&amp;lt;doc-id&amp;gt;.expense.json&lt;/code&gt; (for &lt;code&gt;AnalyzeExpense&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Add metadata to input S3 objects (e.g., &lt;code&gt;processed=true&lt;/code&gt;, &lt;code&gt;textract_job_id=...&lt;/code&gt;) to avoid reprocessing.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Use S3 object ETag/size as a cheap duplicate detector.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  6) Confidence, validation &amp;amp; fallbacks
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Confidence thresholds:&lt;/strong&gt; treat fields under a threshold (e.g., &lt;code&gt;&amp;lt; 0.8&lt;/code&gt;) as &lt;strong&gt;“needs review.”&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Numeric sanity checks:&lt;/strong&gt; parse amounts with a currency regex, verify:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;sum(line_items) ≈ total (± a few cents, taxes/discounts considered)&lt;/li&gt;
&lt;li&gt;date is valid &amp;amp; plausible (e.g., due date ≥ invoice date)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Disambiguation:&lt;/strong&gt; fight common OCR confusions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;O&lt;/code&gt;↔&lt;code&gt;0&lt;/code&gt;, &lt;code&gt;l&lt;/code&gt;/&lt;code&gt;I&lt;/code&gt;↔&lt;code&gt;1&lt;/code&gt;, &lt;code&gt;S&lt;/code&gt;↔&lt;code&gt;5&lt;/code&gt;, &lt;code&gt;B&lt;/code&gt;↔&lt;code&gt;8&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fallback order for invoices:&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;code&gt;AnalyzeExpense&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AnalyzeDocument (FORMS+TABLES)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DetectDocumentText&lt;/code&gt; + regex heuristics (as a last resort)

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Human-in-the-loop:&lt;/strong&gt; flag low-confidence docs to a review queue (CSV row with a &lt;code&gt;needs_review=1&lt;/code&gt; flag).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  7) Permissions &amp;amp; security edge cases (most common causes of failure)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;S3&lt;/strong&gt;: role needs &lt;code&gt;s3:GetObject&lt;/code&gt;, &lt;code&gt;s3:ListBucket&lt;/code&gt;, and if encrypted, &lt;strong&gt;&lt;code&gt;kms:Decrypt&lt;/code&gt;&lt;/strong&gt; on the CMK.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KMS key policy&lt;/strong&gt;: include the &lt;strong&gt;calling role principal&lt;/strong&gt; (not just IAM permissions).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-account&lt;/strong&gt;: use bucket policy + role assumption; ensure KMS allows the assumable role’s principal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Private networking&lt;/strong&gt;: if using VPC-only compute, add &lt;strong&gt;VPC endpoints&lt;/strong&gt; for Textract/S3/STS/Logs.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  8) Scaling, throughput, and cost control
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Batching&lt;/strong&gt;: drive Textract from a queue (SQS) or Step Functions; cap concurrency to avoid throttling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retries&lt;/strong&gt;: exponential backoff on &lt;code&gt;ThrottlingException&lt;/code&gt; and transient network errors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;De-dup&lt;/strong&gt;: check if a doc was already processed (S3 metadata) before calling Textract again.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Specialization saves money&lt;/strong&gt;: &lt;code&gt;AnalyzeExpense&lt;/code&gt; or &lt;code&gt;QUERIES&lt;/code&gt; can reduce post-processing and re-runs.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  9) Troubleshooting cheat-sheet
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Symptom&lt;/th&gt;
&lt;th&gt;Likely cause&lt;/th&gt;
&lt;th&gt;Fast fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;AccessDenied&lt;/code&gt; on S3 or KMS&lt;/td&gt;
&lt;td&gt;Missing &lt;code&gt;s3:GetObject&lt;/code&gt; / &lt;code&gt;kms:Decrypt&lt;/code&gt; or KMS key policy excludes role&lt;/td&gt;
&lt;td&gt;Add actions to role &lt;strong&gt;and&lt;/strong&gt; role principal to &lt;strong&gt;KMS key policy&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Job stuck “IN_PROGRESS” forever&lt;/td&gt;
&lt;td&gt;Not polling pages / ignoring &lt;code&gt;NextToken&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Implement full pagination loop until &lt;code&gt;NextToken&lt;/code&gt; is empty&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Empty or low-quality text&lt;/td&gt;
&lt;td&gt;Bad scans (blur, skew), wrong rotation, low contrast&lt;/td&gt;
&lt;td&gt;Preprocess image: deskew, increase contrast, convert to PNG, rotate correctly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fields missing on invoices&lt;/td&gt;
&lt;td&gt;Using generic forms instead of expense API&lt;/td&gt;
&lt;td&gt;Switch to &lt;strong&gt;&lt;code&gt;AnalyzeExpense&lt;/code&gt;&lt;/strong&gt; or add &lt;strong&gt;&lt;code&gt;QUERIES&lt;/code&gt;&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dates/amounts parsed wrong&lt;/td&gt;
&lt;td&gt;Locale/regex issues or OCR confusions&lt;/td&gt;
&lt;td&gt;Normalize formats (ISO-8601), add currency/number regex, apply character-swap heuristics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;QuickSight can’t see outputs&lt;/td&gt;
&lt;td&gt;Using KMS + missing QS access&lt;/td&gt;
&lt;td&gt;Add QS role to &lt;strong&gt;bucket policy&lt;/strong&gt; and &lt;strong&gt;KMS key policy&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  10) Console workflow (great for validation)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Upload a sample to S3 → &lt;strong&gt;Textract Console&lt;/strong&gt; → &lt;strong&gt;Analyze&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Toggle &lt;strong&gt;Forms&lt;/strong&gt; + &lt;strong&gt;Tables&lt;/strong&gt;, confirm highlights match expectations.&lt;/li&gt;
&lt;li&gt;For invoices, test &lt;strong&gt;Expense analysis&lt;/strong&gt; and compare field names (often cleaner).&lt;/li&gt;
&lt;li&gt;Save the result JSON and move on to parsing (Step 3).&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  ✏️ Step 3: Convert Textract Output to Structured Data (CSV or JSON)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) Clean &amp;amp; normalize the extracted data
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Keep originals:&lt;/strong&gt; store the raw Textract JSON (read-only) so you can re-parse later without re-running Textract.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Whitespace &amp;amp; casing:&lt;/strong&gt; trim, collapse double spaces; keep a &lt;code&gt;_raw&lt;/code&gt; copy for each value if you’ll audit later.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Key normalization (FORMS/QUERIES):&lt;/strong&gt; map variants to canonical keys:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;invoice no&lt;/code&gt;, &lt;code&gt;inv #&lt;/code&gt;, &lt;code&gt;invoice number&lt;/code&gt; → &lt;code&gt;invoice_number&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;total&lt;/code&gt;, &lt;code&gt;grand total&lt;/code&gt;, &lt;code&gt;amount due&lt;/code&gt; → &lt;code&gt;total&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;supplier&lt;/code&gt;, &lt;code&gt;vendor&lt;/code&gt;, &lt;code&gt;billed by&lt;/code&gt; → &lt;code&gt;vendor_name&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;invoice date&lt;/code&gt;, &lt;code&gt;date&lt;/code&gt; → &lt;code&gt;invoice_date&lt;/code&gt;; &lt;code&gt;due date&lt;/code&gt; → &lt;code&gt;due_date&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dates:&lt;/strong&gt; convert to ISO-8601 (&lt;code&gt;YYYY-MM-DD&lt;/code&gt;). If ambiguous/parsing fails, leave blank and flag &lt;code&gt;needs_review&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Numbers &amp;amp; currency:&lt;/strong&gt; strip thousands separators, unify decimals (&lt;code&gt;.&lt;/code&gt;), extract currency symbol/code (&lt;code&gt;$&lt;/code&gt;→USD, &lt;code&gt;€&lt;/code&gt;→EUR, &lt;code&gt;£&lt;/code&gt;→GBP, &lt;code&gt;₨&lt;/code&gt;→PKR, &lt;code&gt;₹&lt;/code&gt;→INR). If none, set a default and flag for review.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Confidence tracking:&lt;/strong&gt; carry Textract confidence (0–100). Anything below your threshold (e.g., &lt;strong&gt;80&lt;/strong&gt;) sets &lt;code&gt;needs_review = 1&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;OCR confusions:&lt;/strong&gt; apply gentle corrections when context demands numeric values (&lt;code&gt;O↔0&lt;/code&gt;, &lt;code&gt;I/l↔1&lt;/code&gt;, &lt;code&gt;S↔5&lt;/code&gt;, &lt;code&gt;B↔8&lt;/code&gt;) but still keep &lt;code&gt;needs_review&lt;/code&gt; if confidence was low.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  2) Choose your outputs
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;CSV (analytics-ready):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;KV (long format):&lt;/strong&gt; &lt;code&gt;doc_id, page, key_norm, value, confidence, needs_review&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tables (per page/table):&lt;/strong&gt; &lt;code&gt;doc_id, page, row, col, text, confidence, is_header&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Invoice summary (if using Expense API):&lt;/strong&gt; one row per document with normalized fields&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;JSON (traceability):&lt;/strong&gt; keep structured JSON when you need bounding boxes/page indices for audits.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;(Optional) Parquet:&lt;/strong&gt; for Athena/QuickSight at scale.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  3) Dump key–value pairs to CSV
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;What to capture:&lt;/strong&gt; each detected pair as one row in a &lt;strong&gt;long&lt;/strong&gt; table:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;doc_id, page, key_raw, key_norm, value, conf_key, conf_val, needs_review&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Duplicates:&lt;/strong&gt; keep all candidates. Later, pick the &lt;strong&gt;highest confidence&lt;/strong&gt; (ties → closest to the label “Total”, latest on the page, or appears in a summary box).&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Missing values:&lt;/strong&gt; if a &lt;code&gt;KEY&lt;/code&gt; has no &lt;code&gt;VALUE&lt;/code&gt;, record &lt;code&gt;value = ""&lt;/code&gt; and set &lt;code&gt;needs_review = 1&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multi-line values:&lt;/strong&gt; join lines with a single space; preserve original in &lt;code&gt;value_raw&lt;/code&gt; if you keep audits.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  4) Convert tables from JSON to CSV
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reconstruct using cells:&lt;/strong&gt; one row per cell with indexes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;doc_id, page, row, col, text, confidence, span_rows, span_cols, is_header&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Headers:&lt;/strong&gt; default to &lt;strong&gt;row 1 = header&lt;/strong&gt;; mark &lt;code&gt;is_header = 1&lt;/code&gt;. If row 1 has poor confidence but contains many alphabetic tokens vs numeric rows below, still treat as header.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Merged cells:&lt;/strong&gt; repeat the merged text in the spanned area and record &lt;code&gt;span_rows/span_cols&lt;/code&gt; for fidelity.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multi-page tables:&lt;/strong&gt; keep separate by page; your business logic decides whether to stitch them later.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  5) Build the simple “wide” table for your dashboard
&lt;/h3&gt;

&lt;p&gt;Target columns from your outline:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Invoice No (&lt;code&gt;invoice_number&lt;/code&gt;)&lt;/th&gt;
&lt;th&gt;Date (&lt;code&gt;invoice_date&lt;/code&gt;)&lt;/th&gt;
&lt;th&gt;Vendor (&lt;code&gt;vendor_name&lt;/code&gt;)&lt;/th&gt;
&lt;th&gt;Amount (&lt;code&gt;total&lt;/code&gt;)&lt;/th&gt;
&lt;th&gt;Due Date (&lt;code&gt;due_date&lt;/code&gt;)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INV-123&lt;/td&gt;
&lt;td&gt;2025-08-03&lt;/td&gt;
&lt;td&gt;Alpha Supplies&lt;/td&gt;
&lt;td&gt;423.90&lt;/td&gt;
&lt;td&gt;2025-09-02&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Population priority:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Expense API (if used):&lt;/strong&gt; &lt;code&gt;INVOICE_RECEIPT_ID&lt;/code&gt;, &lt;code&gt;INVOICE_RECEIPT_DATE&lt;/code&gt;, &lt;code&gt;VENDOR_NAME&lt;/code&gt;, &lt;code&gt;TOTAL&lt;/code&gt;, &lt;code&gt;DUE_DATE&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FORMS/QUERIES:&lt;/strong&gt; after key normalization and numeric/date parsing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TABLES/OCR fallback:&lt;/strong&gt; totals row or summary box heuristics (e.g., last column named &lt;code&gt;Amount&lt;/code&gt;, &lt;code&gt;Total&lt;/code&gt;, &lt;code&gt;Balance&lt;/code&gt;).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Also add helpful derived fields:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;currency&lt;/code&gt;, &lt;code&gt;source&lt;/code&gt; (&lt;code&gt;expense&lt;/code&gt;/&lt;code&gt;forms&lt;/code&gt;/&lt;code&gt;ocr&lt;/code&gt;), &lt;code&gt;needs_review&lt;/code&gt;, &lt;code&gt;month&lt;/code&gt;, &lt;code&gt;year&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  6) Quality &amp;amp; integrity checks
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Arithmetic sanity:&lt;/strong&gt; if you have &lt;code&gt;subtotal&lt;/code&gt; and &lt;code&gt;tax&lt;/code&gt;, verify &lt;code&gt;subtotal + tax ≈ total&lt;/code&gt; (small tolerance for rounding). If not, &lt;code&gt;needs_review = 1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Date logic:&lt;/strong&gt; &lt;code&gt;due_date ≥ invoice_date&lt;/code&gt;; otherwise &lt;code&gt;needs_review = 1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Negatives &amp;amp; zeros:&lt;/strong&gt; &lt;code&gt;total&lt;/code&gt; must be &lt;code&gt;&amp;gt; 0&lt;/code&gt; for invoices; credits/notes may be negative tag with &lt;code&gt;doc_type = credit_note&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Duplicates:&lt;/strong&gt; same &lt;code&gt;invoice_number&lt;/code&gt; appearing across different &lt;code&gt;doc_id&lt;/code&gt;s → keep the one with higher composite score (weighted confidence of critical fields) and mark others as &lt;code&gt;superseded&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Two documents in one file:&lt;/strong&gt; if you detect two distinct vendors or invoice numbers in one &lt;code&gt;doc_id&lt;/code&gt;, set &lt;code&gt;multi_doc_suspected = 1&lt;/code&gt; and route to review.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Locale drift:&lt;/strong&gt; month names/currency symbols outside your default locale → still parse but set &lt;code&gt;locale_detected&lt;/code&gt; and &lt;code&gt;needs_review&lt;/code&gt; if parsing relied on heuristics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handwritten/low DPI scans:&lt;/strong&gt; values may be fragmented; prefer QUERIES for targeted fields and flag low confidence.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  7) Foldering
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;textract-json/            # raw API outputs (never edited)
extracted/
  kv/                     # &amp;lt;doc_id&amp;gt;_kv.csv
  tables/                 # &amp;lt;doc_id&amp;gt;_table_&amp;lt;page&amp;gt;.csv
  expense/                # &amp;lt;doc_id&amp;gt;_summary.csv, &amp;lt;doc_id&amp;gt;_line_items.csv (if used)
processed/
  invoices_wide.csv       # your dashboard input (or Parquet partitioned by year/month)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧠 Step 4: Summarise the Document Using Bedrock
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) What you’ll summarise
&lt;/h3&gt;

&lt;p&gt;Feed Bedrock &lt;strong&gt;only what it needs&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Facts&lt;/strong&gt; you already extracted in Step 3 (invoice_number, vendor_name, total, due_date, currency, etc.).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Top signal text&lt;/strong&gt;: page headers, totals area, vendor/address block, and (for invoices) the line-items table flattened to a few lines (don’t paste thousands of rows).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optional full text&lt;/strong&gt;: chunked by page if you need narrative context.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Redact or truncate PII that isn’t needed (addresses beyond city/country, phone, full account numbers). Keep the &lt;strong&gt;doc_id&lt;/strong&gt; so you can trace back.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  2) Output format
&lt;/h3&gt;

&lt;p&gt;Ask Bedrock for &lt;strong&gt;structured JSON + a short narrative&lt;/strong&gt;. Example JSON schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"doc_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1–3 sentence human summary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"key_fields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"vendor_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string|null"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"invoice_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string|null"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"invoice_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"YYYY-MM-DD|null"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"due_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"YYYY-MM-DD|null"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"currency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string|null"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total_amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"number|null"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"observations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;anomalies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;discounts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;late&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;fees&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;notes&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data_quality"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"missing_fields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"suspected_errors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"confidence_note"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This slots straight into your &lt;code&gt;processed/&lt;/code&gt; layer or QuickSight as a secondary dataset.&lt;/p&gt;




&lt;h3&gt;
  
  
  3) Prompting
&lt;/h3&gt;

&lt;p&gt;Use a &lt;strong&gt;system + user&lt;/strong&gt; style prompt; instruct the model not to invent data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;You are a careful financial document assistant. Use ONLY the supplied text and fields.&lt;br&gt;
If a field is missing or unclear, return &lt;code&gt;null&lt;/code&gt; and add a helpful note in &lt;code&gt;data_quality&lt;/code&gt;.&lt;br&gt;
Output must be valid JSON matching the provided schema. Do not include any other text.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;User&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DOC_ID: {{doc_id}}

FACTS (from extraction):
- vendor_name: {{vendor_name}}
- invoice_number: {{invoice_number}}
- invoice_date: {{invoice_date}}
- due_date: {{due_date}}
- currency: {{currency}}
- total_amount: {{total}}

EXCERPTS (trimmed; page-tagged):
[PAGE 1]
{{top_of_page_text}}
[PAGE 2 - totals box]
{{totals_area_text}}
[OPTIONAL line items]
{{few_key_lines}}

TASK:
1) Produce a JSON object exactly matching this schema:
{{paste schema}}
2) Base fields on FACTS when available; only use EXCERPTS to confirm or fill blanks.
3) If numbers disagree, prefer FACTS but note the discrepancy in `data_quality.suspected_errors`.
4) Do not guess. Use `null` for unknowns.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why this works&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It &lt;strong&gt;anchors&lt;/strong&gt; the model to your extracted fields (reduces hallucinations).&lt;/li&gt;
&lt;li&gt;It &lt;strong&gt;forces&lt;/strong&gt; a predictable JSON payload for downstream use.&lt;/li&gt;
&lt;li&gt;It adds a &lt;strong&gt;quality channel&lt;/strong&gt; for “known unknowns”.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  4) Model choice &amp;amp; params
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude (Anthropic)&lt;/strong&gt;: strong on instruction following &amp;amp; JSON fidelity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mistral&lt;/strong&gt;: concise, fast; good when payloads are small and schema is simple.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parameters&lt;/strong&gt;: keep &lt;strong&gt;temperature low (≈0.0–0.2)&lt;/strong&gt; for factual stability; set &lt;strong&gt;max tokens&lt;/strong&gt; tight (e.g., 600–900) for cost control.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Tip: If JSON validation fails sometimes, wrap with a &lt;strong&gt;“fix to valid JSON”&lt;/strong&gt; pass (or enable JSON mode where available).&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  5) Long documents or Map/Reduce
&lt;/h3&gt;

&lt;p&gt;For multi-page or very verbose docs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Map step&lt;/strong&gt;: summarize &lt;strong&gt;each page&lt;/strong&gt; (fixed schema, plus page-level anomalies).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reduce step&lt;/strong&gt;: feed all page summaries (+ your Step-3 facts) into a second prompt that &lt;strong&gt;merges&lt;/strong&gt; into a single final JSON, breaking ties in favor of Step-3 facts.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This avoids token limits and preserves page context.&lt;/p&gt;




&lt;h3&gt;
  
  
  6) Robust invocation
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;bedrock-runtime&lt;/code&gt; with the model you enabled (ids vary by region/provider). Two common patterns:&lt;/p&gt;

&lt;h3&gt;
  
  
  Non-streaming (simple)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="n"&gt;brt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-runtime&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_REGION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;invoke_bedrock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic_version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-2023-05-31&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user_prompt&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;brt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="c1"&gt;# Extract the text depending on provider; for Claude:
&lt;/span&gt;    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,[])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Streaming
&lt;/h4&gt;

&lt;p&gt;Stream when you want low latency for UI, otherwise skip (adds complexity).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Exact request/response shapes differ by model family. Keep the &lt;strong&gt;modelId&lt;/strong&gt; and payload format configurable and test in the Bedrock console first.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  7) Validation &amp;amp; safety rails
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;JSON validation&lt;/strong&gt;: run a quick schema check (pydantic/jsonschema). If invalid, re-prompt the model with: “Return only valid JSON conforming to this schema. Do not add commentary.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No-invention rule&lt;/strong&gt;: instruct “If unsure, use &lt;code&gt;null&lt;/code&gt; and add a &lt;code&gt;data_quality&lt;/code&gt; note.” Penalize hallucinations by rejecting outputs that add fields you didn’t request.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discrepancy handling&lt;/strong&gt;: when EXCERPTS disagree with FACTS, keep FACTS and add a note like:
&lt;em&gt;“Totals area shows 432.90 but extracted total is 423.90 (difference 9.00).”&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PII minimization&lt;/strong&gt;: don’t pass line-item text unless needed; mask account numbers (e.g., &lt;code&gt;****1234&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost control&lt;/strong&gt;: truncate excerpts, avoid full dumps; cap &lt;code&gt;max_tokens&lt;/code&gt;; batch summarize.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  8) Edge cases
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Edge case&lt;/th&gt;
&lt;th&gt;What to do in prompt / output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Missing total or due date&lt;/td&gt;
&lt;td&gt;Return &lt;code&gt;"total_amount": null&lt;/code&gt; / &lt;code&gt;"due_date": null&lt;/code&gt;; add a &lt;code&gt;data_quality.missing_fields&lt;/code&gt; entry.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multiple totals (invoice vs balance)&lt;/td&gt;
&lt;td&gt;Prefer FACTS; if absent, pick the one labeled “Total” near summary; note the alternative.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conflicting currency symbols vs codes&lt;/td&gt;
&lt;td&gt;Prefer the currency from FACTS; if excerpt differs, note discrepancy.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Handwritten / low DPI&lt;/td&gt;
&lt;td&gt;Expect nulls and notes; don’t interpolate from context.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-currency line items&lt;/td&gt;
&lt;td&gt;Set currency to document currency; add observation like “Line items include mixed currency not used for total.”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-doc PDF&lt;/td&gt;
&lt;td&gt;If two vendors or two invoice numbers appear, say so in &lt;code&gt;observations&lt;/code&gt; and add &lt;code&gt;suspected_errors&lt;/code&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Non-English docs&lt;/td&gt;
&lt;td&gt;Add “Output in English.” Keep dates numeric (YYYY-MM-DD).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extremely long inputs&lt;/td&gt;
&lt;td&gt;Use Map/Reduce; only pass per-page top areas and final consolidation.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  9) Example “good” output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"doc_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"INV-2025-0045"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Invoice 0045 from Beta Traders for 89.50 EUR, due on 2025-08-28."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"key_fields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"vendor_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Beta Traders"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"invoice_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0045"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"invoice_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-08-14"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"due_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-08-28"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"currency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"EUR"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total_amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;89.50&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"observations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Early-payment discount mentioned (2%)."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data_quality"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"missing_fields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"suspected_errors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"confidence_note"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"All fields aligned with extracted facts."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  10) Where to store results
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Write JSON summaries to &lt;code&gt;summaries/&lt;/code&gt; and (optionally) a compact CSV view &lt;code&gt;processed/invoice_summaries.csv&lt;/code&gt; with &lt;code&gt;doc_id&lt;/code&gt;, &lt;code&gt;vendor_name&lt;/code&gt;, &lt;code&gt;total_amount&lt;/code&gt;, &lt;code&gt;due_date&lt;/code&gt;, &lt;code&gt;needs_review&lt;/code&gt;, &lt;code&gt;source='bedrock'&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Join this dataset with &lt;code&gt;invoices_wide.csv&lt;/code&gt; for dashboard overlays (e.g., a tooltip with the narrative summary).&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🗂️ Step 5: Store Final Data Back in S3
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) Folder layout
&lt;/h3&gt;

&lt;p&gt;Use a layout that separates &lt;strong&gt;raw vs extracted vs consumable&lt;/strong&gt; and supports BI tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;s3://doc-insights-bucket/
  processed/
    wide/                 # final, dashboard-ready “wide” tables (CSV/Parquet)
    kv/                   # merged KV outputs across docs
    tables/               # flattened tables across docs
    expense/              # invoice summaries &amp;amp; line items (if using AnalyzeExpense)
    summaries/            # Bedrock JSON summaries
    parquet/              # same datasets in Parquet (for Athena/QuickSight at scale)
    _staging/             # temp writes (see atomic pattern)
    manifests/            # optional manifest/index files
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Partitioning (for Athena/scale):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Prefer hive-style partitions for Parquet:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;processed/parquet/invoices_wide/year=2025/month=09/day=12/part-0001.parquet&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Keep CSVs unpartitioned or lightly partitioned by month to avoid too many small files.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  2) Naming strategy
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Document-scoped files:&lt;/strong&gt;
&lt;code&gt;processed/expense/summary/&amp;lt;doc_id&amp;gt;.csv&lt;/code&gt;
&lt;code&gt;processed/summaries/&amp;lt;doc_id&amp;gt;.json&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch/append files:&lt;/strong&gt;
&lt;code&gt;processed/wide/invoices_wide_YYYYMMDD.csv&lt;/code&gt; (or append into a single table and de-dupe by &lt;code&gt;doc_id&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Timestamped artifacts:&lt;/strong&gt; add an ISO timestamp or a monotonically increasing batch id:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;.../2025-09-12T18-40-12Z_invoices_wide.csv&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Avoid collisions:&lt;/strong&gt; include &lt;code&gt;doc_id&lt;/code&gt; (or a UUID) in per-doc artifacts; never rely only on original filenames.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  3) Atomic write pattern
&lt;/h3&gt;

&lt;p&gt;S3 has strong read-after-write consistency, but clients can still read &lt;strong&gt;half-written&lt;/strong&gt; objects if your writer crashes mid-upload.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Write to &lt;code&gt;processed/_staging/&amp;lt;uuid&amp;gt;/&amp;lt;file.tmp&amp;gt;&lt;/code&gt; (or &lt;code&gt;status=writing&lt;/code&gt; tag).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verify&lt;/strong&gt; checksum/row count.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Copy&lt;/strong&gt; to final key (e.g., &lt;code&gt;processed/wide/invoices_wide_20250912.csv&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Optionally delete the staging file, or keep with tag &lt;code&gt;status=archived&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Edge cases:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Large files (&amp;gt;5 GB):&lt;/strong&gt; use multipart upload; always send &lt;code&gt;Content-MD5&lt;/code&gt; for integrity when feasible.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrent writers:&lt;/strong&gt; add a lightweight &lt;strong&gt;lock&lt;/strong&gt; (S3 object &lt;code&gt;processed/_locks/invoices_wide.lock&lt;/code&gt; with short TTL) or write per-writer partitions then compact later.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  4) Metadata, tags, and headers
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Server-side encryption:&lt;/strong&gt; SSE-KMS with your CMK (consistent with earlier steps).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content-Type:&lt;/strong&gt; set correctly (&lt;code&gt;text/csv&lt;/code&gt;, &lt;code&gt;application/json&lt;/code&gt;, &lt;code&gt;application/parquet&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache-Control:&lt;/strong&gt; not crucial for data lakes; you can omit or set &lt;code&gt;no-cache&lt;/code&gt; for rapidly changing CSVs.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Object tags&lt;/strong&gt; (helpful for governance/search):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;dataset=invoices&lt;/code&gt;, &lt;code&gt;stage=processed&lt;/code&gt;, &lt;code&gt;doc_id=&amp;lt;id&amp;gt;&lt;/code&gt;, &lt;code&gt;status=ready&lt;/code&gt;, &lt;code&gt;source=expense/forms/ocr/bedrock&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Custom metadata&lt;/strong&gt; (e.g., &lt;code&gt;x-amz-meta-rowcount&lt;/code&gt;, &lt;code&gt;x-amz-meta-hash&lt;/code&gt;): useful for quick validation.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  5) Security &amp;amp; access
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bucket policy:&lt;/strong&gt; already denies insecure transport; also allow the &lt;strong&gt;QuickSight service role&lt;/strong&gt; to &lt;code&gt;ListBucket&lt;/code&gt; and &lt;code&gt;GetObject&lt;/code&gt; on &lt;code&gt;processed/*&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KMS key policy:&lt;/strong&gt; include the QuickSight role principal (not just IAM permissions) so it can decrypt.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-account reads (optional):&lt;/strong&gt; prefer role assumption + bucket policy + KMS grants.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VPC endpoints:&lt;/strong&gt; if compute is private, ensure endpoints for S3/KMS are present.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Edge cases:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;QuickSight can list but &lt;strong&gt;can’t read&lt;/strong&gt; → key policy is missing the role principal.&lt;/li&gt;
&lt;li&gt;Using &lt;strong&gt;SPICE&lt;/strong&gt;: watch dataset size limits; switch to &lt;strong&gt;Athena over Parquet&lt;/strong&gt; when CSVs grow.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  6) Data integrity &amp;amp; quality gates
&lt;/h3&gt;

&lt;p&gt;Before moving from &lt;code&gt;_staging&lt;/code&gt; → &lt;code&gt;processed&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Schema check:&lt;/strong&gt; headers exist and match expected column set.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Row count &amp;gt; 0&lt;/strong&gt; (or allow zero with a &lt;code&gt;note=empty_batch&lt;/code&gt; tag).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Type validation:&lt;/strong&gt; dates parse (ISO-8601), amounts numeric, currency present.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sanity checks:&lt;/strong&gt; if files include &lt;code&gt;subtotal&lt;/code&gt;, &lt;code&gt;tax&lt;/code&gt;, &lt;code&gt;total&lt;/code&gt;, verify &lt;code&gt;subtotal + tax ≈ total&lt;/code&gt; (tolerance).&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Checksum:&lt;/strong&gt; compute SHA-256 of the file and store as sidecar:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;processed/wide/invoices_wide_20250912.csv.sha256&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  7) Lifecycle &amp;amp; retention
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Versioning:&lt;/strong&gt; keep enabled to roll back erroneous overwrites.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Lifecycle rules:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Transition &lt;code&gt;processed/_staging/&lt;/code&gt; to &lt;strong&gt;Expire after 7 days&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Transition &lt;code&gt;processed/*.csv&lt;/code&gt; to &lt;strong&gt;STANDARD_IA&lt;/strong&gt; after 30 days.&lt;/li&gt;
&lt;li&gt;Keep Parquet in &lt;strong&gt;STANDARD&lt;/strong&gt; (or &lt;strong&gt;GLACIER&lt;/strong&gt; if archived) based on access.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;S3 Inventory&lt;/strong&gt; (optional): enable for daily listings &amp;amp; encryption audit.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  8) Manifests &amp;amp; discovery (helps BI tools)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Manifest file&lt;/strong&gt; lists the latest batch outputs:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  {
    "dataset": "invoices_wide",
    "updated_at": "2025-09-12T18:40:12Z",
    "files": [
      "s3://.../processed/wide/invoices_wide_20250912.csv"
    ],
    "row_count": 1243,
    "schema": ["doc_id","vendor_name","invoice_number","invoice_date","due_date","total","currency","needs_review","source","month","year"]
  }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save as &lt;code&gt;processed/manifests/invoices_wide.latest.json&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Glue Catalog&lt;/strong&gt; (for Athena): create/update a table pointing to your Parquet path with partitions &lt;code&gt;year, month, day&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  9) Eventing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;S3 → EventBridge&lt;/strong&gt; rule on &lt;code&gt;processed/&lt;/code&gt; prefix:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Trigger a Lambda to &lt;strong&gt;refresh QuickSight&lt;/strong&gt; (or kick an ingestion SPICE refresh).&lt;/li&gt;
&lt;li&gt;Or publish a Slack/Email notification with the manifest summary.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Idempotency:&lt;/strong&gt; include &lt;code&gt;run_id&lt;/code&gt; or &lt;code&gt;batch_id&lt;/code&gt; in event detail; ignore duplicates in the consumer.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  📊 Step 6: Create a Dashboard in QuickSight
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) Pick your connection path
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Path A | S3 CSV (fast start)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Best for small/medium datasets (≤ a few million rows per file).&lt;/li&gt;
&lt;li&gt;In QuickSight: &lt;strong&gt;Datasets → New dataset → S3&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Point to &lt;strong&gt;a single CSV&lt;/strong&gt; &lt;em&gt;or&lt;/em&gt; a &lt;strong&gt;manifest JSON&lt;/strong&gt; (recommended if you have many files).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Path B | Athena + Parquet (scales)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Best for large data, partitions, or frequent refresh.&lt;/li&gt;
&lt;li&gt;Store Parquet at &lt;code&gt;processed/parquet/invoices_wide/year=YYYY/month=MM/day=DD/...&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;In QuickSight: &lt;strong&gt;Datasets → New dataset → Athena → select database &amp;amp; table&lt;/strong&gt; (from Glue Catalog).&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Edge cases&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If QuickSight is in a different region than your S3 bucket, prefer &lt;strong&gt;Athena&lt;/strong&gt; to avoid cross-region pain.&lt;/li&gt;
&lt;li&gt;Using SSE-KMS? Ensure the &lt;strong&gt;QuickSight service role&lt;/strong&gt; is in your &lt;strong&gt;KMS key policy&lt;/strong&gt; and bucket policy.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  2) Connect to your data
&lt;/h3&gt;

&lt;h4&gt;
  
  
  A) S3 with a manifest (many files)
&lt;/h4&gt;

&lt;p&gt;Create &lt;code&gt;processed/manifests/invoices_wide.latest.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fileLocations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"URIPrefixes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"s3://doc-insights-bucket/processed/wide/"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"globalUploadSettings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"format"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CSV"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"delimiter"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;","&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"textqualifier"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"containsHeader"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then in QuickSight S3 connector, paste this &lt;strong&gt;manifest file’s S3 URL&lt;/strong&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  B) Athena (Parquet partitions)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Make sure your Glue table points to &lt;code&gt;processed/parquet/invoices_wide/&lt;/code&gt; and has partitions &lt;code&gt;year, month, day&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;In QuickSight, choose &lt;strong&gt;Direct Query&lt;/strong&gt; (live) or &lt;strong&gt;Import to SPICE&lt;/strong&gt; (in-memory; faster but size-limited).&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Edge cases&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;“AccessDenied”&lt;/strong&gt; when previewing = missing KMS permission or bucket grant.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No new data&lt;/strong&gt; after partition upload = run crawler/add partition OR enable partition projection.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  3) Prepare the dataset
&lt;/h3&gt;

&lt;p&gt;In the &lt;strong&gt;data prep&lt;/strong&gt; screen:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Map columns&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;invoice_number&lt;/code&gt; → Text&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;invoice_date&lt;/code&gt;, &lt;code&gt;due_date&lt;/code&gt; → Date (format &lt;code&gt;YYYY-MM-DD&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;total&lt;/code&gt; → Decimal&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;vendor_name&lt;/code&gt; → Text&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;currency&lt;/code&gt; → Text&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;needs_review&lt;/code&gt; → Integer/Boolean&lt;/li&gt;
&lt;li&gt;(Optional) &lt;code&gt;ingest_date&lt;/code&gt; or &lt;code&gt;processed_at&lt;/code&gt; → Date/Time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Create calculated fields (QuickSight expressions)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Month (bucket)&lt;/strong&gt;
&lt;code&gt;truncDate('MM', {invoice_date})&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Year&lt;/strong&gt;
&lt;code&gt;extract('YYYY', {invoice_date})&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Overdue (flag)&lt;/strong&gt;
&lt;code&gt;ifelse({due_date} &amp;lt; now(), 1, 0)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Processing time (days)&lt;/strong&gt; – pick the source you actually have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If you kept an ingest timestamp:
&lt;code&gt;dateDiff('DD', {invoice_date}, {ingest_date})&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Or from S3 key date (if you saved it):
&lt;code&gt;dateDiff('DD', {invoice_date}, {s3_ingest_date})&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Clean total (fallback)&lt;/strong&gt; – if some totals are text:&lt;br&gt;&lt;br&gt;
&lt;code&gt;parseDecimal({total})&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Edge cases&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mixed date formats? Normalize upstream; otherwise use a &lt;strong&gt;calculated field&lt;/strong&gt; with &lt;code&gt;parseDate()&lt;/code&gt; per pattern and coalesce.&lt;/li&gt;
&lt;li&gt;Multiple currencies? Keep a &lt;strong&gt;currency&lt;/strong&gt; dimension; optional FX table join later.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  4) Build the visuals
&lt;/h3&gt;

&lt;h4&gt;
  
  
  A) Total amount by vendor (Top-N bar)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Visual&lt;/strong&gt;: Horizontal bar&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X&lt;/strong&gt;: &lt;code&gt;sum({total})&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Y&lt;/strong&gt;: &lt;code&gt;vendor_name&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sort&lt;/strong&gt;: by &lt;code&gt;sum(total)&lt;/code&gt; desc&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filter&lt;/strong&gt;: Top 10 vendors (add a Top N filter)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optional&lt;/strong&gt;: color by &lt;code&gt;needs_review&lt;/code&gt; to spot messy vendors&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  B) Monthly invoice totals (trend)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Visual&lt;/strong&gt;: Line chart&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X&lt;/strong&gt;: &lt;code&gt;truncDate('MM', {invoice_date})&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Y&lt;/strong&gt;: &lt;code&gt;sum({total})&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Forecast&lt;/strong&gt; (optional): enable forecast to project next months&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tooltip&lt;/strong&gt;: add &lt;code&gt;countDistinct({invoice_number})&lt;/code&gt; for volume context&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  C) Average processing time (days)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Visual&lt;/strong&gt;: KPI or bar by vendor&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Field&lt;/strong&gt;: &lt;code&gt;avg(dateDiff('DD', {invoice_date}, {ingest_date}))&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Group by&lt;/strong&gt;: &lt;code&gt;vendor_name&lt;/code&gt; (or monthly bucket)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filter&lt;/strong&gt;: exclude &lt;code&gt;needs_review = 1&lt;/code&gt; if you want only clean rows&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  D) Bonus quick wins
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Overdue heatmap&lt;/strong&gt;: &lt;code&gt;vendor_name&lt;/code&gt; × &lt;code&gt;month&lt;/code&gt; with color &lt;code&gt;sum(ifelse({due_date} &amp;lt; now(), {total}, 0))&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quality gate&lt;/strong&gt;: table filtered to &lt;code&gt;needs_review = 1&lt;/code&gt; (your manual review queue)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  5) Filters, controls, and parameters
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Date range control&lt;/strong&gt;: Filter on &lt;code&gt;{invoice_date}&lt;/code&gt; with “Relative dates” (e.g., Last 12 months).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor control&lt;/strong&gt;: Dropdown on &lt;code&gt;vendor_name&lt;/code&gt; (multi-select).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confidence/quality toggle&lt;/strong&gt;: &lt;code&gt;needs_review&lt;/code&gt; = 0 by default (checkbox control).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Currency control&lt;/strong&gt; (if applicable): filter or parameter to select a single currency.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Edge cases&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If your dataset mixes currencies, don’t sum across different codes in one visual unless you convert first.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  6) SPICE vs Direct Query
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SPICE (Import)&lt;/strong&gt;: fastest visuals, schedule refreshes (e.g., hourly). Watch capacity limits.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Direct Query&lt;/strong&gt;: always live; best with Athena/Parquet; slightly slower per query.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Recommended&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Start with &lt;strong&gt;SPICE&lt;/strong&gt; for S3 CSVs; switch to &lt;strong&gt;Athena + SPICE&lt;/strong&gt; as you scale.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  7) Refresh strategy
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;S3 CSV path&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If you &lt;strong&gt;overwrite one key&lt;/strong&gt; (e.g., &lt;code&gt;invoices_wide_latest.csv&lt;/code&gt;), set a refresh schedule (hourly/daily).&lt;/li&gt;
&lt;li&gt;If you &lt;strong&gt;append new files&lt;/strong&gt;, use a &lt;strong&gt;manifest&lt;/strong&gt; and schedule refresh.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Athena path&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;After adding partitions, either run a crawler or use &lt;strong&gt;partition projection&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;In QuickSight, schedule SPICE refresh or stay on Direct Query.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Edge cases&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Refresh fails with KMS errors → add QuickSight role to CMK key policy.&lt;/li&gt;
&lt;li&gt;Stale data → you’re writing to new paths; either point QS at a &lt;strong&gt;stable key&lt;/strong&gt; or use manifests.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  8) Sharing, RLS, and governance
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Share analysis → Publish dashboard&lt;/strong&gt;; grant users/groups access.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Row-Level Security (RLS)&lt;/strong&gt;: upload an RLS mapping dataset (&lt;code&gt;user&lt;/code&gt;, &lt;code&gt;vendor_name&lt;/code&gt;) to restrict rows per viewer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data dictionary tab&lt;/strong&gt;: add field descriptions so the team knows what “processing time” means.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  10) Minimal recipe
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Connect&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;S3: dataset → S3 → manifest → validate preview.&lt;/li&gt;
&lt;li&gt;Athena: dataset → Athena → select table.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Model&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Types: map date/number fields.&lt;/li&gt;
&lt;li&gt;Calcs: &lt;code&gt;month = truncDate('MM', {invoice_date})&lt;/code&gt;, &lt;code&gt;processing_days = dateDiff('DD', {invoice_date}, {ingest_date})&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Visualize&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Bar: &lt;code&gt;sum(total)&lt;/code&gt; by &lt;code&gt;vendor_name&lt;/code&gt; (Top 10).&lt;/li&gt;
&lt;li&gt;Line: &lt;code&gt;sum(total)&lt;/code&gt; by &lt;code&gt;month&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;KPI/Bar: &lt;code&gt;avg(processing_days)&lt;/code&gt; by vendor.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Control&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Date range (last 12 months).&lt;/li&gt;
&lt;li&gt;Vendor multi-select.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;needs_review&lt;/code&gt; toggle (default 0).&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Operationalize&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;SPICE refresh schedule (hourly/daily).&lt;/li&gt;
&lt;li&gt;If Athena: ensure partitions &amp;amp; permissions.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  💭 Conclusion
&lt;/h2&gt;

&lt;p&gt;This was honestly a fun little fortnight project.&lt;br&gt;
It saved me hours of manual work and gave me clean, visual summaries of everything I needed. By combining S3, Textract, Bedrock, and QuickSight, you can turn unstructured, scanned documents into analysis-ready data and clear dashboards. This pipeline handles extraction, normalisation, summarisation, governed storage, and visualisation, built with security (SSE-KMS), least-privilege IAM, and scalable patterns (async Textract, Parquet/Athena). It replaces manual review with consistent, auditable automation and surfaces vendor spend, monthly trends, and processing KPIs in minutes. From here, you can harden orchestration with Step Functions, add Row-Level Security in QuickSight, and extend the same approach to receipts, contracts, or reports. Small lift-big leverage.&lt;/p&gt;

&lt;p&gt;You could easily apply this to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Legal contracts&lt;/li&gt;
&lt;li&gt;Academic papers&lt;/li&gt;
&lt;li&gt;Medical prescriptions&lt;/li&gt;
&lt;li&gt;Tax docs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you try it out and get stuck, drop a comment, and I'd be happy to help.&lt;/p&gt;

&lt;h1&gt;
  
  
  HappyHacking 👋
&lt;/h1&gt;

</description>
      <category>aws</category>
      <category>machinelearning</category>
      <category>devops</category>
      <category>automation</category>
    </item>
    <item>
      <title>🚀 Scaling React Like Big Tech: Folder Structures, Clean Code &amp; Beyond</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Sat, 30 Aug 2025 18:55:22 +0000</pubDate>
      <link>https://forem.com/mursalfk/scaling-react-like-big-tech-folder-structures-clean-code-beyond-51bj</link>
      <guid>https://forem.com/mursalfk/scaling-react-like-big-tech-folder-structures-clean-code-beyond-51bj</guid>
      <description>&lt;h2&gt;
  
  
  Hey Everyone 👋
&lt;/h2&gt;

&lt;p&gt;I know it's been quite a long gap since my last article. But we are finally back with a bang.&lt;/p&gt;

&lt;p&gt;So if you’ve ever wondered how &lt;strong&gt;Airbnb, Shopify, or Meta&lt;/strong&gt; keep their codebases sane while thousands of engineers ship features daily — the answer is simple: &lt;strong&gt;structure + discipline + automation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This isn’t about writing code that just &lt;em&gt;works&lt;/em&gt;. It’s about writing code that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scales with your team.&lt;/li&gt;
&lt;li&gt;Survives onboarding of 10 new devs.&lt;/li&gt;
&lt;li&gt;Passes reviews without a hundred nitpicks.&lt;/li&gt;
&lt;li&gt;And doesn’t make you cry when you open it six months later.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s break down what the big companies do differently — and how you can bring those practices into your own projects.&lt;/p&gt;




&lt;h2&gt;
  
  
  📂 Folder Structure that Scales
&lt;/h2&gt;

&lt;p&gt;Top companies don’t just throw files into a random &lt;strong&gt;&lt;code&gt;components/&lt;/code&gt;&lt;/strong&gt; folder and hope for the best. They use &lt;strong&gt;feature-driven architecture&lt;/strong&gt;, which grows naturally with the product.&lt;/p&gt;

&lt;h3&gt;
  
  
  ❌ Traditional (Bad) Structure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;src/
├── components/
│   ├── Button.jsx
│   ├── Card.jsx
│   ├── Form.jsx
├── services/
├── hooks/
└── App.jsx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This looks fine at first… until your app has &lt;strong&gt;300+ components&lt;/strong&gt; and no one knows which belongs to what.&lt;/p&gt;




&lt;h3&gt;
  
  
  ✅ Scalable (Feature-Based) Structure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;src/
├── features/
│   └── auth/
│       ├── components/
│       ├── hooks/
│       └── services/
├── shared/
│   ├── ui/
│   └── utils/
└── App.jsx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;📌 &lt;strong&gt;Key Difference&lt;/strong&gt;: Everything related to a single feature (Auth, Products, Cart, etc.) is grouped. Shared logic/components live in &lt;code&gt;shared/&lt;/code&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Pros&lt;/th&gt;
&lt;th&gt;Cons&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Traditional (by type)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Simple to start&lt;/td&gt;
&lt;td&gt;Becomes chaos at scale&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Feature-Based&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Easy ownership, scalable, testable&lt;/td&gt;
&lt;td&gt;Slightly more setup upfront&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🎨 Design Systems &amp;amp; UI Libraries
&lt;/h2&gt;

&lt;p&gt;Big companies don’t reinvent the button every Tuesday. They use &lt;strong&gt;design systems&lt;/strong&gt; and &lt;strong&gt;UI libraries&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🚀 &lt;strong&gt;Airbnb&lt;/strong&gt; → Lona&lt;/li&gt;
&lt;li&gt;🚀 &lt;strong&gt;Shopify&lt;/strong&gt; → Polaris&lt;/li&gt;
&lt;li&gt;🚀 &lt;strong&gt;Meta&lt;/strong&gt; → Reusable UI + strict guidelines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;👉 Tools you can use: &lt;strong&gt;Chakra UI&lt;/strong&gt;, &lt;strong&gt;Tailwind + Headless UI&lt;/strong&gt;, &lt;strong&gt;Radix UI&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Visual Example
&lt;/h3&gt;

&lt;p&gt;❌ Without Design System:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Button&lt;/span&gt; &lt;span class="na"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"danger"&lt;/span&gt; &lt;span class="na"&gt;color&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"red"&lt;/span&gt; &lt;span class="na"&gt;bg&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"red"&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;✅ With Design System:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;DeleteButton&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Why Use Design Systems?&lt;/th&gt;
&lt;th&gt;Benefits&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Consistency&lt;/td&gt;
&lt;td&gt;UI looks the same across app&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Devs move faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Accessibility&lt;/td&gt;
&lt;td&gt;Built-in ally support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Brand Identity&lt;/td&gt;
&lt;td&gt;Product feels cohesive&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🧑‍💻 Hooks = Where Logic Lives
&lt;/h2&gt;

&lt;p&gt;Business logic doesn’t belong inside components. &lt;strong&gt;Custom hooks&lt;/strong&gt; make code &lt;strong&gt;reusable, testable, and clean&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// useProduct.js&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useProducts&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setData&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;([])&lt;/span&gt;
   &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;fetchProducts&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;setData&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;✅ UI stays declarative.&lt;br&gt;
✅ Logic stays modular.&lt;/p&gt;

&lt;p&gt;📊 &lt;strong&gt;Diagram: Components vs Hooks&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Component] ---&amp;gt; UI (render)
       |
       v
 [Custom Hook] ---&amp;gt; Data Fetching, State, Business Logic
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧼 Clean Code = Scalable Code
&lt;/h2&gt;

&lt;p&gt;Big tech engineers follow &lt;strong&gt;clean code principles&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SRP&lt;/strong&gt; → Single Responsibility Principle&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KISS&lt;/strong&gt; → Keep It Simple, Stupid&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DRY&lt;/strong&gt; → Don’t Repeat Yourself&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;❌ Bad&lt;/th&gt;
&lt;th&gt;✅ Good&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;Button variant="danger" color="red" bg="red" /&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;DeleteButton /&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;✅ Abstraction = Fewer Bugs + Cleaner Reviews&lt;/p&gt;




&lt;h2&gt;
  
  
  🔍 Real-Time Code Reviews + Linting
&lt;/h2&gt;

&lt;p&gt;At big companies, &lt;strong&gt;automation + reviews are mandatory&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;Code Reviews&lt;/strong&gt; → shared knowledge&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Pre-Commit Hooks (Husky)&lt;/strong&gt; → catch errors early&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;ESLint + Prettier&lt;/strong&gt; → consistent style&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  CI/CD Pipeline Visual
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Developer Pushes Code] 
       |
       v
[Pre-Commit Checks: Lint + Tests]
       |
       v
[Pull Request Review]
       |
       v
[CI/CD: Build + Deploy + Monitor]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🔐 Security &amp;amp; ♿ Accessibility First
&lt;/h2&gt;

&lt;p&gt;Security isn’t optional. Accessibility isn’t either.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security Checklist
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Prevent &lt;strong&gt;XSS&lt;/strong&gt; and &lt;strong&gt;CSRF&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Proper &lt;strong&gt;token handling&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Input validation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Accessibility Checklist
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Keyboard navigation works&lt;/li&gt;
&lt;li&gt;Screen reader support&lt;/li&gt;
&lt;li&gt;High color contrast&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;👉 Use tools like &lt;code&gt;axe-core&lt;/code&gt;, &lt;code&gt;eslint-plugin-jsx-ally&lt;/code&gt;, &lt;code&gt;lighthouse&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚡ Automation + Monitoring
&lt;/h2&gt;

&lt;p&gt;Big teams don’t just ship code — they ship &lt;strong&gt;pipelines + monitoring&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;CI/CD Pipelines&lt;/strong&gt; (GitHub Actions, Vercel, CircleCI)&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Bundle Size Alerts&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Error Tracking&lt;/strong&gt; (Sentry, Datadog)&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Performance Monitoring&lt;/strong&gt; (Lighthouse CI)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;📊 Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Commit → Build → Test → Deploy → Monitor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  📊 Documentation &amp;amp; Knowledge Sharing
&lt;/h2&gt;

&lt;p&gt;What separates side projects from &lt;strong&gt;real companies&lt;/strong&gt;?&lt;br&gt;
👉 &lt;strong&gt;Documentation.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Good README.md = contributor-friendly.&lt;/li&gt;
&lt;li&gt;Internal wikis = no lost knowledge.&lt;/li&gt;
&lt;li&gt;ADRs (Architecture Decision Records) = explain &lt;strong&gt;why&lt;/strong&gt; choices were made.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  🧠 Culture = The Invisible Architecture
&lt;/h2&gt;

&lt;p&gt;Finally — even the best code structure won’t save a bad culture. Big companies win because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PRs = conversations, not fights.&lt;/li&gt;
&lt;li&gt;Automation is embraced, not skipped.&lt;/li&gt;
&lt;li&gt;Docs + testing = part of “done”.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At the end of the day, &lt;strong&gt;how your team codes together&lt;/strong&gt; matters more than folder names.&lt;/p&gt;


&lt;h2&gt;
  
  
  🚦 Startup vs Big Tech Codebases
&lt;/h2&gt;

&lt;p&gt;Here’s how things usually look in a scrappy startup vs how big tech organizes for scale:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────────┐           ┌─────────────────────────┐
│   Startup Style    │           │     Big Tech Style      │
├────────────────────┤           ├─────────────────────────┤
│ 📂 components/     │           │ 📂 features/auth/       │
│ 📂 hooks/          │           │   ├── components/       │
│ 📂 services/       │           │   ├── hooks/            │
│ App.jsx            │           │   └── services/         │
│                    │           │ 📂 shared/ui/           │
│ Easy to start ✅   │           │ 📂 shared/utils/        │
│ Chaos later ❌      │           │ Scales across teams ✅  │
└────────────────────┘           └─────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  📊 Side-by-Side Table
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Startup-Style&lt;/th&gt;
&lt;th&gt;Big Tech-Style&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Folder Structure&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Flat, all components in one folder&lt;/td&gt;
&lt;td&gt;Feature-based, modular&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;UI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Custom each time&lt;/td&gt;
&lt;td&gt;Design system + UI library&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Logic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Inside components&lt;/td&gt;
&lt;td&gt;Extracted into hooks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Code Quality&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Quick fixes, less abstraction&lt;/td&gt;
&lt;td&gt;Clean code, reusable patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reviews&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ad-hoc, sometimes skipped&lt;/td&gt;
&lt;td&gt;Mandatory PRs + linting + tests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Security&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Later concern&lt;/td&gt;
&lt;td&gt;Built-in from day one&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Docs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;README only&lt;/td&gt;
&lt;td&gt;Wikis + ADRs + onboarding guides&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Culture&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;“Move fast, break things”&lt;/td&gt;
&lt;td&gt;“Move fast, don’t break prod”&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🎯 Final Takeaway
&lt;/h2&gt;

&lt;p&gt;Big companies scale not because their engineers are magical unicorns 🦄, but because they follow &lt;strong&gt;battle-tested practices&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Organize code by &lt;strong&gt;features&lt;/strong&gt;, not file types.&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;design systems&lt;/strong&gt; for UI.&lt;/li&gt;
&lt;li&gt;Extract logic into &lt;strong&gt;hooks&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Follow &lt;strong&gt;clean code principles&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Automate reviews + pipelines.&lt;/li&gt;
&lt;li&gt;Prioritize &lt;strong&gt;security &amp;amp; accessibility&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Write &lt;strong&gt;documentation&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Build a &lt;strong&gt;team culture&lt;/strong&gt; around consistency.&lt;/li&gt;
&lt;li&gt;Startups often optimize for &lt;strong&gt;speed&lt;/strong&gt;, but end up with technical debt.&lt;/li&gt;
&lt;li&gt;Big Tech optimizes for &lt;strong&gt;scalability&lt;/strong&gt;, making onboarding, reviews, and growth smooth.&lt;/li&gt;
&lt;li&gt;The sweet spot? Start with startup speed, but gradually &lt;strong&gt;adopt Big Tech practices&lt;/strong&gt; before chaos hits.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Happy Hacking 👋
&lt;/h2&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>javascript</category>
      <category>beginners</category>
    </item>
    <item>
      <title>WSL2 TensorFlow GPU Setup – RTX 4060 + Ubuntu 22.04 + CUDA 12.2 + cuDNN</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Wed, 16 Apr 2025 16:36:12 +0000</pubDate>
      <link>https://forem.com/mursalfk/wsl2-tensorflow-gpu-setup-rtx-4060-ubuntu-2204-cuda-122-cudnn-361h</link>
      <guid>https://forem.com/mursalfk/wsl2-tensorflow-gpu-setup-rtx-4060-ubuntu-2204-cuda-122-cudnn-361h</guid>
      <description>&lt;h3&gt;
  
  
  🛠 Prereqs:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;WSL2 enabled on Windows&lt;/li&gt;
&lt;li&gt;NVIDIA GPU driver (≥ v535) installed on Windows&lt;/li&gt;
&lt;li&gt;Ubuntu 22.04 installed via WSL&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://developer.nvidia.com/cudnn-downloads" rel="noopener noreferrer"&gt;Download cuDNN manually&lt;/a&gt; while logged in to NVIDIA Developer&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🧱 1. Setup Python environment in Ubuntu (WSL2)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;python3.11 python3.11-venv gcc-11 g++-11 &lt;span class="nt"&gt;-y&lt;/span&gt;
python3.11 &lt;span class="nt"&gt;-m&lt;/span&gt; venv ~/tf-gpu-311
&lt;span class="nb"&gt;source&lt;/span&gt; ~/tf-gpu-311/bin/activate
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--upgrade&lt;/span&gt; pip
pip &lt;span class="nb"&gt;install &lt;/span&gt;&lt;span class="nv"&gt;tensorflow&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2.15.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  ⚙️ 2. Install CUDA 12.2 (runfile method)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x cuda_12.2.0_535.54.03_linux.run
&lt;span class="nb"&gt;sudo&lt;/span&gt; ./cuda_12.2.0_535.54.03_linux.run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;❌ Uncheck driver install (you're using Windows one)&lt;/li&gt;
&lt;li&gt;✅ Install toolkit only&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then set up environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'export PATH=/usr/local/cuda-12.2/bin:$PATH'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc
&lt;span class="nb"&gt;source&lt;/span&gt; ~/.bashrc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  📦 3. Install cuDNN 8.9.4 (for CUDA 12.x)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="https://developer.nvidia.com/cudnn-downloads" rel="noopener noreferrer"&gt;cuDNN Downloads&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Login → Choose:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Version: &lt;strong&gt;cuDNN 8.9.4&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;OS: &lt;strong&gt;Linux&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Installer: &lt;strong&gt;Local Installer for Ubuntu22.04 x86_64 (.deb)&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Download &lt;code&gt;.deb&lt;/code&gt; file and move it into Ubuntu:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; /mnt/c/Users/&amp;lt;yourname&amp;gt;/Downloads/cudnn-local-repo-ubuntu2204-8.9.4.25_1.0-1_amd64.deb ~/
&lt;span class="nb"&gt;cd&lt;/span&gt; ~
&lt;span class="nb"&gt;sudo &lt;/span&gt;dpkg &lt;span class="nt"&gt;-i&lt;/span&gt; cudnn-local-repo-ubuntu2204-8.9.4.25_1.0-1_amd64.deb
&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /var/cudnn-local-repo-ubuntu2204-8.9.4.25/cudnn-&lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="nt"&gt;-keyring&lt;/span&gt;.gpg /usr/share/keyrings/
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;libcudnn8 libcudnn8-dev &lt;span class="nt"&gt;-y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  🔄 4. Restart WSL2
&lt;/h3&gt;

&lt;p&gt;In &lt;strong&gt;PowerShell (Windows)&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;wsl&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;--shutdown&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then open Ubuntu again:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;source&lt;/span&gt; ~/tf-gpu-311/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  ✅ 5. Test TensorFlow GPU
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tensorflow&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TensorFlow version:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__version__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GPUs available:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_physical_devices&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;GPU&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CUDA enabled:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_built_with_cuda&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GPU name:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gpu_device_name&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  💚 Expected Output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;GPUs&lt;/span&gt; &lt;span class="n"&gt;available&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;PhysicalDevice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/physical_device:GPU:0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;GPU&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;CUDA&lt;/span&gt; &lt;span class="n"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="n"&gt;GPU&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;GPU&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Happy TensorFlowing, GPU wizard 🧙‍♂️⚡&lt;/p&gt;

</description>
      <category>linux</category>
      <category>machinelearning</category>
      <category>cuda</category>
      <category>gpu</category>
    </item>
    <item>
      <title>Smart Image Tagging on AWS - The Finale</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Fri, 04 Apr 2025 12:10:00 +0000</pubDate>
      <link>https://forem.com/aws-builders/smart-image-tagging-on-aws-the-finale-303</link>
      <guid>https://forem.com/aws-builders/smart-image-tagging-on-aws-the-finale-303</guid>
      <description>&lt;h2&gt;
  
  
  Hello and Greetings Everyone 👋
&lt;/h2&gt;

&lt;p&gt;I am hoping that you are having fun in our journey of creating the auto-tagging application for images like a pro using &lt;strong&gt;Amazon Rekognition&lt;/strong&gt;, &lt;strong&gt;AWS Lambda&lt;/strong&gt; and &lt;strong&gt;S3&lt;/strong&gt;. &lt;/p&gt;

&lt;h2&gt;
  
  
  Recap
&lt;/h2&gt;

&lt;p&gt;Let's do a quick recap of what we did in the last two parts. &lt;/p&gt;

&lt;h3&gt;
  
  
  1. &lt;a href="https://dev.to/aws-builders/auto-tag-images-on-aws-using-amazon-rekognition-lambda-s3-1d1h"&gt;Auto-Tag Images on AWS using Amazon Rekognition + Lambda + S3&lt;/a&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Created an S3 bucket to store uploaded images&lt;/li&gt;
&lt;li&gt;Created a Lambda function to handle image upload events&lt;/li&gt;
&lt;li&gt;Set up IAM roles and permissions for Rekognition and S3 access&lt;/li&gt;
&lt;li&gt;Used Amazon Rekognition to detect objects/scenes in images&lt;/li&gt;
&lt;li&gt;Logged detected labels to CloudWatch Logs&lt;/li&gt;
&lt;li&gt;Connected S3 and Lambda using event notifications&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. &lt;a href="https://dev.to/aws-builders/level-up-your-auto-tagging-pipeline-on-aws-4abj"&gt;Level Up Your Auto-Tagging Pipeline on AWS&lt;/a&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Uploaded test images and successfully triggered Lambda&lt;/li&gt;
&lt;li&gt;Viewed detection results in CloudWatch logs&lt;/li&gt;
&lt;li&gt;Improved the Lambda logic to store Rekognition labels as .json files inside a tags/ folder in S3&lt;/li&gt;
&lt;li&gt;Validated that the full pipeline was working end-to-end&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I guess that would have jogged your memory. In this article, we are going to discuss:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sending real-time email notifications using Amazon SNS&lt;/li&gt;
&lt;li&gt;Wrapping up the entire series with final thoughts and next steps&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Let's Go 🚀
&lt;/h2&gt;




&lt;h2&gt;
  
  
  Send Real-Time notifications via Amazon SNS
&lt;/h2&gt;

&lt;p&gt;As discussed, we already have created the pipeline for auto-tagging the image and saving the data in a &lt;code&gt;JSON&lt;/code&gt; file. Now what can be cooler than getting an email whenever it happens? In this step, we will modify and enhance our pipeline by integrating &lt;strong&gt;Amazon SNS&lt;/strong&gt; to send notifications everytime an image is tagged.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create an SNS Topic
&lt;/h3&gt;

&lt;p&gt;Like every other AWS Service, we need to first go to the &lt;strong&gt;AWS Console&lt;/strong&gt; and find &lt;strong&gt;SNS &lt;em&gt;(Simple Notification Service)&lt;/em&gt;&lt;/strong&gt;. On the SNS Homepage, write your topic's name, &lt;code&gt;ImageTaggingNotifications&lt;/code&gt;, in the &lt;strong&gt;Create Topic&lt;/strong&gt; textbox (As shown in the figure below). Once done, click on &lt;strong&gt;Next Step&lt;/strong&gt; button. In the next screen, select &lt;strong&gt;Standard&lt;/strong&gt; as the type and click &lt;strong&gt;Create Topic&lt;/strong&gt; at the bottom of the screen leaving everything default.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5kujd4m70yx06cihnc9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5kujd4m70yx06cihnc9.png" alt="create topic" width="800" height="852"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Seeing our friend, the green notification box would tell us that our topic is created. Once done, click on orange &lt;strong&gt;Create Substription&lt;/strong&gt; button and fill in &lt;strong&gt;Email&lt;/strong&gt; under &lt;em&gt;Protocol&lt;/em&gt; and type in your email under &lt;em&gt;Endpoint&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgrtfa9qe646vjuq6l04z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgrtfa9qe646vjuq6l04z.png" alt="topiccreated" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;All done? Click on &lt;strong&gt;Create Subscription.&lt;/strong&gt; As soon as subscription is created, check your email for the confirmation from AWS. Click on the link to confirm the subscription.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuudsc2aetcdcuvuwcf5i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuudsc2aetcdcuvuwcf5i.png" alt="subscription created" width="800" height="418"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Remember:&lt;/strong&gt; Don't skip/miss this step or you won't get any emails.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The email would look something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fomxcitctm3t2rdo8jjdn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fomxcitctm3t2rdo8jjdn.png" alt="confirmation email" width="790" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Give Lambda Permission to Publish to SNS
&lt;/h2&gt;

&lt;p&gt;After successfully creating the topic and subscribing to it with your email, we need to go to &lt;strong&gt;IAM&lt;/strong&gt; &amp;gt; &lt;strong&gt;Roles&lt;/strong&gt; &amp;gt; &lt;strong&gt;RekognitionLambdaRole&lt;/strong&gt; &amp;gt; &lt;strong&gt;Add Permissions&lt;/strong&gt; &amp;gt; &lt;strong&gt;Create Inline Policy&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In the next screen, click on &lt;code&gt;JSON&lt;/code&gt; and copy paste the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sns:Publish"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:sns:your-region:your-account-id:ImageTaggingNotifications"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Don't forget to replace &lt;code&gt;your-region&lt;/code&gt; (e.g. us-east-1) and &lt;code&gt;your-account-id&lt;/code&gt; (e.g. 123456789012) accordingly&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Click next and Give it a good name such as &lt;code&gt;AllowSnsPublishPolicy&lt;/code&gt; in the next screen. After doing so, click on &lt;strong&gt;Create Policy&lt;/strong&gt; to finish the policy creation process.&lt;/p&gt;

&lt;h3&gt;
  
  
  Update the Lambda Code to Send Notifications
&lt;/h3&gt;

&lt;p&gt;After creating the policy, we need to let the Lambda function know when and how to send the notification. To do so, we need to the below given code right &lt;strong&gt;after&lt;/strong&gt; saving the &lt;code&gt;.json&lt;/code&gt; file in &lt;strong&gt;S3&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;

&lt;span class="n"&gt;sns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sns&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;TOPIC_ARN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;arn:aws:sns:us-east-1:896415180455:ImageTaggingNotifications&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;  &lt;span class="c1"&gt;# Replace with your actual ARN
&lt;/span&gt;
&lt;span class="c1"&gt;# Get top 3 labels
&lt;/span&gt;&lt;span class="n"&gt;top_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;label_data&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;labels_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Confidence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;top_labels&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Image &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; was tagged with:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;labels_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Send SNS notification
&lt;/span&gt;&lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;publish&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;TopicArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TOPIC_ARN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Subject&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[Image Tagging Alert] New Image Tagged on S3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or if you didn't get it, replace the complete code for Lambda Function with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urllib.parse&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;

&lt;span class="n"&gt;rekognition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rekognition&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;s3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sns&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Replace with your actual SNS topic ARN
&lt;/span&gt;&lt;span class="n"&gt;TOPIC_ARN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;TOPIC-ARN&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Lambda Triggered at:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Received event:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bucket&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;unquote_plus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="c1"&gt;# ✅ Skip non-image files
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.jpg&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.jpeg&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.png&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Skipping file &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; — not a supported image format.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Skipped non-image file.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# 🔍 Step 1: Detect labels using Rekognition
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rekognition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detect_labels&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S3Object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Bucket&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;MaxLabels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;MinConfidence&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# 📋 Step 2: Format label data
&lt;/span&gt;        &lt;span class="n"&gt;label_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Labels&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;label_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Confidence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Confidence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Confidence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# 💾 Step 3: Save labels as JSON to /tags/
&lt;/span&gt;        &lt;span class="n"&gt;image_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rsplit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;rsplit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;tags_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tags/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;image_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tags_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;label_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;ContentType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Saved tags JSON to: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;tags_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# 📬 Step 4: Send SNS notification with top labels
&lt;/span&gt;        &lt;span class="n"&gt;top_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;label_data&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;labels_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Confidence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;top_labels&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Image &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; was tagged with:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;labels_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;publish&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;TopicArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TOPIC_ARN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Subject&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[Image Tagging Alert] New Image Tagged on S3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Notification sent via SNS!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Labels detected, saved to S3, and notification sent!&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error processing &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Replace TOPIC_ARN with your actual SNS Topic ARN.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  And it's done!
&lt;/h2&gt;

&lt;p&gt;After this, our entire email notification pipeline is complete. Let's test this. To test, go to your AWS S3 bucket and upload the image like we did in the previous article. If the entire pipeline is working, you will immediately receive an email with the top 3 tags of the image in the email address you gave while creating the SNS Topic. The email would look something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8c2zn9rob3g4h3fgpg7m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8c2zn9rob3g4h3fgpg7m.png" alt="final email" width="794" height="423"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you don't get the email, go through all the process and see what you missed. And in case you don't find what you did wrong, write me in the comments below.&lt;/p&gt;

&lt;h2&gt;
  
  
  🎉 Conclusion
&lt;/h2&gt;

&lt;p&gt;And that’s a wrap, folks! 🎬&lt;br&gt;
We’ve come a long way — from uploading images to S3 and detecting labels with Rekognition, to storing results, sending real-time notifications, and polishing our code like true builders 💪&lt;/p&gt;

&lt;p&gt;This serverless, scalable, and smart image tagging workflow is just one example of how you can combine AWS Lambda, Amazon S3, Rekognition, and SNS to automate powerful tasks with ease.&lt;/p&gt;

&lt;p&gt;Whether you’re building this for fun, a portfolio project, or the foundation of a real-world app — I hope this series inspired you to explore what’s possible on AWS.&lt;/p&gt;

&lt;p&gt;Thanks for sticking around till the end — and as always, if you enjoyed this, drop a ❤️, share it, or leave a comment.&lt;/p&gt;

&lt;p&gt;Until next time... Ciao! 👋&lt;/p&gt;

&lt;h2&gt;
  
  
  Happy Coding 🧑‍💻
&lt;/h2&gt;

</description>
      <category>aws</category>
      <category>machinelearning</category>
      <category>ai</category>
      <category>computervision</category>
    </item>
    <item>
      <title>Level Up Your Auto-Tagging Pipeline on AWS</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Thu, 03 Apr 2025 12:05:00 +0000</pubDate>
      <link>https://forem.com/aws-builders/level-up-your-auto-tagging-pipeline-on-aws-4abj</link>
      <guid>https://forem.com/aws-builders/level-up-your-auto-tagging-pipeline-on-aws-4abj</guid>
      <description>&lt;h2&gt;
  
  
  Hello and Greetings Everyone 👋
&lt;/h2&gt;

&lt;p&gt;I hope you would have loved reading and experimenting with creating your Auto-Tag application using Amazon Rekognition, AWS Lambda and S3 &lt;a href="https://dev.to/mursalfk/auto-tag-images-on-aws-using-amazon-rekognition-lambda-s3-1d1h"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It's a new day, and we don't want to leave everything hanging. Right? 😀 As promised, in this article we are going to discuss:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uploading a Test Image to S3&lt;/li&gt;
&lt;li&gt;View Results in CloudWatch Logs&lt;/li&gt;
&lt;li&gt;Save Tags in a JSON file in S3&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without wasting much of our time, &lt;strong&gt;Let's go 🚀&lt;/strong&gt;!&lt;/p&gt;

&lt;h2&gt;
  
  
  Uploading a Test Image to S3
&lt;/h2&gt;

&lt;p&gt;Since we have already created our basic pipeline with S3 Bucket, Lambda Function and Rekognition Pipeline, it's time to put everything to the test. We are now going to upload a sample image, and pray that our system detects and logs the labels, automatically 😁&lt;/p&gt;

&lt;h3&gt;
  
  
  Choose a Test Image
&lt;/h3&gt;

&lt;p&gt;If you remember clearly, we added the suffixes in our system, so we now have to be careful about the extension of our images. We can pick any &lt;code&gt;.jpg&lt;/code&gt;, &lt;code&gt;.jpeg&lt;/code&gt; or &lt;code&gt;.png&lt;/code&gt; image. You can either pick a photo from your phone/computer, get a sample image from the internet or any other source, but &lt;strong&gt;remember&lt;/strong&gt; to keep it small and simple for quick processing since we are still on the testing phase.&lt;/p&gt;

&lt;h3&gt;
  
  
  Upload via AWS Console
&lt;/h3&gt;

&lt;p&gt;Now that we have selected the image, we are going to upload it on our S3 bucket, using AWS Console. I hope you haven't logged out of the AWS Portal (If yes, sign in again...)&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to your &lt;strong&gt;S3 Bucket&lt;/strong&gt; (for me, &lt;code&gt;image-tagger-mursal-april-five&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Click the &lt;strong&gt;Upload&lt;/strong&gt; button and click &lt;strong&gt;Add files&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F73le75ag4qs9cml52a4b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F73le75ag4qs9cml52a4b.png" alt="clicking the upload button" width="800" height="793"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Browse to your image in your mobile phone/computer and choose it. You will see your uploaded image in the directory.&lt;/li&gt;
&lt;li&gt;Leave all the other settings as default and directly click &lt;strong&gt;Upload&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1j4faguxm7p5b9hhj3u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1j4faguxm7p5b9hhj3u.png" alt="image uploaded" width="800" height="592"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Once it's done, click &lt;strong&gt;Close&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Doing this will immediately trigger ou lambda function behind the scenes. Once the image is uploaded, Lambda will Pick up the image from S3 and send it to Amazon Rekognition to detect objects and scenes. It will also log everything to CloudWatch in the end. Let's see what's bee logged for our image (Excited? 😜)&lt;/p&gt;

&lt;h2&gt;
  
  
  View Detected Labels in CloudWatch Logs
&lt;/h2&gt;

&lt;p&gt;Now that our image is uploaded, we need to see how and what Rekognition detected. Well, that's where &lt;strong&gt;Amazon CloudWatch Logs&lt;/strong&gt; jumps in. Everytime Lambda runs, it will log the entire output, including the labels, to CloudWatch. Let's see that.&lt;/p&gt;

&lt;h3&gt;
  
  
  Navigate to CloudWatch Logs
&lt;/h3&gt;

&lt;p&gt;In our AWS Console, search for &lt;strong&gt;CloudWatch&lt;/strong&gt; in the services menu and open it and click &lt;strong&gt;Logs&lt;/strong&gt; in the left sidebar, followed by &lt;strong&gt;Log groups&lt;/strong&gt;. We will be looking for a log group named something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/aws/lambda/AutoTagImageFunction
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz4ubvnnl1yyh9fb4qbld.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz4ubvnnl1yyh9fb4qbld.png" alt="cloudwatch logs view" width="800" height="864"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once inside, click that log group and click the most recent &lt;strong&gt;log stream&lt;/strong&gt; (Check for your upload timestamp). Once inside the log stream, we will find logs like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight css"&gt;&lt;code&gt;&lt;span class="nt"&gt;Received&lt;/span&gt; &lt;span class="nt"&gt;event&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="err"&gt;"Records":&lt;/span&gt; &lt;span class="err"&gt;[&lt;/span&gt;
    &lt;span class="err"&gt;{&lt;/span&gt;
      &lt;span class="err"&gt;...&lt;/span&gt;
      &lt;span class="err"&gt;"s3":&lt;/span&gt; &lt;span class="err"&gt;{&lt;/span&gt;
        &lt;span class="err"&gt;"bucket":&lt;/span&gt; &lt;span class="err"&gt;{&lt;/span&gt;
          &lt;span class="err"&gt;"name":&lt;/span&gt; &lt;span class="err"&gt;"image-tagger-mursal-april-five"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;"object"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="err"&gt;"key":&lt;/span&gt; &lt;span class="err"&gt;"sample.jpg"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="err"&gt;}&lt;/span&gt;
    &lt;span class="err"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="err"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Followed by the recognition output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;
2025-04-03T05:30:46.444Z
Labels detected for WhatsApp Image 2025-03-23 at 16.28.43_45962656.jpg:
2025-04-03T05:30:46.444Z
&lt;span class="p"&gt;-&lt;/span&gt; Adult (99.23%)
2025-04-03T05:30:46.444Z
&lt;span class="p"&gt;-&lt;/span&gt; Male (99.23%)
2025-04-03T05:30:46.444Z
&lt;span class="p"&gt;-&lt;/span&gt; Man (99.23%)
2025-04-03T05:30:46.444Z
&lt;span class="p"&gt;-&lt;/span&gt; Person (99.23%)
2025-04-03T05:30:46.444Z
&lt;span class="p"&gt;-&lt;/span&gt; Standing (98.34%)
2025-04-03T05:30:46.444Z
&lt;span class="p"&gt;-&lt;/span&gt; Head (98.21%)
2025-04-03T05:30:46.444Z
&lt;span class="p"&gt;-&lt;/span&gt; Face (97.73%)
2025-04-03T05:30:46.444Z
&lt;span class="p"&gt;-&lt;/span&gt; Smile (97.26%)
2025-04-03T05:30:46.444Z
&lt;span class="p"&gt;-&lt;/span&gt; Jacket (96.45%)
2025-04-03T05:30:46.444Z
&lt;span class="p"&gt;-&lt;/span&gt; Portrait (95.66%)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Save Detected Labels as a JSON File in S3
&lt;/h2&gt;

&lt;p&gt;At the current moment, we are logging all the labels to CloudWatch, but we are going to store those labels in a &lt;code&gt;tags/&lt;/code&gt; folder in our existing S3 bucket since we need to record each image. This is going to be our intended folder structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;image-tagger-mursal-april-five/&lt;/span&gt;
&lt;span class="s"&gt;├── WhatsApp Image 2025-03-23 at 16.28.43_45962656.jpg&lt;/span&gt;
&lt;span class="s"&gt;└── tags/&lt;/span&gt;
    &lt;span class="s"&gt;└── WhatsApp Image 2025-03-23 at 16.28.43_45962656.json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Remember:&lt;/strong&gt; Create your &lt;code&gt;tags&lt;/code&gt; folder before moving forward.&lt;/p&gt;

&lt;p&gt;For that, we need to update our Lambda Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urllib.parse&lt;/span&gt;

&lt;span class="n"&gt;rekognition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rekognition&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;s3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Received event:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bucket&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;unquote_plus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Step 1: Detect labels
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rekognition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detect_labels&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S3Object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Bucket&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;MaxLabels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;MinConfidence&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Labels detected for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;label_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Labels&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;label_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Confidence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Confidence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Confidence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Step 2: Save labels as JSON to /tags/
&lt;/span&gt;        &lt;span class="n"&gt;tags_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tags/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rsplit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;rsplit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tags_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;label_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;ContentType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Saved tags JSON to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;tags_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Labels detected and saved to S3!&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error processing &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once done, test and &lt;strong&gt;Deploy&lt;/strong&gt; the new code in our &lt;strong&gt;Lambda Function&lt;/strong&gt;. Let's test it.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Upload a new Image to S3&lt;/li&gt;
&lt;li&gt;Make sure the image is either &lt;code&gt;.jpg&lt;/code&gt;, &lt;code&gt;.jpeg&lt;/code&gt; or &lt;code&gt;.png&lt;/code&gt; format.&lt;/li&gt;
&lt;li&gt;This will trigger the Lambda Function&lt;/li&gt;
&lt;li&gt;Lambda Function will have Amazon Rekognition detect the labels&lt;/li&gt;
&lt;li&gt;All the data will be saved in a &lt;code&gt;JSON&lt;/code&gt; file inside the &lt;code&gt;tags&lt;/code&gt; folder that you created earlier&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now your &lt;code&gt;tags&lt;/code&gt; folder looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftzvf2s5hq89i1gx0mplv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftzvf2s5hq89i1gx0mplv.png" alt="Tags folder" width="800" height="709"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click on your relevant &lt;code&gt;.json&lt;/code&gt; file to see the results. Here you will find all the data of your uploaded file.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp7jtu9hdfx2gp58qy1y1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp7jtu9hdfx2gp58qy1y1.png" alt="Image description" width="800" height="828"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click on &lt;strong&gt;Open&lt;/strong&gt; or &lt;strong&gt;Download&lt;/strong&gt; to access the file. Your &lt;code&gt;JSON&lt;/code&gt; file have an output similar to the one below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bag"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;95.62&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Document"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;85.53&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Receipt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;85.53&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;85.53&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Plastic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;81.17&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Plastic Bag"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;81.17&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;BONUS:&lt;/strong&gt; You can also try to see your JSON file from the terminal with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;s3://&amp;lt;bucket-name&amp;gt;/tags/&amp;lt;file-name&amp;gt;.json &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; &amp;lt;file-name&amp;gt;.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Remember:&lt;/strong&gt; Don't forget to replace s3:///tags/ and  with the relevant names.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uezmjzlrlidinm1wqah.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uezmjzlrlidinm1wqah.gif" alt="ending git" width="498" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  You did it!
&lt;/h2&gt;

&lt;p&gt;Still hanging in there? You’re a legend! 😎&lt;br&gt;
That wraps up Part 2 of our Smart Serverless Image Tagging on AWS series. This time, we went beyond the basics — tested the full automation flow, explored CloudWatch logs, and even saved our Rekognition results as structured .json files directly in S3. Clean. Serverless. Smart. 💡&lt;/p&gt;

&lt;p&gt;In the next part, we’ll level it up with some bonus magic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Email notifications via SNS 📬&lt;/li&gt;
&lt;li&gt;Code modularization 🧩&lt;/li&gt;
&lt;li&gt;Maybe even a simple front-end to display your tagged images 👀&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Until then, keep building cool stuff — and as always, hit me up in the comments if you have any ambiguity! 😜&lt;/p&gt;

&lt;h2&gt;
  
  
  Ciao 👋
&lt;/h2&gt;

</description>
      <category>aws</category>
      <category>machinelearning</category>
      <category>ai</category>
      <category>computervision</category>
    </item>
    <item>
      <title>Auto-Tag Images on AWS using Amazon Rekognition + Lambda + S3</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Wed, 02 Apr 2025 18:24:15 +0000</pubDate>
      <link>https://forem.com/aws-builders/auto-tag-images-on-aws-using-amazon-rekognition-lambda-s3-1d1h</link>
      <guid>https://forem.com/aws-builders/auto-tag-images-on-aws-using-amazon-rekognition-lambda-s3-1d1h</guid>
      <description>&lt;h2&gt;
  
  
  Hello and Greetings 👋 Everyone!
&lt;/h2&gt;

&lt;p&gt;I know it's been quite a gap since my last article and I sincerely apologize for the gap. But I am back with a bang 😉&lt;/p&gt;

&lt;p&gt;We all have been to a situation where we want create something to make our lives easier in daily routine. RIGHT? If you have ever wanted to build a smart system, that automatically tags images the moment they're uploaded (Like Google Photos or Amazon Photos)? Wanted to create an Application to upload vacay pictures and instantly tells you what you are looking at? Trees, Mountains, People, Emotions? That too without writing any Machine Learning Model yourself?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxcycv873k324bk6qlqwj.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxcycv873k324bk6qlqwj.gif" alt="GIF 1" width="498" height="482"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this article, we are going to build a fully serverless pipeline that'll:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Let the users upload an image to an &lt;strong&gt;S3 Bucket&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;It will trigger a &lt;em&gt;Lambda Function&lt;/em&gt; automatically&lt;/li&gt;
&lt;li&gt;Will use &lt;strong&gt;Amazon Rekognition&lt;/strong&gt; to detect objects/scenes&lt;/li&gt;
&lt;li&gt;Store the generated labels or tags into &lt;strong&gt;DynamoDB&lt;/strong&gt; or, if you want, write them to an S3 JSON file&lt;/li&gt;
&lt;li&gt;Send a notification via &lt;strong&gt;SNS or Email&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I know it's quite a mouthful and therefore we will be doing a series of articles on this process. We'll need:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Tool&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Usage&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Amazon S3&lt;/td&gt;
&lt;td&gt;Store Images&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Amazon Rekognition&lt;/td&gt;
&lt;td&gt;Object and Scene Detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS Lambda&lt;/td&gt;
&lt;td&gt;Adding everything together&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB or S3 JSON&lt;/td&gt;
&lt;td&gt;Store the Results&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Let's Go...! 🚀
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Step 1: Create an S3 Bucket and Enable Event Notifications
&lt;/h2&gt;

&lt;p&gt;In this step, all you need to do is take your laptop or start your workstation, login to your OS, open your browser and go to our favorite portal, i.e. &lt;a href="https://aws.amazon.com/" rel="noopener noreferrer"&gt;AWS Portal&lt;/a&gt;. Sign-in and begin by creating an S3 Bucket. We will use this bucket so that the users can upload their images. This bucket will trigger a lambda function everytime a new image upload is detected.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create S3 Bucket
&lt;/h3&gt;

&lt;p&gt;Simply open your AWS Console/CLI, and wait until the environment is ready to listen to you.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftga3idy73w6m9fospsxo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftga3idy73w6m9fospsxo.png" alt="Step 1.1" width="800" height="865"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then simply write this&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api create-bucket &lt;span class="nt"&gt;--bucket&lt;/span&gt; unique-bucket-name &lt;span class="nt"&gt;--region&lt;/span&gt; eu-west-3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Remember to use a unique &lt;strong&gt;bucket name&lt;/strong&gt; at 'unique-bucket-name' &amp;amp; use your preferred region.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Create &lt;em&gt;RekognitionLambdaRole&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;Having created our S3 bucket, we need to create the IAM Role before moving forward. Simply follow these steps to do so:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to IAM &amp;gt; Roles &amp;gt; Create Role&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F41w3tgpl8rqtrrjy85gk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F41w3tgpl8rqtrrjy85gk.png" alt="create role" width="800" height="862"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Trusted Entity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose &lt;strong&gt;AWS Service&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Use Case: &lt;strong&gt;Lambda&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Next&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the next screen, attach the following policies and click &lt;strong&gt;Next&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;AmazonRekognitionFullAccess&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AmazonS3ReadOnlyAccess&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;CloudWatchLogsFullAccess&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the next screen, name your role:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Role Name: &lt;code&gt;RekognitionLambdaRole&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Click Create Role&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Our new Role is ready!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3cvbojgcrxwu12f9xcjk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3cvbojgcrxwu12f9xcjk.png" alt="role ready" width="800" height="295"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Create Lambda Function
&lt;/h3&gt;

&lt;p&gt;Before moving forward, we need to create our Lambda Function, attach the correct IAM role, write it's python code and deploy it.&lt;/p&gt;

&lt;p&gt;Go to AWS Lambda Console, and click create function.&lt;/p&gt;

&lt;p&gt;Fill in the following details:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Author From Scratch&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Function Name: AutoTagImageFunction&lt;/li&gt;
&lt;li&gt;Runtime: &lt;code&gt;Python 3.12&lt;/code&gt; or if a newer version available&lt;/li&gt;
&lt;li&gt;Permissions:

&lt;ul&gt;
&lt;li&gt;Select &lt;strong&gt;"Use an Existing Role"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Choose the role we created earlier, i.e. &lt;code&gt;RekognitionLambdaRole&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Click &lt;em&gt;Create Function&lt;/em&gt; to complete the function creation process.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh0gy0ov6lzn4jxzbtauf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh0gy0ov6lzn4jxzbtauf.png" alt="function created" width="800" height="598"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Congratulations 🎉 Our Lambda Function is done. But wait. We are not done with the Lambda Function yet.&lt;/p&gt;

&lt;p&gt;In the next screen, replace the default code with the following python code in the code editor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urllib.parse&lt;/span&gt;

&lt;span class="n"&gt;rekognition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rekognition&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Received event:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Get the bucket and object key
&lt;/span&gt;    &lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bucket&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;unquote_plus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Call Rekognition
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rekognition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detect_labels&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S3Object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Bucket&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;MaxLabels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;MinConfidence&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Labels detected for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Labels&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Confidence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Labels detected and logged!&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error processing &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next step is to &lt;em&gt;Deploy&lt;/em&gt; and &lt;em&gt;Test&lt;/em&gt;.&lt;br&gt;
Simply click &lt;strong&gt;Deploy&lt;/strong&gt; to save the function. We will test this function later.&lt;/p&gt;
&lt;h3&gt;
  
  
  Enable Event Notifications for Uploads
&lt;/h3&gt;

&lt;p&gt;Now we have our S3 bucket created. Let's go for the next step, i.e. triggering the Lambda Function on S3 Uploads whenever a new &lt;em&gt;.jpeg&lt;/em&gt; image is uploaded to your bucket. &lt;strong&gt;Remember:&lt;/strong&gt; We want it to automatically trigger a lambda function. Here, we need two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Lambda Function ARN (Will get this after this step is done)&lt;/li&gt;
&lt;li&gt;Permission for S3 to trigger our Lambda Function&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;First go to the s3 bucket that we just created.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi0tqe0hxy7z8pigfiiu5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi0tqe0hxy7z8pigfiiu5.png" alt="1.2.1" width="800" height="860"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And navigate to the properties tab.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fej8hdw0a1oc4kgjga08h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fej8hdw0a1oc4kgjga08h.png" alt="1.2.2" width="800" height="839"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Scroll to 'Event Notifications' and Click 'Create Event Notification'&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frrnd4l786l77bc5x1ub8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frrnd4l786l77bc5x1ub8.png" alt="notifications pic" width="800" height="537"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Fill in the following Details:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Name: &lt;code&gt;onImageUpload&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Suffix Filter: Add &lt;code&gt;.jpg&lt;/code&gt;, &lt;code&gt;.jpeg&lt;/code&gt;, &lt;code&gt;.png&lt;/code&gt; (You can leave it all to allow all images)&lt;/li&gt;
&lt;li&gt;Event Type: &lt;code&gt;PUT&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Destination: Select &lt;strong&gt;Lambda Function&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Choose your lambda function that we created earlier.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Having done all that, save the changes by clicking &lt;strong&gt;Save Changes.&lt;/strong&gt; Once that is done, we need to add permissions as well &lt;strong&gt;[REEEAALLLYY IMPORTANT ⚠️]&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Aft the lambda function is created, run the following command in the CLI to allow S3 to invoke it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws lambda add-permission &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--function-name&lt;/span&gt; AutoTagImageFunction &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--principal&lt;/span&gt; s3.amazonaws.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--statement-id&lt;/span&gt; S3InvokePermission &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--action&lt;/span&gt; &lt;span class="s2"&gt;"lambda:InvokeFunction"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--source-arn&lt;/span&gt; arn:aws:s3:::image-tagger-mursal-april-five
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Make sure you replace &lt;code&gt;AutoTagImageFunction&lt;/code&gt; with your actual Lambda function name (if it's different 😉)&lt;br&gt;
Don't forget to replace &lt;code&gt;image-tagger-mursal-april-five&lt;/code&gt; with your actual S3 bucket name.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk8shs5cl8p65blsvq88u.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk8shs5cl8p65blsvq88u.gif" alt="Image description" width="498" height="443"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Tired Already?
&lt;/h2&gt;

&lt;p&gt;Well, that's it for Part 1 of our &lt;strong&gt;Smart Serverless Image Tagging on AWS&lt;/strong&gt; series. In this article, we have set-up the basic pipeline, i.e. S3, Lambda, Rekognition, and enabled them to auto-detect labels from the uploaded images. In the next article, we are going to cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uploading test images to S3&lt;/li&gt;
&lt;li&gt;Viewing Label results in CloudWatch&lt;/li&gt;
&lt;li&gt;Saving tags to a JSON file in S3&lt;/li&gt;
&lt;li&gt;Sending Notifications via SNS&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I guess we are going great. Let me know in the comments section if you guys want more of the AWS Stuff 😜&lt;/p&gt;

&lt;h2&gt;
  
  
  Ciao 👋
&lt;/h2&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>machinelearning</category>
      <category>computervision</category>
    </item>
    <item>
      <title>Introduction | Graph Neural Networks (GNNs) &amp; Knowledge Graphs on AWS</title>
      <dc:creator>Mursal Furqan Kumbhar</dc:creator>
      <pubDate>Sat, 15 Mar 2025 08:18:16 +0000</pubDate>
      <link>https://forem.com/mursalfk/introduction-graph-neural-networks-gnns-knowledge-graphs-on-aws-f9m</link>
      <guid>https://forem.com/mursalfk/introduction-graph-neural-networks-gnns-knowledge-graphs-on-aws-f9m</guid>
      <description>&lt;p&gt;The rapid advancements in artificial intelligence (AI) have unlocked new ways to process and learn from &lt;strong&gt;complex, interconnected data&lt;/strong&gt;. While traditional deep learning models excel at structured and unstructured data, they struggle to capture relationships between entities. This is where &lt;strong&gt;Graph Neural Networks (GNNs)&lt;/strong&gt; and &lt;strong&gt;Knowledge Graphs&lt;/strong&gt; come in—offering a powerful way to model dependencies, enhance reasoning, and improve AI predictions.  &lt;/p&gt;

&lt;p&gt;In this article series, we’ll explore how &lt;strong&gt;AWS provides scalable infrastructure&lt;/strong&gt; for building, training, and deploying GNN-based AI models. From &lt;strong&gt;Amazon Neptune for knowledge graph storage&lt;/strong&gt; to &lt;strong&gt;SageMaker for graph-based ML&lt;/strong&gt;, we’ll dive into practical implementations and real-world use cases such as &lt;strong&gt;fraud detection, recommendation systems, and AI-powered search engines&lt;/strong&gt;.  &lt;/p&gt;

&lt;p&gt;This first article will introduce the core concepts of GNNs, why they matter, and how AWS enables scalable Graph AI. Let’s get started! 🚀&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;What are Graph Neural Networks (GNNs)?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Graph Neural Networks (GNNs) are a class of deep learning models designed to process &lt;strong&gt;graph-structured data&lt;/strong&gt;. Unlike traditional neural networks that work on structured tabular data or unstructured images and text, GNNs can &lt;strong&gt;learn representations from nodes, edges, and their relationships&lt;/strong&gt; in a graph.  &lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;graph&lt;/strong&gt; consists of:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nodes (Vertices):&lt;/strong&gt; Entities in the dataset (e.g., users, products, molecules).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Edges:&lt;/strong&gt; Relationships between entities (e.g., social connections, molecular bonds).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Features:&lt;/strong&gt; Attributes associated with nodes and edges (e.g., user preferences, product categories).
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;GNNs use &lt;strong&gt;message passing&lt;/strong&gt; mechanisms to propagate information across the graph, enabling learning at both the &lt;strong&gt;node-level&lt;/strong&gt; (e.g., predicting node properties), &lt;strong&gt;edge-level&lt;/strong&gt; (e.g., link prediction), and &lt;strong&gt;graph-level&lt;/strong&gt; (e.g., graph classification).  &lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Importance of Graph-Based Learning in AI&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Graph-based learning has seen &lt;strong&gt;widespread adoption&lt;/strong&gt; in various AI-driven domains:  &lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;1. Social Network Analysis&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Predicting social connections (e.g., Facebook friend recommendations).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GNN Role:&lt;/strong&gt; Learning user embeddings based on mutual friends and shared interactions.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;2. Fraud Detection in Financial Transactions&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Detecting fraudulent transactions in banking and e-commerce.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GNN Role:&lt;/strong&gt; Identifying unusual patterns in transaction networks.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;3. Recommender Systems (Graph-Based Search &amp;amp; Recommendation)&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Enhancing movie, product, and content recommendations (e.g., Netflix, Amazon).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GNN Role:&lt;/strong&gt; Capturing relationships between users and items to improve recommendations.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;4. Drug Discovery &amp;amp; Bioinformatics&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Predicting molecular interactions for drug design.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GNN Role:&lt;/strong&gt; Learning molecular structures and identifying potential drug candidates.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;5. Knowledge Graphs for NLP &amp;amp; AI Search&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Enhancing Large Language Models (LLMs) with knowledge graphs.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GNN Role:&lt;/strong&gt; Structuring factual knowledge for better AI reasoning and retrieval.
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;How AWS Provides Scalable Infrastructure for Graph ML&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;AWS offers a &lt;strong&gt;robust suite of services&lt;/strong&gt; that support GNN-based AI applications, providing storage, training, and deployment capabilities.  &lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;1. Amazon Neptune (Managed Graph Database for Knowledge Graphs)&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;A fully managed graph database optimized for high-performance querying.
&lt;/li&gt;
&lt;li&gt;Supports &lt;strong&gt;Gremlin, SPARQL, and openCypher&lt;/strong&gt; for flexible graph analytics.
&lt;/li&gt;
&lt;li&gt;Enables &lt;strong&gt;real-time graph-based search&lt;/strong&gt; for fraud detection, recommendations, and knowledge graphs.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;2. Amazon SageMaker (For Training &amp;amp; Deploying GNN Models)&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Provides &lt;strong&gt;pre-built environments&lt;/strong&gt; for deep learning frameworks like &lt;strong&gt;PyTorch Geometric (PyG)&lt;/strong&gt; and &lt;strong&gt;Deep Graph Library (DGL)&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Supports &lt;strong&gt;distributed training&lt;/strong&gt; for large-scale graph processing.
&lt;/li&gt;
&lt;li&gt;Enables &lt;strong&gt;MLOps pipelines&lt;/strong&gt; for graph-based AI models.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;3. AWS Glue (Data Preprocessing for Graph ML)&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Used to &lt;strong&gt;clean, transform, and link data&lt;/strong&gt; before ingesting into &lt;strong&gt;Neptune&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Supports &lt;strong&gt;automated ETL (Extract, Transform, Load)&lt;/strong&gt; for complex graph datasets.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;4. Amazon OpenSearch (Vector Search + Graph Embeddings for AI)&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Stores &lt;strong&gt;graph-based embeddings&lt;/strong&gt; generated by GNN models.
&lt;/li&gt;
&lt;li&gt;Supports &lt;strong&gt;hybrid retrieval&lt;/strong&gt; (vector search + graph relationships) for AI-powered search.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;5. AWS Lambda &amp;amp; Step Functions (Serverless Graph AI Pipelines)&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Automates &lt;strong&gt;graph-based workflows&lt;/strong&gt; for real-time fraud detection and recommendation systems.
&lt;/li&gt;
&lt;li&gt;Enables &lt;strong&gt;event-driven graph learning&lt;/strong&gt; when combined with &lt;strong&gt;Kinesis&lt;/strong&gt; and &lt;strong&gt;Neptune Streams&lt;/strong&gt;.
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Graph Neural Networks Implementation on AWS (Code Example)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Let’s train a &lt;strong&gt;basic GNN&lt;/strong&gt; using &lt;strong&gt;Amazon SageMaker&lt;/strong&gt; with &lt;strong&gt;Deep Graph Library (DGL)&lt;/strong&gt; on AWS.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Step 1: Install Dependencies&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="err"&gt;!&lt;/span&gt;&lt;span class="n"&gt;pip&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt; &lt;span class="n"&gt;dgl&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt; &lt;span class="n"&gt;sagemaker&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  &lt;strong&gt;Step 2: Load a Sample Graph Dataset&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dgl&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch.nn&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch.optim&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;optim&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dgl.data&lt;/span&gt;

&lt;span class="c1"&gt;# Load a sample graph (Cora dataset)
&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dgl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CoraGraphDataset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Check graph structure
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Number of nodes: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;num_nodes&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Number of edges: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;num_edges&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  &lt;strong&gt;Step 3: Define a GNN Model&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GCN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;in_feats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hidden_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_classes&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;GCN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dgl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GraphConv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;in_feats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hidden_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conv2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dgl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GraphConv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hidden_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_classes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;conv1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;conv2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  &lt;strong&gt;Step 4: Train the Model&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Get feature and label data
&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;feat&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize model
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GCN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_classes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Adam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;loss_fn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CrossEntropyLoss&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Training loop
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;logits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;loss_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zero_grad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Epoch &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, Loss: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  &lt;strong&gt;Step 5: Deploy Model on SageMaker&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sagemaker&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sagemaker.pytorch&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PyTorchModel&lt;/span&gt;

&lt;span class="c1"&gt;# Define the model location (after training, upload to S3)
&lt;/span&gt;&lt;span class="n"&gt;model_artifact&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3://my-bucket/gnn-model.tar.gz&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Create a SageMaker PyTorch Model
&lt;/span&gt;&lt;span class="n"&gt;gnn_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PyTorchModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model_artifact&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_IAM_ROLE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;framework_version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.8.1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;py_version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;py3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Deploy the model to a SageMaker Endpoint
&lt;/span&gt;&lt;span class="n"&gt;predictor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gnn_model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;deploy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instance_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ml.m5.large&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;initial_instance_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  &lt;strong&gt;Key Takeaways&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;✅ &lt;strong&gt;GNNs&lt;/strong&gt; are essential for graph-structured data in AI applications.&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;AWS provides a scalable ecosystem&lt;/strong&gt; (Neptune, SageMaker, OpenSearch) for Graph ML.&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;DGL + SageMaker enables powerful GNN training &amp;amp; deployment&lt;/strong&gt; for real-world applications.  &lt;/p&gt;

&lt;p&gt;This is just the &lt;strong&gt;beginning&lt;/strong&gt;—in later sections, we can explore &lt;strong&gt;advanced GNN architectures, real-world use cases, and cost optimization strategies on AWS!&lt;/strong&gt; 🚀  &lt;/p&gt;

</description>
      <category>aws</category>
      <category>machinelearning</category>
      <category>gnn</category>
      <category>basic</category>
    </item>
  </channel>
</rss>
