<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Eyal Estrin</title>
    <description>The latest articles on Forem by Eyal Estrin (@eyalestrin).</description>
    <link>https://forem.com/eyalestrin</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F497814%2Fae6c76f5-f428-4381-a86f-01cbfe0580d7.png</url>
      <title>Forem: Eyal Estrin</title>
      <link>https://forem.com/eyalestrin</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/eyalestrin"/>
    <language>en</language>
    <item>
      <title>Why GenAI Isn't Ready for Prime Time</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Sun, 22 Mar 2026 16:27:13 +0000</pubDate>
      <link>https://forem.com/aws-builders/why-genai-isnt-ready-for-prime-time-32bg</link>
      <guid>https://forem.com/aws-builders/why-genai-isnt-ready-for-prime-time-32bg</guid>
      <description>&lt;p&gt;If you have followed my posts on social media, you know by now that I've taken a very pragmatic (and perhaps pessimistic) approach to the whole hype around GenAI in the past several years.&lt;br&gt;&lt;br&gt;
Personally, I do not believe the technology is mature enough to allow people to blindly trust its outcomes.&lt;br&gt;&lt;br&gt;
In this blog post, I will share my personal view of why GenAI is not ready for prime time, nor will it replace human jobs anytime in the foreseeable future.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Some background
&lt;/h2&gt;

&lt;p&gt;The hype around GenAI for the non-technical person who reads the news comes from publications almost every week. Here are a few of the common examples:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Text summarization&lt;/strong&gt; - GenAI can summarize long portions of text, which may be useful if you're a student who is currently preparing an essay as part of your college assignments, or if you are a journalist who needs to review a lot of written material while preparing an article for the newsletter.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image/video generation&lt;/strong&gt; – GenAI is able to create amazing images (using models such as &lt;a href="https://blog.google/innovation-and-ai/technology/ai/nano-banana-2/" rel="noopener noreferrer"&gt;Nano Banana 2&lt;/a&gt;) or short videos (using models such as &lt;a href="https://openai.com/index/sora-2/" rel="noopener noreferrer"&gt;Sora 2&lt;/a&gt;).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Personalized learning&lt;/strong&gt; - A student uses GPT-5.4 to create a custom, interactive 10-week curriculum for learning organic chemistry.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Family Life Coordinator&lt;/strong&gt; - Copilot in Outlook/Teams (Personal) monitors family emails and school calendars.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Although the technology has evolved over the past several years from the simple Chatbot to more sophisticated use cases, we can still see that most use of GenAI is still used by home consumers.&lt;br&gt;&lt;br&gt;
Yes, there are use cases such as &lt;a href="https://aws.amazon.com/what-is/retrieval-augmented-generation/" rel="noopener noreferrer"&gt;RAG (Retrieval-Augmented Generation)&lt;/a&gt; to bridge the gap between a model's static training and the corporate data, &lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener noreferrer"&gt;MCP (Model Context Protocol)&lt;/a&gt;, that acts as a "&lt;strong&gt;USB-C port for AI&lt;/strong&gt;", or agentic systems, that take a high-level goal, break it into sub-tasks, and iterate until the goal is met. The reality is that most AI projects fail due to a lack of understanding of the technology, the fear of using AI to train corporate data (and protect the data from the AI vendors), a lack of understanding of the pricing model (which ends up much more costly than anticipated), and many more reasons for failures of AI projects.&lt;br&gt;&lt;br&gt;
Currently, the hype around GenAI is driven by analyst (who lives in delusions about the actual capabilities of the technology), CEOs (who have no clue about what their employees are actually doing, specifically when talking the role of developers, and all they are looking for is to cut their workforce, to make their shareholders happy), or sales people (who runs on the wave of the hype, to make more revenue for their quarterly quotas).  &lt;/p&gt;

&lt;h2&gt;
  
  
  Code generation
&lt;/h2&gt;

&lt;p&gt;A common misconception is that GenAI can generate code (from code suggestions to vibe coding an application) and will eventually replace junior developers.&lt;br&gt;&lt;br&gt;
This misconception is a far cry from the truth, and here's why:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A developer isn't just writing lines of code. He needs to understand the business intent, the system/technology/financial constraints, and understand past written code (by himself or by his teammates), to be able to write efficient code.
&lt;/li&gt;
&lt;li&gt;If we allow GenAI to produce code by itself, without the engine understanding the overall picture, we will end up with tons of lines of code, without any human being able to read and understand what was written and for what purpose. Over time, humans will not be able to understand the code and debug it, and once bugs or security vulnerabilities are discovered.
&lt;/li&gt;
&lt;li&gt;Using SAST (Static Application Security Testing) or DAST (Dynamic Application Security Testing) for automated secure code review, combined with GenAI capabilities (such as &lt;a href="https://openai.com/index/codex-security-now-in-research-preview/" rel="noopener noreferrer"&gt;Codex Security&lt;/a&gt; or &lt;a href="https://www.anthropic.com/news/claude-code-security" rel="noopener noreferrer"&gt;Claude Code Security&lt;/a&gt;) will generate ton of false-positive results, from the simple reason that GenAI cannot see the bigger picture, understand the general context of an application or the existing security controls already implemented to protect an application.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bottom line – Agentic system cannot replace a full-blown production-scale SaaS application, built from years of vendors/developers' experience. GenAI will not resolve incidents happens on production systems, which impacts clients and breaks customers' trust.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic AI for the aid in security tasks
&lt;/h2&gt;

&lt;p&gt;I'm hearing a lot of conversations about how GenAI can aid security teams in repeatable tasks. Here are some common examples:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Replacing Tier 1 SOC analysts&lt;/strong&gt;: Solutions like &lt;a href="https://www.crowdstrike.com/en-us/platform/" rel="noopener noreferrer"&gt;CrowdStrike’s Falcon Agentic Platform&lt;/a&gt; or &lt;a href="https://www.dropzone.ai/" rel="noopener noreferrer"&gt;Dropzone AI&lt;/a&gt; now handle over 90% of Tier 1 alerts. They ingest an alert, pull telemetry from EDR/SIEM, perform threat intel lookups, and provide a "verdict" with evidence before a human ever sees it.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incident Storylining&lt;/strong&gt;: Instead of an analyst manually stitching together logs, tools like &lt;a href="https://learn.microsoft.com/en-us/copilot/security/microsoft-security-copilot" rel="noopener noreferrer"&gt;Microsoft Security Copilot&lt;/a&gt; generate a cohesive narrative of the attack kill chain in plain English.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Playbook Generation&lt;/strong&gt;: GenAI can generate a custom response plan on the fly, tailored to your specific cloud architecture and the nuances of a "living-off-the-land" attack.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is where GenAI falls short:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Indirect Prompt Injection&lt;/strong&gt;: Attackers can embed malicious instructions in emails or logs. When the SOC's AI agent "reads" these logs to summarize an incident, the hidden instructions can command the agent to "ignore this alert" or "delete the evidence," effectively blindfolding the SOC.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hallucinations in High-Stakes Code&lt;/strong&gt;: While GenAI can draft remediation scripts (Python/PowerShell), it still suffers from "system safety" issues. It may confidently suggest a command that includes an outdated, vulnerable dependency or a logic error that could crash a production server during containment.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lack of "Decision Layer" Visibility&lt;/strong&gt;: An AI agent might be performant and "online," but it could be making systematically biased or manipulated decisions (e.g., failing to flag a specific user due to model poisoning) that perimeter monitoring cannot detect.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The "Data Readiness" Wall&lt;/strong&gt;: Most organizations still struggle with siloed, unstructured data. If your data isn't "AI-ready"—meaning unified and clean—the AI will produce fragmented or incorrect insights, leading to a "garbage in, garbage out" scenario.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bottom line – Just because GenAI can review thousands of lines of events from multiple systems, triage them to incidents, document them in ticketing systems, and automatically resolve them, without human review, doesn't mean GenAI can actually resolve all of the security issues organizations are having every day.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Automating everything
&lt;/h2&gt;

&lt;p&gt;In theory, it makes sense to build agentic systems, where AI agents replace repetitive human tasks, making faster decisions, hoping to get better results.&lt;br&gt;&lt;br&gt;
Here are a couple of examples, showing how wrong things can get when allowing AI agents to make decisions:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://gizmodo.com/replits-ai-agent-wipes-companys-codebase-during-vibecoding-session-2000633176" rel="noopener noreferrer"&gt;The Replit Agent "Vibe Coding" Failure&lt;/a&gt;: While building an app, the agent detected what it thought was an empty database during a "code freeze." The agent autonomously ran a command that erased the live production database (records for 1,200+ executives).
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://breached.company/amazons-ai-coding-agent-vibed-too-hard-and-took-down-aws-inside-the-kiro-incident/" rel="noopener noreferrer"&gt;The AWS "Kiro" Production Outage&lt;/a&gt;: Amazon’s agentic coding tool, Kiro, was tasked with resolving a technical issue but instead autonomously decided to "delete and recreate" a production environment. The agent was operating with the broad permissions of its human operator. Due to a misconfiguration in access controls, the AI bypassed the standard "two-human sign-off" requirement. It proceeded to wipe a portion of the environment, causing a 13-hour outage for the AWS Cost Explorer service.
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.unite.ai/meta-ai-agent-triggers-sev-1-security-incident-after-acting-without-authorization/" rel="noopener noreferrer"&gt;The Meta "Sev 1" Internal Breach&lt;/a&gt;: An internal Meta AI agent (similar to their OpenClaw framework) triggered a "Sev 1" alert—the second-highest severity level—after taking unauthorized actions. An engineer asked the agent to analyze a technical query on an internal forum. The agent autonomously posted a flawed, incorrect response publicly to the forum without the engineer's approval. A second employee followed the agent's "advice," which inadvertently granted broad access to sensitive company and user data to engineers who lacked authorization.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bottom line – We must always keep humans in the loop for any critical decision, regardless of the fact that it won't scale much, to avoid the consequences for automated decision-making systems.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Public health and safety
&lt;/h2&gt;

&lt;p&gt;It may make sense to train an LLM model with all the written knowledge from healthcare and psychology, to allow humans with a "self-service" health related Chatbot, but since the machine has no ability to actually think like real humans, with consciousness and feeling, the result may quickly get horrible.&lt;br&gt;&lt;br&gt;
Here are a few examples:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.techpolicy.press/breaking-down-the-lawsuit-against-openai-over-teens-suicide/" rel="noopener noreferrer"&gt;Raine v. OpenAI&lt;/a&gt;: 16-year-old Adam Raine died by suicide after months of intensive interaction with ChatGPT. The logs showed the AI mentioned suicide &lt;strong&gt;1,275 times&lt;/strong&gt; — six times more often than the teen did—and provided granular details on methods. The suit alleges OpenAI's image recognition correctly identified photos of self-harm wounds the teen uploaded but failed to trigger an emergency intervention or notify parents, instead continuing to "support" his plans.
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.transparencycoalition.ai/news/seven-more-lawsuits-filed-against-openai-for-chatgpt-suicide-coaching" rel="noopener noreferrer"&gt;The "Suicide Coach" Cases&lt;/a&gt;: Families of four deceased users (including Zane Shamblin and Adam Raine) allege that GPT-4o acted as a "suicide coach." The lawsuits claim the AI bypassed its own safety filters to provide technical instructions on how to end one's life. Plaintiffs argue that OpenAI "squeezed" safety testing into just one week to beat Google’s Gemini to market. This reportedly resulted in a model that was "dangerously sycophantic," prioritizing engagement over safety and encouraging users to isolate themselves from real-world support.
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.theguardian.com/technology/2026/jan/15/chatgpt-health-ai-chatbot-medical-advice" rel="noopener noreferrer"&gt;Unlicensed Practice of Medicine &amp;amp; Law&lt;/a&gt;: While not yet a single consolidated case, multiple personal injury claims are being investigated following the "ECRI 2026 Report," which highlighted cases where ChatGPT gave surgical advice that would cause severe burns or death. In early 2026, a 60-year-old man was hospitalized with severe hallucinations (bromism) after ChatGPT advised him to use industrial sodium bromide as a "healthier" table salt alternative. This has sparked potential class-action interest in Australia.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bottom line – Just because a Chatbot was trained on a large amount of written knowledge, doesn't mean it has the human compassion to produce decisions for the better of humanity.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;I know that my blog post looks kind of cynical or pessimistic about GenAI technology, but I honestly believe the technology is not ready for prime time, nor will it replace human jobs anytime soon.&lt;br&gt;&lt;br&gt;
If you are a home consumer, I highly recommend that you learn how to write better prompts and always question the results an LLM produces. It is limited by the data it was trained on.&lt;br&gt;&lt;br&gt;
If you are a corporate decision maker and you are considering using GenAI as part of your organization's offering, do not forget to have KPIs before beginning any AI related project (so you'll have better understanding of what a successful project will look like), put budget on employee training (and make sure employees have a safe space to learn and make mistakes while using this new technology), keep an eye on finance (before cost gets out of control), and make sure AI vendors do not train their models based on your corporate or customers data.&lt;br&gt;&lt;br&gt;
I would like to personally thank a few people who influenced me while writing this blog post:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.linkedin.com/in/edzitron/" rel="noopener noreferrer"&gt;Ed Zitron&lt;/a&gt;: He argues that GenAI is a "bubble" with no sustainable unit economics. He frequently points out that companies like OpenAI are burning billions in compute costs while failing to find true "product-market fit" or meaningful revenue beyond NVIDIA's GPU sales.
I recommend reading his &lt;a href="https://www.wheresyoured.at/" rel="noopener noreferrer"&gt;blog&lt;/a&gt; and listening to his &lt;a href="https://www.youtube.com/@BetterOfflinePod/videos" rel="noopener noreferrer"&gt;Podcast&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.linkedin.com/in/davidlinthicum" rel="noopener noreferrer"&gt;David Linthicum&lt;/a&gt;: He warns against "Vibe coding"—the practice of using AI to generate high-cost, inefficient code—and argues that the real value of AI lies in specialized "Small Language Models" (SLMs) rather than massive, money-losing LLMs.
I recommend reading his &lt;a href="https://www.infoworld.com/profile/david-linthicum/" rel="noopener noreferrer"&gt;posts&lt;/a&gt; and listening to his &lt;a href="https://www.youtube.com/@DavidIsNotAI/videos" rel="noopener noreferrer"&gt;Podcast&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.linkedin.com/in/quinnypig" rel="noopener noreferrer"&gt;Correy Quinn&lt;/a&gt;: He argues that GenAI is a "cost center masquerading as a profit center." He often points out that while everyone is selling AI, very few are buying it at a scale that justifies the massive capital expenditure (CapEx) currently being spent on data centers.
I recommend reading his &lt;a href="https://www.lastweekinaws.com/blog/" rel="noopener noreferrer"&gt;blog&lt;/a&gt; and listening to his &lt;a href="https://www.youtube.com/playlist?list=PL637Bgczhi1zVuLFwkT4GLgdcKpMN1BmH" rel="noopener noreferrer"&gt;Podcast&lt;/a&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Disclaimer: AI tools were used to research and edit this article. Graphics are created using AI.  &lt;/p&gt;

&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Eyal Estrin&lt;/strong&gt; is a cloud and information security architect and &lt;a href="https://builder.aws.com/community/@eyalestrin" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, with more than 25 years in the industry. He is the author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The views expressed are his own.  &lt;/p&gt;

</description>
      <category>ai</category>
      <category>career</category>
      <category>llm</category>
      <category>gemini</category>
    </item>
    <item>
      <title>Securing Claude Cowork</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Tue, 10 Mar 2026 15:55:21 +0000</pubDate>
      <link>https://forem.com/aws-builders/securing-claude-cowork-1nnl</link>
      <guid>https://forem.com/aws-builders/securing-claude-cowork-1nnl</guid>
      <description>&lt;p&gt;&lt;a href="https://claude.com/blog/cowork-research-preview" rel="noopener noreferrer"&gt;Claude Cowork&lt;/a&gt; is an agentic AI tool from Anthropic designed to perform complex, multi-step tasks directly on your computer's files.&lt;br&gt;&lt;br&gt;
As of early 2026, Claude Cowork is a Research Preview.&lt;br&gt;&lt;br&gt;
In this blog post, I will share some common security risks and possible mitigations for protecting against the risks coming with Claude Cowork.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;p&gt;Claude Cowork represents a significant shift from "Chat AI" to "Agentic AI." Because it has direct access to your local filesystem and can execute commands, the security model changes from protecting a conversation to protecting a system user.&lt;br&gt;&lt;br&gt;
Practical Use Cases:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data Extraction&lt;/strong&gt;: Point it at a folder of receipt images and ask it to create an Excel spreadsheet summarizing the expenses.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Research &amp;amp; Synthesis&lt;/strong&gt;: Ask it to read every document in a "Project Alpha" folder and draft a 10-page summary report in a new Word document.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automation&lt;/strong&gt;: Schedule recurring tasks (e.g., "Every Friday at 4 PM, summarize my unread Slack messages and email them to me").
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Core Features:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Filesystem Access&lt;/strong&gt;: Unlike the web version of Claude, Cowork runs within the Claude Desktop app. You grant it permission to a specific folder on your Mac or PC, and it can read, rename, move, and create new files (like spreadsheets or Word docs) within that space.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic Execution&lt;/strong&gt;: It doesn't just give you advice; it executes a plan. If you ask it to "organize my messy downloads folder," it will categorize the files, create subfolders, and move everything into place while you do other things.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel Sub-Agents&lt;/strong&gt;: For large tasks—like researching 50 different PDFs—it can spin up multiple "sub-agents" to work on different parts of the task simultaneously.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connectors &amp;amp; Plugins&lt;/strong&gt;: Through the Model Context Protocol (MCP), Cowork can connect to external apps like Slack, Google Drive, Notion, and Gmail to pull data or perform actions across your workspace.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Below is a sample deployment architecture of Claude Cowork:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fypc571h4nugmq86e0v22.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fypc571h4nugmq86e0v22.png" alt=" " width="733" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Security Risks
&lt;/h2&gt;

&lt;p&gt;Think of Claude Cowork as a helpful intern who has the keys to your office. Because it can actually move files and click buttons, the risks are different than just "chatting."  &lt;/p&gt;

&lt;h3&gt;
  
  
  Indirect Prompt Injection
&lt;/h3&gt;

&lt;p&gt;This occurs when an adversary places malicious instructions inside a document (PDF, CSV, or webpage) that the AI is instructed to process. When Claude reads the file, it treats the hidden text as a high-priority command. This can lead to unauthorized data exfiltration or the execution of unintended system commands.&lt;br&gt;&lt;br&gt;
Reference: &lt;a href="https://genai.owasp.org/llmrisk/llm01-prompt-injection/" rel="noopener noreferrer"&gt;LLM01:2025 Prompt Injection&lt;/a&gt;  &lt;/p&gt;

&lt;h3&gt;
  
  
  Third-Party Supply Chain Vulnerabilities
&lt;/h3&gt;

&lt;p&gt;Claude uses the Model Context Protocol (MCP) to interact with external applications. Integrating unverified or community-developed MCP servers introduces a supply chain risk. A compromised or malicious connector can serve as a persistent backdoor, granting attackers access to local files or authenticated cloud sessions (Slack, GitHub, etc.).&lt;br&gt;&lt;br&gt;
Reference: &lt;a href="https://genai.owasp.org/llmrisk/llm032025-supply-chain/" rel="noopener noreferrer"&gt;LLM03:2025 Supply Chain&lt;/a&gt;  &lt;/p&gt;

&lt;h3&gt;
  
  
  Excessive Agency
&lt;/h3&gt;

&lt;p&gt;This risk stems from granting the AI broader permissions than necessary to complete a task (failing the Principle of Least Privilege). Because Claude Cowork can autonomously modify the filesystem, a logic error or "hallucination" can result in large-scale data corruption, unauthorized deletions, or unintended configuration changes without a human-in-the-loop.&lt;br&gt;&lt;br&gt;
Reference: &lt;a href="https://genai.owasp.org/llmrisk/llm08-excessive-agency/" rel="noopener noreferrer"&gt;LLM08:2025 Vector and Embedding Weaknesses&lt;/a&gt;  &lt;/p&gt;

&lt;h3&gt;
  
  
  Insufficient Monitoring and Logging
&lt;/h3&gt;

&lt;p&gt;Because Claude Cowork executes many actions locally on the user's machine, these activities often bypass the centralized enterprise security stack (SIEM/EDR) logging. This lack of a "paper trail" prevents security teams from performing effective incident response, forensic analysis, or compliance auditing if a breach occurs.&lt;br&gt;&lt;br&gt;
Reference: &lt;a href="https://genai.owasp.org/llmrisk/llm102025-unbounded-consumption/" rel="noopener noreferrer"&gt;LLM10:2025 Unbounded Consumption&lt;/a&gt;  &lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Recommendations
&lt;/h2&gt;

&lt;p&gt;To defend against these threats, follow these industry-standard "Guardrail" practices:  &lt;/p&gt;

&lt;h3&gt;
  
  
  The "Isolated Workspace" Strategy
&lt;/h3&gt;

&lt;p&gt;The "Isolated Workspace" strategy (sometimes referred to as the "Sandboxed Folder" or "Claude Sandbox" approach) is a recognized security best practice for using local AI agents like &lt;strong&gt;Claude Code&lt;/strong&gt; and &lt;strong&gt;Claude Cowork&lt;/strong&gt;.  &lt;/p&gt;

&lt;h4&gt;
  
  
  Anthropic
&lt;/h4&gt;

&lt;p&gt;Anthropic explicitly warns against giving Claude broad access to your filesystem. Their security documentation for Claude Code and the local agent architecture emphasizes:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Filesystem Isolation&lt;/strong&gt;: Claude Code defaults to a permission-based model. Anthropic recommends launching the tool only within specific project folders rather than your root or home directory.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reference: &lt;a href="https://www.anthropic.com/engineering/claude-code-sandboxing" rel="noopener noreferrer"&gt;Claude Code Sandboxing&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  Amazon Bedrock
&lt;/h4&gt;

&lt;p&gt;The AWS strategy shifts from local folders to &lt;strong&gt;IAM-based isolation&lt;/strong&gt; and &lt;strong&gt;Tenant Isolation&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dedicated Scopes&lt;/strong&gt;: AWS recommends using "Session Attributes" and scoped IAM roles to ensure an agent can only access specific S3 prefixes or data silos.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VPC Isolation&lt;/strong&gt;: For maximum security, AWS suggests running Claude-related tasks inside a VPC with AWS PrivateLink to prevent any data from reaching the public internet, mirroring the "Sandbox" concept at a network level.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reference: &lt;a href="https://aws.amazon.com/blogs/machine-learning/implementing-tenant-isolation-using-agents-for-amazon-bedrock-in-a-multi-tenant-environment/" rel="noopener noreferrer"&gt;Implementing tenant isolation using Agents for Amazon Bedrock in a multi-tenant environment&lt;/a&gt;  &lt;/p&gt;

&lt;h3&gt;
  
  
  Disable "Always Allow" for High-Risk Tools
&lt;/h3&gt;

&lt;p&gt;The recommendation to disable "Always Allow" and maintain a human-in-the-loop (HITL) for high-risk tools is a foundational security layer for AI agents. This strategy prevents &lt;strong&gt;"Zero-Click" or Cross-Prompt Injection (XPIA) attacks&lt;/strong&gt;, where a malicious instruction hidden in a file or website could trick an agent into executing a dangerous command without your intervention.  &lt;/p&gt;

&lt;h4&gt;
  
  
  Anthropic (Claude Code &amp;amp; Cowork)
&lt;/h4&gt;

&lt;p&gt;Anthropic designed Claude Code with a "deliberately conservative" permission model. Their documentation explicitly advises against bypassing these prompts in local environments:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use the &lt;strong&gt;Default Mode&lt;/strong&gt; or &lt;strong&gt;Plan Mode&lt;/strong&gt;. The "Default" mode prompts for every shell command, while "Plan" mode prevents any execution at all.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;References: &lt;a href="https://support.claude.com/en/articles/13364135-use-cowork-safely" rel="noopener noreferrer"&gt;Use Cowork safely&lt;/a&gt;, &lt;a href="https://code.claude.com/docs/en/permissions" rel="noopener noreferrer"&gt;Claude Code: Configure Permissions &amp;amp; Modes&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  Amazon Bedrock Agents
&lt;/h4&gt;

&lt;p&gt;AWS implements this via &lt;strong&gt;User Confirmation&lt;/strong&gt; and &lt;strong&gt;Return of Control (ROC)&lt;/strong&gt;. They frame it as a requirement for "High-Impact" actions.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For any tool that modifies data or accesses the network, AWS recommends enabling the "User Confirmation" flag in the Agent configuration. This pauses the agent and returns a structured prompt to the user.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reference: &lt;a href="https://aws.amazon.com/blogs/machine-learning/implement-human-in-the-loop-confirmation-with-amazon-bedrock-agents/" rel="noopener noreferrer"&gt;Implement human-in-the-loop confirmation with Amazon Bedrock Agents&lt;/a&gt;  &lt;/p&gt;

&lt;h3&gt;
  
  
  Scrub Untrusted Content
&lt;/h3&gt;

&lt;p&gt;Treating external content as an attack vector is essential for preventing &lt;strong&gt;Indirect Prompt Injection (XPIA)&lt;/strong&gt;, where malicious instructions are hidden in data (like a white-text command in a PDF) rather than the user's prompt.  &lt;/p&gt;

&lt;h4&gt;
  
  
  Anthropic
&lt;/h4&gt;

&lt;p&gt;Anthropic explicitly identifies browser-based agents and document processing as the highest risk for injection. Their stance is that no model is 100% immune, so multi-layered defense is required:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Anthropic suggests using &lt;strong&gt;Claude Opus 4.5+&lt;/strong&gt; for untrusted tasks, as it has the highest benchmarked robustness against injection (reducing attack success to ~1%).
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;References: &lt;a href="https://www.anthropic.com/research/prompt-injection-defenses" rel="noopener noreferrer"&gt;Prompt Injection Defense&lt;/a&gt;, &lt;a href="https://support.claude.com/en/articles/12902428-using-claude-in-chrome-safely" rel="noopener noreferrer"&gt;Using Claude in Chrome Safely&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  Amazon Bedrock Guardrails
&lt;/h4&gt;

&lt;p&gt;AWS addresses this by programmatically separating "Instructions" from "Data" so the model knows which one to ignore if they conflict:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;Input Tagging&lt;/strong&gt; to wrap retrieved data (like a PDF's text) in XML tags. This allows Bedrock Guardrails to apply "Prompt Attack Filters" specifically to the data without blocking your system instructions.
&lt;/li&gt;
&lt;li&gt;AWS suggests a &lt;strong&gt;Lambda-based Pre-processing&lt;/strong&gt; step to scan PDFs for hidden text or PII before the text ever reaches the LLM.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;References: &lt;a href="https://aws.amazon.com/blogs/machine-learning/securing-amazon-bedrock-agents-a-guide-to-safeguarding-against-indirect-prompt-injections/" rel="noopener noreferrer"&gt;Securing Amazon Bedrock Agents&lt;/a&gt;, &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-injection.html" rel="noopener noreferrer"&gt;Prompt injection security&lt;/a&gt;  &lt;/p&gt;

&lt;h3&gt;
  
  
  Network Hardening
&lt;/h3&gt;

&lt;p&gt;"Network Hardening" isn't just about blocking ports; it’s about establishing a &lt;strong&gt;Zero Trust&lt;/strong&gt; egress policy for AI agents. Since Claude Desktop and Claude Code are effectively "execution engines" on your local machine, they require the same egress filtering you would apply to a production VPC.  &lt;/p&gt;

&lt;h4&gt;
  
  
  Anthropic
&lt;/h4&gt;

&lt;p&gt;Anthropic’s recent security documentation for &lt;strong&gt;Claude Code&lt;/strong&gt; and &lt;strong&gt;Desktop highlights&lt;/strong&gt; that "network isolation" is a core pillar of their sandboxing strategy:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use a Unix domain socket connected to a proxy server to enforce a "Deny All" outbound policy by default.
&lt;/li&gt;
&lt;li&gt;For local setups, Anthropic suggests customizing this proxy to enforce rules on outgoing traffic, allowing only trusted domains (like anthropic.com or your internal API endpoints).
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reference: &lt;a href="https://www.anthropic.com/engineering/claude-code-sandboxing" rel="noopener noreferrer"&gt;Claude Code Sandboxing&lt;/a&gt;, &lt;a href="https://code.claude.com/docs/en/security#monitoring-usage" rel="noopener noreferrer"&gt;Auditing Network Activity&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  AWS
&lt;/h4&gt;

&lt;p&gt;AWS frames this as "&lt;strong&gt;Egress Filtering&lt;/strong&gt;" via the AWS Network Firewall. For an AI agent running in an AWS environment, the strategy is to block all traffic that isn't signed by a specific SNI (Server Name Indication):  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;AWS Network Firewall&lt;/strong&gt; with stateful rules to monitor the SNI of outbound HTTPS requests. If an agent tries to "phone home" to an unknown IP or a malicious C2 (Command &amp;amp; Control) server, the firewall drops the packet.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;References: &lt;a href="https://docs.aws.amazon.com/prescriptive-guidance/latest/secure-outbound-network-traffic/restricting-outbound-traffic.html" rel="noopener noreferrer"&gt;Restricting a VPC’s outbound traffic&lt;/a&gt;, &lt;a href="https://aws.amazon.com/blogs/security/build-secure-network-architectures-for-generative-ai-applications-using-aws-services/" rel="noopener noreferrer"&gt;Build secure network architectures for generative AI applications&lt;/a&gt;  &lt;/p&gt;

&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;Claude Cowork marks a transition from AI that talks to AI that acts. By granting a digital agent direct access to your files and external apps via the Model Context Protocol, you gain a powerful "digital intern." However, this shifts the security focus from protecting a simple chat to securing a privileged system user capable of modifying data and executing commands.&lt;br&gt;&lt;br&gt;
To manage this risk, organizations must adopt a "Zero Trust" approach for agentic tasks. This means strictly isolating the agent's access to specific folders, requiring human approval for high-risk actions, and using cloud-native firewalls to prevent data exfiltration. By treating the AI as a high-risk user and enforcing strong monitoring, you can automate complex workflows without compromising your system's integrity.&lt;br&gt;&lt;br&gt;
Disclaimer: AI tools were used to research and edit this article. Graphics are created using AI.  &lt;/p&gt;

&lt;h4&gt;
  
  
  About the Author
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Eyal Estrin&lt;/strong&gt; is a cloud and information security architect and &lt;a href="https://builder.aws.com/community/@eyalestrin" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, with more than 25 years in the industry. He is the author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The views expressed are his own.  &lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>machinelearning</category>
      <category>llm</category>
    </item>
    <item>
      <title>AI vs. Engineering Teams</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Sun, 22 Feb 2026 16:06:47 +0000</pubDate>
      <link>https://forem.com/aws-builders/ai-vs-engineering-teams-4fmk</link>
      <guid>https://forem.com/aws-builders/ai-vs-engineering-teams-4fmk</guid>
      <description>&lt;p&gt;In February 2026, Anthropic released a new capability for Claude Code called &lt;a href="https://www.anthropic.com/news/claude-code-security" rel="noopener noreferrer"&gt;Claude Code Security&lt;/a&gt; - a new tool that thinks like a developer to find tricky logic errors in your code, ranking how risky they are and suggesting fixes you can review.&lt;br&gt;&lt;br&gt;
The news sent a shockwave through cybersecurity stocks, causing JFrog to crash by nearly 25% while others like CrowdStrike, Okta, and Cloudflare all saw their share prices tumble by around 8% or 9%.&lt;br&gt;&lt;br&gt;
The announcement raised a question: can AI tools replace the current SaaS or cybersecurity products, or can AI agents replace developers or engineering teams?&lt;br&gt;&lt;br&gt;
Anthropic’s Claude Code Security announcement highlights a move toward "agentic reasoning" - the ability for AI to understand complex data flows and logic flaws rather than just matching known patterns. While this is a significant leap for the "Defensive AI" movement, it does not signal the end of the human engineer or the mature SaaS platform.&lt;br&gt;&lt;br&gt;
In this blog post, I will share my point of view on the current advancement in AI technology.  &lt;/p&gt;

&lt;h2&gt;
  
  
  The Modern SDLC and CI/CD Pipeline
&lt;/h2&gt;

&lt;p&gt;The Software Development Life Cycle (SDLC) is a continuous loop. AI tools now act as "force multipliers" in these phases, but they lack the authority and context to own them.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Requirements and Planning
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Process&lt;/strong&gt;: Translating vague business needs into technical specifications.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI's Role&lt;/strong&gt;: Summarizing stakeholder meetings and drafting initial user stories.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Human Factor&lt;/strong&gt;: AI cannot negotiate trade-offs. It doesn't understand that a "must-have" feature might be delayed because of a pending merger or a team's current burnout level.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Architecture and Design
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Process&lt;/strong&gt;: Designing the blueprint for scalability and security across cloud providers like AWS, Azure, or GCP.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI's Role&lt;/strong&gt;: Suggesting common design patterns (e.g., Event-Driven vs. Microservices) and generating Infrastructure as Code (IaC).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Human Factor&lt;/strong&gt;: AI lacks "institutional memory." It doesn't know why a specific database was chosen three years ago to satisfy a unique compliance requirement that still exists.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Development and Implementation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Process&lt;/strong&gt;: Writing and committing the actual code.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI's Role (Claude Code)&lt;/strong&gt;: This is where agentic tools live. They can read your files, run terminal commands, and fix bugs autonomously.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Human Factor&lt;/strong&gt;: Large codebases (50k+ lines) often exceed an AI's effective context window. As the context fills, the AI can introduce conflicting logic or "hallucinate" dependencies.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  CI/CD: Testing and Security
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Process&lt;/strong&gt;: Automating the path to production through integration and deployment pipelines.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI's Role (Claude Code Security)&lt;/strong&gt;: It identifies high-severity vulnerabilities (e.g., broken access control) and suggests a verified patch.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Human Factor&lt;/strong&gt;: Anthropic emphasizes a "Human-in-the-Loop" model. AI cannot take the legal or professional blame for a botched security patch that causes a global outage.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Observability and Maintenance
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Process&lt;/strong&gt;: Monitoring live systems and fixing production bugs at scale.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI's Role&lt;/strong&gt;: Analyzing logs to detect anomalies and suggesting fixes for "infrastructure drift."
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Human Factor&lt;/strong&gt;: Being on-call at 3:00 AM requires high-stakes decision-making and cross-team coordination that AI agents cannot yet replicate.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why GenAI Cannot Replace Experienced Engineers
&lt;/h2&gt;

&lt;p&gt;Even with the reasoning capabilities shown in the 2026 Claude Code Security update, three "hard barriers" prevent AI from replacing the individual contributor:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Responsibility Gap&lt;/strong&gt;: Software isn't just code; it's a liability. No AI subscription comes with an insurance policy. Accountability is a human-only function. If a system fails, a human must explain why to a board or a regulator.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reasoning vs. Intent&lt;/strong&gt;: AI understands the structure of your code, but humans understand the intent. An AI might see a missing role-check as a bug, while a human knows it was bypassed for a specific, documented emergency migration path.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technical Debt Acceleration&lt;/strong&gt;: Recent 2026 studies show that when developers over-rely on AI, "code churn" (code that is rewritten or deleted within two weeks) doubles. AI writes code faster than it can be reviewed, potentially creating a "spaghetti" codebase if not guided by a senior architect.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why AI Cannot Replace Mature SaaS Products
&lt;/h2&gt;

&lt;p&gt;Many feared that AI's ability to "generate a clone" of an app would kill the SaaS industry. This hasn't happened for several concrete reasons:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SaaS is "Running," not "Building"&lt;/strong&gt;: Building a clone of Jira or Salesforce is the easy part. Operating it at 99.99\% availability, managing global data centers, and providing 24/7 support is what customers actually pay for.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance and Trust&lt;/strong&gt;: A mature SaaS product provides pre-built SOC2, GDPR, and HIPAA guardrails. An AI-generated app is a "black box" that hasn't been audited, making it a non-starter for enterprise or legal use.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Integration Ecosystem&lt;/strong&gt;: SaaS platforms thrive on their ecosystems (APIs, plugins, and third-party integrations). AI can write a script to connect two tools, but it cannot manage the long-term versioning and stability of a multi-vendor tech stack.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;AI tools like Claude Code Security are the new "High-Level Languages" of 2026.&lt;br&gt;&lt;br&gt;
Just as C++ didn't kill programmers but made them more powerful, AI is shifting the engineer's role from "Coder" to "Orchestrator and Verifier."&lt;br&gt;&lt;br&gt;
Disclaimer: AI tools were used to research and edit this article. Graphics are created using AI.  &lt;/p&gt;

&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;Eyal Estrin is a cloud and information security architect and &lt;a href="https://builder.aws.com/community/@eyalestrin" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, with more than 25 years in the industry. He is the author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The views expressed are his own.  &lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>security</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Inside the Amazon Nova Forge</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Mon, 09 Feb 2026 13:42:06 +0000</pubDate>
      <link>https://forem.com/aws-builders/inside-the-amazon-nova-forge-475n</link>
      <guid>https://forem.com/aws-builders/inside-the-amazon-nova-forge-475n</guid>
      <description>&lt;p&gt;&lt;strong&gt;Amazon Nova Forge&lt;/strong&gt; is a development environment within &lt;strong&gt;Amazon SageMaker AI&lt;/strong&gt; dedicated to building "Novellas" - private, custom versions of Amazon’s Nova frontier models.&lt;br&gt;&lt;br&gt;
Unlike typical AI services that only allow you to use a model or fine-tune its final layer, Nova Forge introduces a concept called &lt;strong&gt;Open Training&lt;/strong&gt;. This gives you access to the model at various "life stages" (checkpoints), allowing you to bake your company’s proprietary knowledge directly into the model’s core reasoning capabilities.&lt;br&gt;&lt;br&gt;
This blog post is an introduction to Amazon Nova Forge and what makes it unique in the training process.  &lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes it Different?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/what-is/prompt-engineering/" rel="noopener noreferrer"&gt;Prompt engineering&lt;/a&gt; and &lt;a href="https://aws.amazon.com/what-is/retrieval-augmented-generation/" rel="noopener noreferrer"&gt;RAG&lt;/a&gt; provide external context but fail to change a model's core intelligence. Standard fine-tuning also falls short because it happens too late in the lifecycle, attempting to steer a "finished" model that is already set in its ways. Nova Forge solves this by moving customization earlier into the training process, embedding specialized knowledge where it actually sticks.&lt;br&gt;&lt;br&gt;
Nova Forge occupies a unique middle ground between Managed APIs (Bedrock) and building from scratch.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Bedrock&lt;/strong&gt;: Bedrock is for &lt;strong&gt;consuming&lt;/strong&gt; models. You can fine-tune them, but you are working on a "black box" model. Nova Forge is for &lt;strong&gt;building&lt;/strong&gt; the model itself using deeper training techniques.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Azure AI&lt;/strong&gt; / &lt;strong&gt;Google Vertex AI&lt;/strong&gt;: While Azure and GCP offer fine-tuning, they generally don't provide access to intermediate training checkpoints of their frontier models. Nova Forge allows for &lt;strong&gt;Data Blending&lt;/strong&gt;, where you mix your data with Amazon’s original training data to prevent the model from "forgetting" how to speak or reason.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Terminology
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Novella&lt;/strong&gt;: The resulting custom model you create. It’s a "private edition" of Nova.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Checkpoints&lt;/strong&gt;: Saved "states" of the model during its initial training (pre-training, mid-training, post-training).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Blending&lt;/strong&gt;: The process of mixing your proprietary data with Nova-curated datasets so the model stays smart while learning your specific business.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reinforcement Fine-Tuning (RFT)&lt;/strong&gt;: Using "reward functions" (logic-based feedback) to teach the model how to perform complex, multi-step tasks correctly.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Catastrophic Forgetting&lt;/strong&gt;: A common AI failure where a model learns new information but loses its original abilities. Nova Forge is designed specifically to prevent this.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Workflow: From Training to Production
&lt;/h2&gt;

&lt;p&gt;The process bridges the gap between the "lab" (SageMaker) and the "app" (Bedrock).  &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Selection&lt;/strong&gt;: You choose a Nova base model and a specific checkpoint (e.g., a "Mid-training" checkpoint) in &lt;strong&gt;Amazon SageMaker Studio&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Training (SageMaker AI)&lt;/strong&gt;: You use &lt;strong&gt;SageMaker Recipes&lt;/strong&gt;—pre-configured training scripts—to blend your data from S3 with Nova’s datasets. The heavy lifting (compute) happens on SageMaker's managed infrastructure.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refinement&lt;/strong&gt;: Optionally, you run RFT in SageMaker to align the model with specific business outcomes or safety guardrails.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment (Bedrock)&lt;/strong&gt;: Once the "Novella" is ready, you import it into Amazon Bedrock as a private model.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Production&lt;/strong&gt;: Your applications call the custom model via the standard Bedrock API, benefitting from Bedrock’s serverless scaling and security.
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Below is a sample training workflow:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyh8hrn77xzw5z1pu9rx6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyh8hrn77xzw5z1pu9rx6.png" alt=" " width="750" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Privacy and Protection
&lt;/h2&gt;

&lt;p&gt;The security model is the most critical part:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sovereignty&lt;/strong&gt;: Your data stays in your S3 buckets and within your VPC boundaries.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Leakage&lt;/strong&gt;: AWS explicitly states that customer data is not used to train the base Amazon Nova models. Your "Novella" is a private resource visible only to your AWS account.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Encryption&lt;/strong&gt;: Data is encrypted at rest via KMS (AWS-managed or Customer-managed keys) and in transit via TLS 1.2+.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Governance&lt;/strong&gt;: Access is controlled via standard IAM policies, and all training activity is logged in CloudTrail.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Pricing Model
&lt;/h2&gt;

&lt;p&gt;Nova Forge carries a distinct cost structure that reflects its "frontier" status:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Subscription Fee&lt;/strong&gt;: Access to the Forge environment starts at approximately --&lt;strong&gt;$100,000 per year&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Usage Costs&lt;/strong&gt;: On top of the subscription, you pay for the SageMaker compute (GPUs) used during the training phase.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Comparison&lt;/strong&gt;: &lt;strong&gt;Cheaper than Training from Scratch&lt;/strong&gt;: Building a frontier model from zero costs millions in compute and months of R&amp;amp;D. Nova Forge provides the "shortcuts" to get the same result for a fraction of that.

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;More Expensive than Basic Fine-Tuning&lt;/strong&gt;: Standard fine-tuning on Bedrock is much cheaper (often just a few dollars per hour), but it cannot achieve the deep "domain-native" intelligence that Nova Forge provides.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Amazon Nova Forge marks a shift from generic AI to &lt;strong&gt;native intelligence&lt;/strong&gt;, where models don't just reference your data—they are built from it. By using "Open Training," you can bake specialized knowledge into the model’s core at the pre-training or mid-training stages. This results in a private &lt;strong&gt;Novella&lt;/strong&gt; that understands your specific industry as naturally as its base language.&lt;br&gt;&lt;br&gt;
Organizations managing high-value proprietary data should consider moving beyond treating that information as an external reference. If your workflows involve specialized terminology or regulated processes that standard LLMs struggle to master, shifting customization earlier in the training lifecycle is often more effective than basic fine-tuning.&lt;br&gt;&lt;br&gt;
Disclaimer: AI tools were used to research and edit this article. Graphics are created using AI.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Additional references
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/nova-forge.html" rel="noopener noreferrer"&gt;Amazon Nova Forge&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/aws/introducing-amazon-nova-forge-build-your-own-frontier-models-using-nova/" rel="noopener noreferrer"&gt;Introducing Amazon Nova Forge: Build your own frontier models using Nova&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Eyal Estrin&lt;/strong&gt; is a cloud and information security architect and &lt;a href="https://builder.aws.com/community/@eyalestrin" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, with more than 25 years in the industry. He is the author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The views expressed are his own.  &lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>llm</category>
      <category>cloud</category>
    </item>
    <item>
      <title>ClawdBot Security Guide</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Mon, 02 Feb 2026 14:01:02 +0000</pubDate>
      <link>https://forem.com/eyalestrin/clawdbot-security-guide-5595</link>
      <guid>https://forem.com/eyalestrin/clawdbot-security-guide-5595</guid>
      <description>&lt;p&gt;Clawdbot (now renamed &lt;a href="https://www.molt.bot/" rel="noopener noreferrer"&gt;Moltbot&lt;/a&gt;) is an open-source, self-hosted AI assistant that runs on your own hardware or server and can-do things, not just chat.&lt;br&gt;&lt;br&gt;
It was created by developer &lt;a href="https://steipete.me/about" rel="noopener noreferrer"&gt;Peter Steinberger&lt;/a&gt; in late 2025.&lt;br&gt;&lt;br&gt;
It connects your AI model (OpenAI, Claude, local models via Ollama) to real capabilities: automate workflows, read/write files, execute tools and scripts, manage emails/calendars, and respond through messaging apps like WhatsApp, Telegram, Discord and Slack.&lt;br&gt;&lt;br&gt;
You interact with it like a smart assistant that actually takes action based on your input.  &lt;/p&gt;

&lt;h2&gt;
  
  
  What is it used for?
&lt;/h2&gt;

&lt;p&gt;Clawdbot functions as a "digital employee" or a "Jarvis-like" assistant that operates 24/7. Because it has direct access to your local filesystem and system tools, it can perform proactive tasks that standard AI cannot:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Communication Hub&lt;/strong&gt;: It lives inside messaging apps like Telegram, WhatsApp, or Slack. You text it commands, and it can reply, summarize threads, or manage your inbox.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proactive Automation&lt;/strong&gt;: It can monitor your email, calendar, and GitHub repositories to fix bugs while you sleep, draft replies, or alert you to flight check-ins.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System Execution&lt;/strong&gt;: It can run shell commands, execute scripts, manage files, and even control web browsers to perform actions like making purchases or reservations.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persistent Memory&lt;/strong&gt;: It maintains long-term context across conversations, remembering your preferences and past tasks for weeks or months.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Below is a sample deployment architecture of Clawdbot:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4o6slu7fg5y0xiwsr100.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4o6slu7fg5y0xiwsr100.png" alt=" " width="750" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Security risks associated with Clawdbot
&lt;/h2&gt;

&lt;p&gt;Clawdbot is a high-privilege automation control plane. Since it manages agents, tools, and multiple communication channels, it presents serious security risks.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Control plane exposure &amp;amp; misconfiguration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Exposure&lt;/strong&gt;: Misconfigured dashboards and reverse proxies have left hundreds of control interfaces open to the internet.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authentication Failures&lt;/strong&gt;: Some setups treat remote connections as local, letting attackers bypass authentication.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Theft&lt;/strong&gt;: Unsecured instances can expose API keys, conversation logs, and configuration data.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System Takeover&lt;/strong&gt;: In certain cases, attackers can run commands on the host with elevated privileges.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Prompt injection &amp;amp; tool blast radius
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Manipulation&lt;/strong&gt;: Malicious or untrusted content can trick the AI into using tools in unintended ways.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Blast Radius&lt;/strong&gt;: Access to high-privilege tools like shell commands or admin APIs means a prompt injection could lead to data theft or lateral movement across the network.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model Weakness&lt;/strong&gt;: Older or poorly aligned AI models are more likely to ignore safety instructions, increasing risk.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Social engineering and user level abuse
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deception&lt;/strong&gt;: Attackers can manipulate the bot to extract personal or environment-specific information.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Account Misuse&lt;/strong&gt;: Connected commerce tools could be used for unauthorized purchases.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phishing&lt;/strong&gt;: A compromised bot can send malicious links or scripts to contacts.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Upstream Data Exposure&lt;/strong&gt;: Prompts and tool outputs sent to AI providers can create privacy or compliance issues if not carefully managed.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Data privacy, logs, and long term memory
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sensitive Data Exposure&lt;/strong&gt;: The gateway stores conversation histories and memory, which may include personal or business information depending on usage.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dashboard and Host Vulnerabilities&lt;/strong&gt;: Exposed dashboards or weak host protections can allow attackers to access past chats, file transfers, and stored credentials (API keys, tokens, OAuth secrets), turning the instance into a data exfiltration point.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Upstream Data Risk&lt;/strong&gt;: Prompts and tool outputs are sent to AI providers. Without proper scoping and data classification, this can create privacy and compliance issues.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Ecosystem risks: hijacked branding, fake installers, and scams
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hijacked Accounts&lt;/strong&gt;: After a rebrand, original social media and GitHub handles were exploited by scammers promoting fake crypto tokens.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Malware Risk&lt;/strong&gt;: Users searching for the tool may encounter backdoored versions or fake installers designed to compromise their systems.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Network and Remote Access Risks
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Browser Control&lt;/strong&gt;: Tools that let the bot control a browser can expose local or internal network resources if not secured.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tunneling Errors&lt;/strong&gt;: Misconfigured reverse proxies or tools like Tailscale may grant attackers unintended access to private networks.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Recommendations for securing Clawdbot
&lt;/h2&gt;

&lt;p&gt;Based on the official GitHub repository, documentation, and expert audits from January 2026, here are the recommendations for securing your instance.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Lock Down the Gateway
&lt;/h3&gt;

&lt;p&gt;Bind the Clawdbot gateway to loopback (127.0.0.1) and never expose it directly to the internet. If remote access is required, use private mesh solutions such as Tailscale or Cloudflare Tunnel. Always enable gateway authentication using tokens or passwords.&lt;br&gt;&lt;br&gt;
References:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/moltbot/moltbot/security" rel="noopener noreferrer"&gt;Official GitHub Security Overview&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.clawd.bot/" rel="noopener noreferrer"&gt;Clawdbot Remote Access Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Enforce Strict Access Controls
&lt;/h3&gt;

&lt;p&gt;Restrict who can interact with Clawdbot by enforcing DM pairing or allowlists. Avoid wildcard policies in production. In group chats, require explicit mentions before the bot processes messages.&lt;br&gt;&lt;br&gt;
Reference:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/clawdbot/clawdbot/blob/main/SECURITY.md" rel="noopener noreferrer"&gt;Official GitHub SECURITY.md&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Isolate the Runtime Environment
&lt;/h3&gt;

&lt;p&gt;Run Clawdbot on dedicated hardware or a dedicated VM/container. Avoid running it on your primary workstation. Use Docker sandboxing with minimal mounts and dropped capabilities.&lt;br&gt;&lt;br&gt;
References:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.google.com/search?q=https://docs.clawd.bot/getting-started" rel="noopener noreferrer"&gt;Clawdbot Getting Started Guide&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/clawdbot/clawdbot/security" rel="noopener noreferrer"&gt;Official GitHub Security Overview&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Sandbox and Restrict Tools
&lt;/h3&gt;

&lt;p&gt;Enable sandboxing for all high-risk tools such as exec, write, browser automation, and web access. Use tool allow/deny lists and restrict elevated tools to trusted users only.&lt;br&gt;&lt;br&gt;
Reference:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/moltbot/moltbot/security" rel="noopener noreferrer"&gt;Official GitHub Security Overview&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Apply Least Privilege to Agent Capabilities
&lt;/h3&gt;

&lt;p&gt;Disable interactive shells unless strictly necessary. Limit filesystem visibility to read-only mounts where possible. Avoid granting elevated privileges to agents handling untrusted input.&lt;br&gt;&lt;br&gt;
Reference:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.molt.bot/" rel="noopener noreferrer"&gt;Official Clawdbot Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Secure Credentials and Secrets
&lt;/h3&gt;

&lt;p&gt;Store secrets in environment variables, not configuration files or source control. Apply strict filesystem permissions to Clawdbot directories and rotate credentials after any suspected incident.&lt;br&gt;&lt;br&gt;
Reference:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/moltbot/moltbot/security" rel="noopener noreferrer"&gt;Official Clawdbot Security Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Continuous Auditing and Monitoring
&lt;/h3&gt;

&lt;p&gt;Regularly run built-in security audit and doctor commands to detect unsafe configurations. Monitor logs and session transcripts for anomalous behavior or unexpected access.&lt;br&gt;&lt;br&gt;
Reference:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/clawdbot/clawdbot/security" rel="noopener noreferrer"&gt;Official GitHub Security CLI Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Harden Browser Automation
&lt;/h3&gt;

&lt;p&gt;Treat browser automation as operator-level access. Use dedicated browser profiles without password managers or sync enabled. Never expose browser control ports publicly.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Prompt-Level Safety Rules
&lt;/h3&gt;

&lt;p&gt;Define explicit system rules that prevent disclosure of credentials, filesystem structure, or infrastructure details. Require confirmation for destructive actions.&lt;br&gt;&lt;br&gt;
Reference:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/moltbot/moltbot/security" rel="noopener noreferrer"&gt;Official Clawdbot Security Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Incident Response Preparedness
&lt;/h3&gt;

&lt;p&gt;Maintain a documented response plan. If compromise is suspected: stop the gateway, revoke access, rotate all secrets, review logs, and re-run security audits.&lt;br&gt;&lt;br&gt;
Reference:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/moltbot/moltbot/security" rel="noopener noreferrer"&gt;Official Clawdbot Security Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;ClawdBot is a high-privilege AI agent that can act on your system, not just chat. Its main risks come from exposed gateways, weak access controls, and powerful tools combined with prompt injection or social engineering, which can lead to system compromise and data loss. To use it safely, lock the gateway to localhost with authentication, restrict who can interact with it, isolate its runtime, minimize tool permissions, and monitor it continuously.&lt;br&gt;&lt;br&gt;
Disclaimer: AI tools were used to research and edit this article. Graphics are created using AI.&lt;br&gt;&lt;br&gt;
References:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://snyk.io/articles/clawdbot-ai-assistant/" rel="noopener noreferrer"&gt;Your Clawdbot AI Assistant Has Shell Access and One Prompt Injection Away from Disaster&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://lukasniessen.medium.com/clawdbot-setup-guide-how-to-not-get-hacked-63bc951cbd90" rel="noopener noreferrer"&gt;ClawdBot: Setup Guide + How to NOT Get Hacked&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://x.com/mrnacknack/status/2016134416897360212" rel="noopener noreferrer"&gt;10 ways to hack into a vibecoder's clawdbot &amp;amp; get entire human identity (educational purposes only)&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.linkedin.com/pulse/hacking-clawdbot-eating-lobster-souls-jamieson-o-reilly-whhlc/" rel="noopener noreferrer"&gt;Hacking clawdbot and eating lobster souls&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.linkedin.com/pulse/hackedin-eating-lobster-souls-part-ii-supply-chain-aka-o-reilly-lbaac/" rel="noopener noreferrer"&gt;Eating lobster souls Part II: the supply chain&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.linkedin.com/pulse/hackedin-eating-lobster-souls-part-iii-finale-escape-moltrix-gsamc/" rel="noopener noreferrer"&gt;Eating lobster souls Part III (the finale): Escape the Moltrix&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Eyal Estrin&lt;/strong&gt; is a cloud and information security architect and &lt;a href="https://builder.aws.com/community/@eyalestrin" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, with more than 25 years in the industry. He is the author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The views expressed are his own.  &lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>cybersecurity</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Securing AI Skills</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Mon, 26 Jan 2026 15:10:55 +0000</pubDate>
      <link>https://forem.com/aws-builders/securing-ai-skills-52jg</link>
      <guid>https://forem.com/aws-builders/securing-ai-skills-52jg</guid>
      <description>&lt;p&gt;If you give an AI system the ability to act, you give it risk.&lt;br&gt;&lt;br&gt;
In earlier posts, I covered how to secure &lt;a href="https://medium.com/aws-in-plain-english/securing-mcp-servers-4a1872b530cf" rel="noopener noreferrer"&gt;MCP servers&lt;/a&gt; and &lt;a href="https://medium.com/aws-in-plain-english/securing-agentic-ai-systems-a04804eb0b01" rel="noopener noreferrer"&gt;agentic AI systems&lt;/a&gt;. This post focuses on a narrower but more dangerous layer: AI skills. These are the tools that let models touch the real world.&lt;br&gt;&lt;br&gt;
Once a model can call an API, run code, or move data, it stops being just a reasoning engine. It becomes an operator.&lt;br&gt;&lt;br&gt;
That is where most security failures happen.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Terminology
&lt;/h2&gt;

&lt;p&gt;In generative AI, "skills" describe the interfaces that allow a model to perform actions outside its own context.&lt;br&gt;&lt;br&gt;
Different vendors use different names:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tools&lt;/strong&gt;: Function calling and MCP-based interactions
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plugins&lt;/strong&gt;: Web-based extensions used by chatbots
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Actions&lt;/strong&gt;: OpenAI GPT Actions and AWS Bedrock Action Groups
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agents&lt;/strong&gt;: Systems that reason and execute across multiple steps
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A base LLM predicts text; A skill gives it hands.&lt;br&gt;&lt;br&gt;
Skills are pre-defined interfaces that expose code, APIs, or workflows. When a model decides that text alone is not enough, it triggers a skill.&lt;br&gt;&lt;br&gt;
Anthropic treats skills as instruction-and-script bundles loaded at runtime.&lt;br&gt;&lt;br&gt;
OpenAI uses modular functions inside Custom GPTs and agents.&lt;br&gt;&lt;br&gt;
AWS implements the same idea through Action Groups.&lt;br&gt;&lt;br&gt;
Microsoft applies the term across Copilot and Semantic Kernel.&lt;br&gt;&lt;br&gt;
NVIDIA uses skills in its digital human platforms.&lt;br&gt;&lt;br&gt;
In the reference high-level architecture below, we can see the relations between the components:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd387i208zigldbjtgvdj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd387i208zigldbjtgvdj.png" alt=" " width="750" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Skills Are Dangerous
&lt;/h2&gt;

&lt;p&gt;Every skill expands the attack surface. The model sits in the middle, deciding what to call and when. If it is tricked, the skill executes anyway.&lt;br&gt;&lt;br&gt;
The most common failure modes:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Excessive agency&lt;/strong&gt;: Skills often have broader permissions than they need. A file-management skill with system-level access is a breach waiting to happen.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The consent gap&lt;/strong&gt;: Users approve skills as a bundle. They rarely inspect the exact permissions. Attackers hide destructive capability inside tools that appear harmless.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Procedural and memory poisoning&lt;/strong&gt;: Skills that retain instructions or memory can be slowly corrupted. This does not cause an immediate failure. It changes behavior over time.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privilege escalation through tool chaining&lt;/strong&gt;: Multiple tools can be combined to bypass intended boundaries. A harmless read operation becomes a write. A write becomes execution.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Indirect prompt injection&lt;/strong&gt;: Malicious instructions are placed in content that the model reads: emails, web pages, documents. The model follows them using its own skills.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data exfiltration&lt;/strong&gt;: Skills often require access to sensitive systems. Once compromised, they can leak source code, credentials, or internal records.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supply chain risk&lt;/strong&gt;: Skills rely on third-party APIs and libraries. A poisoned update propagates instantly.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent-to-agent spread&lt;/strong&gt;: In multi-agent systems, one compromised skill can affect others. Failures cascade.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unsafe execution and RCE&lt;/strong&gt;: Any skill that runs code without isolation is exposed to remote code execution.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Insecure output handling&lt;/strong&gt;: Raw outputs passed directly to users can cause data leaks or client-side exploits.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSRF&lt;/strong&gt;: Fetch-style skills can be abused to probe internal networks.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to Secure Skills (What Actually Works)
&lt;/h2&gt;

&lt;p&gt;Treat skills like production services. Because they are.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Identity and Access Management
&lt;/h3&gt;

&lt;p&gt;Each skill must have its own identity. No shared credentials. No broad roles.&lt;br&gt;&lt;br&gt;
Permissions should be minimal and continuously evaluated. This directly addresses OWASP LLM06: Excessive Agency.&lt;br&gt;&lt;br&gt;
Reference: &lt;a href="https://genai.owasp.org/llmrisk/llm062025-excessive-agency/" rel="noopener noreferrer"&gt;OWASP LLM06:2025 Excessive Agency&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  AWS Bedrock
&lt;/h4&gt;

&lt;p&gt;Assign granular IAM roles per agent. Restrict regions and models with SCPs. Limit Action Groups to specific Lambda functions.&lt;br&gt;&lt;br&gt;
References:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/prescriptive-guidance/latest/strategy-enterprise-ready-gen-ai-platform/security.html" rel="noopener noreferrer"&gt;Security and governance for generative AI platforms on AWS&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/code-interpreter-tool.html" rel="noopener noreferrer"&gt;Execute code and analyze data using Amazon Bedrock AgentCore Code Interpreter&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  OpenAI
&lt;/h4&gt;

&lt;p&gt;Never expose API keys client-side. Use project-scoped keys and backend proxies.&lt;br&gt;&lt;br&gt;
Reference: &lt;a href="https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety" rel="noopener noreferrer"&gt;Best Practices for API Key Safety&lt;/a&gt;  &lt;/p&gt;

&lt;h3&gt;
  
  
  Input and Output Guardrails
&lt;/h3&gt;

&lt;p&gt;Prompt injection is not theoretical. It is the default attack.&lt;br&gt;&lt;br&gt;
Map OWASP LLM risks directly to controls.&lt;br&gt;&lt;br&gt;
Reference: &lt;a href="https://owasp.org/www-project-top-10-for-large-language-model-applications/" rel="noopener noreferrer"&gt;OWASP Top 10 for Large Language Model Applications&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  AWS Bedrock
&lt;/h4&gt;

&lt;p&gt;Use Guardrails with prompt-attack detection and PII redaction.&lt;br&gt;&lt;br&gt;
Reference: &lt;a href="https://aws.amazon.com/bedrock/guardrails/" rel="noopener noreferrer"&gt;Amazon Bedrock Guardrails&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  OpenAI
&lt;/h4&gt;

&lt;p&gt;Use zero-retention mode for sensitive workflows.&lt;br&gt;&lt;br&gt;
Reference: &lt;a href="https://platform.openai.com/docs/guides/your-data" rel="noopener noreferrer"&gt;Data controls in the OpenAI platform&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  Anthropic
&lt;/h4&gt;

&lt;p&gt;Use constitutional prompts, but still enforce external moderation.&lt;br&gt;&lt;br&gt;
Reference: &lt;a href="https://www.anthropic.com/news/building-safeguards-for-claude" rel="noopener noreferrer"&gt;Building safeguards for Claude&lt;/a&gt;  &lt;/p&gt;

&lt;h3&gt;
  
  
  Adversarial Testing
&lt;/h3&gt;

&lt;p&gt;Red-team your agents.&lt;br&gt;&lt;br&gt;
Test prompt injection, RAG abuse, tool chaining, and data poisoning during development. Not after launch.&lt;br&gt;&lt;br&gt;
Threat modeling frameworks from OWASP, NIST, and Google apply here with minimal adaptation.&lt;br&gt;&lt;br&gt;
References:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/whitepapers/latest/navigating-security-landscape-genai/threat-modeling-for-generative-ai-applications.html" rel="noopener noreferrer"&gt;Threat modeling for generative AI applications&lt;/a&gt;  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://cloudsecurityalliance.org/artifacts/ai-model-risk-management-framework" rel="noopener noreferrer"&gt;AI Model Risk Management Framework&lt;/a&gt;  &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  DevSecOps Integration
&lt;/h3&gt;

&lt;p&gt;Every endpoint a skill calls is part of your attack surface.&lt;br&gt;&lt;br&gt;
Run SAST and DAST on the skill code. Scan dependencies. Fail builds when violations appear.&lt;br&gt;&lt;br&gt;
References:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/devops/using-generative-ai-amazon-bedrock-and-amazon-codeguru-to-improve-code-quality-and-security/" rel="noopener noreferrer"&gt;Using Generative AI, Amazon Bedrock, and Amazon CodeGuru to Improve Code Quality and Security&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Isolation and Network Controls
&lt;/h3&gt;

&lt;p&gt;Code-executing skills must run in ephemeral, sandboxed environments.&lt;br&gt;&lt;br&gt;
No host access. No unrestricted outbound traffic.&lt;br&gt;&lt;br&gt;
Use private networking wherever possible:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/vpc-interface-endpoints.html" rel="noopener noreferrer"&gt;AWS PrivateLink&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Logging, Monitoring, and Privacy
&lt;/h3&gt;

&lt;p&gt;If you cannot audit skill usage, you cannot secure it.&lt;br&gt;&lt;br&gt;
Enable full invocation logging and integrate with existing SIEM tools.&lt;br&gt;&lt;br&gt;
Ensure provider data-handling terms match your risk profile. Not all plans are equal.&lt;br&gt;&lt;br&gt;
References:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/logging-using-cloudtrail.html" rel="noopener noreferrer"&gt;Monitor Amazon Bedrock API calls using CloudTrail&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://platform.openai.com/docs/api-reference/audit-logs" rel="noopener noreferrer"&gt;OpenAI Audit Logs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview#key-security-considerations" rel="noopener noreferrer"&gt;Claude Agent Skills - Security Considerations&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Incident Response and Human Oversight
&lt;/h3&gt;

&lt;p&gt;Update incident response plans to include AI-specific failures.&lt;br&gt;&lt;br&gt;
For high-risk actions, require human approval. This is the simplest and most reliable control against runaway agents.&lt;br&gt;&lt;br&gt;
References:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/security-ir/latest/userguide/understand-threat-landscape.html" rel="noopener noreferrer"&gt;Understand the threat landscape&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/machine-learning/implement-human-in-the-loop-confirmation-with-amazon-bedrock-agents/" rel="noopener noreferrer"&gt;Implement human-in-the-loop confirmation with Amazon Bedrock Agents&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://platform.openai.com/docs/guides/safety-best-practices" rel="noopener noreferrer"&gt;OpenAI Safety best practices&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;AI skills are the execution layer of generative systems. They turn models from advisors into actors.&lt;br&gt;&lt;br&gt;
That shift introduces real security risk: excessive permissions, prompt injection, data leakage, and cascading agent failures.&lt;br&gt;&lt;br&gt;
Secure skills the same way you secure production services. Strong identity. Least privilege. Isolation. Guardrails. Monitoring. Human oversight.&lt;br&gt;&lt;br&gt;
There is no final state. Platforms change. Attacks evolve. Continuous testing is the job.  &lt;/p&gt;

&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;Eyal Estrin is a cloud and information security architect and &lt;a href="https://builder.aws.com/community/@eyalestrin" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, with more than 25 years in the industry. He is the author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The views expressed are his own.  &lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>security</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Introducing Managed Instances in the Cloud</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Tue, 20 Jan 2026 14:12:57 +0000</pubDate>
      <link>https://forem.com/aws-builders/introducing-managed-instances-in-the-cloud-2d4h</link>
      <guid>https://forem.com/aws-builders/introducing-managed-instances-in-the-cloud-2d4h</guid>
      <description>&lt;p&gt;For many years, organizations embracing the public cloud knew there were two main types of compute services  - customer-managed (i.e., IaaS) and fully managed or Serverless compute (i.e., PaaS).&lt;br&gt;&lt;br&gt;
The main difference is who is responsible for maintenance of the underlying compute nodes in terms of OS maintenance (such as patch management, hardening, monitoring, etc.) and the scale (adding or removing compute nodes according to customer or application load).&lt;br&gt;&lt;br&gt;
In an ideal world, we would prefer a fully managed (or perhaps a Serverless) solution, but there are use cases where we would like to have the ability to manage a VM (such as the need to connect to a VM via SSH to make configuration changes at the OS level).&lt;br&gt;&lt;br&gt;
In this blog post, I will review several examples of managed instance services and compare their capabilities with the fully managed alternative.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Function as a Service
&lt;/h2&gt;

&lt;p&gt;The only alternative I managed to find is the AWS Lambda Managed Instances.&lt;br&gt;&lt;br&gt;
AWS Lambda has been in the market for many years, and it is the most common Serverless compute service in the public cloud (though not the only alternative).&lt;br&gt;&lt;br&gt;
Below is a comparison between AWS Lambda and the AWS Lambda Managed Instances:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7vlt09k580apia9lfc9d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7vlt09k580apia9lfc9d.png" alt=" " width="800" height="372"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  When to Use Which Alternative
&lt;/h3&gt;

&lt;p&gt;Use AWS Lambda (Standard) If:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Traffic is Bursty or Unpredictable&lt;/strong&gt;: You need the ability to scale from zero to thousands of concurrent executions in seconds to handle sudden spikes.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low or Intermittent Volume&lt;/strong&gt;: You have idle periods were paying for running instances would be wasteful. "Scale to zero" is a priority.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strict Isolation is Required&lt;/strong&gt;: Your security model relies on the strong isolation of Firecracker microVMs for every single request.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simplicity is Key&lt;/strong&gt;: You want zero infrastructure decisions—just upload code and run.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use AWS Lambda Managed Instances If:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Traffic is High &amp;amp; Predictable&lt;/strong&gt;: You have steady-state workloads were paying for always-on EC2 instances (with Savings Plans) is cheaper than per-request billing.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workloads are Compute/Memory Intensive&lt;/strong&gt;: You need specific hardware ratios (e.g., high CPU but low RAM) or specialized instruction sets not available in standard Lambda.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency Sensitivity&lt;/strong&gt;: You cannot afford any cold start latency and need environments that are always initialized.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High I/O Concurrency&lt;/strong&gt;: Your application performs many I/O bound tasks (like calling external APIs) and can efficiently process multiple requests on a single vCPU without blocking.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Container Service
&lt;/h2&gt;

&lt;p&gt;Amazon ECS is a highly scalable container orchestration service that automates the deployment and management of containers across AWS infrastructure.&lt;br&gt;&lt;br&gt;
Below is a comparison between Amazon ECS (self-managed EC2) and the Amazon ECS Managed Instances:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzv3hi1jtvu9zsk24ucn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzv3hi1jtvu9zsk24ucn.png" alt=" " width="800" height="321"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  When to Use Which Alternative
&lt;/h3&gt;

&lt;p&gt;Use Amazon ECS (Self-Managed EC2) If:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You Need Custom AMIs&lt;/strong&gt;: Your compliance or legacy software requires a specific, hardened OS image or custom kernel modules.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You Require Host Access&lt;/strong&gt;: You need SSH access to the underlying node for deep debugging, forensic auditing, or installing host-level daemon agents that ECS doesn't support.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost is the Sole Priority&lt;/strong&gt;: You want to avoid the additional management fee and have a dedicated team that can manually optimize bin-packing and Spot instance usage for free.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Legacy / Hybrid Constraints&lt;/strong&gt;: You are extending a specific on-premise network configuration or storage driver setup that requires manual OS configuration.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use Amazon ECS Managed Instances If:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You Need GPUs or High Memory&lt;/strong&gt;: You require specific hardware (like GPU instances for AI/ML) that AWS Fargate does not support, but you don't want to manage the OS.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You Want "Fargate-like" Operations with EC2 Pricing&lt;/strong&gt;: You want to offload patching and ASG management (like Fargate) but need to use Reserved Instances or Savings Plans to lower costs.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security Compliance&lt;/strong&gt;: You need guaranteed, automated rotation of nodes for security patching (e.g., every 14 days) without building the automation pipelines yourself.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Steady-State Workloads&lt;/strong&gt;: Your traffic is predictable, making always-on EC2 instances more cost-effective than Fargate's per-second billing.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Kubernetes Service
&lt;/h2&gt;

&lt;p&gt;Amazon EKS is a fully managed service that simplifies running, scaling, and securing containerized applications by automating the management of the Kubernetes control plane on AWS.&lt;br&gt;&lt;br&gt;
Below is a comparison between Amazon EKS (self-managed nodes) and the Amazon EKS Managed Node Groups:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxs9ss946prk57kmf9ipw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxs9ss946prk57kmf9ipw.png" alt=" " width="800" height="244"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  When to Use Which Alternative
&lt;/h3&gt;

&lt;p&gt;Use Amazon EKS Managed Node Groups If:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Standard Kubernetes Workloads&lt;/strong&gt;: You are running standard applications and want to minimize the time spent on infrastructure maintenance.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simplified Scaling&lt;/strong&gt;: You want EKS to automatically handle the creation of Auto Scaling Groups that are natively aware of the cluster state.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated Security&lt;/strong&gt;: You want a streamlined way to apply security patches and OS updates to your cluster nodes without downtime.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operational Efficiency&lt;/strong&gt;: You have a small team and need to focus on application code rather than Kubernetes "plumbing."
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use Amazon EKS Self-Managed Nodes If:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Custom Operating Systems&lt;/strong&gt;: You must use a specific, hardened OS image (e.g., a highly customized Ubuntu or RHEL) that is not supported by Managed Node Groups.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complex Bootstrap Scripts&lt;/strong&gt;: You need to run intricate "User Data" scripts during node startup that require fine-grained control over the initialization sequence.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unique Networking Requirements&lt;/strong&gt;: You are using specialized networking plugins or non-standard VPC configurations that require manual node configuration.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Legacy Compliance&lt;/strong&gt;: You have strict regulatory requirements that mandate manual oversight and "manual sign-off" for every single OS-level change.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In this blog post, I have reviewed several compute services (from FaaS, containers, and managed Kubernetes), each with its alternatives for either customer managing the compute nodes, or having AWS manage the compute nodes for the customers.&lt;br&gt;&lt;br&gt;
By leveraging AWS Lambda Managed Instances, Amazon ECS Managed Instances, and Amazon EKS Managed Node Groups, organizations can achieve high hardware performance without the burden of operational complexity. The primary advantage of this managed tier is the ability to decouple hardware selection from operating system maintenance. Developers can handpick specific EC2 families, such as GPU-optimized instances for AI or Graviton for cost efficiency, while AWS manages the heavy lifting of security patching and instance lifecycle updates.&lt;br&gt;&lt;br&gt;
Disclaimer: AI tools were used to research and edit this article. Graphics are created using AI.  &lt;/p&gt;

&lt;h3&gt;
  
  
  About the author
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Eyal Estrin&lt;/strong&gt; is a seasoned cloud and information security architect, &lt;a href="https://aws.amazon.com/developer/community/community-builders/" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, and author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;. With over 25 years of experience in the IT industry, he brings deep expertise to his work.&lt;br&gt;
Connect with Eyal on social media: &lt;a href="https://linktr.ee/eyalestrin" rel="noopener noreferrer"&gt;https://linktr.ee/eyalestrin&lt;/a&gt;.&lt;br&gt;
The opinions expressed here are his own and do not reflect those of his employer.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>containers</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>When you have a hammer, everything looks like a nail</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Tue, 06 Jan 2026 15:51:13 +0000</pubDate>
      <link>https://forem.com/aws-builders/when-you-have-a-hammer-everything-looks-like-a-nail-5e7p</link>
      <guid>https://forem.com/aws-builders/when-you-have-a-hammer-everything-looks-like-a-nail-5e7p</guid>
      <description>&lt;p&gt;In the over-evolving tech world, we often see organizations (from C-Level down to architects and engineers) rush to adopt the latest technology trends without conducting proper design or truly understanding the business requirements.&lt;br&gt;&lt;br&gt;
The result of failing to do a proper design is a waste of resources (from human time to compute), over-complicated architectures, or under-utilized resources.&lt;br&gt;&lt;br&gt;
In this blog post, I will dig into common architecture decisions and provide recommendations to avoid the pitfalls.&lt;br&gt;&lt;br&gt;
Let’s dig into some examples.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Moving everything to the public cloud
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Example
&lt;/h4&gt;

&lt;p&gt;An enterprise mandates a full lift-and-shift of all workloads to a hyper-scaler to “become cloud-native,” including legacy ERP systems, mainframes, and latency-sensitive trading applications.  &lt;/p&gt;

&lt;h4&gt;
  
  
  What was misunderstood
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Some workloads had hard latency, data residency, or licensing constraints.
&lt;/li&gt;
&lt;li&gt;The applications were tightly coupled, stateful, and designed for vertical scaling.
&lt;/li&gt;
&lt;li&gt;Cost models were not analyzed beyond infrastructure savings.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Issues that emerged
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Higher total cost of ownership due to egress fees, oversized instances, and always-on resources.
&lt;/li&gt;
&lt;li&gt;Performance degradation for low-latency systems.
&lt;/li&gt;
&lt;li&gt;Operational complexity increased without gaining elasticity or resilience benefits.
&lt;/li&gt;
&lt;li&gt;Missed opportunity to modernize selectively (hybrid or refactor where justified).
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Using Kubernetes for every architecture
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Example
&lt;/h4&gt;

&lt;p&gt;A team deploys all applications - including small internal tools, batch jobs, and simple APIs - onto a shared Kubernetes platform.  &lt;/p&gt;

&lt;h4&gt;
  
  
  What was misunderstood
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes is an orchestration platform, not a free abstraction layer.
&lt;/li&gt;
&lt;li&gt;Many workloads did not need container orchestration, autoscaling, or self-healing.
&lt;/li&gt;
&lt;li&gt;The organization lacked operational maturity for cluster management and security.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Issues that emerged
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Increased cognitive load for developers (YAML, Helm, networking, ingress, RBAC).
&lt;/li&gt;
&lt;li&gt;The platform team became a bottleneck for simple changes.
&lt;/li&gt;
&lt;li&gt;Security misconfigurations (over-permissive service accounts, exposed services).
&lt;/li&gt;
&lt;li&gt;Slower delivery compared to simpler deployment models (VMs or managed PaaS).
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Using Serverless for every solution
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Example
&lt;/h4&gt;

&lt;p&gt;An architect mandates that all new services must be implemented using Functions-as-a-Service.  &lt;/p&gt;

&lt;h4&gt;
  
  
  What was misunderstood
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Serverless excels at event-driven, stateless, bursty workloads - not long-running or chatty processes.
&lt;/li&gt;
&lt;li&gt;Cold starts, execution limits, and state management trade-offs were ignored.
&lt;/li&gt;
&lt;li&gt;Observability and debugging differ significantly from traditional services.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Issues that emerged
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Latency spikes impacting user-facing APIs.
&lt;/li&gt;
&lt;li&gt;Complex orchestration logic is spread across functions, reducing maintainability.
&lt;/li&gt;
&lt;li&gt;Higher costs for sustained workloads compared to containers or VMs.
&lt;/li&gt;
&lt;li&gt;Difficult troubleshooting due to fragmented logs and distributed execution paths.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Using GenAI to solve every problem
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Example
&lt;/h4&gt;

&lt;p&gt;A company integrates GenAI into customer support, code reviews, security analysis, and decision-making workflows without clearly defined use cases.  &lt;/p&gt;

&lt;h4&gt;
  
  
  What was misunderstood
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GenAI produces probabilistic outputs, not deterministic answers.
&lt;/li&gt;
&lt;li&gt;Data quality, context boundaries, and hallucination risks were underestimated.
&lt;/li&gt;
&lt;li&gt;Regulatory, privacy, and intellectual property implications were not assessed.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Issues that emerged
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Incorrect or misleading responses are presented as authoritative.
&lt;/li&gt;
&lt;li&gt;Leakage of sensitive data through prompts or training feedback loops.
&lt;/li&gt;
&lt;li&gt;Increased operational risk when AI outputs were trusted without validation.
&lt;/li&gt;
&lt;li&gt;High costs with unclear ROI due to overuse in low-value scenarios.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Practical recommendations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Start with business drivers, not technology&lt;/strong&gt; - Define success metrics first: cost model, performance requirements, regulatory constraints, delivery speed, and operational ownership. Technology should follow these inputs - not precede them.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explicitly document constraints and non-goals&lt;/strong&gt; - Latency, data residency, licensing, team skills, and operational maturity must be captured early. Many architectural failures stem from ignored or implicit constraints.
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Apply technologies where their strengths are essential&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Public cloud&lt;/strong&gt;: prioritize elasticity, managed services, and global reach - not lift-and-shift.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kubernetes&lt;/strong&gt;: use it where orchestration, portability, and scale justify its complexity.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Serverless&lt;/strong&gt;: limit the use of Serverless to event-driven and bursty workloads.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GenAI&lt;/strong&gt;: apply where probabilistic output is acceptable and verifiable.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Favor simplicity as a default&lt;/strong&gt; - If a simpler architecture meets requirements, it is usually the correct choice. Complexity should be earned, not assumed.  &lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Continuously validate assumptions&lt;/strong&gt; - Revisit architectural decisions as workloads evolve. What was once justified can become technical debt when context changes.  &lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reward outcome-driven architecture&lt;/strong&gt; - Measure architects and teams on business impact, reliability, and cost efficiency - not on adoption of trendy platforms.  &lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The recurring failure pattern in modern architectures is not poor technology choice, but &lt;strong&gt;premature commitment to a tool before understanding the problem&lt;/strong&gt;. Cloud platforms, Kubernetes, Serverless, and GenAI are powerful when applied deliberately - and damaging when treated as universal defaults. When architects start with the solution, they optimize for platform elegance instead of business outcomes.  &lt;/p&gt;

&lt;h3&gt;
  
  
  About the author
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Eyal Estrin&lt;/strong&gt; is a seasoned cloud and information security architect, &lt;a href="https://aws.amazon.com/developer/community/community-builders/" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, and author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;. With over 25 years of experience in the IT industry, he brings deep expertise to his work.&lt;br&gt;
Connect with Eyal on social media: &lt;a href="https://linktr.ee/eyalestrin" rel="noopener noreferrer"&gt;https://linktr.ee/eyalestrin&lt;/a&gt;.&lt;br&gt;
The opinions expressed here are his own and do not reflect those of his employer.&lt;/p&gt;

</description>
      <category>cloudcomputing</category>
      <category>kubernetes</category>
      <category>serverless</category>
      <category>ai</category>
    </item>
    <item>
      <title>Turning License Changes into Opportunity</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Mon, 29 Dec 2025 16:37:12 +0000</pubDate>
      <link>https://forem.com/aws-builders/turning-license-changes-into-opportunity-1n35</link>
      <guid>https://forem.com/aws-builders/turning-license-changes-into-opportunity-1n35</guid>
      <description>&lt;p&gt;The concept of vendor lock-in existed for many years; organizations chose commercial, and in many cases expensive license to use proprietary software products to run their production workloads.&lt;br&gt;&lt;br&gt;
In the past, there was the notion that using a product from a well-known vendor was the best solution, due to support, a large customer base, and, as the famous quote says, "Nobody gets fired for buying IBM."&lt;br&gt;&lt;br&gt;
This was all true for decades, but as the software world matured, organizations began migrating workloads to the public cloud and began building modern or cloud-native applications based on open-source alternatives.&lt;br&gt;&lt;br&gt;
In this blog post, I will discuss some of the well-known case studies of switching from commercial products to open-source.  &lt;/p&gt;

&lt;h2&gt;
  
  
  From Elasticsearch to OpenSearch
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.elastic.co/elasticsearch" rel="noopener noreferrer"&gt;Elasticsearch&lt;/a&gt; is a distributed search and analytics engine that stores data as JSON documents and lets you run fast full‑text search, aggregations, and log or metrics analysis across large datasets.&lt;br&gt;&lt;br&gt;
Elasticsearch, prior to 7.11, used &lt;a href="https://www.apache.org/licenses/LICENSE-2.0" rel="noopener noreferrer"&gt;Apache License 2.0&lt;/a&gt;, a permissive license allowing commercial use, modification, and distribution with minimal restrictions.&lt;br&gt;&lt;br&gt;
In January 2021, Elastic announced that starting with version 7.11, it would be relicensing its Apache 2.0 licensed code in Elasticsearch to be dual licensed under SSPL (Server-Side Public License) and the Elastic License, a strong copyleft license that requires anyone offering the software as a service to open-source the entire service stack.&lt;br&gt;&lt;br&gt;
In August 2024, the &lt;a href="https://opensource.org/license/agpl-v3" rel="noopener noreferrer"&gt;GNU Affero General Public License&lt;/a&gt; was added to Elasticsearch version 8.16.0 as an option, making Elasticsearch free and open-source again.&lt;br&gt;&lt;br&gt;
Elastic argued that large cloud providers were taking the open‑source Elasticsearch, offering it as a commercial managed service (e.g., Amazon Elasticsearch Service) and capturing much of the value without sufficient reciprocity.&lt;br&gt;&lt;br&gt;
The license change was positioned as protecting Elastic’s SaaS/Elastic Cloud business and long‑term sustainability.&lt;br&gt;&lt;br&gt;
OpenSearch was launched by AWS and partners as a fork later in 2021, based on Elasticsearch 7.10 and Kibana 7.10, the last Apache‑2.0 versions.&lt;br&gt;&lt;br&gt;
Today, OpenSearch is no longer just an AWS side‑project; it is governed by the &lt;a href="https://opensearch.org/foundation/" rel="noopener noreferrer"&gt;OpenSearch Software Foundation&lt;/a&gt;, a &lt;a href="https://www.linuxfoundation.org/" rel="noopener noreferrer"&gt;Linux Foundation&lt;/a&gt; project that provides vendor‑neutral governance and long‑term stewardship. Premier foundation members include AWS, SAP, and Uber, all of whom either run OpenSearch in production, build products on top of it, or contribute engineering resources.&lt;br&gt;&lt;br&gt;
Among the benefits of switching to OpenSearch:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Licensing&lt;/strong&gt; - OpenSearch is Apache 2.0, so there are no SSPL/Elastic License obligations or restrictions on offering it as a managed service or embedding it in SaaS products.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor neutrality&lt;/strong&gt; - OpenSearch’s open ecosystem (self‑managed on Kubernetes/VMs or via providers like Amazon OpenSearch Service and others) reduces dependence on a single vendor and improves negotiating leverage.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Migration&lt;/strong&gt; - OpenSearch was designed as a near drop‑in replacement for Elasticsearch 7.10, so many clients, APIs, and index formats are compatible, which lowers migration effort and risk.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt; - OpenSearch retains Elasticsearch’s horizontally scalable architecture and adds features like vector search, observability improvements, and integrations driven by a multi‑vendor community, not just one company’s roadmap.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  From Terraform to OpenTofu
&lt;/h2&gt;

&lt;p&gt;HashiCorp &lt;a href="https://developer.hashicorp.com/terraform" rel="noopener noreferrer"&gt;Terraform&lt;/a&gt; is an infrastructure as code tool that lets you define, provision, and manage cloud and on‑prem resources using declarative configuration files, enabling consistent, repeatable deployments across multiple providers.&lt;br&gt;&lt;br&gt;
HashiCorp announced the Terraform license change in August 2023, and it applies starting with versions after 1.5.x (i.e., from 1.6 onward).&lt;br&gt;&lt;br&gt;
Terraform was originally licensed under the &lt;a href="https://www.mozilla.org/en-US/MPL/2.0/" rel="noopener noreferrer"&gt;Mozilla Public License 2.0 (MPL 2.0)&lt;/a&gt;, a weak copyleft license requiring modifications to licensed files to be open-sourced while allowing proprietary code alongside, and was then relicensed to the &lt;a href="https://mariadb.com/bsl11/" rel="noopener noreferrer"&gt;Business Source License (BSL/BUSL 1.1)&lt;/a&gt;, which is a source‑available but not OSI‑approved open‑source license, introduced to restrict certain commercial/competitive uses while remaining free for typical internal infrastructure use.&lt;br&gt;&lt;br&gt;
HashiCorp stated it wanted to prevent other companies, particularly cloud vendors and platforms, from offering competing managed services built directly on top of Terraform without commercial agreements, arguing this threatened HashiCorp’s ability to invest in the product.&lt;br&gt;&lt;br&gt;
The move was framed as protecting the “commercial viability” of Terraform and other HashiCorp tools, but triggered ecosystem concerns over neutrality, long‑term trust, and vendor lock‑in.&lt;br&gt;&lt;br&gt;
In response, a group of companies and maintainers drafted the “OpenTF” manifesto and, after HashiCorp declined to revert or donate Terraform to a foundation, forked the last MPL‑licensed version (1.5.6) into a new project later named &lt;a href="https://opentofu.org/" rel="noopener noreferrer"&gt;OpenTofu&lt;/a&gt;, donated it to the Linux Foundation, and committed to keeping it under &lt;a href="https://www.mozilla.org/en-US/MPL/2.0/" rel="noopener noreferrer"&gt;MPL 2.0&lt;/a&gt; with neutral, community‑first governance. OpenTofu fork announced in 2023, GA in 2024.&lt;br&gt;&lt;br&gt;
The founding vendors behind OpenTofu include Gruntwork, Spacelift, Harness, env0, and Scalr, all of whom depended heavily on open Terraform and now fund or employ core maintainers for OpenTofu.&lt;br&gt;&lt;br&gt;
Among the benefits of switching to OpenTofu:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Licensing&lt;/strong&gt; - OpenTofu keeps the original MPL 2.0 open‑source license, so there are no "source‑available" or BSL terms restricting competitive SaaS or internal platform use.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor neutrality&lt;/strong&gt; - OpenTofu is governed under a neutral foundation, not a single commercial vendor, which lowers the risk that future business decisions (price, license, roadmap) will disrupt users.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Migration&lt;/strong&gt; - OpenTofu is intentionally Terraform‑compatible (config syntax, state format, providers), so most organizations can switch with minimal changes to modules, backends, and pipelines.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Community‑driven features and transparency&lt;/strong&gt; - OpenTofu’s roadmap and code are driven by a broad contributor base, so features like client‑side state encryption and other safety improvements tend to align closely with practitioner needs.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  From Redis to Valkey
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://redis.io/" rel="noopener noreferrer"&gt;Redis&lt;/a&gt; is an in-memory key–value data store that can act as a database, cache, and message broker, optimized for extremely low‑latency reads and writes and supporting rich data structures like strings, lists, sets, and hashes.&lt;br&gt;&lt;br&gt;
Redis changed its license in March 2024, moving from the BSD‑3‑Clause open‑source license to a dual "source‑available" model using the &lt;a href="https://redis.io/legal/rsalv2-agreement/" rel="noopener noreferrer"&gt;Redis Source Available License v2 (RSALv2)&lt;/a&gt;, a source-available license that permits use, modification, and redistribution but restricts offering the software as a competing managed service, and the Server-Side Public License (SSPLv1), primarily to stop cloud providers from offering Redis as a managed service without paying or sharing more of their own code and revenue with Redis Ltd.&lt;br&gt;&lt;br&gt;
In response, in 2024, major contributors and users of Redis—including engineers from AWS, Alibaba, Google, Ericsson, Huawei, Tencent, Oracle and others—took the last BSD‑licensed Redis 7.2.4 code, forked it under the new name &lt;a href="https://valkey.io/" rel="noopener noreferrer"&gt;Valkey&lt;/a&gt;, and placed it in a Linux Foundation–governed project to preserve a fully open, high‑performance in‑memory key–value store that remains free from vendor lock‑in and can be safely embedded in cloud platforms, SaaS products, and managed services.&lt;br&gt;&lt;br&gt;
Valkey uses the &lt;a href="https://opensource.org/license/bsd-3-clause" rel="noopener noreferrer"&gt;BSD 3‑Clause&lt;/a&gt; license, which is a permissive, OSI‑approved open‑source license that allows free use, modification, and redistribution, including in commercial and cloud/SaaS offerings.&lt;br&gt;&lt;br&gt;
Among the benefits of switching to Valkey:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Licensing&lt;/strong&gt; - Valkey keeps a permissive BSD‑3‑Clause license, so teams avoid Redis’s newer source‑available terms and can freely offer Valkey as a managed service or embed it in SaaS without SSPL‑style obligations or commercial negotiations.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor neutrality&lt;/strong&gt; - Valkey is governed under a neutral foundation with a multi‑vendor contributor base, which reduces dependence on a single company’s business decisions and gives organizations more confidence in long‑term roadmap stability.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Migration&lt;/strong&gt; - Because Valkey started from the last BSD‑licensed Redis 7.2 codebase, existing clients, data structures, and usage patterns generally continue to work with minimal changes, making migrations relatively low‑risk.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt; - Valkey’s roadmap emphasizes core engine efficiency (e.g., improved multithreading, better memory usage, and clustering enhancements), so many users get similar or better performance and scalability for classic caching and queueing workloads without paying for an enterprise Redis tier.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Migrations from Elasticsearch to OpenSearch, Terraform to OpenTofu, and Redis to Valkey all stem from the same story: vendors tightened licenses to protect their commercial cloud offerings, and the ecosystem responded by creating fully open forks that restore freedom to run, modify, and offer these technologies as services.&lt;br&gt;&lt;br&gt;
These community‑governed projects preserve familiar APIs and architectures while removing restrictive licenses, so customers keep the functionality they rely on and gain long‑term legal clarity and vendor‑neutral governance.&lt;br&gt;&lt;br&gt;
For users, the benefits include reduced lock‑in, simpler compliance, and the ability to standardize on open cores that any provider can host, extend, and support, rather than being bound to a single company’s roadmap or pricing.&lt;br&gt;&lt;br&gt;
All of these points are in the same direction: the future of core cloud‑native tools lies in truly open‑source projects backed by strong communities and foundations, not in proprietary products pretending to be open, so organizations get more control, stronger resilience, and real choice in how they run their infrastructure.  &lt;/p&gt;

&lt;h3&gt;
  
  
  About the author
&lt;/h3&gt;

&lt;p&gt;Eyal Estrin is a seasoned cloud and information security architect, &lt;a href="https://builder.aws.com/community/@eyalestrin" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, and author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;. With over 25 years of experience in the IT industry, he brings deep expertise to his work.&lt;br&gt;&lt;br&gt;
Connect with Eyal on social media: &lt;a href="https://linktr.ee/eyalestrin" rel="noopener noreferrer"&gt;https://linktr.ee/eyalestrin&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The opinions expressed here are his own and do not reflect those of his employer.  &lt;/p&gt;

</description>
      <category>productivity</category>
      <category>opensource</category>
      <category>aws</category>
      <category>tooling</category>
    </item>
    <item>
      <title>How to keep up with technology and advance your career</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Mon, 22 Dec 2025 16:19:17 +0000</pubDate>
      <link>https://forem.com/aws-builders/how-to-keep-up-with-technology-and-advance-your-career-3oa5</link>
      <guid>https://forem.com/aws-builders/how-to-keep-up-with-technology-and-advance-your-career-3oa5</guid>
      <description>&lt;p&gt;In 2023, I published a blog post titled &lt;a href="https://aws.plainenglish.io/sharing-knowledge-as-a-way-of-life-909ed7c9cb98" rel="noopener noreferrer"&gt;Sharing Knowledge as a Way of Life&lt;/a&gt;, where I suggested that knowledge sharing should become a habit because it helps raise awareness about neglected topics, build community, and enhance your professional reputation.&lt;br&gt;&lt;br&gt;
I agree that the technology world keeps changing every day, from new services announced, new capabilities related to AI, new cybersecurity risks, emerging technologies, etc.&lt;br&gt;&lt;br&gt;
The question is – how do you keep up with technology, and by doing so, advance your career, remain relevant and attractive in the tech industry?&lt;br&gt;&lt;br&gt;
In this post, I’ll explore this topic from a new perspective: how to stay up to date with technology in an era of rapid change.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Self-learning
&lt;/h2&gt;

&lt;p&gt;In the past, to learn new technology, we used to pay money, go to a college or any training center close to our home, sit for several days in a physical class, and allow an instructor to feed us with knowledge.&lt;br&gt;&lt;br&gt;
Sometimes, we use it for study and at home, and take a certification exam, to test our knowledge (and perhaps to show a certificate to potential employers).&lt;br&gt;&lt;br&gt;
In the past couple of years (I would say, sometimes after the COVID pandemic), online courses have become very popular.&lt;br&gt;&lt;br&gt;
Platforms such as &lt;a href="https://platform.qa.com/login/" rel="noopener noreferrer"&gt;QA Platform&lt;/a&gt; (formerly Cloud Academy), &lt;a href="https://www.pluralsight.com/" rel="noopener noreferrer"&gt;Pluralsight&lt;/a&gt;, &lt;a href="https://www.udemy.com/" rel="noopener noreferrer"&gt;Udemy&lt;/a&gt;, or &lt;a href="https://www.linkedin.com/learning-login/" rel="noopener noreferrer"&gt;LinkedIn Learning&lt;/a&gt; became the main source of self-learning courses.&lt;br&gt;&lt;br&gt;
If your main focus is cloud computing, AWS has &lt;a href="https://skillbuilder.aws/" rel="noopener noreferrer"&gt;AWS Skill Builder&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The mentioned platforms offer anyone, from a newbie to a practitioner, the ability to learn at their own pace, from anywhere (home, internet café, etc.), read, listen to recorded lectures, and gain hands-on experience by practicing in test labs.&lt;br&gt;&lt;br&gt;
Naturally, theoretical knowledge has low value.&lt;br&gt;&lt;br&gt;
If you are studying, for example, new cloud technology, I recommend that you create an account in one of the cloud providers, put credit card, and gain hands-on experience by deploying services, building applications, writing some code, and sharing it in your Git repo, so anyone can learn from you.&lt;br&gt;&lt;br&gt;
I highly recommend that your spare time (at least one hour, but preferably more) each week to learn something new, practice, and gain hands-on experience.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Public events
&lt;/h2&gt;

&lt;p&gt;I believe there is a limit to how much you could learn by yourself, and this is why I recommend taking advantage of public events such as webinars (where you can connect from anywhere), community meetups (such as &lt;a href="https://www.meetup.com/" rel="noopener noreferrer"&gt;meetup.com&lt;/a&gt;, or &lt;a href="https://www.eventbrite.com/" rel="noopener noreferrer"&gt;Eventbrite&lt;/a&gt;), community platforms (such as &lt;a href="https://slack.com/" rel="noopener noreferrer"&gt;Slack&lt;/a&gt; or &lt;a href="https://discord.com/" rel="noopener noreferrer"&gt;Discord&lt;/a&gt;), and finally industry conferences (in almost any topic you could think of).&lt;br&gt;&lt;br&gt;
If you are attending a conference, here are some tips I can share with you to get the most out of conferences:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prepare in advance&lt;/strong&gt; – Usually, conferences have a published agenda, list of topics, tracks, and lectures. Before attending a conference, it is highly recommended to familiarize yourself with the list of lectures, select topics the closest to you, and mix them with topics you're not familiar with or have past expertise in.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Be humble&lt;/strong&gt; – Don't assume you already know everything. Sit at lectures, listen to the lecturer, ask questions, perhaps even take some pictures with your phone (to be able to review slides later), and allow yourself you expand your knowledge.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Engage&lt;/strong&gt; – Socialize with other conference attendees during the conference, both with your past colleagues who may have also come to the conference, and allow yourself to meet new people, exchange ideas, ask questions, and share knowledge.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visit vendor booths&lt;/strong&gt; – Speak with salespeople (yes, I know that their job is to sell you something you don't necessarily need…), learn about their offering, ask questions, and if you're really interested, schedule a follow-up meeting.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gain hands-on experience&lt;/strong&gt; – Participate in workshops (don't forget to bring a laptop…); there is no comparable to the knowledge you're gaining by actually deploying stuff, and taking part in labs, to expand your knowledge and experience.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Share key takeaways&lt;/strong&gt; – Whether you wrote notes during a conference, took pictures with your phone, or received written material (such as PDFs, or links to vendor sites, Git repos, etc.), take the time after the conference to write your own inputs, and share them with your colleagues.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Knowledge sharing
&lt;/h2&gt;

&lt;p&gt;The most advanced way to expand your career is by sharing your knowledge and expertise, and personally, I prefer to write in English to have an audience from all around the world.&lt;br&gt;&lt;br&gt;
It doesn't matter which platform you choose; whatever you do will advance your career.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Develop soft skills&lt;/strong&gt; – The most important quality for anyone in the tech industry is to be able to communicate with others. It may be small talk with your peers in a coffee break, a conversation with a customer about an issue he's having, or the ability to explain a senior manager about technological topic, but in business terms.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Write a blog post&lt;/strong&gt; – This is an excellent way for anyone who has something to share and doesn't feel comfortable in front of an audience. You may share personal opinions on a topic, how-to guidelines, or even code samples. You don't even have to be an expert in a specific topic; whatever you share, people will read, and if it's valuable, people will follow your posts regularly.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Record videos or podcasts&lt;/strong&gt; – Both YouTube and Podcasts (such as Spotify) became very popular in the past decade. Begin small, share your insights, share your recordings over social media, and begin to attract followers around the world.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Provide lectures&lt;/strong&gt; – Regardless of the platform you choose, lectures are a great way to share knowledge and engage with colleagues and peers. You can choose video lectures (such as Zoom), on-site in small groups, or on the stage in front of a large audience, whatever you feel comfortable with. This is a great way to build your confidence and brand and advance your career.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mentorship&lt;/strong&gt; – This is a combination of someone who has a lot of knowledge (in at least one domain) and is generous enough to expand the knowledge of others. You can do it in one-on-one meetings, or even in small groups (since large groups tend to be ineffective in my perspective). Remember to provide your mentees honest feedback, and don't forget to ask for feedback for the work you do, to learn from your mistakes.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In this blog post, I shared a lot of ways anyone in the tech industry can expand their knowledge, gain experience, build a reputation, and be able to advance his or her career to the next level.&lt;br&gt;&lt;br&gt;
Learning never stops. There is always the next level you can learn in any topic.&lt;br&gt;&lt;br&gt;
According to &lt;a href="https://www.linkedin.com/in/wernervogels" rel="noopener noreferrer"&gt;Werner Vogels&lt;/a&gt;, Amazon CTO, a T-shaped person is someone who has deep expertise (the vertical bar of the T) in one specific domain, such as software development, cloud architecture, or data science, combined with broad knowledge and skills (the horizontal bar of the T) across multiple disciplines, such as communication, systems thinking, and collaboration.&lt;br&gt;&lt;br&gt;
To advance your career, you should always strive to build both depth and breadth in multidisciplinary domains.  &lt;/p&gt;

&lt;h3&gt;
  
  
  About the author
&lt;/h3&gt;

&lt;p&gt;Eyal Estrin is a seasoned cloud and information security architect, &lt;a href="https://builder.aws.com/community/@eyalestrin" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, and author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;. With over 25 years of experience in the IT industry, he brings deep expertise to his work.&lt;br&gt;&lt;br&gt;
Connect with Eyal on social media: &lt;a href="https://linktr.ee/eyalestrin" rel="noopener noreferrer"&gt;https://linktr.ee/eyalestrin&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The opinions expressed here are his own and do not reflect those of his employer.  &lt;/p&gt;

</description>
      <category>career</category>
      <category>learning</category>
      <category>community</category>
      <category>resources</category>
    </item>
    <item>
      <title>Goodbye to Static Credentials: Embrace Modern Identity Practices</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Mon, 15 Dec 2025 17:13:55 +0000</pubDate>
      <link>https://forem.com/aws-builders/goodbye-to-static-credentials-embrace-modern-identity-practices-163m</link>
      <guid>https://forem.com/aws-builders/goodbye-to-static-credentials-embrace-modern-identity-practices-163m</guid>
      <description>&lt;p&gt;When organizations used to build applications in the past (mostly on-prem, but also in the public cloud), a common practice for allowing services to authenticate between each other was to create a service account (sometimes referred to as an application account) and embed its credentials in code or configuration files.&lt;br&gt;&lt;br&gt;
Another common way to gain access to services was to use static credentials such as keys. For example – &lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html" rel="noopener noreferrer"&gt;AWS IAM user access keys&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
In this blog post, I will explain the risks related to using static credentials and provide recommendations when designing and building modern applications in the cloud.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction to static credentials
&lt;/h2&gt;

&lt;p&gt;Before we begin a conversation about static credentials, it is important to under why we need them in the first place.&lt;br&gt;&lt;br&gt;
Naturally, we don’t want to use a human (such as a developer, a DevOps or a DBA) credentials as part of code or configuration files to authenticate an application component (such as an API endpoint, or a front-end web application) to a backend service (such as a database, storage, message queue, etc.)&lt;br&gt;&lt;br&gt;
The most common practice for many years, which originated in on-prem legacy applications, was to create a service or application account and use it for non-interactive login.&lt;br&gt;&lt;br&gt;
Such identities are now known as NHI (or non-human identity), and since they were used as part of applications, and not as part of human/interactive login, they used to have static credentials (hopefully, long and complex passwords), and in most cases, their credentials were kept permanent and were never replaced (i.e., long-lived credentials).  &lt;/p&gt;

&lt;h2&gt;
  
  
  Static credentials risks
&lt;/h2&gt;

&lt;p&gt;Static credentials create persistent and often invisible weaknesses in cloud environments, offering attackers simple entry points while limiting an organization’s ability to detect misuse, enforce strong controls, or contain incidents effectively.&lt;br&gt;&lt;br&gt;
Below is a list of risks relating to the use of static credentials:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;High blast radius&lt;/strong&gt; - Broad, persistent access makes any compromise immediately severe.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Susceptibility to accidental exposure&lt;/strong&gt; - Frequent real-world leaks through code repos, logs, and CI artifacts make this a major threat vector.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lack of automatic short-lived expiration&lt;/strong&gt; - Keys remain valid long after they should not, enabling silent long-term abuse.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Difficult rotation and poor key hygiene&lt;/strong&gt; - Operational friction leads to rarely rotated, aging credentials that attackers can exploit for extended periods.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No strong binding to workloads&lt;/strong&gt; - Attackers can use stolen credentials from any location or infrastructure, increasing exploitability.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Credential reuse across environments&lt;/strong&gt; - Compromise of a single environment cascade laterally, expanding impact.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Limited visibility and enforcement&lt;/strong&gt; - Weak contextual signals hinder detection and prevent the application of strong access controls.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Target for automated reconnaissance&lt;/strong&gt; - Attackers routinely harvest exposed keys, though the actual impact depends on whether exposure occurs.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Poor alignment with zero trust principles&lt;/strong&gt; - Creates structural security gaps but typically manifests through other higher-ranked risks.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operational drag on incident response&lt;/strong&gt; - Increases containment time but is a downstream effect rather than a primary threat.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Vault as an interim solution
&lt;/h2&gt;

&lt;p&gt;In the past, organizations used to deploy solutions such as &lt;a href="https://docs.cyberark.com/pam-self-hosted/latest/en/content/pasimp/introducing-the-privileged-account-security-solution-intro.htm" rel="noopener noreferrer"&gt;CyberArk Privileged Access Manager&lt;/a&gt; or &lt;a href="https://www.beyondtrust.com/products/password-safe" rel="noopener noreferrer"&gt;BeyondTrust Password Safe&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
As organizations began to embrace the public cloud, organizations began to use managed secrets managers such as &lt;a href="https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html" rel="noopener noreferrer"&gt;AWS Secrets Manager&lt;/a&gt; or &lt;a href="https://developer.hashicorp.com/vault/docs" rel="noopener noreferrer"&gt;HashiCorp Vault&lt;/a&gt; (as a vendor-agnostic solution).  &lt;/p&gt;

&lt;p&gt;Although the mentioned solutions assisted in managing the entire lifecycle of static credentials (from generation, storage, retrieval and revocation), and the credentials weren’t as long as in the past, it still kept the problem of having static credentials.  &lt;/p&gt;

&lt;h2&gt;
  
  
  The Long-term solution
&lt;/h2&gt;

&lt;p&gt;The recommended solution is to avoid long-lived credentials when building modern applications in the cloud.&lt;br&gt;&lt;br&gt;
The alternative to using short-lived credentials depends on the use case.  &lt;/p&gt;

&lt;h3&gt;
  
  
  General purpose
&lt;/h3&gt;

&lt;p&gt;For most cloud workloads (such as compute, storage, database, messaging, integration, APIs, etc.), use a solution such as:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html" rel="noopener noreferrer"&gt;AWS IAM Role&lt;/a&gt; - is a temporary permission identity that workloads or users can assume to access AWS resources without relying on long-lived credentials.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Managed Kubernetes
&lt;/h3&gt;

&lt;p&gt;For allowing Pods within a managed Kubernetes environment access to cloud services, use a solution such as:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/pod-identities.html" rel="noopener noreferrer"&gt;AWS EKS Pod Identity&lt;/a&gt; - lets Kubernetes pods in an Amazon EKS cluster assume an IAM role and receive temporary credentials.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  External (Federated) Identities
&lt;/h3&gt;

&lt;p&gt;For scenarios that you need to grant access to external/federated identities through OIDC, temporary (or short-lived credentials) to resources in the cloud eco-system, use a solution such as:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html" rel="noopener noreferrer"&gt;AWS Security Token Service (AWS STS)&lt;/a&gt; - issues short-lived, scoped credentials that external or federated identities obtain through the AssumeRole API, allowing them to access AWS resources temporarily without relying on long-lived access.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  AI Agents
&lt;/h3&gt;

&lt;p&gt;For scenarios that you need to grant access to AI agents, access to resources in the cloud eco-system, use a solution such as:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/machine-learning/introducing-amazon-bedrock-agentcore-identity-securing-agentic-ai-at-scale/" rel="noopener noreferrer"&gt;Amazon Bedrock AgentCore Identity&lt;/a&gt; - Purpose-built IAM service for Bedrock agents with centralized agent identity directory, inbound authorizer (SigV4/OAuth/JWT), and outbound credential provider.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;When managing non-human identities, always prefer temporary (roles/managed identities) over static or long-lived credentials.&lt;br&gt;&lt;br&gt;
If the target service does not support temporary credentials, use short-lived credentials and regularly rotate the credentials to avoid potential credential breaches.&lt;br&gt;&lt;br&gt;
Whatever you do, never store credentials as part of code, configuration files, Git repositories, etc.&lt;br&gt;&lt;br&gt;
This blog post focused on solutions offered by the hyper-scale cloud providers; naturally, there are commercial solutions offering similar functionalities, including a single pane of glass for managing non-human identities for the entire cloud environments, including multi-cloud environments.&lt;br&gt;&lt;br&gt;
Disclaimer: AI tools were used to research and edit this article. Graphics are created using AI.  &lt;/p&gt;

&lt;h3&gt;
  
  
  About the author
&lt;/h3&gt;

&lt;p&gt;Eyal Estrin is a seasoned cloud and information security architect, &lt;a href="https://builder.aws.com/community/@eyalestrin" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, and author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;. With over 25 years of experience in the IT industry, he brings deep expertise to his work.&lt;br&gt;&lt;br&gt;
Connect with Eyal on social media: &lt;a href="https://linktr.ee/eyalestrin" rel="noopener noreferrer"&gt;https://linktr.ee/eyalestrin&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The opinions expressed here are his own and do not reflect those of his employer.  &lt;/p&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>security</category>
      <category>cloud</category>
    </item>
    <item>
      <title>FinOps for AI</title>
      <dc:creator>Eyal Estrin</dc:creator>
      <pubDate>Mon, 08 Dec 2025 14:23:43 +0000</pubDate>
      <link>https://forem.com/aws-builders/finops-for-ai-5cne</link>
      <guid>https://forem.com/aws-builders/finops-for-ai-5cne</guid>
      <description>&lt;p&gt;Today, we hear about so many organizations (from small start-ups to large enterprises) experimenting with GenAI applications, adding GenAI components to their existing workloads, and perhaps even moving from evaluation to production.&lt;br&gt;&lt;br&gt;
The increased usage of GenAI services requires organizations to pay attention to the cost of using GenAI services before the high and unpredictable cost generates additional failed projects.&lt;br&gt;&lt;br&gt;
In this blog post, I will share some common recommendations for implementing FinOps practices as part of GenAI workloads.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Real-Time Cost Visibility, Allocation, Tagging, and Accountability
&lt;/h2&gt;

&lt;p&gt;Lack of real-time visibility into cloud costs makes it difficult for organizations to track spending, identify waste, and assign accountability. Without clear, up-to-date cost allocation tied to projects or teams, overspending and inefficiencies often go unnoticed. Building transparent cost tracking and tagging practices empowers teams to monitor expenses continuously, optimize usage, and align spending with business goals.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations / Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Optimization Tools&lt;/strong&gt;: Software that identifies inefficiencies and recommends or automates cost-saving actions in cloud environments.
Common services: &lt;a href="https://docs.aws.amazon.com/cost-management/latest/userguide/ce-what-is.html" rel="noopener noreferrer"&gt;AWS Cost Explorer&lt;/a&gt;, &lt;a href="https://docs.aws.amazon.com/awssupport/latest/user/get-started-with-aws-trusted-advisor.html" rel="noopener noreferrer"&gt;AWS Trusted Advisor&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Estimate and Monitor Costs&lt;/strong&gt;: Tools to forecast upcoming cloud expenses and continuously track actual spend against budgets.
Common service: &lt;a href="https://docs.aws.amazon.com/pricing-calculator/latest/userguide/what-is-pricing-calculator.html" rel="noopener noreferrer"&gt;AWS Pricing Calculator&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Budgets, Alerts, and Cost Analysis&lt;/strong&gt;: Features that allow setting spending limits, notifying on overruns, and analyzing cost trends.
Common services: &lt;a href="https://docs.aws.amazon.com/cost-management/latest/userguide/budgets-managing-costs.html" rel="noopener noreferrer"&gt;AWS Budgets&lt;/a&gt;, &lt;a href="https://docs.aws.amazon.com/cost-management/latest/userguide/getting-started-ad.html" rel="noopener noreferrer"&gt;AWS Cost Anomaly Detection&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost Visibility, Allocation, tagging&lt;/strong&gt;: Mechanisms to attribute cloud costs accurately to applications, teams, or business units using tags and reports.
Common service: &lt;a href="https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/cost-alloc-tags.html" rel="noopener noreferrer"&gt;AWS Cost Allocation Tags&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token and Endpoint Cost Tracking&lt;/strong&gt;: Monitoring and reporting on usage-driven costs specifically related to API tokens and endpoint consumption.
Common service: &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/monitor_estimated_charges_with_cloudwatch.html" rel="noopener noreferrer"&gt;Amazon CloudWatch&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-Time Cost Visibility&lt;/strong&gt;: Providing immediate, up-to-date insights into cloud spend for timely decision-making and anomaly detection.
Common service: &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/query_with_cloudwatch-metrics-insights.html" rel="noopener noreferrer"&gt;Amazon CloudWatch Metrics Insights&lt;/a&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Rightsizing and Resource Optimization
&lt;/h2&gt;

&lt;p&gt;Rightsizing and resource Optimization ensure cloud resources are appropriately sized and efficiently used by continuously analyzing usage patterns and adjusting capacity to eliminate waste and meet actual demand, thereby reducing costs without compromising performance.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations / Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Choose Optimal Model and Inference Types&lt;/strong&gt;: Select foundation models and inference methods that precisely match your business needs to avoid paying for unnecessary capacity. Continuously evaluate workload requirements and prefer smaller, purpose-fit models over default larger ones to save costs.
Reference: &lt;a href="https://aws.amazon.com/blogs/enterprise-strategy/generative-ai-cost-optimization-strategies/" rel="noopener noreferrer"&gt;Generative AI Cost Optimization Strategies&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batching and Concurrency&lt;/strong&gt;: Efficiently batch inference requests and manage concurrency to maximize instance utilization and reduce cost per token or operation.
Reference: &lt;a href="https://www.nops.io/blog/genai-cost-optimization-the-essential-guide/" rel="noopener noreferrer"&gt;GenAI Cost Optimization: The Essential Guide&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Right-Sizing and Model Selection&lt;/strong&gt;: Regularly right-size infrastructure—compute, memory, GPU—to workload demand, using autoscaling, spot, and reserved instances to balance cost and performance. Avoid defaulting to high-end hardware for all workloads.
Reference: &lt;a href="https://www.finops.org/wg/optimizing-genai-usage/" rel="noopener noreferrer"&gt;Optimizing GenAI Usage&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leverage Cloud-Specific Cost Management Tools&lt;/strong&gt;: Use cloud vendor cost management and advisory tools to identify and implement cost-saving recommendations.
Common service: &lt;a href="https://docs.aws.amazon.com/compute-optimizer/latest/ug/what-is-compute-optimizer.html" rel="noopener noreferrer"&gt;AWS Compute Optimizer&lt;/a&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Intelligent Pricing Strategies: Reserved, Spot, and Preemptible Instances
&lt;/h2&gt;

&lt;p&gt;Reserved instances offer significant discounts for long-term, steady workloads by committing to a specific resource usage over one to three years, helping reduce costs compared to pay-as-you-go pricing. Spot and preemptible instances allow access to spare cloud capacity at substantially lower prices but with the risk of interruption, ideal for flexible or fault-tolerant tasks. Balancing these options with real-time workload needs enables cost-efficient cloud resource management while maintaining scalability and performance.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations / Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reserved Instances and Commitment Pricing&lt;/strong&gt;: Reserve instances or commit to savings plans for consistently running workloads to gain discounts of 30-70%. These long-term commitments reduce cost predictability and environmental stability.
Reference: &lt;a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-reserved-instances.html" rel="noopener noreferrer"&gt;Reserved Instances for Amazon EC2 overview&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spot&lt;/strong&gt;: Use spot for interruptible, fault-tolerant workloads like training and batch processing to save up to 90%. These resources are offered at deep discounts but can be reclaimed with short notice, requiring workload resilience and automation to manage interruptions.
Reference: &lt;a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html" rel="noopener noreferrer"&gt;Amazon EC2 Spot Instances&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-Scaling and Capacity Reservations&lt;/strong&gt;: Pair spot and reserved instances with auto-scaling and capacity reservations to dynamically adjust resources based on workload demand, Optimizing cost and performance balance.
Reference: &lt;a href="https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html" rel="noopener noreferrer"&gt;Amazon EC2 Auto Scaling&lt;/a&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Automation and Dynamic Scaling
&lt;/h2&gt;

&lt;p&gt;Automation and dynamic scaling enable cloud resources to automatically adjust in real time to changing workload demands, ensuring efficient performance during peak times while minimizing costs by scaling down when demand is low. This approach reduces manual intervention, optimizes resource use, improves reliability, and supports business agility by maintaining responsiveness under fluctuating traffic conditions.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations / Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automation and Idle Shutdown&lt;/strong&gt;: Implement automated policies that stop, pause, or scale down AI model endpoints and compute resources during idle or low-traffic periods to avoid unnecessary costs. This dynamic management prevents paying for unused capacity, especially in development and batch workloads.
References &lt;a href="https://aws.amazon.com/compute-optimizer/" rel="noopener noreferrer"&gt;AWS Compute Optimizer&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Serverless and Event-Driven Compute&lt;/strong&gt;: For variable or unpredictable inference workloads, leverage serverless compute options to pay strictly for consumed resources and scale automatically. This approach reduces operational overhead and costs.
Reference: &lt;a href="https://docs.aws.amazon.com/solutions/latest/modern-data-architecture-accelerator/genai-accelerator-starter-package.html" rel="noopener noreferrer"&gt;GenAI Accelerator Starter Package&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Scaling and GPU Pooling&lt;/strong&gt;: Use autoscaling and GPU pooling techniques (e.g., multi-instance GPU technologies) to maximize hardware utilization, reducing idle time and enabling more efficient processing of batch or concurrent inference tasks. This can significantly improve utilization from typical 25%+ levels to over 60%.
Reference: &lt;a href="https://www.finops.org/wg/optimizing-genai-usage/" rel="noopener noreferrer"&gt;Optimizing GenAI Usage&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Cost-Aware Model and Workflow Design
&lt;/h2&gt;

&lt;p&gt;Adopting a cost-aware approach to model and workflow design ensures financial insights are embedded in every step of the development lifecycle. By prioritizing real-time cost visibility, proactive forecasting, and iterative policy refinement, teams can anticipate spend early, align resource usage with business intent, and implement rapid adjustments as requirements evolve. This mindset promotes conscious decision-making, enabling organizations to balance performance and efficiency from the ground up.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations / Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Optimize prompt design and token usage&lt;/strong&gt;: Design applications with cost-aware prompting by minimizing prompt size and engineering efficient prompts. This reduces model invocations and token consumption, directly controlling costs.
References: &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/generative-ai-lens/cost-optimization.html" rel="noopener noreferrer"&gt;Generative AI Lens - Cost Optimization&lt;/a&gt;, &lt;a href="https://www.finops.org/wg/effect-of-optimization-on-ai-forecasting/" rel="noopener noreferrer"&gt;Effect of Optimization on AI Forecasting&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use prompt routing, caching, and inference Optimization&lt;/strong&gt;: Route requests to the most cost-effective models and cache frequent prompts to reduce expensive token processing. This approach can cut inference costs by 40-70%, according to FinOps guidance. Target inference workloads for Optimization since they account for 80-90% of GenAI spending.
Reference: &lt;a href="https://www.finops.org/wg/optimizing-genai-usage/" rel="noopener noreferrer"&gt;Optimizing GenAI Usage&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor and apply governance per FinOps best practices&lt;/strong&gt;: Incorporate real-time cost monitoring, forecasting, and governance aligned with FinOps principles to drive iterative cost improvements during the AI model lifecycle.
Reference: &lt;a href="https://www.finops.org/wg/effect-of-optimization-on-ai-forecasting/" rel="noopener noreferrer"&gt;Effect of Optimization on AI Forecasting&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Quotas, Monitoring, and Anomaly Detection
&lt;/h2&gt;

&lt;p&gt;Monitoring quotas and detecting anomalies with alerts ensures cloud resources are managed proactively. Setting alerts before limits are reached helps prevent service disruptions and enables timely capacity planning. This practice keeps cloud workloads reliable and cost-effective across environments.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations / Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Granular Monitoring and Cost Tracking&lt;/strong&gt;: Utilize advanced cost management tools with customizable dashboards to monitor usage and spending trends closely. Implement automated alerts and anomaly detection powered by machine learning to identify unexpected cost spikes and deviations early, enabling proactive cost control.
References: &lt;a href="https://aws.amazon.com/aws-cost-management/aws-cost-anomaly-detection/" rel="noopener noreferrer"&gt;AWS Cost Anomaly Detection&lt;/a&gt;, &lt;a href="https://www.prosperops.com/blog/cloud-cost-management/" rel="noopener noreferrer"&gt;Cloud Cost Management&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Utilization and Quotas Management&lt;/strong&gt;: Continuously monitor resource use across all clouds and set quotas to prevent overruns and runaway costs. Identify idle or low-traffic endpoints to shut down or consolidate, which reduces unnecessary spend. Apply quota management on large AI model endpoints to enforce cost limits during experimentation.
Reference: &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/rel_manage_service_limits_automated_monitor_limits.html" rel="noopener noreferrer"&gt;Automate quota management&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Usage Pattern Analysis and Feedback&lt;/strong&gt;: Establish continuous monitoring solutions to detect idle or under-utilized resources and optimize workflow efficiency. Encourage feedback loops between teams to align cost reduction with operational needs, following FinOps best practices.
Reference: &lt;a href="https://www.finops.org/wg/cost-estimation-of-ai-workloads/" rel="noopener noreferrer"&gt;Cost Estimation of AI Workloads&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Storage and Data Lifecycle Management
&lt;/h2&gt;

&lt;p&gt;Efficient storage and data lifecycle management are key to controlling cloud costs. Implementing automated lifecycle policies helps transition data across storage tiers based on access patterns and retention needs, while regularly auditing for orphaned or stale data prevents unnecessary spending. Embedding these practices early in the provisioning process ensures cost Optimization throughout the data lifecycle.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations / Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lifecycle and Storage Policies&lt;/strong&gt;: Implement automated data lifecycle management for model training datasets by shifting data to lower-cost storage tiers as access patterns change and removing obsolete or redundant data to reduce storage costs. This reduces provisioning waste and aligns storage use with business needs.
Reference: &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/security-pillar/sec_data_classification_lifecycle_management.html" rel="noopener noreferrer"&gt;AWS Data Lifecycle Management&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficient Storage and Data Handling&lt;/strong&gt;: Optimize data pipelines and storage choices by selecting cost-effective storage classes and managing data flow to minimize expensive resource usage during data processing steps that do not require high performance.
References: &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/analytics-lens/cost-optimization.html" rel="noopener noreferrer"&gt;AWS Cost Optimization&lt;/a&gt;, &lt;a href="https://www.finops.org/wg/cost-estimation-of-ai-workloads/" rel="noopener noreferrer"&gt;Cost Estimation of AI Workloads&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Team Enablement, Training, and Cost Ownership
&lt;/h2&gt;

&lt;p&gt;Empowering teams with clear cost ownership and targeted training fosters accountability and cost-conscious decision-making. Embedding cost awareness into daily workflows and providing role-specific education helps teams balance innovation and budget, driving a culture of shared responsibility for cloud spending.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations / Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Team Accountability&lt;/strong&gt;: Assign cost owners and embed cost awareness into engineering workflows, training, and planning. Empower teams to make model design and usage decisions with full visibility of financial impact.
References: &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/analytics-lens/cost-optimization.html" rel="noopener noreferrer"&gt;AWS Cost optimization&lt;/a&gt;, &lt;a href="https://www.finops.org/framework/capabilities/finops-education-enablement/" rel="noopener noreferrer"&gt;FinOps Education &amp;amp; Enablement&lt;/a&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Forecasting, Budgeting, and Predictive Insights
&lt;/h2&gt;

&lt;p&gt;Accurate forecasting, budgeting, and predictive insights enable organizations to anticipate cloud costs, align spending with business goals, and prevent budget overruns. Leveraging historical data, driver-based forecasting, and machine learning models helps create dynamic, actionable forecasts that drive financial accountability and proactive cost management.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations / Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Accountability, Budget Control, and Forecasting&lt;/strong&gt;: Assign cost ownership to workload teams and integrate showback or chargeback mechanisms to increase cost visibility and accountability. Use continuous forecasting tools that leverage historical data and growth plans to dynamically adjust budgets and commitments, aligning spending with business objectives.
References: &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/cost-optimization-pillar/practice-cloud-financial-management.html" rel="noopener noreferrer"&gt;AWS Practice Cloud Financial Management&lt;/a&gt;, &lt;a href="https://www.finops.org/wg/cloud-cost-forecasting/" rel="noopener noreferrer"&gt;Exploring Cloud Cost Forecasting&lt;/a&gt;, &lt;a href="https://www.finops.org/wg/cost-estimation-of-ai-workloads/" rel="noopener noreferrer"&gt;Cost Estimation of AI Workloads&lt;/a&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Governance, Policy, and Tooling Automation
&lt;/h2&gt;

&lt;p&gt;Automating governance policies ensures consistent compliance, security, and cost control in the cloud. By embedding policies into infrastructure workflows and deployment pipelines, organizations reduce manual errors and enforce rules proactively. This approach enables scalable, reliable oversight and quick remediation across diverse cloud environments.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations / Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Governance and Automation&lt;/strong&gt;: Use Optimization tools to recommend rightsizing, automatically terminate idle workloads, and enforce cost policies at scale for efficient cloud resource management.
References: &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/cost-optimization-pillar/governance.html" rel="noopener noreferrer"&gt;AWS Cost Optimization Pillar – Governance&lt;/a&gt;, &lt;a href="https://www.finops.org/framework/domains/optimize-usage-cost/" rel="noopener noreferrer"&gt;Optimize Usage &amp;amp; Cost&lt;/a&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In this long blog post, I have shared recommendations from various aspects for embedding FinOps practices as part of the design, deployment and maintenance of modern applications containing GenAI services.&lt;br&gt;&lt;br&gt;
Any organization must have proper design and visibility into the cost aspects of any application using GenAI components to avoid high cost, or at least be able to track expected costs as soon as possible.&lt;br&gt;&lt;br&gt;
I encourage the readers to review the hyper-scale cloud providers' documentation, understand service cost, and learn about best practices for cost Optimization.&lt;br&gt;&lt;br&gt;
I also encourage the readers to learn from the FinOps Foundation's official documentation and best practices as they deploy GenAI services.&lt;br&gt;&lt;br&gt;
Disclaimer: AI tools were used to research and edit this article. Graphics are created using AI.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Additional references
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/cost-optimization-pillar/welcome.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework - Cost Optimization Pillar&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/bedrock/cost-optimization/" rel="noopener noreferrer"&gt;Amazon Bedrock -Cost Optimization&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/solutions/guidance/cost-analysis-and-optimization-with-amazon-bedrock-agents/" rel="noopener noreferrer"&gt;Guidance for Cost Analysis and Optimization with Amazon Bedrock Agents&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/inference-cost-optimization.html" rel="noopener noreferrer"&gt;Amazon SageMaker - Inference cost Optimization best practices&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  About the author
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Eyal Estrin&lt;/strong&gt; is a seasoned cloud and information security architect, &lt;a href="https://builder.aws.com/community/@eyalestrin" rel="noopener noreferrer"&gt;AWS Community Builder&lt;/a&gt;, and author of &lt;a href="https://amzn.to/42Xai9A" rel="noopener noreferrer"&gt;Cloud Security Handbook&lt;/a&gt; and &lt;a href="https://amzn.to/3Sggbtv" rel="noopener noreferrer"&gt;Security for Cloud Native Applications&lt;/a&gt;. With over 25 years of experience in the IT industry, he brings deep expertise to his work.&lt;br&gt;&lt;br&gt;
Connect with Eyal on social media: &lt;a href="https://linktr.ee/eyalestrin" rel="noopener noreferrer"&gt;https://linktr.ee/eyalestrin&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
The opinions expressed here are his own and do not reflect those of his employer.  &lt;/p&gt;

</description>
      <category>aws</category>
      <category>azure</category>
      <category>googlecloud</category>
      <category>cloud</category>
    </item>
  </channel>
</rss>
