<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</title>
    <description>The latest articles on Forem by Yoshiki Fujiwara(藤原 善基)@AWS Community Builder (@yoshikifujiwara).</description>
    <link>https://forem.com/yoshikifujiwara</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1143688%2F2e0886ff-292c-4e8f-a588-bc7629c2321b.jpeg</url>
      <title>Forem: Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</title>
      <link>https://forem.com/yoshikifujiwara</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/yoshikifujiwara"/>
    <language>en</language>
    <item>
      <title>Near-Real-Time Processing, ML Inference, and Observability for FSx for ONTAP S3 Access Points — Phase 3 Architecture Patterns</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Wed, 06 May 2026 11:02:17 +0000</pubDate>
      <link>https://forem.com/yoshikifujiwara/near-real-time-processing-ml-inference-and-observability-for-fsx-for-ontap-s3-access-points--bkd</link>
      <guid>https://forem.com/yoshikifujiwara/near-real-time-processing-ml-inference-and-observability-for-fsx-for-ontap-s3-access-points--bkd</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;This is &lt;strong&gt;Phase 3&lt;/strong&gt; of the FSx for ONTAP S3 Access Points serverless patterns collection. Building on the &lt;a href="https://dev.to/yoshikifujiwara/fsx-for-ontap-s3-access-points-as-a-serverless-automation-boundary-ai-data-pipelines-ili"&gt;Phase 1 foundation&lt;/a&gt; and the &lt;a href="https://dev.to/yoshikifujiwara/9-more-industry-serverless-patterns-with-fsx-for-ontap-s3-access-points-semiconductor-genomics-15e4"&gt;14 industry patterns from Phase 2&lt;/a&gt;, Phase 3 adds three cross-cutting capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Near-real-time processing&lt;/strong&gt;: Kinesis Data Streams integration for minute-level change detection with seconds-level downstream processing after events are emitted (UC11)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ML inference pipeline&lt;/strong&gt;: SageMaker Batch Transform with Step Functions Callback Pattern for point cloud segmentation (UC9)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability stack&lt;/strong&gt;: X-Ray distributed tracing + CloudWatch EMF metrics across all 14 use cases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Streaming and SageMaker features are &lt;strong&gt;opt-in via CloudFormation Conditions&lt;/strong&gt; (default disabled, zero additional cost). Observability features (X-Ray, EMF, Dashboard, Alarms) can also be disabled but are enabled by default in the reference deployment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In Phase 1 and 2, we established a polling-based architecture using EventBridge Scheduler + Step Functions to process files stored on FSx for ONTAP via S3 Access Points. This approach works well for batch workloads with hourly or daily processing cycles, but some use cases demand faster response times.&lt;/p&gt;

&lt;p&gt;Phase 3 addresses three gaps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt;: Kinesis does not remove the need to detect changes — FSx for ONTAP S3 AP does not provide native event notifications. Instead, it decouples the discovery cadence (minute-level polling) from downstream processing and enables low-latency fan-out once change events are emitted.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ML Integration&lt;/strong&gt;: Large-scale inference jobs (like LiDAR point cloud segmentation) need asynchronous execution without blocking the Step Functions workflow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visibility&lt;/strong&gt;: As the pattern collection grows to 14 use cases, operators need centralized metrics, distributed tracing, and automated alerting.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Summary Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;AWS Services&lt;/th&gt;
&lt;th&gt;Verification&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Near-Real-Time Streaming&lt;/td&gt;
&lt;td&gt;Stream Producer + Consumer (UC11)&lt;/td&gt;
&lt;td&gt;Kinesis Data Streams, DynamoDB, Lambda&lt;/td&gt;
&lt;td&gt;✅ E2E (PutRecord → GetRecords → DynamoDB state transition)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ML Inference&lt;/td&gt;
&lt;td&gt;SageMaker Batch Transform (UC9)&lt;/td&gt;
&lt;td&gt;SageMaker, Step Functions Callback&lt;/td&gt;
&lt;td&gt;✅ Mock mode (Task_Token round-trip verified)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Distributed Tracing&lt;/td&gt;
&lt;td&gt;X-Ray instrumentation (all 14 UCs)&lt;/td&gt;
&lt;td&gt;AWS X-Ray&lt;/td&gt;
&lt;td&gt;✅ X-Ray support added across all Lambda templates (Active by default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Custom Metrics&lt;/td&gt;
&lt;td&gt;EMF output (all 14 UCs)&lt;/td&gt;
&lt;td&gt;CloudWatch EMF&lt;/td&gt;
&lt;td&gt;✅ 573 tests pass (EMF JSON round-trip property)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Centralized Dashboard&lt;/td&gt;
&lt;td&gt;CloudWatch Dashboard&lt;/td&gt;
&lt;td&gt;CloudWatch&lt;/td&gt;
&lt;td&gt;✅ Deployed (FSxN-S3AP-Patterns-Dashboard)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alert Automation&lt;/td&gt;
&lt;td&gt;CloudWatch Alarms + SNS&lt;/td&gt;
&lt;td&gt;CloudWatch, SNS, KMS&lt;/td&gt;
&lt;td&gt;✅ Deployed (composite + KMS-encrypted SNS; 15 alarms in reference deployment)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Change Detection&lt;/td&gt;
&lt;td&gt;DynamoDB state table&lt;/td&gt;
&lt;td&gt;DynamoDB&lt;/td&gt;
&lt;td&gt;✅ E2E (pending → completed state transition)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Cost Impact
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th&gt;Monthly Cost (when enabled)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Kinesis Streaming&lt;/td&gt;
&lt;td&gt;Disabled&lt;/td&gt;
&lt;td&gt;~$14/month (1 shard, ap-northeast-1; approximate, varies by region and retention settings)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SageMaker Batch Transform&lt;/td&gt;
&lt;td&gt;Disabled&lt;/td&gt;
&lt;td&gt;Pay-per-job (no persistent endpoint)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X-Ray Tracing&lt;/td&gt;
&lt;td&gt;Enabled&lt;/td&gt;
&lt;td&gt;Depends on trace volume; free tier includes 100K traces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch EMF&lt;/td&gt;
&lt;td&gt;Enabled&lt;/td&gt;
&lt;td&gt;Included in CloudWatch Logs pricing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dashboard + Alarms&lt;/td&gt;
&lt;td&gt;Enabled&lt;/td&gt;
&lt;td&gt;Varies by alarm count; reference deployment: 15 alarms + 1 dashboard&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;All features can be individually disabled via CloudFormation parameters (&lt;code&gt;EnableStreamingMode&lt;/code&gt;, &lt;code&gt;EnableSageMakerTransform&lt;/code&gt;, &lt;code&gt;EnableXRayTracing&lt;/code&gt;, &lt;code&gt;EnableCloudWatchAlarms&lt;/code&gt;). In cost-sensitive PoC environments, disable X-Ray and alarms if they are not needed.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Architecture Decision: Streaming vs Polling
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why Both Patterns?
&lt;/h3&gt;

&lt;p&gt;FSx for ONTAP S3 Access Points don't support &lt;code&gt;GetBucketNotificationConfiguration&lt;/code&gt; — there's no native event notification when files change. This means we must actively detect changes. The question is: how frequently, and how do we decouple detection from processing?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Polling&lt;/strong&gt; (Phase 1/2 approach):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EventBridge Scheduler (rate(1 hour)) → Step Functions → Discovery Lambda → Processing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Streaming&lt;/strong&gt; (Phase 3 addition):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EventBridge (rate(1 min)) → Stream Producer (detect changes) → Kinesis → Stream Consumer (process) → Pipeline
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: &lt;strong&gt;Kinesis doesn't detect changes faster&lt;/strong&gt; — the Stream Producer still polls at 1-minute intervals. What Kinesis provides is &lt;strong&gt;decoupled, low-latency fan-out&lt;/strong&gt; once change events are emitted. The consumer processes events within seconds of them appearing on the stream.&lt;/p&gt;

&lt;h3&gt;
  
  
  When to Use Each
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criteria&lt;/th&gt;
&lt;th&gt;Polling&lt;/th&gt;
&lt;th&gt;Streaming&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Detection latency&lt;/td&gt;
&lt;td&gt;Configurable (min 1 min)&lt;/td&gt;
&lt;td&gt;1 minute (producer polling)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Processing latency after detection&lt;/td&gt;
&lt;td&gt;Seconds to minutes (Step Functions)&lt;/td&gt;
&lt;td&gt;Seconds (Kinesis consumer)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File change rate&lt;/td&gt;
&lt;td&gt;&amp;lt; 1,000/hour&lt;/td&gt;
&lt;td&gt;&amp;gt; 1,000/hour&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost priority&lt;/td&gt;
&lt;td&gt;✅ Lower at low volume&lt;/td&gt;
&lt;td&gt;✅ Lower at high volume&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operational simplicity&lt;/td&gt;
&lt;td&gt;✅ Simpler&lt;/td&gt;
&lt;td&gt;More components&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Failure handling&lt;/td&gt;
&lt;td&gt;Step Functions Retry/Catch&lt;/td&gt;
&lt;td&gt;bisect-on-error + DLQ&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Hybrid Approach (Recommended)
&lt;/h3&gt;

&lt;p&gt;For production deployments, we recommend running both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Streaming&lt;/strong&gt; handles near-real-time processing (seconds-level latency after detection)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Polling&lt;/strong&gt; runs hourly as a consistency reconciliation pass&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This gives you the best of both worlds: fast downstream processing with guaranteed eventual consistency. If the streaming path fails, the polling path catches up automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  Kinesis Integration Deep Dive
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Stream Producer Design
&lt;/h3&gt;

&lt;p&gt;The Stream Producer Lambda runs every minute via EventBridge Scheduler. Its job is change detection:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Call &lt;code&gt;ListObjectsV2&lt;/code&gt; on the S3 Access Point to get the current file listing&lt;/li&gt;
&lt;li&gt;Compare against a DynamoDB state table (partition key: &lt;code&gt;file_key&lt;/code&gt;, attributes: &lt;code&gt;last_modified&lt;/code&gt;, &lt;code&gt;etag&lt;/code&gt;, &lt;code&gt;processing_status&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;For each new or modified file, write a change event to Kinesis Data Streams
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Change detection logic (simplified for illustration)
# Production implementation should avoid full table scans for large namespaces.
# Use paginated scans, BatchGetItem for listed keys, or prefix-partitioned state tracking.
&lt;/span&gt;&lt;span class="n"&gt;current_objects&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s3_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_objects_v2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;s3_ap_alias&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;stored_state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dynamodb_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;scan&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;current_objects&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;stored&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stored_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;stored&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;stored&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;etag&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ETag&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="c1"&gt;# New or modified file detected
&lt;/span&gt;        &lt;span class="n"&gt;streaming_helper&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_records&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="nf"&gt;create_event_record&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;CREATED&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;stored&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;MODIFIED&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;size&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Size&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;etag&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ETag&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Scaling note&lt;/strong&gt;: For clarity, the snippet uses &lt;code&gt;scan()&lt;/code&gt;. In production with large namespaces (10K+ objects), use paginated scans, &lt;code&gt;BatchGetItem&lt;/code&gt; for the keys returned by &lt;code&gt;ListObjectsV2&lt;/code&gt;, or prefix-partitioned state tracking to avoid full-table scans on every producer run.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Partition Key Strategy
&lt;/h3&gt;

&lt;p&gt;The partition key is derived from the file path prefix (first directory level). This ensures files in the same directory land on the same shard, enabling ordered processing within a directory while distributing load across shards.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Hot shard risk&lt;/strong&gt;: If most files land under the same prefix (e.g., all in &lt;code&gt;products/&lt;/code&gt;), this strategy can create a hot shard. For high-throughput workloads, consider adding a hash suffix or implementing a configurable partitioning strategy.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Stream Consumer and Dead-Letter Handling
&lt;/h3&gt;

&lt;p&gt;The Stream Consumer Lambda is triggered by Kinesis Event Source Mapping with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Batch size&lt;/strong&gt;: 10 records&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;bisect-on-error&lt;/strong&gt;: Enabled (splits failed batches to isolate problematic records)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maximum retry attempts&lt;/strong&gt;: 3&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Failed records are captured by the consumer logic and written to a DynamoDB dead-letter table for investigation. This is a custom implementation within the consumer Lambda — not the event source mapping's built-in on-failure destination (which targets SQS/SNS). The DynamoDB DLQ stores:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Original record data&lt;/li&gt;
&lt;li&gt;Error message and stack trace&lt;/li&gt;
&lt;li&gt;Timestamp of failure&lt;/li&gt;
&lt;li&gt;Retry count&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This avoids blocking the entire shard while preserving failed records for manual reprocessing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Idempotent Processing
&lt;/h3&gt;

&lt;p&gt;Since both the streaming and polling paths may process the same file, idempotency is critical. We use DynamoDB conditional writes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;dynamodb_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;file_key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;file_key&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;UpdateExpression&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;SET processing_status = :status, processed_at = :ts, etag = :etag&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ConditionExpression&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;attribute_not_exists(etag) OR etag &amp;lt;&amp;gt; :etag&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;COMPLETED&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:ts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;current_timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:etag&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;current_etag&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: In production, the condition should include the source object's ETag or &lt;code&gt;last_modified&lt;/code&gt; timestamp so that idempotency is tied to the file version, not only to processing time. This prevents stale events (arriving out of order) from overwriting newer processing results.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  SageMaker Callback Pattern
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;SageMaker Batch Transform jobs can run for minutes to hours depending on data volume. We can't have a Step Functions state waiting synchronously — that would block the workflow and incur unnecessary costs.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: .waitForTaskToken
&lt;/h3&gt;

&lt;p&gt;Step Functions' Callback Pattern (&lt;code&gt;.waitForTaskToken&lt;/code&gt;) is perfect for this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The workflow reaches the SageMaker step and pauses, generating a unique Task Token&lt;/li&gt;
&lt;li&gt;A Lambda function starts the Batch Transform job, storing a correlation ID&lt;/li&gt;
&lt;li&gt;The workflow waits without holding Lambda compute. With Standard Workflows, you pay for state transitions rather than Lambda runtime during the wait.&lt;/li&gt;
&lt;li&gt;When the job completes, an EventBridge rule triggers a callback Lambda&lt;/li&gt;
&lt;li&gt;The callback Lambda calls &lt;code&gt;SendTaskSuccess&lt;/code&gt; or &lt;code&gt;SendTaskFailure&lt;/code&gt; with the Task Token
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Step Functions state definition (simplified)&lt;/span&gt;
&lt;span class="na"&gt;SageMakerTransformStep&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Task&lt;/span&gt;
  &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn:aws:states:::lambda:invoke.waitForTaskToken&lt;/span&gt;
  &lt;span class="na"&gt;Parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;FunctionName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;SageMakerInvokeLambda&lt;/span&gt;
    &lt;span class="na"&gt;Payload&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;task_token.$&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$$.Task.Token&lt;/span&gt;
      &lt;span class="na"&gt;input_path.$&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$.point_cloud_s3_path&lt;/span&gt;
      &lt;span class="na"&gt;model_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;SageMakerModelName&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Task Token Propagation
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: AWS resource tags have value length limits (typically 256 characters). Step Functions Task Tokens can be significantly longer. In the production path, the Task Token should be stored in DynamoDB and correlated with the &lt;code&gt;TransformJobName&lt;/code&gt; or a short correlation ID. The SageMaker job tag should store only the correlation ID, avoiding tag value length limits.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Recommended production flow:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step Functions (.waitForTaskToken)
  → SageMaker Invoke Lambda (receives token in payload)
    → Store TaskToken in DynamoDB keyed by TransformJobName
    → CreateTransformJob (tag: "CorrelationId": "&amp;lt;short-id&amp;gt;")
      → Job completes → EventBridge rule fires
        → SageMaker Callback Lambda
          → Read CorrelationId from job tags / TransformJobName
          → Fetch TaskToken from DynamoDB
          → SendTaskSuccess/SendTaskFailure (returns token to Step Functions)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Mock mode flow&lt;/strong&gt; (used in this reference implementation):&lt;/p&gt;

&lt;p&gt;In mock mode, the Task Token is passed directly since no actual SageMaker job is created and the token doesn't need to survive across service boundaries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;MOCK_MODE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;false&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;true&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Generate mock segmentation output
&lt;/span&gt;    &lt;span class="n"&gt;mock_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_point_count&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;s3_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;output_bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;output_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mock_labels&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Directly call SendTaskSuccess (no tag length concern in mock mode)
&lt;/span&gt;    &lt;span class="n"&gt;sfn_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_task_success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;taskToken&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;task_token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;output_path&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;output_bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;output_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets you verify the entire workflow data flow without a trained model or tag length concerns.&lt;/p&gt;




&lt;h2&gt;
  
  
  Observability Stack
&lt;/h2&gt;

&lt;h3&gt;
  
  
  X-Ray Tracing
&lt;/h3&gt;

&lt;p&gt;Every Lambda function and Step Functions state machine in all 14 use cases now supports X-Ray active tracing (enabled by default via &lt;code&gt;EnableXRayTracing=true&lt;/code&gt;). This provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;End-to-end execution visualization&lt;/strong&gt;: See the complete path from EventBridge trigger through Discovery, Processing, and Report stages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency breakdown&lt;/strong&gt;: Identify which service calls (S3 AP, ONTAP API, Bedrock, Textract) contribute most to execution time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error correlation&lt;/strong&gt;: Trace errors back to their source across distributed components&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Graceful Degradation
&lt;/h4&gt;

&lt;p&gt;X-Ray SDK is an optional dependency. If not installed or if &lt;code&gt;ENABLE_XRAY&lt;/code&gt; is set to &lt;code&gt;false&lt;/code&gt;, the tracing decorators become no-ops:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@xray_subsegment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3ap_list_objects&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;annotations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;use_case&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retail-catalog&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;list_objects&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s3_ap_alias&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# If X-Ray SDK is unavailable, this decorator does nothing
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s3_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_objects_v2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;s3_ap_alias&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means existing deployments continue working without modification — X-Ray is purely additive.&lt;/p&gt;

&lt;h3&gt;
  
  
  CloudWatch Embedded Metrics Format (EMF)
&lt;/h3&gt;

&lt;p&gt;Every Lambda function emits structured metrics via EMF:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FilesProcessed&lt;/strong&gt; (Count): Number of files processed per invocation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ProcessingDuration&lt;/strong&gt; (Milliseconds): End-to-end processing time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ProcessingErrors&lt;/strong&gt; (Count): Number of errors encountered&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BytesProcessed&lt;/strong&gt; (Bytes): Total data volume processed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;EMF writes metrics as structured JSON log lines — no additional &lt;code&gt;PutMetricData&lt;/code&gt; API calls needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"_aws"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1700000000000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"CloudWatchMetrics"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Namespace"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FSxN-S3AP-Patterns"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Dimensions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"UseCase"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FunctionName"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Environment"&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Metrics"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FilesProcessed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"Unit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Count"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ProcessingDuration"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"Unit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Milliseconds"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"UseCase"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"retail-catalog"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"FunctionName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ImageTaggingLambda"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Environment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"prod"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"FilesProcessed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ProcessingDuration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2340&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Dashboard and Alerts
&lt;/h3&gt;

&lt;p&gt;A shared CloudFormation template (&lt;code&gt;shared/cfn/observability-dashboard.yaml&lt;/code&gt;) creates a CloudWatch dashboard with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Per-UC widgets: Step Functions success/failure, Lambda error rates, execution duration&lt;/li&gt;
&lt;li&gt;Cross-UC aggregation: Total files processed, overall error rate, P50/P90/P99 latency&lt;/li&gt;
&lt;li&gt;Kinesis widgets (when streaming enabled): Iterator age, incoming records, consumer lag&lt;/li&gt;
&lt;li&gt;SageMaker widgets (when enabled): Job duration, success/failure count&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Alert automation (&lt;code&gt;shared/cfn/alert-automation.yaml&lt;/code&gt;) provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Step Functions failure rate alarms (default: 3 failures in 5 minutes)&lt;/li&gt;
&lt;li&gt;Lambda error rate alarms (default: 5% in 5 minutes)&lt;/li&gt;
&lt;li&gt;Kinesis iterator age alarms (default: 60 seconds)&lt;/li&gt;
&lt;li&gt;Composite alarms for correlated failures&lt;/li&gt;
&lt;li&gt;SNS notifications with structured messages (email, optional Slack/PagerDuty)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Multi-Region Deployment
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Design Principles
&lt;/h3&gt;

&lt;p&gt;All CloudFormation templates use &lt;code&gt;${AWS::Region}&lt;/code&gt; for dynamic resource construction — no hardcoded region references. This means you can deploy to any region where the required services are available.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 3 Service Availability
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Availability&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Kinesis Data Streams&lt;/td&gt;
&lt;td&gt;Nearly all commercial regions&lt;/td&gt;
&lt;td&gt;Shard pricing varies by region&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SageMaker Batch Transform&lt;/td&gt;
&lt;td&gt;Nearly all regions&lt;/td&gt;
&lt;td&gt;Instance type availability varies&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X-Ray&lt;/td&gt;
&lt;td&gt;All commercial regions&lt;/td&gt;
&lt;td&gt;No constraints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch EMF&lt;/td&gt;
&lt;td&gt;All commercial regions&lt;/td&gt;
&lt;td&gt;No constraints&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Pre-Deployment Checklist
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Check the &lt;a href="https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/" rel="noopener noreferrer"&gt;AWS Regional Services List&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;For Kinesis: Verify shard pricing in your target region&lt;/li&gt;
&lt;li&gt;For SageMaker: Confirm your desired instance type is available&lt;/li&gt;
&lt;li&gt;For Cross-Region UCs (Textract, Comprehend Medical): Confirm target region connectivity&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Verification Results
&lt;/h2&gt;

&lt;p&gt;All Phase 3 features were verified in ap-northeast-1 (Tokyo) against a live FSx for ONTAP environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Environment Verification
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Test&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;S3 Access Point ListObjectsV2&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;Via fsxn-eda-s3ap alias&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kinesis CreateStream + PutRecord + GetRecords&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;1 shard, SSE-KMS, partition key routing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB State Table CRUD&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;PAY_PER_REQUEST, conditional writes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB Dead-Letter Table&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;record_id partition key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E2E Streaming Pipeline&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;Producer → Kinesis → Consumer → DynamoDB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudFormation validate-template&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;All 14 UC templates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;cfn-lint&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;0 errors across all templates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch Dashboard deploy&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;CREATE_COMPLETE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alert Automation deploy&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;CREATE_COMPLETE (KMS + SNS + 3 alarms)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC11 Full Stack deploy&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;36 resources (EnableStreamingMode=true)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC9 Full Stack deploy&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;33 resources (EnableSageMakerTransform=true, MockMode=true)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;UC11 Step Functions E2E&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;SUCCEEDED&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Discovery → ImageTagging → CatalogMetadata → QualityCheck (8.974s)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X-Ray Tracing&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;TraceId generated, Stream Producer traces visible in X-Ray console&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch Alarms&lt;/td&gt;
&lt;td&gt;✅ PASS&lt;/td&gt;
&lt;td&gt;15 alarms active (12 OK, 1 ALARM from duration baseline, 2 INSUFFICIENT_DATA). The ALARM state was expected due to a deliberately low duration baseline (2x multiplier) used for validation.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Local Test Results
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Suite&lt;/th&gt;
&lt;th&gt;Tests&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;shared/streaming/&lt;/td&gt;
&lt;td&gt;18 (16 unit + 2 property)&lt;/td&gt;
&lt;td&gt;✅ All pass&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;shared/observability.py&lt;/td&gt;
&lt;td&gt;23 (19 unit + 4 property)&lt;/td&gt;
&lt;td&gt;✅ All pass&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;retail-catalog (Phase 3)&lt;/td&gt;
&lt;td&gt;17 (producer + consumer)&lt;/td&gt;
&lt;td&gt;✅ All pass&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;autonomous-driving (Phase 3)&lt;/td&gt;
&lt;td&gt;22 (invoke + callback + properties)&lt;/td&gt;
&lt;td&gt;✅ All pass&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total (all UCs)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;573&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;All pass&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Property-Based Tests (Hypothesis)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;StreamingConfig round-trip serialization&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Record batching preserves count and content&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;EMF JSON round-trip validity&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;xray_subsegment no-op when disabled&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Task_Token propagation round-trip&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Point count invariant (input == output)&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Error state propagation (failed → SendTaskFailure)&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Networking and Access
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. S3 Access Points Don't Appear in Bucket Lists
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;aws s3api list-buckets&lt;/code&gt; and &lt;code&gt;aws s3 ls&lt;/code&gt; don't show S3 Access Point aliases. You must access them directly via &lt;code&gt;aws s3 ls s3://&amp;lt;alias&amp;gt;/&lt;/code&gt; or check the FSx console's volume S3 tab. This caught us during initial verification when we thought the access points had been deleted.&lt;/p&gt;

&lt;h4&gt;
  
  
  2. S3 Access Point IAM Policies Require Two ARN Formats
&lt;/h4&gt;

&lt;p&gt;S3 Access Points require both the alias format (&lt;code&gt;arn:aws:s3:::${alias}&lt;/code&gt;) and the ARN format (&lt;code&gt;arn:aws:s3:${region}:${account}:accesspoint/*&lt;/code&gt;) in IAM policies. The alias format handles S3 API routing, while the ARN format satisfies IAM policy evaluation. Missing either format results in &lt;code&gt;AccessDenied&lt;/code&gt; errors.&lt;/p&gt;

&lt;p&gt;At a high level, include both forms in your IAM policy Resource block:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Alias format: &lt;code&gt;arn:aws:s3:::${S3AccessPointAlias}&lt;/code&gt; and &lt;code&gt;.../*&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Access point ARN format: &lt;code&gt;arn:aws:s3:${Region}:${AccountId}:accesspoint/*&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See the &lt;a href="https://dev.to/yoshikifujiwara/9-more-industry-serverless-patterns-with-fsx-for-ontap-s3-access-points-semiconductor-genomics-15e4"&gt;Phase 2 article's Design Decisions section&lt;/a&gt; or the CloudFormation templates in the repository for the full IAM policy pattern.&lt;/p&gt;

&lt;h4&gt;
  
  
  3. Verify the Actual DNS and VPC Endpoint Path for S3 Access Points
&lt;/h4&gt;

&lt;p&gt;During verification, S3 AP access from VPC-attached Lambda required careful validation of the DNS resolution path, route table associations, VPC endpoint policies, and access point network origin. Do not assume that creating an S3 Gateway Endpoint alone guarantees successful S3 AP access in every topology.&lt;/p&gt;

&lt;p&gt;S3 Access Points can work with both S3 Gateway Endpoints and S3 Interface Endpoints (&lt;a href="https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/configuring-network-access-for-s3-access-points.html" rel="noopener noreferrer"&gt;AWS docs&lt;/a&gt;). However, the VPC endpoint policy must allow the required S3 Access Point resources and actions, and the IAM policy must include the ARN formats used by the implementation (see Lesson #2 above). Additionally, if the access point uses VPC origin, ensure &lt;code&gt;enableDnsHostnames&lt;/code&gt; and &lt;code&gt;enableDnsSupport&lt;/code&gt; are enabled on the VPC.&lt;/p&gt;

&lt;p&gt;In our case, the root cause was the S3 Gateway Endpoint not being associated with the Lambda subnet's route table — a simple but easy-to-miss configuration issue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verified&lt;/strong&gt;: After associating the S3 Gateway Endpoint with the correct route table and fixing the IAM policy (two ARN formats), S3 AP access via Gateway Endpoint worked successfully. No S3 Interface Endpoint was needed.&lt;/p&gt;

&lt;h4&gt;
  
  
  4. Interface VPC Endpoint Security Group Design
&lt;/h4&gt;

&lt;p&gt;Interface VPC Endpoints should use a &lt;strong&gt;dedicated security group&lt;/strong&gt; (separate from the Lambda SG) with an ingress rule allowing HTTPS (443) from the Lambda security group. Using the same SG for both Lambda and Interface VPC Endpoints creates confusion with self-referencing rules and can lead to connectivity issues. Note that Gateway VPC Endpoints do not use security groups — they rely on route table associations and endpoint policies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deployment and Packaging
&lt;/h3&gt;

&lt;h4&gt;
  
  
  5. DynamoDB Table Creation Timing
&lt;/h4&gt;

&lt;p&gt;DynamoDB tables in PAY_PER_REQUEST mode take 5-10 seconds to become ACTIVE after CreateTable. Immediate PutItem calls will fail with &lt;code&gt;ResourceNotFoundException&lt;/code&gt;. In CloudFormation this is handled by dependency ordering, but in scripts always use &lt;code&gt;aws dynamodb wait table-exists&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  6. VPC Lambda ENI Cleanup Takes 10-20 Minutes
&lt;/h4&gt;

&lt;p&gt;When deleting CloudFormation stacks with VPC-attached Lambda functions, the ENI (Elastic Network Interface) cleanup can take 10-20 minutes. This is a known AWS behavior. Use &lt;code&gt;--deletion-mode FORCE_DELETE_STACK&lt;/code&gt; for stuck DELETE_FAILED stacks.&lt;/p&gt;

&lt;h4&gt;
  
  
  7. Handler Path Flattening with &lt;code&gt;aws cloudformation package&lt;/code&gt;
&lt;/h4&gt;

&lt;p&gt;When &lt;code&gt;aws cloudformation package&lt;/code&gt; uploads Lambda code to S3, it flattens the directory structure. If your template uses &lt;code&gt;Handler: retail-catalog/functions/discovery/handler.handler&lt;/code&gt;, the packaged template must be updated to &lt;code&gt;Handler: handler.handler&lt;/code&gt;. We automated this with a &lt;code&gt;sed&lt;/code&gt; post-processing step in the deploy script.&lt;/p&gt;

&lt;h4&gt;
  
  
  8. Lambda Packaging: Individual ZIPs Required for Shared Modules
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;aws cloudformation package&lt;/code&gt; zips the template's directory, but shared modules in parent directories are excluded. For this project, each Lambda function requires an individual ZIP containing both &lt;code&gt;handler.py&lt;/code&gt; and the &lt;code&gt;shared/&lt;/code&gt; module at the root level. The deploy script handles this automatically via a &lt;code&gt;package_lambda()&lt;/code&gt; helper function.&lt;/p&gt;

&lt;h3&gt;
  
  
  Workflow and ML Integration
&lt;/h3&gt;

&lt;h4&gt;
  
  
  9. Task Token Length and SageMaker Job Tags
&lt;/h4&gt;

&lt;p&gt;AWS resource tags typically have a 256-character value limit. Step Functions Task Tokens can exceed this. For production SageMaker integrations, store the Task Token in DynamoDB and pass only a short correlation ID as a job tag. The mock mode in this reference implementation passes the token directly since no actual SageMaker job is created.&lt;/p&gt;

&lt;h4&gt;
  
  
  10. Opt-in Design Validates Backward Compatibility
&lt;/h4&gt;

&lt;p&gt;By defaulting streaming and SageMaker features to disabled (CloudFormation Conditions), we confirmed zero impact on existing Phase 1/2 deployments. The same template works for both "Phase 2 mode" (features disabled) and "Phase 3 mode" (features enabled).&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Phase 3 transforms the FSx for ONTAP S3 Access Points pattern collection from a batch-oriented toolkit into a near-real-time capable platform with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Faster downstream processing&lt;/strong&gt;: Kinesis streaming enables seconds-level processing after minute-level change detection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ML integration&lt;/strong&gt;: SageMaker Callback Pattern provides scalable, cost-effective inference without persistent endpoints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Production visibility&lt;/strong&gt;: X-Ray + EMF + Dashboard + Alerts give operators full observability across all 14 use cases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Streaming and SageMaker features are opt-in with zero cost when disabled. Observability is enabled by default but can be individually toggled, maintaining backward compatibility with Phase 1/2 deployments.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's Next
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Event-driven architecture exploration (when FSx ONTAP S3 AP supports native notifications — eliminating the polling requirement entirely)&lt;/li&gt;
&lt;li&gt;DynamoDB-based Task Token storage for production SageMaker integrations&lt;/li&gt;
&lt;li&gt;Additional ML patterns (real-time inference endpoints, A/B testing)&lt;/li&gt;
&lt;li&gt;Multi-account deployment patterns with AWS Organizations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article is part of the "FSx for ONTAP S3 Access Points" series. See &lt;a href="https://dev.to/yoshikifujiwara/fsx-for-ontap-s3-access-points-as-a-serverless-automation-boundary-ai-data-pipelines-ili"&gt;Phase 1&lt;/a&gt; and &lt;a href="https://dev.to/yoshikifujiwara/9-more-industry-serverless-patterns-with-fsx-for-ontap-s3-access-points-semiconductor-genomics-15e4"&gt;Phase 2&lt;/a&gt; for the foundation.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>amazonfsxfornetappontap</category>
      <category>s3accesspoints</category>
    </item>
    <item>
      <title>9 More Industry Serverless Patterns with FSx for ONTAP S3 Access Points — Semiconductor, Genomics, Energy, and Beyond</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Tue, 05 May 2026 14:35:03 +0000</pubDate>
      <link>https://forem.com/yoshikifujiwara/9-more-industry-serverless-patterns-with-fsx-for-ontap-s3-access-points-semiconductor-genomics-15e4</link>
      <guid>https://forem.com/yoshikifujiwara/9-more-industry-serverless-patterns-with-fsx-for-ontap-s3-access-points-semiconductor-genomics-15e4</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;This is &lt;strong&gt;Phase 2&lt;/strong&gt; of the FSx for ONTAP S3 Access Points serverless patterns collection. Building on the &lt;a href="https://dev.to/yoshikifujiwara/industry-specific-serverless-automation-patterns-with-fsx-for-ontap-s3-access-points-3e0a"&gt;5 patterns from Phase 1&lt;/a&gt;, we add &lt;strong&gt;9 new industry-specific patterns&lt;/strong&gt; covering semiconductor, genomics, energy, autonomous driving, construction, retail, logistics, education, and insurance.&lt;/p&gt;

&lt;p&gt;Key additions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cross-region AI/ML&lt;/strong&gt;: Textract and Comprehend Medical routed from ap-northeast-1 to us-east-1&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Large-file / high-object-count building blocks&lt;/strong&gt;: Streaming download, multipart upload, 10K+ object pagination&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Core AI/ML integrations E2E verified via Lambda&lt;/strong&gt;: Rekognition (15 labels), Textract (text extraction), Comprehend Medical (entity detection), Bedrock (report generation), Athena (SQL queries)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;9 CloudFormation stacks deployed&lt;/strong&gt;: 205 resources, all Step Functions SUCCEEDED&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;UC&lt;/th&gt;
&lt;th&gt;Industry&lt;/th&gt;
&lt;th&gt;Main Data Types&lt;/th&gt;
&lt;th&gt;AWS Services&lt;/th&gt;
&lt;th&gt;Verification&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;UC6&lt;/td&gt;
&lt;td&gt;Semiconductor / EDA&lt;/td&gt;
&lt;td&gt;GDS, OASIS&lt;/td&gt;
&lt;td&gt;Athena, Bedrock&lt;/td&gt;
&lt;td&gt;✅ E2E (Athena 4 queries + Bedrock report)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC7&lt;/td&gt;
&lt;td&gt;Genomics&lt;/td&gt;
&lt;td&gt;FASTQ, VCF&lt;/td&gt;
&lt;td&gt;Athena, Bedrock, Comprehend Medical&lt;/td&gt;
&lt;td&gt;✅ E2E (entity detection via cross-region)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC8&lt;/td&gt;
&lt;td&gt;Energy / Oil &amp;amp; Gas&lt;/td&gt;
&lt;td&gt;SEG-Y, Well Logs&lt;/td&gt;
&lt;td&gt;Athena, Bedrock, Rekognition (optional)&lt;/td&gt;
&lt;td&gt;✅ E2E (SEG-Y header + anomaly detection)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC9&lt;/td&gt;
&lt;td&gt;Autonomous Driving&lt;/td&gt;
&lt;td&gt;Video, LiDAR&lt;/td&gt;
&lt;td&gt;Rekognition, Bedrock&lt;/td&gt;
&lt;td&gt;✅ Step Functions SUCCEEDED&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC10&lt;/td&gt;
&lt;td&gt;Construction / AEC&lt;/td&gt;
&lt;td&gt;IFC, PDF&lt;/td&gt;
&lt;td&gt;Textract, Bedrock, Rekognition&lt;/td&gt;
&lt;td&gt;✅ Textract cross-region + workflow succeeded&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC11&lt;/td&gt;
&lt;td&gt;Retail / E-Commerce&lt;/td&gt;
&lt;td&gt;Product Images&lt;/td&gt;
&lt;td&gt;Rekognition, Bedrock&lt;/td&gt;
&lt;td&gt;✅ E2E (15 labels detected)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC12&lt;/td&gt;
&lt;td&gt;Logistics&lt;/td&gt;
&lt;td&gt;Delivery Slips, Images&lt;/td&gt;
&lt;td&gt;Textract, Rekognition, Bedrock&lt;/td&gt;
&lt;td&gt;✅ E2E (text extraction cross-region)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC13&lt;/td&gt;
&lt;td&gt;Education / Research&lt;/td&gt;
&lt;td&gt;PDF Papers&lt;/td&gt;
&lt;td&gt;Textract, Comprehend, Bedrock&lt;/td&gt;
&lt;td&gt;✅ Step Functions SUCCEEDED&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC14&lt;/td&gt;
&lt;td&gt;Insurance / Claims&lt;/td&gt;
&lt;td&gt;Photos, Estimates&lt;/td&gt;
&lt;td&gt;Rekognition, Textract, Bedrock&lt;/td&gt;
&lt;td&gt;✅ E2E (labels + OCR cross-region)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Design Decisions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  IAM Policy for S3 Access Points
&lt;/h3&gt;

&lt;p&gt;FSx ONTAP S3 Access Points require &lt;strong&gt;two ARN formats&lt;/strong&gt; in IAM policies. In this implementation, both formats were required to satisfy S3 API access and IAM evaluation paths:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:s3:::${S3AccessPointAlias}"&lt;/span&gt;        &lt;span class="c1"&gt;# Alias format (S3 API)&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:s3:::${S3AccessPointAlias}/*"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:s3:${AWS::Region}:${AWS::AccountId}:accesspoint/${S3AccessPointName}"&lt;/span&gt;  &lt;span class="c1"&gt;# ARN format (IAM evaluation)&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:s3:${AWS::Region}:${AWS::AccountId}:accesspoint/${S3AccessPointName}/*"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  VPC Endpoints for Lambda
&lt;/h3&gt;

&lt;p&gt;In the private-subnet / no-NAT deployment model, the Lambda functions need the following endpoints:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Costs are approximate (single-AZ in ap-northeast-1) and vary by region and AZ count.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Endpoint&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Secrets Manager&lt;/td&gt;
&lt;td&gt;Interface&lt;/td&gt;
&lt;td&gt;~$7.20/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FSx&lt;/td&gt;
&lt;td&gt;Interface&lt;/td&gt;
&lt;td&gt;~$7.20/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch Monitoring&lt;/td&gt;
&lt;td&gt;Interface&lt;/td&gt;
&lt;td&gt;~$7.20/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch Logs&lt;/td&gt;
&lt;td&gt;Interface&lt;/td&gt;
&lt;td&gt;~$7.20/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SNS&lt;/td&gt;
&lt;td&gt;Interface&lt;/td&gt;
&lt;td&gt;~$7.20/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3&lt;/td&gt;
&lt;td&gt;Gateway&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Free&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Key lesson&lt;/strong&gt;: The &lt;code&gt;monitoring&lt;/code&gt; endpoint is for CloudWatch Metrics, not Logs. You need a separate &lt;code&gt;logs&lt;/code&gt; endpoint for Lambda to write CloudWatch Logs from inside a VPC. The SNS endpoint is required for notification publishing from Report Lambda in private subnets.&lt;/p&gt;

&lt;h3&gt;
  
  
  boto3 Service Name Gotcha
&lt;/h3&gt;

&lt;p&gt;The correct boto3 service name for Comprehend Medical is &lt;code&gt;comprehendmedical&lt;/code&gt; (no hyphen), not &lt;code&gt;comprehend-medical&lt;/code&gt;. This caused silent failures in early testing where the service was skipped with a WARNING rather than crashing the workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's New in Phase 2
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Cross-Region Client
&lt;/h3&gt;

&lt;p&gt;Textract and Comprehend Medical are unavailable in ap-northeast-1 (Tokyo). Phase 2 introduces a &lt;code&gt;CrossRegionClient&lt;/code&gt; that transparently routes API calls to us-east-1:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;shared.cross_region_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CrossRegionClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CrossRegionConfig&lt;/span&gt;

&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CrossRegionConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;target_region&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-east-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;services&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;textract&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;comprehendmedical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CrossRegionClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Textract in us-east-1
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_bytes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pdf_bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Comprehend Medical in us-east-1
&lt;/span&gt;&lt;span class="n"&gt;entities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detect_entities_v2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;medical_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The client includes an allow-list to prevent accidental cross-region calls to unintended services, and raises &lt;code&gt;CrossRegionClientError&lt;/code&gt; with region and service context for debugging.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Data residency note&lt;/strong&gt;: For regulated workloads, cross-region invocation should be explicitly reviewed for data residency, audit logging, and compliance requirements. The allow-list in &lt;code&gt;CrossRegionClient&lt;/code&gt; is intended to make cross-region behavior explicit rather than implicit.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Streaming Download &amp;amp; Multipart Upload
&lt;/h3&gt;

&lt;p&gt;Phase 2 use cases are designed for large-file and high-object-count workloads such as SEG-Y, FASTQ/VCF, BIM, and media assets. The &lt;code&gt;S3ApHelper&lt;/code&gt; now supports:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Streaming download — never loads entire file into memory
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;s3ap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;streaming_download&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;large-file.segy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Range download — read only SEG-Y header (first 3600 bytes)
&lt;/span&gt;&lt;span class="n"&gt;header&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s3ap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;streaming_download_range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;survey.segy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3599&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Multipart upload — automatic abort on failure
&lt;/span&gt;&lt;span class="n"&gt;s3ap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;multipart_upload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output.parquet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_chunks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;part_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Discovery Lambda Pagination
&lt;/h3&gt;

&lt;p&gt;For volumes with 10,000+ objects, Discovery Lambda automatically paginates manifests into chunks for Step Functions Map processing.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 9 New Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  UC6: Semiconductor / EDA — Design File Validation
&lt;/h3&gt;

&lt;p&gt;Detects GDS/OASIS design files, extracts metadata (library name, cell count, bounding box, creation date), aggregates DRC statistics with Athena SQL, and generates design review reports with Bedrock.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discovery → Map(MetadataExtraction) → DrcAggregation(Athena) → ReportGeneration(Bedrock + SNS)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Athena, Glue Data Catalog, Bedrock (Nova Lite)&lt;br&gt;
&lt;strong&gt;Verification&lt;/strong&gt;: ✅ GDS metadata extracted, Athena 4 queries succeeded, Bedrock report generated&lt;/p&gt;
&lt;h3&gt;
  
  
  UC7: Genomics / Bioinformatics — Quality Check &amp;amp; Variant Aggregation
&lt;/h3&gt;

&lt;p&gt;Processes FASTQ files for quality metrics (total reads, average quality score, GC content), aggregates VCF variant statistics (SNP count, indel count, Ti/Tv ratio), and generates research summaries with biomedical entity extraction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discovery → Parallel[QcMap(FASTQ), VariantMap(VCF)] → AthenaAnalysis → Summary(Bedrock + Comprehend Medical)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Athena, Bedrock, Comprehend Medical (cross-region us-east-1)&lt;br&gt;
&lt;strong&gt;Verification&lt;/strong&gt;: ✅ QC metrics extracted, variants aggregated, Comprehend Medical detected entities from generated biomedical summary text&lt;/p&gt;
&lt;h3&gt;
  
  
  UC8: Energy / Oil &amp;amp; Gas — Seismic Data Processing
&lt;/h3&gt;

&lt;p&gt;Reads SEG-Y binary headers (first 3600 bytes via range download) for survey metadata, detects anomalies in well log sensor readings using statistical thresholds, and generates compliance reports. Rekognition is used for optional image-based inspection of well-log visualization artifacts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discovery → Parallel[SeismicMetadata(Range DL), AnomalyDetection(Well Logs)] → AthenaAnalysis → ComplianceReport(Bedrock + Rekognition)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Athena, Bedrock, Rekognition (well-log image pattern recognition)&lt;br&gt;
&lt;strong&gt;Verification&lt;/strong&gt;: ✅ SEG-Y header parsed, anomaly detection executed, compliance report generated&lt;/p&gt;
&lt;h3&gt;
  
  
  UC9: Autonomous Driving / ADAS — Labeling Preprocessing
&lt;/h3&gt;

&lt;p&gt;Extracts keyframes from dashcam video, performs Rekognition object detection (vehicles, pedestrians, and other road-scene labels), validates LiDAR point cloud data integrity, and generates COCO-compatible annotation suggestions with Bedrock.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discovery → Parallel[FrameExtraction(Rekognition), PointCloudQC] → AnnotationManager(Bedrock)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Rekognition, Bedrock&lt;br&gt;
&lt;strong&gt;Extension&lt;/strong&gt;: SageMaker Batch Transform for point cloud segmentation (planned)&lt;br&gt;
&lt;strong&gt;Verification&lt;/strong&gt;: ✅ Step Functions SUCCEEDED&lt;/p&gt;
&lt;h3&gt;
  
  
  UC10: Construction / AEC — BIM Model Management
&lt;/h3&gt;

&lt;p&gt;Parses IFC files for building metadata, performs version diff detection, OCRs blueprint PDFs with Textract (cross-region), and checks safety compliance rules with Bedrock + Rekognition.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discovery → Parallel[BimParse(IFC), OcrMap(Textract)] → SafetyCheck(Bedrock + Rekognition)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Textract (cross-region), Bedrock, Rekognition&lt;br&gt;
&lt;strong&gt;Verification&lt;/strong&gt;: ✅ Textract text extraction confirmed, Step Functions workflow succeeded&lt;/p&gt;
&lt;h3&gt;
  
  
  UC11: Retail / E-Commerce — Product Image Tagging
&lt;/h3&gt;

&lt;p&gt;Detects product images, performs Rekognition label detection with confidence scoring, generates structured catalog metadata with Bedrock, and flags low-quality images for manual review.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discovery → ImageTagging(Rekognition) → CatalogMetadata(Bedrock) → QualityCheck
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Rekognition, Bedrock&lt;br&gt;
&lt;strong&gt;Verification&lt;/strong&gt;: ✅ &lt;strong&gt;15 labels detected&lt;/strong&gt; (Lighting 98.5%, Light 96.0%, Purple 92.0%)&lt;/p&gt;
&lt;h3&gt;
  
  
  UC12: Logistics / Supply Chain — Delivery Slip OCR
&lt;/h3&gt;

&lt;p&gt;OCRs delivery slips with Textract (cross-region), normalizes extracted fields with Bedrock, analyzes warehouse inventory images with Rekognition, and generates delivery and routing summary reports.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discovery → Parallel[OcrMap(Textract), InventoryMap(Rekognition)] → DataStructuring(Bedrock) → Report(Bedrock + SNS)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Textract (cross-region), Rekognition, Bedrock&lt;br&gt;
&lt;strong&gt;Verification&lt;/strong&gt;: ✅ Textract extraction confirmed on generated test PDF, inventory analysis completed&lt;/p&gt;
&lt;h3&gt;
  
  
  UC13: Education / Research — Paper Classification
&lt;/h3&gt;

&lt;p&gt;OCRs research PDFs with Textract (cross-region), classifies topics with Comprehend, builds citation networks from reference sections, and generates structured metadata.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discovery → OcrMap(Textract) → Classification(Comprehend + Bedrock) → CitationAnalysis → Metadata
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Textract (cross-region), Comprehend, Bedrock&lt;br&gt;
&lt;strong&gt;Verification&lt;/strong&gt;: ✅ Step Functions SUCCEEDED&lt;/p&gt;
&lt;h3&gt;
  
  
  UC14: Insurance / Claims — Damage Assessment
&lt;/h3&gt;

&lt;p&gt;Detects accident photos and estimate documents, uses Rekognition labels as inputs for preliminary damage triage, OCRs estimates with Textract (cross-region), and generates comprehensive claims reports correlating photo evidence with estimate data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discovery → Parallel[DamageAssessment(Rekognition), EstimateOcr(Textract)] → ClaimsReport(Bedrock + SNS)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Rekognition, Textract (cross-region), Bedrock&lt;br&gt;
&lt;strong&gt;Verification&lt;/strong&gt;: ✅ &lt;strong&gt;Rekognition labels detected + Textract extracted tracking/estimate text from generated test document&lt;/strong&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  AI/ML Service Verification Results
&lt;/h2&gt;

&lt;p&gt;Core services were verified via &lt;strong&gt;Lambda E2E execution&lt;/strong&gt; (not just direct API calls):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;UC&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Rekognition DetectLabels&lt;/td&gt;
&lt;td&gt;UC11&lt;/td&gt;
&lt;td&gt;✅ 15 labels (Lighting 98.5%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rekognition DetectLabels&lt;/td&gt;
&lt;td&gt;UC14&lt;/td&gt;
&lt;td&gt;✅ damage_assessment with labels&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Textract DetectDocumentText&lt;/td&gt;
&lt;td&gt;UC12&lt;/td&gt;
&lt;td&gt;✅ Text extracted from generated test PDF&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Textract DetectDocumentText&lt;/td&gt;
&lt;td&gt;UC14&lt;/td&gt;
&lt;td&gt;✅ Tracking/estimate text extracted from generated test document&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Comprehend Medical DetectEntitiesV2&lt;/td&gt;
&lt;td&gt;UC7&lt;/td&gt;
&lt;td&gt;✅ Entity detection executed on biomedical summary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bedrock InvokeModel (Nova Lite)&lt;/td&gt;
&lt;td&gt;UC6&lt;/td&gt;
&lt;td&gt;✅ Design review report generated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Athena StartQueryExecution&lt;/td&gt;
&lt;td&gt;UC6&lt;/td&gt;
&lt;td&gt;✅ 4 queries (cell_count, bbox, naming, invalid)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h2&gt;
  
  
  Issues Discovered During Phase 2 Verification
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Issue&lt;/th&gt;
&lt;th&gt;Root Cause&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Discovery Lambda timeout (300s)&lt;/td&gt;
&lt;td&gt;Public subnet + no VPC Endpoints&lt;/td&gt;
&lt;td&gt;Private subnet + VPC Endpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;S3 AP AccessDenied&lt;/td&gt;
&lt;td&gt;IAM policy missing ARN format&lt;/td&gt;
&lt;td&gt;Both Alias + ARN formats&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Athena RLIKE syntax error&lt;/td&gt;
&lt;td&gt;Athena (Trino) doesn't support RLIKE&lt;/td&gt;
&lt;td&gt;Use &lt;code&gt;REGEXP_LIKE()&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Missing CloudWatch Logs endpoint&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;monitoring&lt;/code&gt; ≠ &lt;code&gt;logs&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Added separate Logs endpoint&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Step Functions ItemsPath mismatch&lt;/td&gt;
&lt;td&gt;Discovery returns &lt;code&gt;objects&lt;/code&gt; but SFN expects &lt;code&gt;fastq_objects&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Added file-type classification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Comprehend Medical service name&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;comprehend-medical&lt;/code&gt; is invalid&lt;/td&gt;
&lt;td&gt;Use &lt;code&gt;comprehendmedical&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Rekognition InvalidImageFormat&lt;/td&gt;
&lt;td&gt;284-byte invalid JPEG&lt;/td&gt;
&lt;td&gt;Valid 200x200 PNG (56KB)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Processing Lambda S3 AP AccessDenied&lt;/td&gt;
&lt;td&gt;Only Discovery role had S3 AP permissions&lt;/td&gt;
&lt;td&gt;Added to all Processing roles&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h2&gt;
  
  
  File-Type Classification in Discovery Lambda
&lt;/h2&gt;

&lt;p&gt;Each UC's Discovery Lambda classifies detected files by type and returns UC-specific keys matching the Step Functions Map &lt;code&gt;ItemsPath&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# UC7 Genomics Discovery returns:
&lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;objects&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;all_objects&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c1"&gt;# All detected files
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fastq_objects&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fastq_files&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;# → QcMap ItemsPath
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vcf_objects&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;vcf_files&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# → VariantMap ItemsPath
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ontap_metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This allows Step Functions to route different file types to different processing branches without additional Lambda invocations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deployment
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Quick Start (Batch Deploy)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns.git
&lt;span class="nb"&gt;cd &lt;/span&gt;FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns

&lt;span class="c"&gt;# Generate deployment templates&lt;/span&gt;
./scripts/regenerate_deploy_templates.sh

&lt;span class="c"&gt;# Package all Lambda functions&lt;/span&gt;
./scripts/deploy_phase2_batch.sh package

&lt;span class="c"&gt;# Deploy all 9 stacks&lt;/span&gt;
./scripts/deploy_phase2_batch.sh deploy

&lt;span class="c"&gt;# Check status&lt;/span&gt;
./scripts/deploy_phase2_batch.sh status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Test Data
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate and upload test data (GDS, FASTQ, VCF, SEG-Y, IFC, PNG, PDF)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;S3_AP_ALIAS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;your-s3-ap-alias&amp;gt;"&lt;/span&gt;
python3 scripts/generate_test_data.py all &lt;span class="nt"&gt;--upload&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Verify shared/ modules
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 docs/verification-scripts/verify_phase2_shared.py &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--s3-ap-alias&lt;/span&gt; &lt;span class="s2"&gt;"&amp;lt;your-s3-ap-alias&amp;gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output-bucket&lt;/span&gt; &lt;span class="s2"&gt;"&amp;lt;your-output-bucket&amp;gt;"&lt;/span&gt;
&lt;span class="c"&gt;# Result: 8/8 PASSED&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Cost
&lt;/h2&gt;

&lt;p&gt;Phase 2 uses the same cost-optimized architecture as Phase 1:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Environment&lt;/th&gt;
&lt;th&gt;Fixed/mo&lt;/th&gt;
&lt;th&gt;Variable/mo&lt;/th&gt;
&lt;th&gt;Total/mo&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Demo/PoC&lt;/td&gt;
&lt;td&gt;~$0&lt;/td&gt;
&lt;td&gt;~$1–$3&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$1–$3&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production (1 UC)&lt;/td&gt;
&lt;td&gt;~$36&lt;/td&gt;
&lt;td&gt;~$1–$3&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$37–$39&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production (all 14 UCs)&lt;/td&gt;
&lt;td&gt;~$36&lt;/td&gt;
&lt;td&gt;~$14–$42&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$50–$78&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;VPC Endpoints are shared across all UCs in the same VPC — deploy the first UC with &lt;code&gt;EnableVpcEndpoints=true&lt;/code&gt;, subsequent UCs with &lt;code&gt;false&lt;/code&gt;. Variable costs depend on object count, document/image size, and AI/ML service usage.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;SageMaker Batch Transform integration for UC9 (autonomous driving point cloud segmentation)&lt;/li&gt;
&lt;li&gt;Real-time streaming with Kinesis for high-frequency sensor data&lt;/li&gt;
&lt;li&gt;Multi-account deployment patterns with AWS Organizations&lt;/li&gt;
&lt;li&gt;Cost optimization with Lambda Provisioned Concurrency for latency-sensitive UCs&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 1 Article&lt;/strong&gt;: &lt;a href="https://dev.to/yoshikifujiwara/industry-specific-serverless-automation-patterns-with-fsx-for-ontap-s3-access-points-3e0a"&gt;Industry-Specific Serverless Automation Patterns with FSx for ONTAP S3 Access Points&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>amazonfsxfornetappontap</category>
      <category>s3accesspoints</category>
    </item>
    <item>
      <title>Industry-Specific Serverless Automation Patterns with FSx for ONTAP S3 Access Points</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 03 May 2026 20:29:26 +0000</pubDate>
      <link>https://forem.com/yoshikifujiwara/industry-specific-serverless-automation-patterns-with-fsx-for-ontap-s3-access-points-3e0a</link>
      <guid>https://forem.com/yoshikifujiwara/industry-specific-serverless-automation-patterns-with-fsx-for-ontap-s3-access-points-3e0a</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;FSx for ONTAP S3 Access Points let you build &lt;strong&gt;industry-specific serverless data pipelines&lt;/strong&gt; against NAS data — without moving files — using EventBridge Scheduler, Step Functions, and AWS AI/ML services. This article introduces 5 use-case patterns and 3 extension patterns, all backed by a &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;reference implementation repository&lt;/a&gt; with CloudFormation templates, shared Python modules, and property-based tests.&lt;/p&gt;

&lt;p&gt;This is a continuation of &lt;a href="https://dev.to/yoshikifujiwara/fsx-for-ontap-s3-access-points-as-a-serverless-automation-boundary-ai-data-pipelines-ili"&gt;FSx for ONTAP S3 Access Points as a Serverless Automation Boundary&lt;/a&gt;. While the previous article covered the operational automation layer, this one focuses on &lt;strong&gt;concrete, reusable industry patterns&lt;/strong&gt; with full deployment instructions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Polling-Based? The S3 AP Constraint
&lt;/h2&gt;

&lt;p&gt;S3 Access Points (hereafter &lt;strong&gt;S3 AP&lt;/strong&gt;) expose ONTAP volume data through S3 APIs — &lt;code&gt;ListObjectsV2&lt;/code&gt;, &lt;code&gt;GetObject&lt;/code&gt;, &lt;code&gt;PutObject&lt;/code&gt;, and others. However, &lt;code&gt;GetBucketNotificationConfiguration&lt;/code&gt; is not supported, which means S3 event notifications (EventBridge / Lambda triggers) cannot be used.&lt;/p&gt;

&lt;p&gt;This is why all patterns in this collection use &lt;strong&gt;EventBridge Scheduler + Step Functions&lt;/strong&gt; for periodic polling:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EventBridge Scheduler (cron/rate)
  └─→ Step Functions State Machine
       ├─→ Discovery Lambda: List objects via S3 AP → Generate Manifest
       ├─→ Map State: Process each object with AI/ML services
       └─→ Report Lambda: Generate results → SNS notification
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Architecture: VPC Placement Optimization
&lt;/h2&gt;

&lt;p&gt;A key design decision from verification: &lt;strong&gt;only Lambda functions that need ONTAP REST API access are placed inside the VPC&lt;/strong&gt;. Lambda functions that only use S3 AP (with &lt;code&gt;internet&lt;/code&gt; network origin) run outside the VPC.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────┐
│ Inside VPC                                           │
│  - Discovery Lambda (ONTAP REST API + S3 AP)        │
│  - ACL Collection Lambda (ONTAP REST API)           │
│  → Requires VPC Endpoints for Secrets Manager / FSx │
└─────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────┐
│ Outside VPC                                          │
│  - Processing Lambda (S3 AP + AI/ML services)       │
│  - Report Lambda (S3 + SNS + Bedrock)               │
│  → Direct access to S3 AP (internet origin)         │
│  → No VPC Endpoints needed — cost savings           │
└─────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Benefits&lt;/strong&gt;: Interface VPC Endpoints (~$28.80/month) become unnecessary for most Lambda functions, cold start times improve (no ENI creation), and AI/ML services are accessed directly without NAT Gateway.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda Placement Guide
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Recommended&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Demo / PoC&lt;/td&gt;
&lt;td&gt;Outside VPC&lt;/td&gt;
&lt;td&gt;No VPC Endpoints needed, low cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production / private network&lt;/td&gt;
&lt;td&gt;Inside VPC&lt;/td&gt;
&lt;td&gt;Secrets Manager / FSx / SNS via PrivateLink&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Athena / Glue use cases&lt;/td&gt;
&lt;td&gt;S3 AP network origin: &lt;code&gt;internet&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;AWS managed services need access&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Network Origin Constraints
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Network Origin&lt;/th&gt;
&lt;th&gt;Lambda (outside VPC)&lt;/th&gt;
&lt;th&gt;Lambda (inside VPC)&lt;/th&gt;
&lt;th&gt;Athena / Glue&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;internet&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ (via S3 Gateway EP)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;VPC&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅ (S3 Gateway EP required)&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Athena and Glue access from AWS managed infrastructure, so they cannot reach VPC-origin S3 APs. Use cases requiring Athena (UC1, UC3) must use &lt;code&gt;internet&lt;/code&gt; network origin.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security and Authorization Model
&lt;/h2&gt;

&lt;p&gt;The solution uses four authorization layers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IAM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Controls access to AWS services and S3 Access Points&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;S3 Access Point&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Defines access boundaries through the associated file system user&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ONTAP File System&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Enforces file-level permissions (UNIX / NTFS ACL)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ONTAP REST API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Exposes only metadata and control-plane operations&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Key points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;S3 APIs do not expose file-level ACLs. File permissions are retrieved &lt;strong&gt;exclusively via the ONTAP REST API&lt;/strong&gt; (UC1's ACL Collection uses this pattern)&lt;/li&gt;
&lt;li&gt;S3 AP access is authorized on the ONTAP side as the associated UNIX / Windows file system user, after IAM / S3 AP policy checks pass&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The 5 Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  UC1: Legal &amp;amp; Compliance — File Server Audit
&lt;/h3&gt;

&lt;p&gt;Collects NTFS ACL information via ONTAP REST API, detects excessive permissions with Athena SQL, and generates natural-language compliance reports with Bedrock.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Athena, Glue Data Catalog, Bedrock | &lt;strong&gt;Verification&lt;/strong&gt;: ✅ E2E success (67/67 Lambda executions)&lt;/p&gt;

&lt;h3&gt;
  
  
  UC2: Financial Services — Contract &amp;amp; Invoice Processing (IDP)
&lt;/h3&gt;

&lt;p&gt;OCR processing of PDF/TIFF/JPEG documents with Textract, entity extraction with Comprehend, and structured summary generation with Bedrock.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Textract, Comprehend, Bedrock | &lt;strong&gt;Verification&lt;/strong&gt;: ✅ E2E success (Textract via cross-region invocation)&lt;/p&gt;

&lt;h3&gt;
  
  
  UC3: Manufacturing — IoT Sensor Log &amp;amp; Quality Inspection
&lt;/h3&gt;

&lt;p&gt;CSV sensor logs converted to Parquet for Athena anomaly detection. Inspection images analyzed with Rekognition for defect detection with confidence-based manual review flagging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Athena, Glue Data Catalog, Rekognition | &lt;strong&gt;Verification&lt;/strong&gt;: ✅ E2E success&lt;/p&gt;

&lt;h3&gt;
  
  
  UC4: Media — VFX Rendering Pipeline
&lt;/h3&gt;

&lt;p&gt;Detects rendering assets, submits jobs to AWS Deadline Cloud, performs Rekognition quality checks, and writes approved output back to FSx ONTAP via S3 AP PutObject.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Deadline Cloud, Rekognition | &lt;strong&gt;Verification&lt;/strong&gt;: ✅ E2E success&lt;/p&gt;

&lt;h3&gt;
  
  
  UC5: Healthcare — DICOM Image Classification &amp;amp; Anonymization
&lt;/h3&gt;

&lt;p&gt;Parses DICOM metadata for classification, detects burned-in PII with Rekognition DetectText, and removes PHI with Comprehend Medical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Services&lt;/strong&gt;: Rekognition, Comprehend Medical | &lt;strong&gt;Verification&lt;/strong&gt;: ✅ E2E success (Comprehend Medical via cross-region)&lt;/p&gt;

&lt;h2&gt;
  
  
  Extension Patterns (Verified)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Bedrock Knowledge Bases — RAG
&lt;/h3&gt;

&lt;p&gt;S3 AP as a data source for Bedrock Knowledge Bases. Verified with OpenSearch Serverless + Titan Embed Text v2 (81 documents indexed, Retrieve and RetrieveAndGenerate APIs confirmed).&lt;/p&gt;

&lt;h3&gt;
  
  
  Transfer Family SFTP — Partner File Exchange
&lt;/h3&gt;

&lt;p&gt;SFTP server connected to S3 AP for external partner file exchange. Verified with SSH public key auth, upload/download operations.&lt;/p&gt;

&lt;h3&gt;
  
  
  EMR Serverless Spark — Large-Scale Processing
&lt;/h3&gt;

&lt;p&gt;PySpark jobs reading/writing via S3 AP. Verified CSV → Parquet transformation with script and data I/O entirely through S3 AP.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design Decisions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Shared Modules
&lt;/h3&gt;

&lt;p&gt;All use cases share &lt;code&gt;OntapClient&lt;/code&gt; (Secrets Manager auth, urllib3, TLS, retry), &lt;code&gt;FsxHelper&lt;/code&gt; (AWS FSx API + CloudWatch metrics), &lt;code&gt;S3ApHelper&lt;/code&gt; (pagination, suffix filter), and &lt;code&gt;lambda_error_handler&lt;/code&gt; decorator.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost Optimization
&lt;/h3&gt;

&lt;p&gt;High-cost always-on resources are opt-in via CloudFormation parameters:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Interface VPC Endpoints (4)&lt;/td&gt;
&lt;td&gt;~$28.80&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Disabled&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch Alarms&lt;/td&gt;
&lt;td&gt;~$0.10/alarm&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Disabled&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 Gateway VPC Endpoint&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Enabled&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Demo/PoC cost: &lt;strong&gt;~$1–$3/month&lt;/strong&gt;. Actual verification cost for all 8 patterns: &lt;strong&gt;under $2&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Three-Layer Error Handling
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Shared modules&lt;/strong&gt;: Custom exceptions + urllib3/boto3 retry&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step Functions&lt;/strong&gt;: Retry/Catch blocks with exponential backoff&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workflow&lt;/strong&gt;: Map State individual failures don't affect other items&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Cross-Region Invocation
&lt;/h3&gt;

&lt;p&gt;Textract and Comprehend Medical are unavailable in some regions (e.g., ap-northeast-1). UC2 and UC5 use &lt;code&gt;TextractRegion&lt;/code&gt; and &lt;code&gt;ComprehendMedicalRegion&lt;/code&gt; CloudFormation parameters for cross-region API calls.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Cross-region invocation transfers data to another region. Verify data residency and compliance requirements.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Issues Discovered During Verification
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Issue&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;datetime&lt;/code&gt; JSON serialization&lt;/td&gt;
&lt;td&gt;Added &lt;code&gt;default=str&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Bedrock Messages API format&lt;/td&gt;
&lt;td&gt;Updated to Messages API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Athena SQL quoting&lt;/td&gt;
&lt;td&gt;Added backtick quoting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Lambda package name collision&lt;/td&gt;
&lt;td&gt;Added UC prefix to ZIP names&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;S3 Gateway Endpoint duplication&lt;/td&gt;
&lt;td&gt;Added &lt;code&gt;EnableS3GatewayEndpoint&lt;/code&gt; parameter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;VPC Lambda S3 AP timeout&lt;/td&gt;
&lt;td&gt;Added &lt;code&gt;PrivateRouteTableIds&lt;/code&gt; parameter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Textract region unavailability&lt;/td&gt;
&lt;td&gt;Added &lt;code&gt;TextractRegion&lt;/code&gt; cross-region parameter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;ONTAP self-signed certificate&lt;/td&gt;
&lt;td&gt;Added &lt;code&gt;VERIFY_SSL&lt;/code&gt; environment variable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Single route table limitation&lt;/td&gt;
&lt;td&gt;Changed to &lt;code&gt;CommaDelimitedList&lt;/code&gt; type&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Unnecessary VpcConfig&lt;/td&gt;
&lt;td&gt;Removed VpcConfig from S3 AP-only Lambda&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Comprehend Medical region&lt;/td&gt;
&lt;td&gt;Added &lt;code&gt;ComprehendMedicalRegion&lt;/code&gt; parameter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;UC4 QualityCheck KeyError&lt;/td&gt;
&lt;td&gt;Safe key access pattern&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;pyarrow Lambda layer size&lt;/td&gt;
&lt;td&gt;Replaced with stdlib &lt;code&gt;csv&lt;/code&gt; module&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  When to Use / When Not to Use
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Use this when:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;You want to serverlessly process existing NAS data on FSx for ONTAP without moving it&lt;/li&gt;
&lt;li&gt;You need file listing and preprocessing from Lambda without NFS/SMB mounts&lt;/li&gt;
&lt;li&gt;You want to learn the separation of responsibilities between S3 AP and ONTAP REST API&lt;/li&gt;
&lt;li&gt;You want to quickly validate industry-specific AI/ML patterns as a PoC&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Don't use this when:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Real-time file change event processing is required (S3 Event Notification not supported)&lt;/li&gt;
&lt;li&gt;Full S3 bucket compatibility (Presigned URLs, etc.) is needed&lt;/li&gt;
&lt;li&gt;You already have EC2/ECS batch infrastructure with NFS mount operations&lt;/li&gt;
&lt;li&gt;File data already exists in standard S3 buckets&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Production Readiness Considerations
&lt;/h2&gt;

&lt;p&gt;This repository includes production-oriented design decisions, but actual production environments should additionally consider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Organizational IAM / SCP / Permission Boundary alignment&lt;/li&gt;
&lt;li&gt;S3 AP policy and ONTAP-side user permission review&lt;/li&gt;
&lt;li&gt;Audit and execution logs (CloudTrail / CloudWatch Logs)&lt;/li&gt;
&lt;li&gt;CloudWatch Alarms / SNS / Incident Management integration&lt;/li&gt;
&lt;li&gt;Industry-specific compliance (data classification, PII, PHI)&lt;/li&gt;
&lt;li&gt;Data residency for cross-region invocations&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns.git
&lt;span class="nb"&gt;cd &lt;/span&gt;FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns

pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements-dev.txt
pytest shared/tests/ &lt;span class="nt"&gt;-v&lt;/span&gt;

&lt;span class="c"&gt;# Package and deploy (example: UC1)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_DEFAULT_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1
./scripts/deploy_uc.sh legal-compliance package
&lt;span class="c"&gt;# Then deploy via CloudFormation — see README for full parameter list&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The repository includes 8-language READMEs (ja, en, ko, zh-CN, zh-TW, fr, de, es), deployment guides, operations guides, troubleshooting guides, cost analysis, and region compatibility matrix.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Yoshiki Fujiwara&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>amazonfsxfornetappontap</category>
      <category>s3accesspoints</category>
    </item>
    <item>
      <title>FSx for ONTAP S3 Access Points as a Serverless Automation Boundary — AI Data Pipelines, Volume-Level SnapMirror DR, and Capacity Guardrails</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Fri, 01 May 2026 23:44:20 +0000</pubDate>
      <link>https://forem.com/yoshikifujiwara/fsx-for-ontap-s3-access-points-as-a-serverless-automation-boundary-ai-data-pipelines-ili</link>
      <guid>https://forem.com/yoshikifujiwara/fsx-for-ontap-s3-access-points-as-a-serverless-automation-boundary-ai-data-pipelines-ili</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; FSx for ONTAP S3 Access Points let you treat NAS data as an S3-facing automation boundary — without moving data — making serverless AI pipelines and ops workflows practical.&lt;/p&gt;

&lt;p&gt;This is a continuation of &lt;a href="https://dev.to/aws-builders/building-an-agentic-access-aware-rag-system-with-amazon-fsx-for-netapp-ontap-s3-vectors-and-s3-2b86"&gt;Building an Agentic Access-Aware RAG System with Amazon FSx for NetApp ONTAP&lt;/a&gt;. While the previous article focused on the RAG application itself, this one covers the &lt;strong&gt;operational automation layer&lt;/strong&gt; built around FSx for ONTAP S3 Access Points.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Shift: S3 Access Points Make Serverless NAS Automation Practical
&lt;/h2&gt;

&lt;p&gt;Enterprise file data lives on FSx for NetApp ONTAP — accessed via SMB/NFS by users and applications every day. Automating operations around that data has traditionally meant mounting NFS from compute instances, managing file system connections, and dealing with the cold-start and connection-limit penalties that come with VPC-mounted Lambda functions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;FSx for ONTAP S3 Access Points change this equation.&lt;/strong&gt; They &lt;a href="https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/accessing-data-via-s3-access-points.html" rel="noopener noreferrer"&gt;expose ONTAP file data through supported S3 object APIs&lt;/a&gt; — &lt;code&gt;GetObject&lt;/code&gt;, &lt;code&gt;PutObject&lt;/code&gt;, &lt;code&gt;ListObjectsV2&lt;/code&gt;, and others — while keeping the data in FSx and preserving concurrent SMB/NFS access. The practical shift is that &lt;strong&gt;Lambda no longer needs mounted NAS access for the data path&lt;/strong&gt;. S3 Access Points provide the supported object-facing operations, and ONTAP REST API supplies the storage-system metadata that S3 cannot expose.&lt;/p&gt;

&lt;p&gt;This is the architectural pivot that makes the automation suite in this article possible:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Serverless file inventory&lt;/strong&gt; without NFS/SMB mounts from Lambda&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI / RAG preprocessing&lt;/strong&gt; directly against ONTAP file data through supported S3 object APIs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metadata sidecar generation&lt;/strong&gt; by combining S3-listed objects with ONTAP ACL, export policy, and security-style metadata via REST API&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scheduled governance scans&lt;/strong&gt; over NAS data using IAM + file-system authorization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reuse of S3-speaking application components&lt;/strong&gt; (Bedrock KB, analytics tools) without moving data out of ONTAP&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ONTAP REST API complements this by providing the control-plane and storage-system context — volume management, SnapMirror orchestration, capacity monitoring, snapshot operations — that S3 Access Points are not designed to handle.&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;Code&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG/tree/main/automation/fsxn-ops" rel="noopener noreferrer"&gt;&lt;code&gt;automation/fsxn-ops/&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  S3 Access Point Compatibility Model
&lt;/h2&gt;

&lt;p&gt;FSx for ONTAP S3 Access Points support a &lt;a href="https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/access-points-for-fsxn-object-api-support.html" rel="noopener noreferrer"&gt;subset of S3 APIs&lt;/a&gt;, not full S3 bucket semantics. Understanding this boundary is essential for building reliable automation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Supported operations used in this automation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ListObjectsV2&lt;/code&gt; — file inventory and scanning&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GetObject&lt;/code&gt; — file content access for preprocessing&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PutObject&lt;/code&gt; — writing metadata sidecars and manifests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Notable unsupported operations:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;GetBucketNotificationConfiguration&lt;/code&gt; — &lt;strong&gt;this is why the design uses scheduled polling instead of S3 notification-driven triggers&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GetObjectAcl&lt;/code&gt; / &lt;code&gt;PutObjectAcl&lt;/code&gt; — object ACL management is not available&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Presign&lt;/code&gt; — presigned URL generation is not supported&lt;/li&gt;
&lt;li&gt;Bucket-style management APIs and bucket-notification semantics do not apply in the same way here&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Authorization operates at two layers: &lt;a href="https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/accessing-data-via-s3-access-points.html" rel="noopener noreferrer"&gt;IAM permissions control AWS-level access, while file-system-level authorization uses the mapped user identity&lt;/a&gt; (UNIX or Windows) configured on the S3 Access Point. File-level ACLs are retrieved via the ONTAP REST API; S3 APIs do not expose file-system ACLs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;The architecture has four layers, with S3 Access Points as the data-path boundary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────┐
│  Orchestration: EventBridge Scheduler / Step Functions   │
├─────────────────────────────────────────────────────────┤
│  Compute: Lambda (Python 3.12, VPC-deployed)            │
├──────────────────────┬──────────────────────────────────┤
│  Data Path:          │  Control Plane:                  │
│  FSx ONTAP S3 AP     │  ONTAP REST API                  │
│  (ListObjectsV2,     │  (volumes, SnapMirror,           │
│   GetObject,         │   snapshots, exports,            │
│   PutObject)         │   security_style, ACLs)          │
├──────────────────────┴──────────────────────────────────┤
│  Storage: FSx for NetApp ONTAP (SMB/NFS + S3 AP)       │
└─────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;S3 Access Points&lt;/strong&gt; = data-path (file listing, content access, sidecar writes)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ONTAP REST API&lt;/strong&gt; = storage metadata + control-plane (resize, SnapMirror, snapshots, ACLs, export policies)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step Functions / Lambda&lt;/strong&gt; = orchestration and compute&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EventBridge&lt;/strong&gt; = scheduled execution (polling, not file-change triggers)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Data Preprocessor: The S3 Access Point Workflow
&lt;/h3&gt;

&lt;p&gt;This is the core use case that S3 Access Points enable. The preprocessor scans FSx ONTAP volumes through S3 Access Points and enriches the results with ONTAP-specific storage metadata.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DataPreprocessor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;list_source_objects&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;suffix_filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;List files on FSx ONTAP via S3 Access Point (ListObjectsV2)&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_objects_v2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;s3_access_point_arn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# FSx ONTAP S3 AP ARN
&lt;/span&gt;            &lt;span class="n"&gt;Prefix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Filter by extension, collect basic object metadata
&lt;/span&gt;        &lt;span class="c1"&gt;# (key, size, last_modified, etag)
&lt;/span&gt;        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;collect_ontap_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;volume_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Get storage-system metadata via ONTAP REST API&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;vol_detail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ontap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_volume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vol_uuid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;security_style&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;vol_detail&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nas&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;security_style&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;export_policy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;vol_detail&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nas&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;export_policy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;snapshot_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ontap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_snapshots&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vol_uuid&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;S3 Access Points give you data-path operations and basic object-facing metadata&lt;/strong&gt; (key, size, last modified, ETag). &lt;strong&gt;ONTAP REST API provides storage-system metadata and control-plane attributes&lt;/strong&gt; — security style, export policies, snapshot information, and NAS/storage context that S3 APIs cannot expose.&lt;/p&gt;

&lt;p&gt;The preprocessor combines both to generate task manifests for downstream AI/analytics pipelines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;S3 AP: ListObjectsV2 → file inventory (.md, .pdf, .docx)
ONTAP REST: GET /storage/volumes → security_style, export_policy, snapshots
  → Generate preprocessing tasks (batch_size=10)
  → Write manifest to S3 (PutObject)
  → Downstream: Bedrock KB Ingestion, analytics, governance
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. ONTAP REST API Client
&lt;/h3&gt;

&lt;p&gt;The shared Python client handles control-plane operations. Credentials come from Secrets Manager.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OntapClient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;management_lif&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;secret_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;verify_ssl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# Default: TLS verification enabled
&lt;/span&gt;        &lt;span class="n"&gt;ca_cert_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# CA bundle for production
&lt;/span&gt;    &lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;verify_ssl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TLS verification disabled — lab/PoC only&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;TLS verification is enabled by default.&lt;/strong&gt; FSx for ONTAP management LIFs use self-signed certificates, so production deployments should provide a CA bundle via &lt;code&gt;ca_cert_path&lt;/code&gt;. For lab/PoC environments, &lt;code&gt;verify_ssl=False&lt;/code&gt; can be set explicitly — the client logs a warning. The &lt;code&gt;ONTAP_VERIFY_SSL&lt;/code&gt; and &lt;code&gt;ONTAP_CA_CERT_PATH&lt;/code&gt; environment variables control this at the Lambda level.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Capacity Monitor with Guardrails
&lt;/h3&gt;

&lt;p&gt;Runs every 5 minutes via EventBridge Scheduler. Checks filesystem-level capacity (FSx API + CloudWatch) and volume-level usage (ONTAP REST API).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Guardrail&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DRY_RUN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;true&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Safe default&lt;/strong&gt; — logs actions without executing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MAX_GROW_PER_ACTION_PCT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;50%&lt;/td&gt;
&lt;td&gt;Prevents a single run from doubling a volume&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MAX_GROW_PER_DAY_GIB&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;500 GiB&lt;/td&gt;
&lt;td&gt;Caps total daily expansion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;VOL_THRESHOLD_PCT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;80%&lt;/td&gt;
&lt;td&gt;Aligned with &lt;a href="https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/managing-storage-capacity.html" rel="noopener noreferrer"&gt;AWS recommendation&lt;/a&gt; to keep SSD utilization below 80%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Observed behavior&lt;/strong&gt;: CloudWatch &lt;code&gt;StorageCapacityUtilization&lt;/code&gt; metrics are not always available for new filesystems or those with minimal data. The monitor falls back to ONTAP REST API for volume-level monitoring when CloudWatch data is unavailable.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Volume-Level SnapMirror Failover Orchestration
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Scope&lt;/strong&gt;: This automation handles &lt;strong&gt;planned failover for volume-level SnapMirror relationships&lt;/strong&gt; — breaking replication, recreating selected CIFS shares and NFS exports on the DR side, and reversing the process for failback. It is &lt;strong&gt;not a complete SVM-DR solution&lt;/strong&gt;. ONTAP's SVM-DR includes additional considerations such as identity preservation and replicated configuration scope. See &lt;a href="https://docs.netapp.com/us-en/ontap/data-protection/snapmirror-svm-replication-concept.html" rel="noopener noreferrer"&gt;NetApp's SVM-DR documentation&lt;/a&gt; for the full picture.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Step Functions state machine orchestrates 10 actions through a single Lambda function with action routing.&lt;/p&gt;

&lt;h4&gt;
  
  
  SnapMirror Initialization: Two Paths
&lt;/h4&gt;

&lt;p&gt;ONTAP supports &lt;a href="https://docs.netapp.com/us-en/ontap-restapi/ontap/post-snapmirror-relationships.html" rel="noopener noreferrer"&gt;documented initialization semantics during relationship creation&lt;/a&gt; in supported create flows. The &lt;a href="https://docs.netapp.com/us-en/ontap-restapi/post-snapmirror-relationships-transfers.html" rel="noopener noreferrer"&gt;transfers API&lt;/a&gt; starts an initialize or update operation depending on the current relationship state.&lt;/p&gt;

&lt;p&gt;The automation implements both:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;initialize&lt;/code&gt; action&lt;/strong&gt;: Creates the relationship with &lt;code&gt;"state": "snapmirrored"&lt;/code&gt; in supported create flows. In workflows using a pre-existing destination volume, explicit transfer may be more reliable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;final_transfer&lt;/code&gt; action&lt;/strong&gt;: Explicit &lt;code&gt;POST /snapmirror/relationships/{uuid}/transfers&lt;/code&gt; — starts an initialize or update transfer depending on the current state.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Observed in my tested environment (ONTAP 9.17.1P4D3, FSx SINGLE_AZ_1 with pre-existing destination volume)&lt;/strong&gt;: The create-with-state path resulted in a job failure, so the automation fell back to explicit transfer. Both paths are implemented and tested.&lt;/p&gt;




&lt;h2&gt;
  
  
  Enabling the S3 Access Point Pattern: Private-Subnet Networking
&lt;/h2&gt;

&lt;p&gt;If you want S3 Access Point-driven serverless automation for ONTAP data in a private-subnet design (no NAT Gateway), this is the endpoint footprint you need:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;secretsmanager&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Interface&lt;/td&gt;
&lt;td&gt;ONTAP credentials&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;fsx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/fsx-vpc-endpoints.html" rel="noopener noreferrer"&gt;Interface&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;FSx API (describe, update)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;monitoring&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Interface&lt;/td&gt;
&lt;td&gt;CloudWatch metrics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sns&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Interface&lt;/td&gt;
&lt;td&gt;SNS notifications&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;s3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-s3.html" rel="noopener noreferrer"&gt;Gateway&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;S3 Access Point data-path — must be associated with Lambda subnet's route table (no hourly endpoint charge)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;This applies to private-subnet / no-NAT deployments.&lt;/strong&gt; If your Lambda functions have another egress path, the endpoint requirements differ. The S3 Gateway endpoint specifically needs to be associated with the route table used by the Lambda subnet.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What I Learned from AWS Verification
&lt;/h2&gt;

&lt;p&gt;Deployed and tested against FSx for ONTAP (ONTAP 9.17.1P4D3). The following are &lt;strong&gt;empirical observations from my tested deployment pattern&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SNS VPC endpoint is required for alerts&lt;/strong&gt; — without it, SNS Publish silently times out in private-subnet Lambda. This is documented AWS behavior for VPC-deployed Lambda, but easy to overlook.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;fsxadmin password sync is not automatic&lt;/strong&gt; — Secrets Manager and FSx ONTAP store the password independently. If someone changes it via the console, Lambda gets 401 errors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;S3 Gateway endpoint route table association matters&lt;/strong&gt; — it must be the specific route table used by the Lambda subnet, not just any route table in the VPC.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Same-SVM SnapMirror on SINGLE_AZ_1: Test Harness Only
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;This is a low-cost automation test harness, not a real DR architecture.&lt;/strong&gt; Source and destination volumes remain in the same failure domain (same filesystem, same AZ). Use this only for validating automation logic.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I tested SnapMirror within the same SVM on a SINGLE_AZ_1 deployment. It works for validating the automation without the cost of a second filesystem (~$200/month minimum).&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Monthly&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lambda (4 functions)&lt;/td&gt;
&lt;td&gt;~$1.65&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Step Functions&lt;/td&gt;
&lt;td&gt;~$0.05&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EventBridge Scheduler&lt;/td&gt;
&lt;td&gt;~$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Secrets Manager&lt;/td&gt;
&lt;td&gt;~$0.40&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch Logs&lt;/td&gt;
&lt;td&gt;~$0.50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Serverless subtotal&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$2.60&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VPC Interface Endpoints (4 × ~$7.30/AZ)&lt;/td&gt;
&lt;td&gt;~$29-58&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 Gateway Endpoint&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total (with dedicated endpoints)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$32-61&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;If your VPC already has these endpoints, the incremental cost is the serverless subtotal only. The S3 Gateway endpoint has no hourly endpoint charge, so the dominant networking cost comes from the four interface endpoints.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Extending the S3 Access Point Pattern
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Automated Permission Metadata Pipeline
&lt;/h3&gt;

&lt;p&gt;The strongest extension connects S3 Access Points to the RAG system's permission pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EventBridge (daily)
  → S3 AP: ListObjectsV2 → file inventory
  → ONTAP REST: GET ACL metadata per file
  → Generate .metadata.json with allowed_group_sids
  → S3 AP: PutObject → write sidecars alongside source files
  → Trigger Bedrock KB Ingestion Job
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This eliminates manual &lt;code&gt;.metadata.json&lt;/code&gt; management — the automation reads NTFS ACLs from ONTAP REST API and generates permission metadata automatically, writing it back through the S3 Access Point.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-Volume Ingestion Orchestration
&lt;/h3&gt;

&lt;p&gt;For environments with multiple FSx ONTAP volumes, each with its own S3 Access Point:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step Functions Map:
  → Per volume: S3 AP scan → ONTAP metadata → generate sidecars
  → Per volume: Trigger Bedrock KB Ingestion Job
  → Wait for all → validate vector counts → notify
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ONTAP Operations Chatbot
&lt;/h3&gt;

&lt;p&gt;Combine &lt;code&gt;ontap_api_executor&lt;/code&gt; with Bedrock Agent for natural language ONTAP management. The security controls (method restrictions, blocked paths) make this safe for read-only chatbots.&lt;/p&gt;




&lt;h2&gt;
  
  
  Testing
&lt;/h2&gt;

&lt;p&gt;38 unit tests covering TLS verification modes, SnapMirror dual initialization paths, capacity guardrails, and S3 AP operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; automation/fsxn-ops/requirements.txt
pytest automation/fsxn-ops/tests/ &lt;span class="nt"&gt;-v&lt;/span&gt;

&lt;span class="c"&gt;# AWS integration tests (auto-deploys, tests, cleans up)&lt;/span&gt;
bash automation/fsxn-ops/tests/integration/run_aws_verification.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;FSx for ONTAP S3 Access Points are the architectural enabler that makes this automation suite practical:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AI / data pipelines&lt;/strong&gt; — ONTAP file data becomes accessible to serverless workflows through S3 Access Points (supported S3 object APIs), enriched with ONTAP REST API storage-system metadata&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Management&lt;/strong&gt; — ONTAP REST API handles control-plane automation (volumes, snapshots, exports)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capacity&lt;/strong&gt; — monitored with guardrails and safe expansion defaults&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Volume-level DR&lt;/strong&gt; — planned failover / failback for volume-level SnapMirror relationships (not full SVM-DR)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The serverless compute cost is ~$2.60/month. The code is open source and deploys with a single CloudFormation command.&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG/tree/main/automation/fsxn-ops" rel="noopener noreferrer"&gt;automation/fsxn-ops/&lt;/a&gt;&lt;br&gt;
📖 &lt;strong&gt;Full project&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG" rel="noopener noreferrer"&gt;Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Yoshiki Fujiwara&lt;/p&gt;

</description>
      <category>aws</category>
      <category>amazonfsxfornetapponpta</category>
      <category>s3accesspoints</category>
      <category>automation</category>
    </item>
    <item>
      <title>Building an Agentic Access-Aware RAG System with Amazon FSx for NetApp ONTAP, S3 Vectors, and S3 Access Points— Where AI Respects File Permissions</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 05 Apr 2026 17:15:03 +0000</pubDate>
      <link>https://forem.com/aws-builders/building-an-agentic-access-aware-rag-system-with-amazon-fsx-for-netapp-ontap-s3-vectors-and-s3-2b86</link>
      <guid>https://forem.com/aws-builders/building-an-agentic-access-aware-rag-system-with-amazon-fsx-for-netapp-ontap-s3-vectors-and-s3-2b86</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🆕 Updated April 2026&lt;/strong&gt;: v4.0.0 released with 6 new features — Agent Registry, Multimodal RAG, Guardrails, Episodic Memory, Voice Chat, and AgentCore Policy. See what's new.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Enterprise data lives on file servers. And on those file servers, not everyone can see everything — NTFS ACLs, UNIX permissions, and group policies control who accesses what. But when you plug that data into a Retrieval-Augmented Generation (RAG) system, those permission boundaries tend to disappear. Suddenly, anyone can ask the AI about another team's, division's, or board member's confidential information.&lt;/p&gt;

&lt;p&gt;But there's a flip side to this problem that's equally important: &lt;strong&gt;without permission awareness, the AI can't fully help the people it should be helping.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Think about it. An engineer has years of design docs, project specs, and team-internal notes in their department's shared folder. A sales lead has pipeline data, customer contracts, and regional forecasts in theirs. When you strip away permissions and dump everything into one vector store, the AI doesn't just leak confidential data — it also drowns each user's results in irrelevant noise from every other team. The engineer gets sales forecasts mixed into their search results. The sales lead gets CI/CD pipeline docs they'll never need.&lt;/p&gt;

&lt;p&gt;Permission-aware RAG flips this around. Because the system knows exactly which files each user can access, it delivers &lt;strong&gt;personalized, noise-free AI assistance&lt;/strong&gt; grounded in the data each person actually works with day to day. Your personal folder, your team's shared drive, the cross-functional project space you're part of — the AI sees what you see, nothing more, nothing less.&lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;Agentic Access-Aware RAG&lt;/strong&gt; to make this real. It's an open-source system that lets AI agents autonomously search, analyze, and respond to enterprise data stored on Amazon FSx for NetApp ONTAP — &lt;strong&gt;while respecting per-user file-level access permissions&lt;/strong&gt;. The same question yields different answers depending on who's asking: an admin gets the full financial report, a project member gets their project's restricted docs, and a general user gets public information only. Each user gets an AI assistant that's effectively customized to their role and responsibilities — without any manual configuration.&lt;/p&gt;

&lt;p&gt;The entire stack deploys with a single &lt;code&gt;npx cdk deploy --all&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG" rel="noopener noreferrer"&gt;Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG&lt;/a&gt;&lt;br&gt;
📦 &lt;strong&gt;Latest Release&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG/releases/tag/v4.0.0" rel="noopener noreferrer"&gt;v4.0.0&lt;/a&gt; — 6 new features added&lt;/p&gt;


&lt;h2&gt;
  
  
  Architecture at a Glance
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Browser → AWS WAF → CloudFront (OAC+Geo) → Lambda Web Adapter (Next.js 15)
                                                    │
              ┌─────────────┬───────────────────────┼──────────────────┐
              ▼             ▼                       ▼                  ▼
        Cognito       Bedrock KB              DynamoDB            DynamoDB
       User Pool    + S3 Vectors /          user-access          perm-cache
                    OpenSearch SL           (SID Data)         (Perm Cache)
                         │
                         ▼
                  FSx for ONTAP
                  (SVM + Volume)
                + S3 Access Point
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The system is organized into 7 CDK stacks: WAF, Networking, Security (Cognito), Storage (FSx ONTAP + DynamoDB), AI (Bedrock KB + vector store), WebApp (Lambda + CloudFront), and an optional Embedding stack.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F76e5olacvd5fvb45ctv3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F76e5olacvd5fvb45ctv3.png" alt="Architecture — KB Mode Card Grid" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  The Core Idea: Permission-Aware RAG
&lt;/h2&gt;

&lt;p&gt;Traditional RAG retrieves documents based on semantic similarity alone. This system adds a second dimension: &lt;strong&gt;SID-based permission filtering&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here's the flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User sends a question via the chat UI&lt;/li&gt;
&lt;li&gt;The app retrieves the user's SID list (personal SID + group SIDs) from DynamoDB&lt;/li&gt;
&lt;li&gt;Bedrock KB Retrieve API performs vector search — each result carries &lt;code&gt;allowed_group_sids&lt;/code&gt; metadata&lt;/li&gt;
&lt;li&gt;The app matches each document's SIDs against the user's SIDs&lt;/li&gt;
&lt;li&gt;Only permitted documents are passed to the Converse API for answer generation&lt;/li&gt;
&lt;li&gt;The user sees a filtered response with citation badges showing access levels
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;■ Admin user: SIDs = [...-512 (Domain Admins), S-1-1-0 (Everyone)]
  public/          → S-1-1-0 match  → ✅ Permitted
  confidential/    → ...-512 match  → ✅ Permitted
  engineering/     → No match       → ❌ Filtered out (no noise from other teams)

■ Engineer (Engineering group member): SIDs = [...-1100 (Engineering), S-1-1-0 (Everyone)]
  public/          → S-1-1-0 match  → ✅ Permitted
  confidential/    → No match       → ❌ Denied
  engineering/     → ...-1100 match → ✅ Their team's docs, front and center

■ Sales user: SIDs = [...-1200 (Sales), S-1-1-0 (Everyone)]
  public/          → S-1-1-0 match  → ✅ Permitted
  confidential/    → No match       → ❌ Denied
  engineering/     → No match       → ❌ No engineering noise in their results
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The engineer asking "What's the status of Project X?" gets answers from their team's internal docs — not from sales forecasts or HR policies. The sales lead asking "What are our Q3 targets?" gets their regional data without wading through engineering specs. Each user's AI experience is naturally scoped to the data they work with every day.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg65qp205at7ss1di0vit.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg65qp205at7ss1di0vit.png" alt="Chat Response with Citation + Access Level Badge" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  S3 Access Points: The Bridge Between FSx ONTAP and Bedrock KB
&lt;/h2&gt;

&lt;p&gt;One of the most impactful recent additions is &lt;strong&gt;S3 Access Point integration&lt;/strong&gt; with FSx for ONTAP. This creates a clean, single-path data ingestion architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FSx ONTAP Volume (/data)
  ├── public/company-overview.md
  ├── public/company-overview.md.metadata.json
  ├── confidential/financial-report.md
  ├── confidential/financial-report.md.metadata.json
      │
      │  S3 Access Point
      ▼
  Bedrock KB Data Source (S3 AP alias)
      │  Ingestion Job (chunking + Titan Embed v2)
      ▼
  Vector Store (S3 Vectors or OpenSearch Serverless)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before S3 Access Points, getting data from FSx ONTAP into Bedrock KB required either a custom Embedding server with CIFS mounts or manual S3 uploads. Now, Bedrock KB reads documents directly from the FSx ONTAP volume through the S3 Access Point — no intermediate copies, no sync scripts.&lt;/p&gt;

&lt;p&gt;The S3 AP user type is automatically selected based on your AD configuration:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;AD Configuration&lt;/th&gt;
&lt;th&gt;Volume Style&lt;/th&gt;
&lt;th&gt;S3 AP User Type&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AD configured&lt;/td&gt;
&lt;td&gt;NTFS&lt;/td&gt;
&lt;td&gt;WINDOWS (&lt;code&gt;Admin&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;NTFS ACLs automatically applied&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No AD&lt;/td&gt;
&lt;td&gt;NTFS/UNIX&lt;/td&gt;
&lt;td&gt;UNIX (&lt;code&gt;root&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;All files accessible; permission control via &lt;code&gt;.metadata.json&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;One gotcha I discovered: the S3 AP &lt;code&gt;WindowsUser&lt;/code&gt; must &lt;strong&gt;not&lt;/strong&gt; include the domain prefix. &lt;code&gt;DEMO\Admin&lt;/code&gt; works for CLI operations but causes &lt;code&gt;AccessDenied&lt;/code&gt; on data plane APIs (&lt;code&gt;ListObjects&lt;/code&gt;, &lt;code&gt;GetObject&lt;/code&gt;). Always specify just &lt;code&gt;Admin&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  S3 Vectors: Low-Cost Vector Storage
&lt;/h2&gt;

&lt;p&gt;The default vector store is &lt;strong&gt;Amazon S3 Vectors&lt;/strong&gt; — a relatively new service that brings vector search costs down to a few dollars per month, compared to ~$700/month for OpenSearch Serverless.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;S3 Vectors (default)&lt;/td&gt;
&lt;td&gt;~$2-5/month&lt;/td&gt;
&lt;td&gt;Sub-second to 100ms&lt;/td&gt;
&lt;td&gt;Demo, dev, cost optimization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenSearch Serverless&lt;/td&gt;
&lt;td&gt;~$700/month&lt;/td&gt;
&lt;td&gt;~10ms&lt;/td&gt;
&lt;td&gt;High-performance production&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;S3 Vectors does have a &lt;strong&gt;2KB filterable metadata limit&lt;/strong&gt; per vector. Since Bedrock KB's internal metadata already consumes ~1KB, custom metadata is effectively limited to ~1KB. The system handles this by setting all metadata keys (including &lt;code&gt;allowed_group_sids&lt;/code&gt;) as non-filterable and performing SID matching on the application side after retrieval.&lt;/p&gt;

&lt;p&gt;If you start with S3 Vectors and later need higher performance, you can export on-demand to OpenSearch Serverless using the included &lt;code&gt;export-to-opensearch.sh&lt;/code&gt; script.&lt;/p&gt;




&lt;h2&gt;
  
  
  Embedding Design: &lt;code&gt;.metadata.json&lt;/code&gt; and the Ingestion Pipeline
&lt;/h2&gt;

&lt;p&gt;Permission metadata follows the standard &lt;strong&gt;Bedrock KB metadata file specification&lt;/strong&gt;. Each document has a companion &lt;code&gt;.metadata.json&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;product-catalog.md                    ← Document body
product-catalog.md.metadata.json      ← Permission metadata
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The metadata format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metadataAttributes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"allowed_group_sids"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;S-1-1-0&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"access_level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"public"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"doc_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"catalog"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;allowed_group_sids&lt;/code&gt; field is a JSON array string of Windows SIDs that are allowed to access the document. &lt;code&gt;S-1-1-0&lt;/code&gt; is the well-known "Everyone" SID.&lt;/p&gt;

&lt;p&gt;Bedrock KB Ingestion Jobs automatically read these &lt;code&gt;.metadata.json&lt;/code&gt; files alongside documents, chunk the content, vectorize with Amazon Titan Text Embeddings v2 (1024 dimensions), and store everything in the vector store. No custom ETL pipeline needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Design Decisions and Trade-offs
&lt;/h3&gt;

&lt;p&gt;At scale (thousands of documents), managing individual &lt;code&gt;.metadata.json&lt;/code&gt; files becomes a maintenance burden. The system supports three approaches:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Pros&lt;/th&gt;
&lt;th&gt;Cons&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;.metadata.json&lt;/code&gt; (current default)&lt;/td&gt;
&lt;td&gt;✅ Production&lt;/td&gt;
&lt;td&gt;Bedrock KB native, no extra infra&lt;/td&gt;
&lt;td&gt;Doubles file count, manual management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ONTAP REST API auto-generation&lt;/td&gt;
&lt;td&gt;✅ Partially implemented&lt;/td&gt;
&lt;td&gt;File server ACLs as source of truth&lt;/td&gt;
&lt;td&gt;Requires Embedding server&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB permission master&lt;/td&gt;
&lt;td&gt;🔜 Recommended for scale&lt;/td&gt;
&lt;td&gt;DB-driven, easy auditing&lt;/td&gt;
&lt;td&gt;Requires pre-Ingestion generation pipeline&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The recommended direction for large-scale environments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ONTAP REST API (ACL retrieval)
  → DynamoDB document-permissions table
  → Auto-generate .metadata.json before Ingestion Job
  → Ingest via S3 AP into Bedrock KB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Multiple Authentication Modes
&lt;/h2&gt;

&lt;p&gt;The system supports 5 authentication configurations, all driven by &lt;code&gt;cdk.context.json&lt;/code&gt; parameters:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Authentication&lt;/th&gt;
&lt;th&gt;Permission Source&lt;/th&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;A: Email/Password&lt;/td&gt;
&lt;td&gt;Cognito native&lt;/td&gt;
&lt;td&gt;Manual DynamoDB SID registration&lt;/td&gt;
&lt;td&gt;Default (no extra config)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B: SAML AD Federation&lt;/td&gt;
&lt;td&gt;Cognito + SAML IdP&lt;/td&gt;
&lt;td&gt;AD Sync Lambda → auto SID retrieval&lt;/td&gt;
&lt;td&gt;&lt;code&gt;enableAdFederation=true&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;C: OIDC + LDAP&lt;/td&gt;
&lt;td&gt;Cognito + OIDC IdP&lt;/td&gt;
&lt;td&gt;LDAP query → auto UID/GID retrieval&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;oidcProviderConfig&lt;/code&gt; + &lt;code&gt;ldapConfig&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;D: OIDC Claims Only&lt;/td&gt;
&lt;td&gt;Cognito + OIDC IdP&lt;/td&gt;
&lt;td&gt;OIDC token claims → group mapping&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;oidcProviderConfig&lt;/code&gt; + &lt;code&gt;groupClaimName&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E: SAML + OIDC Hybrid&lt;/td&gt;
&lt;td&gt;Both IdPs simultaneously&lt;/td&gt;
&lt;td&gt;Combined SID + UID/GID&lt;/td&gt;
&lt;td&gt;Both configs + &lt;code&gt;permissionMappingStrategy=hybrid&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkkydhomo74tksg7cb6no.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkkydhomo74tksg7cb6no.png" alt="Sign-in Page — SAML + OIDC Hybrid" width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The OIDC/LDAP federation enables &lt;strong&gt;zero-touch user provisioning&lt;/strong&gt;: when a user signs in via the OIDC IdP for the first time, the Identity Sync Lambda automatically queries LDAP for their UID/GID/groups and stores them in DynamoDB. No admin intervention required.&lt;/p&gt;

&lt;p&gt;For environments with FSx ONTAP UNIX volumes, the system also supports &lt;strong&gt;ONTAP name-mapping&lt;/strong&gt; — automatically resolving UNIX usernames to Windows users via the ONTAP REST API.&lt;/p&gt;




&lt;h2&gt;
  
  
  Agentic AI: Beyond Document Search
&lt;/h2&gt;

&lt;p&gt;The system isn't just a search engine. Toggle between three modes with one click:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;KB Mode&lt;/strong&gt;: Permission-aware document search and Q&amp;amp;A&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single Agent Mode&lt;/strong&gt;: Permission-aware autonomous multi-step reasoning via a single Bedrock Agent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi Agent Mode&lt;/strong&gt;: Supervisor + Collaborator pattern for complex multi-agent workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frf2fsvqqhk330rql0zzu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frf2fsvqqhk330rql0zzu.png" alt="Agent Mode Card Grid" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Agent mode includes an &lt;strong&gt;Agent Directory&lt;/strong&gt; — a catalog-style management screen where you can create, edit, share, and schedule Bedrock Agents from templates. The directory now includes a &lt;strong&gt;Registry tab&lt;/strong&gt; for importing agents from AWS Agent Registry, and a &lt;strong&gt;Teams tab&lt;/strong&gt; for creating multi-agent teams.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17ux4xzcu9b263gdnnpf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17ux4xzcu9b263gdnnpf.png" alt="Agent Directory with Registry Tab" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Permission filtering works in all modes. Even when agents autonomously search and reason across multiple documents, only documents the user is authorized to see are included.&lt;/p&gt;

&lt;h3&gt;
  
  
  AgentCore Memory (v3.3.0)
&lt;/h3&gt;

&lt;p&gt;With &lt;code&gt;enableAgentCoreMemory=true&lt;/code&gt;, the system integrates Amazon Bedrock AgentCore Memory for conversation context maintenance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Short-term memory&lt;/strong&gt;: In-session conversation history (TTL: 3 days)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-term memory&lt;/strong&gt;: Cross-session user preferences and summaries (semantic + summary strategies)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F05z5s193raoeesr3xc56.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F05z5s193raoeesr3xc56.png" alt="AgentCore Memory Sidebar" width="800" height="406"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Episodic Memory (v4.0.0)
&lt;/h3&gt;

&lt;p&gt;Building on AgentCore Memory, &lt;code&gt;enableEpisodicMemory=true&lt;/code&gt; adds a new dimension: the agent remembers &lt;em&gt;how&lt;/em&gt; it solved problems, not just &lt;em&gt;what&lt;/em&gt; it knows.&lt;/p&gt;

&lt;p&gt;While semantic memory stores facts and summaries, episodic memory records complete task episodes — the goal, reasoning steps, actions taken, outcomes, and reflections. When a similar task comes up later, the agent automatically retrieves the top 3 most relevant past episodes and injects them into its reasoning context.&lt;/p&gt;

&lt;p&gt;Think of it as giving the agent a "lessons learned" database that grows with every interaction:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Episode recording&lt;/strong&gt;: After each conversation, a Background Reflection process automatically extracts episodes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Similar episode injection&lt;/strong&gt;: Before executing a task, the agent searches for similar past episodes and uses them to inform its approach&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Episode management UI&lt;/strong&gt;: Browse, search (semantic, 300ms debounce), and delete episodes from the sidebar&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graceful degradation&lt;/strong&gt;: If episodic memory fails, core agent functionality continues uninterrupted&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The UI shows an "📚 Referenced past experience (N)" badge on responses that leveraged episodic memory.&lt;/p&gt;




&lt;h2&gt;
  
  
  Additional Features
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Smart Routing (v3.1.0)
&lt;/h3&gt;

&lt;p&gt;Automatic model selection based on query complexity. Short factual queries route to Claude Haiku (fast, cheap); complex analytical queries route to Claude Sonnet (powerful). Toggle ON/OFF in the sidebar.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjzckrh5o3nhgnx8tljua.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjzckrh5o3nhgnx8tljua.png" alt="Smart Routing" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Image Analysis RAG (v3.1.0)
&lt;/h3&gt;

&lt;p&gt;Drag-and-drop image upload in the chat input. Images are analyzed with Bedrock Vision API (Claude Haiku 4.5) and the analysis is integrated into KB search context.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjt24e93osmytgd9s2ecr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjt24e93osmytgd9s2ecr.png" alt="Image Upload" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  6-Layer Security
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;L1&lt;/td&gt;
&lt;td&gt;CloudFront Geo Restriction&lt;/td&gt;
&lt;td&gt;Geographic access control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L2&lt;/td&gt;
&lt;td&gt;AWS WAF (6 rules)&lt;/td&gt;
&lt;td&gt;Attack pattern detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L3&lt;/td&gt;
&lt;td&gt;CloudFront OAC (SigV4)&lt;/td&gt;
&lt;td&gt;Origin authentication&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L4&lt;/td&gt;
&lt;td&gt;Lambda Function URL IAM Auth&lt;/td&gt;
&lt;td&gt;API-level access control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L5&lt;/td&gt;
&lt;td&gt;Cognito JWT / SAML / OIDC&lt;/td&gt;
&lt;td&gt;User authentication&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L6&lt;/td&gt;
&lt;td&gt;SID / UID+GID / OIDC Group Filtering&lt;/td&gt;
&lt;td&gt;Document-level authorization&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  8-Language i18n — Why It Matters
&lt;/h3&gt;

&lt;p&gt;The UI and all documentation (README, guides, setup instructions) are available in 8 languages: Japanese, English, Korean, Simplified Chinese, Traditional Chinese, French, German, and Spanish.&lt;/p&gt;

&lt;p&gt;This isn't just a nice-to-have. Enterprise file servers are inherently multi-regional — a global company's FSx ONTAP volumes serve teams across Tokyo, Seoul, Shanghai, Frankfurt, and New York. If the RAG interface only speaks English, you've created a barrier for the very users who need it most.&lt;/p&gt;

&lt;p&gt;The implementation uses Next.js &lt;code&gt;next-intl&lt;/code&gt; with per-locale message files. Every UI string goes through &lt;code&gt;useTranslations()&lt;/code&gt;. The AI's chat responses also match the user's language — a Korean user asking in Korean gets a Korean answer with Korean citation labels.&lt;/p&gt;

&lt;p&gt;Here's what the card grid looks like across all 8 languages:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;🇯🇵 日本語&lt;/th&gt;
&lt;th&gt;🇺🇸 English&lt;/th&gt;
&lt;th&gt;🇰🇷 한국어&lt;/th&gt;
&lt;th&gt;🇨🇳 简体中文&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fynwul0kqu3a3ao4v2prm.png" alt="ja" width="800" height="405"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsrxdmjuvdcljfmwos4yc.png" alt="en" width="800" height="405"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F34up0z4lvjh0l7uxbgnz.png" alt="ko" width="800" height="405"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpn9jjap3q10l57auei15.png" alt="zh-CN" width="800" height="405"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;🇹🇼 繁體中文&lt;/th&gt;
&lt;th&gt;🇫🇷 Français&lt;/th&gt;
&lt;th&gt;🇩🇪 Deutsch&lt;/th&gt;
&lt;th&gt;🇪🇸 Español&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fef3kievslhl41n329njt.png" alt="zh-TW" width="800" height="405"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmwe3evtyuex1ds4z3t0v.png" alt="fr" width="800" height="405"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5vso4cgubr8g5l2esipn.png" alt="de" width="800" height="405"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq0ysudmy9uoq9a2fmw4p.png" alt="es" width="800" height="405"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  v4.0.0: Six New Features (April 2026)
&lt;/h2&gt;

&lt;p&gt;v4.0.0 adds six capabilities that extend the system from document search into a more complete enterprise AI platform. All are opt-in via CDK parameters — zero additional cost when disabled.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent Registry Integration
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;enableAgentRegistry=true&lt;/code&gt; adds a "Registry" tab to the Agent Directory, connecting to AWS Agent Registry (Amazon Bedrock AgentCore). Your organization's shared Agents, Tools, and MCP Servers become searchable and importable directly from the UI.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Semantic search across registry records&lt;/li&gt;
&lt;li&gt;One-click import from registry to local Bedrock Agent (name collision handling with &lt;code&gt;_imported_YYYYMMDD&lt;/code&gt; suffix)&lt;/li&gt;
&lt;li&gt;Publish local agents to the registry (with approval workflow)&lt;/li&gt;
&lt;li&gt;Resource type filters (Agent / Tool / MCP Server)&lt;/li&gt;
&lt;li&gt;Cross-region access via &lt;code&gt;agentRegistryRegion&lt;/code&gt; parameter&lt;/li&gt;
&lt;li&gt;Fault isolation: registry errors don't affect other Agent Directory tabs&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Agent Registry is a Preview API as of April 2026. The implementation uses SigV4-signed HTTP with REST path mapping. When the Node.js SDK adds native commands, the client can be swapped with minimal changes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Multimodal RAG Search
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;embeddingModel: "nova-multimodal"&lt;/code&gt; switches the Knowledge Base from text-only (Titan Text Embeddings v2) to cross-modal search across text, images, video, and audio using Amazon Nova Multimodal Embeddings.&lt;/p&gt;

&lt;p&gt;The architecture uses two patterns that make model changes painless:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Embedding Model Registry&lt;/strong&gt;: Model definitions are configuration objects in a catalog. Adding a new model = adding one entry&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KB Config Strategy&lt;/strong&gt;: Dynamically generates KB configuration, IAM policies, and Lambda environment variables from the registry entry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For gradual migration, &lt;code&gt;multimodalKbMode: "dual"&lt;/code&gt; runs two KBs in parallel — text-only (Titan) + multimodal (Nova) — with a query router that directs text queries to the text KB and image-attached queries to the multimodal KB. Users can toggle between them.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Caveat&lt;/strong&gt;: Nova Multimodal Embeddings is currently available in us-east-1 and us-west-2 only. Changing the embedding model requires KB recreation and full data re-ingestion.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Guardrails Organizational Safeguards
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;enableGuardrails=true&lt;/code&gt; with optional &lt;code&gt;guardrailsConfig&lt;/code&gt; gives fine-grained control over Bedrock Guardrails:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Content filter strength&lt;/strong&gt;: Per-category (sexual, violence, hate, insults, misconduct, prompt attack) input/output filter levels (NONE/LOW/MEDIUM/HIGH)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Topic policies&lt;/strong&gt;: Block specific topics (e.g., competitor information)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PII detection&lt;/strong&gt;: Per-entity-type actions (BLOCK or ANONYMIZE for email, phone, credit card, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contextual grounding&lt;/strong&gt;: Hallucination prevention with configurable thresholds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The UI adds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GuardrailsStatusBadge&lt;/strong&gt; on every chat response: ✅ safe / ⚠️ filtered / ⚠️ check unavailable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GuardrailsAdminPanel&lt;/strong&gt; in the sidebar (admin-only, read-only): shows account guardrails config and detects AWS Organizations Organizational Safeguards&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EMF metrics&lt;/strong&gt;: &lt;code&gt;GuardrailsInputBlocked&lt;/code&gt;, &lt;code&gt;GuardrailsOutputFiltered&lt;/code&gt;, &lt;code&gt;GuardrailsPassthrough&lt;/code&gt; → CloudWatch dashboard + SNS alerts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Error handling follows a &lt;strong&gt;Fail-Open&lt;/strong&gt; strategy: if the Guardrails API times out (5s) or returns 5xx, chat continues normally with an error log. The AI never stops working because of a guardrails hiccup.&lt;/p&gt;

&lt;h3&gt;
  
  
  Voice Chat (Amazon Nova Sonic)
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;enableVoiceChat=true&lt;/code&gt; adds voice interaction. Click the 🎤 microphone button (or Ctrl+Shift+V), speak your question, and get a text + audio response — all through the same permission-aware RAG pipeline.&lt;/p&gt;

&lt;p&gt;Phase 1 (current) uses REST + Bedrock Converse API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Browser (mic) → POST /api/voice/stream → Converse API (speech→text)
                                        → KB/Agent RAG pipeline
                                        → text + audio response → Browser
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Waveform animation (Canvas-based, input=blue, output=green, respects &lt;code&gt;prefers-reduced-motion&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;30-second silence timeout with auto-stop&lt;/li&gt;
&lt;li&gt;Auto-reconnect (max 3 attempts), then text fallback&lt;/li&gt;
&lt;li&gt;Works in KB mode, Single Agent mode, and Multi Agent mode&lt;/li&gt;
&lt;li&gt;Permission filtering is input-method-agnostic — voice queries get the same SID/UID/GID filtering as text&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Phase 2 (planned) will use API Gateway WebSocket + Nova Sonic &lt;code&gt;InvokeModelWithBidirectionalStream&lt;/code&gt; for real-time bidirectional streaming.&lt;/p&gt;

&lt;p&gt;Estimated monthly cost: $70–$100 (input ~$0.0019/min, output ~$0.0076/min).&lt;/p&gt;

&lt;h3&gt;
  
  
  AgentCore Policy
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;enableAgentPolicy=true&lt;/code&gt; adds agent behavior control. Define boundaries in natural language — what tools the agent can use, what APIs it can call, what data it can access — and the system enforces them in real-time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;3 policy templates&lt;/strong&gt;: Security-focused, Cost-focused, Flexibility-focused&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PolicyEvaluationMiddleware&lt;/strong&gt;: Evaluates every agent action against the policy (3s timeout)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fail-open / Fail-closed&lt;/strong&gt;: &lt;code&gt;policyFailureMode&lt;/code&gt; controls behavior when policy evaluation fails&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Violation logging&lt;/strong&gt;: EMF-format metrics (&lt;code&gt;PolicyViolationCount&lt;/code&gt;, &lt;code&gt;PolicyEvaluationLatency&lt;/code&gt;) → CloudWatch dashboard&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PolicySection&lt;/strong&gt; in Agent create/edit forms: optional natural language policy input (max 2000 chars)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PolicyBadge&lt;/strong&gt; (🛡️) on agents with active policies&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: AgentCore Policy reached GA in March 2026 with a Policy Engine + Gateway architecture. Policies are written in Cedar language (with natural language auto-conversion). The implementation uses SigV4-signed HTTP.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Feature Flags Runtime API
&lt;/h3&gt;

&lt;p&gt;A cross-cutting change that affects all v4 features: the UI no longer relies on &lt;code&gt;NEXT_PUBLIC_*&lt;/code&gt; build-time environment variables. Instead, a &lt;code&gt;/api/config/features&lt;/code&gt; endpoint reads Lambda environment variables at runtime and returns feature flags. The &lt;code&gt;useFeatureFlags&lt;/code&gt; hook caches flags in localStorage for instant page loads.&lt;/p&gt;

&lt;p&gt;This means you can enable/disable features by changing CDK parameters and redeploying — without rebuilding the Docker image.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-Agent Collaboration: Now Default-On
&lt;/h3&gt;

&lt;p&gt;When &lt;code&gt;enableAgent=true&lt;/code&gt;, multi-agent collaboration (&lt;code&gt;enableMultiAgent&lt;/code&gt;) is now enabled by default. Bedrock Agents have zero standby cost, so this adds no running cost. Token consumption only increases (3-6x) when users actually chat in Multi Agent mode. Set &lt;code&gt;enableMultiAgent: false&lt;/code&gt; explicitly to disable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg0clamplfubh2y9afohz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg0clamplfubh2y9afohz.png" alt="Multi-Agent Mode" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Multi-Agent Collaboration: Permission-Aware Agent Teams
&lt;/h2&gt;

&lt;p&gt;The system uses Amazon Bedrock Agents' &lt;strong&gt;Supervisor + Collaborator pattern&lt;/strong&gt;. Instead of a single agent handling everything, specialized agents work together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Supervisor Agent&lt;/strong&gt;: Detects user intent, routes tasks to the right collaborator&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Permission Resolver&lt;/strong&gt;: Resolves SID/UID/GID from the User Access Table&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval Agent&lt;/strong&gt;: Executes KB search with permission metadata filters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analysis Agent&lt;/strong&gt;: Summarizes and reasons over filtered context (no direct KB access)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output Agent&lt;/strong&gt;: Generates reports and documents (no direct KB access)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key design principle: &lt;strong&gt;KB access is restricted to Permission Resolver and Retrieval Agent only.&lt;/strong&gt; Analysis and Output agents receive "filtered context" — they never touch the knowledge base directly. This preserves the same SID/UID/GID permission boundaries that exist in single-agent mode.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8hjc62op6rx4vzv8or8t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8hjc62op6rx4vzv8or8t.png" alt="Teams Gallery" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost Structure
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Agent Calls&lt;/th&gt;
&lt;th&gt;Est. Cost/Request&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Single Agent (existing)&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;~$0.02&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-Agent (simple query)&lt;/td&gt;
&lt;td&gt;2–3&lt;/td&gt;
&lt;td&gt;~$0.06&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-Agent (complex query)&lt;/td&gt;
&lt;td&gt;4–6&lt;/td&gt;
&lt;td&gt;~$0.17&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Deployment Lessons Learned
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;CloudFormation &lt;code&gt;AgentCollaboration&lt;/code&gt; values&lt;/strong&gt;: Only &lt;code&gt;DISABLED&lt;/code&gt;, &lt;code&gt;SUPERVISOR&lt;/code&gt;, and &lt;code&gt;SUPERVISOR_ROUTER&lt;/code&gt; are valid. &lt;code&gt;COLLABORATOR&lt;/code&gt; is NOT a valid value. Collaborator Agents should not set this property at all.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2-stage deploy is mandatory&lt;/strong&gt;: You cannot create a Supervisor Agent with &lt;code&gt;SUPERVISOR_ROUTER&lt;/code&gt; and collaborators in a single CloudFormation operation. The solution: create with &lt;code&gt;DISABLED&lt;/code&gt; first, then a Custom Resource Lambda changes to &lt;code&gt;SUPERVISOR_ROUTER&lt;/code&gt;, associates collaborators, and runs &lt;code&gt;PrepareAgent&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IAM permissions&lt;/strong&gt;: The Supervisor Agent's IAM role needs &lt;code&gt;bedrock:GetAgentAlias&lt;/code&gt; + &lt;code&gt;bedrock:InvokeAgent&lt;/code&gt; on &lt;code&gt;agent-alias/*/*&lt;/code&gt;. The Custom Resource Lambda needs &lt;code&gt;iam:PassRole&lt;/code&gt; for the Supervisor role.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tips for Builders
&lt;/h2&gt;

&lt;h3&gt;
  
  
  OpenLDAP &lt;code&gt;memberOf&lt;/code&gt; Overlay
&lt;/h3&gt;

&lt;p&gt;If you're testing with OpenLDAP, the LDAP Connector reads the &lt;code&gt;memberOf&lt;/code&gt; attribute from user entries. Basic OpenLDAP doesn't populate this automatically — you need to add &lt;code&gt;moduleload memberof&lt;/code&gt; and &lt;code&gt;overlay memberof&lt;/code&gt; to &lt;code&gt;slapd.conf&lt;/code&gt;, and create &lt;code&gt;groupOfNames&lt;/code&gt; entries (not just &lt;code&gt;posixGroup&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The repo includes &lt;code&gt;setup-openldap.sh&lt;/code&gt; that handles all of this automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Geo Restriction Default
&lt;/h3&gt;

&lt;p&gt;The WAF configuration defaults to Japan-only access (&lt;code&gt;allowedCountries: ["JP"]&lt;/code&gt;). If you're deploying outside Japan, update this before deploying:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"allowedCountries"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"JP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"US"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SG"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set to &lt;code&gt;[]&lt;/code&gt; for worldwide access.&lt;/p&gt;

&lt;h3&gt;
  
  
  Existing FSx ONTAP Reuse
&lt;/h3&gt;

&lt;p&gt;If you already have an FSx for ONTAP file system, specify &lt;code&gt;existingFileSystemId&lt;/code&gt;, &lt;code&gt;existingSvmId&lt;/code&gt;, and &lt;code&gt;existingVolumeId&lt;/code&gt; in &lt;code&gt;cdk.context.json&lt;/code&gt; to skip FSx creation entirely. This cuts deployment time from 30-40 minutes to under 10 minutes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Built with Kiro
&lt;/h2&gt;

&lt;p&gt;I used &lt;a href="https://kiro.dev" rel="noopener noreferrer"&gt;Kiro&lt;/a&gt; throughout the entire development lifecycle — specs for requirements-to-code traceability, hooks for automated validation on file saves, and steering files for project-specific rules that persist across sessions. The v4.0.0 release involved 195 files changed, 8-language documentation updates, property-based tests with fast-check, and live AWS environment verification across multiple accounts — all developed with Kiro's assistance. As a solo developer, this level of tooling makes enterprise-quality projects feasible.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG.git
&lt;span class="nb"&gt;cd &lt;/span&gt;FSx-for-ONTAP-Agentic-Access-Aware-RAG &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm &lt;span class="nb"&gt;install

&lt;/span&gt;npx cdk bootstrap aws://&lt;span class="si"&gt;$(&lt;/span&gt;aws sts get-caller-identity &lt;span class="nt"&gt;--query&lt;/span&gt; Account &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;/ap-northeast-1
npx cdk bootstrap aws://&lt;span class="si"&gt;$(&lt;/span&gt;aws sts get-caller-identity &lt;span class="nt"&gt;--query&lt;/span&gt; Account &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;/us-east-1

bash demo-data/scripts/pre-deploy-setup.sh
npx cdk deploy &lt;span class="nt"&gt;--all&lt;/span&gt; &lt;span class="nt"&gt;--require-approval&lt;/span&gt; never
bash demo-data/scripts/post-deploy-setup.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Prerequisites: Node.js 22+, Docker, AWS CLI configured with AdministratorAccess. Total deployment time is about 30-40 minutes (FSx ONTAP creation takes 20-30 minutes). Use &lt;code&gt;existingFileSystemId&lt;/code&gt; to skip FSx creation if you already have one.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The project is at v4.0.0 with 19 implementation aspects and actively evolving. Some directions I'm exploring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Voice Chat Phase 2&lt;/strong&gt;: WebSocket via API Gateway + Nova Sonic &lt;code&gt;InvokeModelWithBidirectionalStream&lt;/code&gt; for real-time bidirectional streaming (replacing the current REST-based Phase 1)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DynamoDB-driven permission master&lt;/strong&gt;: Eliminating per-file &lt;code&gt;.metadata.json&lt;/code&gt; management for large-scale environments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-volume embedding&lt;/strong&gt;: Independent S3 Access Points per FSx for ONTAP volume with cross-volume search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent Registry GA SDK migration&lt;/strong&gt;: When the Node.js SDK adds native Agent Registry commands, swap from SigV4 HTTP to SDK calls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'm looking for feedback on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Permission models&lt;/strong&gt;: Are SID/UID-GID/OIDC-group/hybrid strategies sufficient for your use cases?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Voice interaction patterns&lt;/strong&gt;: What voice-specific workflows would be valuable in enterprise RAG?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policy templates&lt;/strong&gt;: What agent behavior boundaries matter most in your organization?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Guardrails configurations&lt;/strong&gt;: What content filtering rules does your compliance team require?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you try it out, I'd love to hear about your experience — especially edge cases I haven't considered. PRs and issues are welcome.&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG" rel="noopener noreferrer"&gt;GitHub Repository&lt;/a&gt;&lt;/strong&gt; — README available in 8 languages, same as the application UI&lt;/p&gt;




&lt;p&gt;Yoshiki Fujiwara&lt;/p&gt;

</description>
      <category>aws</category>
      <category>amazonfsxfornetappontap</category>
      <category>agenticai</category>
      <category>rag</category>
    </item>
    <item>
      <title>Taking a look at Tiering of AWS ever-evolving File Storage services!</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Tue, 31 Dec 2024 13:42:48 +0000</pubDate>
      <link>https://forem.com/yoshikifujiwara/taking-a-look-at-tiering-of-aws-ever-evolving-file-storage-services-24l6</link>
      <guid>https://forem.com/yoshikifujiwara/taking-a-look-at-tiering-of-aws-ever-evolving-file-storage-services-24l6</guid>
      <description>&lt;h1&gt;
  
  
  Disclaimer
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Opinions are my own.&lt;/li&gt;
&lt;li&gt;Cover image is for &lt;a href="https://aws.amazon.com/s3/storage-classes/intelligent-tiering/" rel="noopener noreferrer"&gt;Amazon S3 Intelligent-Tiering storage class Automatic Access tiers&lt;/a&gt;, but main topic is AWS File Storage Tiering.&lt;/li&gt;
&lt;li&gt;If you have any questions or concerns after reading this article, please let us know.&lt;/li&gt;
&lt;li&gt;Based on the features and contents as of December 31, 2024. If there are any discrepancies, please check the latest AWS official information at the time you read the article.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Why it's a good time to take a look at AWS File Storage Tiering now?&lt;/li&gt;
&lt;li&gt;File storage services on AWS&lt;/li&gt;
&lt;li&gt;What is Tiering?&lt;/li&gt;
&lt;li&gt;Tierings of AWS file storage services&lt;/li&gt;
&lt;li&gt;Conclusion&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Why it's a good time to take a look at AWS File Storage Tiering now?
&lt;/h3&gt;

&lt;p&gt;In conclusion, "AWS file storage services and features are so diversified that it is difficult to understand them as a whole". In such a situation, Tiering is an effective function for optimizing the file storage environment. But, it's difficult to understand because there are no materials for cross-sectional understanding.&lt;br&gt;
I would like to understand it from a bird's-eye view rather than comparing services. If I could not find such materials, I'll write it myself.&lt;br&gt;&lt;/p&gt;

&lt;p&gt;The direct trigger of this blog post was the update during AWS re:Invent 2024, which introduced the new FSx storage class Amazon FSx Intelligent-Tiering. You can check the AWS release note titled "&lt;a href="https://aws.amazon.com/about-aws/whats-new/2024/12/amazon-fsx-intelligent-tiering-storage-class-fsx/?nc1=h_ls" rel="noopener noreferrer"&gt;Announcing Amazon FSx Intelligent-Tiering, a new storage class for FSx&lt;/a&gt;"&lt;br&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;This is a announcement related storage during AWS re:Invent 2024 and is an exciting update that is attracting attention.&lt;br&gt;&lt;br&gt;
It was also covered at &lt;a href="https://www.youtube.com/watch?v=uCpDw1aFZJY" rel="noopener noreferrer"&gt;Storage-JAWS#6&lt;/a&gt;, the re:Cap community based webinar of AWS re:Invent 2024 of "Storage-JAWS", a storage specialized branch of the Japanese AWS User Group, or JAWS-UG.&lt;br&gt;&lt;br&gt;
You can check the YouTube video linked above for details. (Since this is a local event in Japan, the introduction was in Japanese)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If you haven't seen the Storage-JAWS video above, please check it out as the speakers summarize the updates in an easy-to-understand manner, and there are great sessions and LTs that are explained with live demos and screenshots of management console. Please feel free to fill out the survey after watching (this is a guide for me as a Storage-JAWS management member).
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;...But,&lt;br&gt;
&lt;strong&gt;Contrary to the title and content of the release notes above, "FSx Intelligent-Tiering," the reality is that this is a feature only available for Amazon FSx for OpenZFS, and is not available for the entire Amazon FSx series. Other services that follow are not available at the time of writing this blog.&lt;/strong&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;Personally, I felt that this notation was confusing, so I decided to take this opportunity to clarify "What is File Storage Tiering on AWS and how it works?"&lt;/p&gt;

&lt;p&gt;Below are some excerpts from the AWS release note. I think there are some expressions that can lead to misunderstandings as to whether this is about the FSx series or FSx for OpenZFS feature. I will try to write this blog in a way that avoids any misunderstandings.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Today, AWS announces the general availability of Amazon FSx Intelligent-Tiering, a new storage class for Amazon FSx that costs up to 85% less than the FSx SSD storage class and up to 20% less than traditional HDD-based NAS storage on premises, and that brings full elasticity and intelligent tiering to network-attached storage (NAS). The new storage class is available today on Amazon FSx for OpenZFS.&lt;/p&gt;

&lt;p&gt;Using Amazon FSx, customers can launch and run fully managed cloud file systems that have familiar NAS capabilities such as point-in-time snapshots, data clones, and user quotas. Before today, customers have been moving NAS data sets for mission-critical and performance-intensive workloads to FSx for OpenZFS, using the existing SSD storage class for predictable high performance. With the new FSx Intelligent-Tiering storage class, customers can now bring to FSx for OpenZFS a broad range of general-purpose data sets, including those with a large proportion of infrequently accessed data stored on low-cost HDD on premises. FSx Intelligent-Tiering delivers low-cost storage and costs up to 85% less than the FSx SSD storage class and up to 20% less than traditional HDD-based NAS storage on premises...&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now, let's get into the details.&lt;/p&gt;




&lt;h3&gt;
  
  
  File storage services on AWS
&lt;/h3&gt;

&lt;p&gt;Items in this chapter&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Storage services on AWS&lt;/li&gt;
&lt;li&gt;File Storage services on AWS&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Storage services on AWS
&lt;/h4&gt;

&lt;p&gt;First, let's take a look at storage services on AWS.&lt;br&gt;
As far as we can see from the following AWS Webpage "Cloud Storage on AWS", there are 11 "categories" as shown in the figure below, and each category is further divided into services and features.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8vw8lut2pq3n6fj4177.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8vw8lut2pq3n6fj4177.png" alt=" " width="800" height="818"&gt;&lt;/a&gt;&lt;br&gt;
Quote: &lt;a href="https://aws.amazon.com/products/storage/?nc1=h_ls" rel="noopener noreferrer"&gt;Cloud Storage on AWS&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  File Storage services on AWS
&lt;/h4&gt;

&lt;p&gt;There are three types of file storage in the diagram above: Amazon Elastic File System (EFS), Amazon FSx, and Amazon File Cache. In this article, we will take a look at the EFS and FSx series other than Amazon File Cache, which is a cache service that does not have a tiering feature.&lt;br&gt;
The characteristics of File Storage and how to choose one are summarized in an easy-to-understand manner on the AWS blog "&lt;a href="https://aws.amazon.com/jp/blogs/news/choose-filestorageservice/" rel="noopener noreferrer"&gt;How to choose an AWS file storage service&lt;/a&gt;". &lt;br&gt;
 It's written in Japanese, but you can easy to understand the contents of it. Or you can check AWS re:Invent sessions of AWS Files Storage Team like &lt;a href="https://www.youtube.com/watch?v=IQR3zxdxjZA" rel="noopener noreferrer"&gt;AWS re:Invent 2024 - Network-attached storage in the cloud with Amazon FSx (STG202)&lt;/a&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;For example, in the AWS blog mentioned above, in the "Storage Types" section below, it confirmed the understanding of the three types of "block storage", "object storage", and "file storage" that will be explained this time,&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F81l3zrf8dgwfmj3di7hn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F81l3zrf8dgwfmj3di7hn.png" alt=" " width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the "File Storage Protocols" section below, it discusses the two protocols NFS (Network File System) and SMB (Server Message Block), and in the "AWS File Storage Service" section, it explains Amazon FSx for Luster's unique protocol. Let's check the three protocols including.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2dafaq9y2j4qpxl1ndq1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2dafaq9y2j4qpxl1ndq1.png" alt=" " width="800" height="269"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6muxb9y3j4riytzqp7jn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6muxb9y3j4riytzqp7jn.png" alt=" " width="800" height="277"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In addition, in the "Comparison of AWS File Storage Services" section, the figure below shows a list of services excluding FSx for Luster at the time of writing, and points to consider when making a selection. In "Protocols supported by each service," Amazon FSx for NetApp ONTAP is characterized by being multi-protocol, not only supporting both NFS and SMB protocols, but also supporting iSCSI. It also touches on some unique aspects.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft6z4nrtr0swyhwkugmyq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft6z4nrtr0swyhwkugmyq.png" alt=" " width="800" height="465"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn13ugh6yj1ifc58hj841.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn13ugh6yj1ifc58hj841.png" alt=" " width="469" height="140"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, in the section "How to choose an AWS file storage service", the part "Consider what you are looking for" touches on Tiering, the main theme of this blog, and the EFS lifecycle policy. In this blog, I will update and supplement the above table based on the AWS re:Invent 2024 Update.&lt;/p&gt;




&lt;h3&gt;
  
  
  What is Tiering?
&lt;/h3&gt;

&lt;p&gt;What do you think of when you hear the word "Tiering"?&lt;br&gt;
Let's take a look at Amazon S3, AWS's representative storage. In the feature description of "&lt;a href="https://aws.amazon.com/s3/storage-classes/intelligent-tiering/?nc1=h_ls" rel="noopener noreferrer"&gt;Amazon S3 Intelligent-Tiering storage class&lt;/a&gt;" whose name includes Tiering, This feature is introduced from the perspective of cost optimization as follows.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The Amazon S3 Intelligent-Tiering storage class is designed to optimize storage costs by automatically moving data to the most cost-effective access tier when access patterns change.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;However, I guess that the scope of implementation and use cases are gradually expanding now.&lt;br&gt;
For example, on an AWS blog, "&lt;a href="https://aws.amazon.com/jp/blogs/architecture/optimizing-your-aws-infrastructure-for-sustainability-part-ii-storage/" rel="noopener noreferrer"&gt;Optimizing your AWS Infrastructure for Sustainability, Part II: Storage&lt;/a&gt;", the section "Analyze data access patterns and use storage tiers", with the following two explanations, it suggests to make S3 lifecycle management sustainable by using automated tiering.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Choosing the right storage tier after analyzing data access patterns gives you more sustainable storage options in the cloud.&lt;/li&gt;
&lt;li&gt;For data with unknown or changing access patterns, use Amazon S3 Intelligent-Tiering to monitor access patterns and move objects among tiers automatically. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5tcxor9xhp92pfv8k20.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5tcxor9xhp92pfv8k20.jpg" alt=" " width="800" height="324"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this way, we can see that storage tiering is important not only from a cost optimization perspective but also from a sustainability perspective.&lt;br&gt;&lt;br&gt;
I often use not only S3 but also the EFS and FSx series of file storage services from the perspective of optimizing cost and performance, and provide design support. I will take a deep dive into file storage, which has a wide variety of types and features.&lt;/p&gt;




&lt;h3&gt;
  
  
  Tierings of AWS file storage services
&lt;/h3&gt;

&lt;p&gt;Items in this chapter&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tierings of AWS file storage services&lt;/li&gt;
&lt;li&gt;Relevant information&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Tierings of AWS file storage services
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Description/Service&lt;/th&gt;
&lt;th&gt;EFS&lt;/th&gt;
&lt;th&gt;FSx for OpenZFS&lt;/th&gt;
&lt;th&gt;FSx for ONTAP&lt;/th&gt;
&lt;th&gt;FSx for Lustre&lt;/th&gt;
&lt;th&gt;FSx for Windows&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tiering&lt;/td&gt;
&lt;td&gt;Available&lt;/td&gt;
&lt;td&gt;Available&lt;/td&gt;
&lt;td&gt;Available&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tiering granularity&lt;/td&gt;
&lt;td&gt;File Level&lt;/td&gt;
&lt;td&gt;Data Block Level&lt;/td&gt;
&lt;td&gt;Data Block Level&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tiering configuration unit&lt;/td&gt;
&lt;td&gt;File System&lt;/td&gt;
&lt;td&gt;File System&lt;/td&gt;
&lt;td&gt;Volume&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tiering Class/Pool&lt;/td&gt;
&lt;td&gt;1.Standard&lt;br&gt; 2.Infrequent Access (IA)&lt;br&gt;3.Archive&lt;/td&gt;
&lt;td&gt;1.Frequent Access&lt;br&gt; 2.Infrequent Access  &lt;br&gt;3.Archive&lt;/td&gt;
&lt;td&gt;1.Primary Storage(SSD)&lt;br&gt; 2.Capacity Pool (HDD)&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;I purposely included "Tiering granularity". This is because it's one of pitfalls of tiering. Let's say S3 and EFS implement file/object level tiering, so when the file/object is read, it is determined that it has been accessed and the tiering is applied even if the data blocks in the file have hardly been read and tiering won't be triggered the rest of data block has never been accessed. &lt;/p&gt;

&lt;p&gt;On the other hand, FSx for OpenZFS and FSx for ONTAP have data block level tiering, so the data blocks that can potentially be optimized for cost/performance through tiering may be wider.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;As an example, if you open this blog and leave after just seeing the title, EFS and S3 will not be subject to Tiering who this file/object, but FSx for OpenZFS and FSx for ONTAP will execute Tiering for unread data blocks other than the title. As described in the ONTAP Knowledge Base and Technical Report, FSx for ONTAP judges data blocks in 4K units and performs tiering in 4M units. Regarding FSx for OpenZFS, I have not yet been able to find any documentation that shows the specific behavior of Tiering at the data block level, so if anyone knows about it, I would appreciate it if you could let me know.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cautions and TIPs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The FSx Intelligent-Tiering storage class of FSx for OpenZFS can only be used in multi-AZ configurations on a file system basis.

&lt;ul&gt;
&lt;li&gt;Screenshot of AWS Management Console for creating FSx for OpenZFS file system. When you choose Intelligent-Tiering (elastic) Storage class, you cannot choose Single-AZ 2 (HA) nor Single-AZ 2 (non-HA).
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1igm66f9mxq4tcrz83th.png" alt=" " width="" height=""&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Tiering in FSx for ONTAP allows you to change and tune the Tiering Policy even after the file system and volume are created. Both single-AZ and multi-AZ configurations are possible.

&lt;ul&gt;
&lt;li&gt;Screenshots of AWS Management Console for creating FSx for ONTAP file system and Updating volume.
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foct1njx9lo6phkk0lwtl.png" alt=" " width="800" height="274"&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fheo3dlff8z79faeoly15.png" alt=" " width="800" height="1025"&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;If you want to run EFS Tiering when creating a file system, select "Customize" and set it in "Lifecycle Management".
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpsgym7idy1ppxhgr44yg.png" alt=" " width="800" height="532"&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfp4ctqi88iaeydcz101.png" alt=" " width="800" height="343"&gt;
&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  Relevant information
&lt;/h4&gt;

&lt;p&gt;EFS Tiering：&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/efs/pricing/?nc1=h_ls" rel="noopener noreferrer"&gt;Amazon EFS Pricing&lt;/a&gt;: 
&amp;gt; Amazon EFS offers three storage classes: EFS Standard, SSD-based storage which delivers sub-millisecond latencies for actively-used data; EFS Infrequent Access (EFS IA), cost-optimized storage which delivers milliseconds latencies for data accessed only a few times a quarter; and EFS Archive, cost-optimized storage which delivers milliseconds latencies for long-lived data accessed a few times a year or less. &lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/efs/features/infrequent-access/?nc1=h_ls" rel="noopener noreferrer"&gt;Amazon EFS Infrequent Access&lt;/a&gt;：「Amazon EFS will automatically and transparently move your files to the lower cost regional EFS IA storage class based on the last time they were accessed. 」&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://aws.amazon.com/jp/efs/storage-classes/archive/" rel="noopener noreferrer"&gt;Amazon EFS Archive&lt;/a&gt;：「Amazon EFS will automatically and transparently move your files to the lower cost EFS IA and Archive storage classes based on the last time they were accessed.」&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;FSx for OpenZFS Tiering：&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/fsx/openzfs/features/?nc1=h_ls" rel="noopener noreferrer"&gt;Amazon FSx for OpenZFS Features &amp;gt; Cost optimization&amp;gt;Intelligent-Tiering(Need to open to see following description)&lt;/a&gt;: &lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Intelligent-Tiering delivers automatic storage cost savings when data access patterns change, without performance impact or operational overhead. The Amazon FSx for OpenZFS Intelligent-Tiering storage class is designed to optimize storage costs using elasticity to automatically move data to the most cost-effective access tier when access patterns change. Amazon FSx Intelligent-Tiering is up to 85% lower cost than the FSx SSD storage class, and up to 20% lower cost compared to traditional on-premises HDD deployments.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;AWS News blog: &lt;a href="https://aws.amazon.com/jp/blogs/aws/announcing-amazon-fsx-intelligent-tiering-a-new-storage-class-for-fsx-for-openzfs/" rel="noopener noreferrer"&gt;Announcing Amazon FSx Intelligent-Tiering, a new storage class for FSx for OpenZFS&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;FSx for ONTAP：&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/fsx/netapp-ontap/features/?nc1=h_ls" rel="noopener noreferrer"&gt;Amazon FSx for NetApp ONTAP Features &amp;gt; Cost optimization &amp;gt; Elastic capacity pool tiering(Need to open to see following description)&lt;/a&gt;：&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Each Amazon FSx for NetApp ONTAP file system has two storage tiers: primary storage and capacity pool storage. Primary storage is provisioned, scalable, high-performance SSD storage that’s purpose-built for the active portion of your data set. Capacity pool storage is a fully elastic storage tier that can scale to petabytes in size and is cost-optimized for infrequently-accessed data. Amazon FSx for NetApp ONTAP automatically tiers data from SSD storage to capacity pool storage based on your access patterns, allowing you to achieve SSD levels of performance for your workload while only paying for SSD storage for a small fraction of your data. Capacity pool storage automatically grows and shrinks as you tier data to it, providing elastic storage for the portion of your data set that grows over time without the need to plan or provision capacity for this data.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ONTAP Tiering logic: &lt;a href="https://www.netapp.com/media/17239-tr-4598.pdf" rel="noopener noreferrer"&gt;Technical Report FabricPool best practices ONTAP 9.14.1&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Data movement &amp;gt; Tiering data to an object store&lt;br&gt;
After a block has been identified as cold, it is marked for tiering. During this time, a background tiering scan looks for cold blocks. When enough 4KB blocks from the same volume have been collected, they are concatenated into a 4MB object and moved to the cloud tier based on the volume tiering policy.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;What do you think of this summary of AWS file storage Tiering?&lt;br&gt;
When selecting a file system, you will most likely choose one that you are familiar with. However, from the perspective of cost/performance optimization, data integration, application configuration, security, etc., why not consider equally the features and benefits of other services?&lt;/p&gt;

&lt;p&gt;Among the options discussed this time, there are differences in the tiering methods for EFS, FSx for OpenZFS, and FSx for ONTAP, and you will need to make a choice based on usage and user experience.&lt;br&gt;
Regarding prices, the actual amount and validity vary greatly depending on the method of use and purpose, and it often changes with the release of new features, so I intentionally did not include it in the list this time so as not to make the comparison stand alone.&lt;/p&gt;

&lt;p&gt;I am sure that Amazon FSx Intelligent-Tiering will become even more powerful in the future. Let's keep an eye on AWS file storage and Tiering, which will continue to evolve!! &lt;br&gt;
I hope this blog will be helpful to someone.&lt;/p&gt;

&lt;p&gt;Bye now!!&lt;/p&gt;




&lt;p&gt;Socials:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://twitter.com/antiberial" rel="noopener noreferrer"&gt;Yoshiki Fujiwara on X&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.facebook.com/yoshiki.fujiwara1/" rel="noopener noreferrer"&gt;Yoshiki Fujiwara on Facebook&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/in/yoshiki-fujiwara/" rel="noopener noreferrer"&gt;Yoshiki Fujiwara on LinkedIn&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>filestorage</category>
      <category>tiering</category>
    </item>
  </channel>
</rss>
