<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Lam Bùi</title>
    <description>The latest articles on Forem by Lam Bùi (@lamkhac).</description>
    <link>https://forem.com/lamkhac</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3267673%2Fa00f228c-d06c-467d-b129-6b4abd3e7872.jpg</url>
      <title>Forem: Lam Bùi</title>
      <link>https://forem.com/lamkhac</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/lamkhac"/>
    <language>en</language>
    <item>
      <title>Building an Intelligent Document Processing Pipeline with AWS Lambda and Bedrock Data Automation</title>
      <dc:creator>Lam Bùi</dc:creator>
      <pubDate>Sun, 12 Oct 2025 19:22:34 +0000</pubDate>
      <link>https://forem.com/lamkhac/build-intelligent-document-processing-with-aws-lambda-and-bedrock-data-automation-45b0</link>
      <guid>https://forem.com/lamkhac/build-intelligent-document-processing-with-aws-lambda-and-bedrock-data-automation-45b0</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;This guide provides detailed instructions on building a serverless document processing pipeline using AWS Lambda and Amazon Bedrock Data Automation (BDA). We will create an automated document processing system where Lambda functions are triggered when documents are uploaded to S3, using BDA to extract structured information from documents.&lt;/p&gt;

&lt;h3&gt;
  
  
  What You Will Learn
&lt;/h3&gt;

&lt;p&gt;After completing this guide, you will be able to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create and configure Lambda functions through AWS Console&lt;/li&gt;
&lt;li&gt;Integrate Lambda with Amazon Bedrock Data Automation&lt;/li&gt;
&lt;li&gt;Set up S3 Event triggers for Lambda&lt;/li&gt;
&lt;li&gt;Automatically process documents and save results&lt;/li&gt;
&lt;li&gt;Monitor and troubleshoot Lambda functions&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Solution Architecture
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq4vme2yz9bm1lhjk8jd1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq4vme2yz9bm1lhjk8jd1.png" alt=" " width="800" height="307"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Knowledge Requirements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Basic understanding of AWS Console&lt;/li&gt;
&lt;li&gt;Basic Python knowledge&lt;/li&gt;
&lt;li&gt;Understanding of S3 and Lambda (basic level)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Article Structure
&lt;/h3&gt;

&lt;p&gt;This article is divided into 5 main parts:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Part&lt;/th&gt;
&lt;th&gt;Content&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Part 1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Environment Setup (Enable Models, S3, IAM, BDA Project)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Part 2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Create and Configure Lambda Function&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Part 3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Set up S3 Trigger&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Part 4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Testing and Deployment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Part 5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Real-world Examples and Best Practices&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;Before starting, ensure you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS Account with IAM permissions for Bedrock, Lambda, S3, IAM&lt;/li&gt;
&lt;li&gt;Access to AWS Console&lt;/li&gt;
&lt;li&gt;Selected a region supporting Amazon Bedrock Data Automation (recommended: &lt;code&gt;us-east-1&lt;/code&gt; or &lt;code&gt;us-west-2&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Modern browser (Chrome, Firefox, Safari, or Edge)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note about Region: Amazon Bedrock Data Automation is not available in all regions. Check &lt;a href="https://docs.aws.amazon.com/general/latest/gr/bedrock.html" rel="noopener noreferrer"&gt;Bedrock endpoints and quotas&lt;/a&gt; to see supported regions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lambda Function Code Reference
&lt;/h2&gt;

&lt;p&gt;Before setting up the environment, here is the complete Lambda function code that you will use later in Part 2. Having this code available upfront allows you to understand what we're building toward.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File&lt;/strong&gt;: &lt;code&gt;lambda_function.py&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;urllib.parse&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;unquote_plus&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize AWS clients
&lt;/span&gt;&lt;span class="n"&gt;s3_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;bda_runtime_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bedrock-data-automation-runtime&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sts_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Get configuration from environment variables
&lt;/span&gt;&lt;span class="n"&gt;BDA_PROJECT_ARN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BDA_PROJECT_ARN&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;OUTPUT_BUCKET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;OUTPUT_BUCKET&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Get region and account ID dynamically (Lambda provides these automatically)
&lt;/span&gt;&lt;span class="n"&gt;AWS_REGION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;AWS_REGION&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Lambda provides this automatically
&lt;/span&gt;&lt;span class="n"&gt;account_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sts_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_caller_identity&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Account&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Construct BDA Profile ARN dynamically
&lt;/span&gt;&lt;span class="n"&gt;BDA_PROFILE_ARN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:bedrock:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;AWS_REGION&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;account_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:data-automation-profile/us.data-automation-v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Lambda handler function to process documents with BDA.
    Triggered by S3 upload events.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Extract S3 bucket and key from event
&lt;/span&gt;        &lt;span class="n"&gt;s3_event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;input_bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s3_event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bucket&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Ensure lowercase
&lt;/span&gt;        &lt;span class="c1"&gt;# URL decode the key (S3 event has URL-encoded keys)
&lt;/span&gt;        &lt;span class="n"&gt;document_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;unquote_plus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s3_event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing document: s3://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;input_bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;document_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Validate bucket name format (BDA requirement)
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;input_bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;isalnum&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid bucket name format: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;input_bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. Must contain only lowercase letters, numbers, dots, and hyphens.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Construct S3 URIs
&lt;/span&gt;        &lt;span class="n"&gt;input_s3_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;input_bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;document_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="c1"&gt;# Generate output path with timestamp
&lt;/span&gt;        &lt;span class="n"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%Y%m%d_%H%M%S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;file_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;document_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;output_prefix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;processed/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;file_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;output_s3_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;OUTPUT_BUCKET&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;output_prefix&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="c1"&gt;# Invoke BDA to process the document
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invoking Bedrock Data Automation...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bda_runtime_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_data_automation_async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;inputConfiguration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3Uri&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;input_s3_uri&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;outputConfiguration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3Uri&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;output_s3_uri&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;dataAutomationConfiguration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;dataAutomationProjectArn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;BDA_PROJECT_ARN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;stage&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;LIVE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;dataAutomationProfileArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BDA_PROFILE_ARN&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;invocation_arn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;invocationArn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BDA Invocation ARN: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;invocation_arn&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Wait for processing to complete (with timeout)
&lt;/span&gt;        &lt;span class="n"&gt;max_wait_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;240&lt;/span&gt;  &lt;span class="c1"&gt;# 4 minutes (leave 1 min for cleanup)
&lt;/span&gt;        &lt;span class="n"&gt;wait_interval&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
        &lt;span class="n"&gt;elapsed_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;elapsed_time&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;max_wait_time&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;status_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bda_runtime_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_data_automation_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;invocationArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;invocation_arn&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;status_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Current status: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Success&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;result_s3_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;status_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;outputConfiguration&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3Uri&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing completed successfully!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Results saved to: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result_s3_uri&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Document processed successfully&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;input_s3_uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result_s3_uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;invocationArn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;invocation_arn&lt;/span&gt;
                    &lt;span class="p"&gt;})&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ClientError&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ServiceError&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="n"&gt;error_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing failed with status: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;error_msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;invocationArn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;invocation_arn&lt;/span&gt;
                    &lt;span class="p"&gt;})&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="c1"&gt;# Wait before checking again
&lt;/span&gt;            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wait_interval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;elapsed_time&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;wait_interval&lt;/span&gt;

        &lt;span class="c1"&gt;# Timeout reached
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing still in progress after &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;max_wait_time&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; seconds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;202&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Processing initiated but not completed within Lambda timeout&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;invocationArn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;invocation_arn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;note&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Check BDA console for final status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;error_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error processing document: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;error_msg&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key Features of This Code&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automatically converts bucket names to lowercase (BDA requirement)&lt;/li&gt;
&lt;li&gt;URL decodes file names from S3 events&lt;/li&gt;
&lt;li&gt;Validates bucket name format&lt;/li&gt;
&lt;li&gt;Dynamically constructs BDA Profile ARN&lt;/li&gt;
&lt;li&gt;Polls for completion status with timeout&lt;/li&gt;
&lt;li&gt;Comprehensive error handling and logging&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you reach Step 2.3 in Part 2, simply copy this code into your Lambda function.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction to Amazon Bedrock Data Automation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Intelligent Document Processing with Generative AI
&lt;/h3&gt;

&lt;p&gt;Generative AI not only drives innovation through ideation and content creation but also optimizes operational processes and increases productivity across various domains. Amazon Bedrock Data Automation (BDA) is a fully managed service that provides Intelligent Document Processing (IDP) capabilities enhanced by generative AI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Choose BDA?
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Complete Automation
&lt;/h4&gt;

&lt;p&gt;Enterprises can extract significant value from IDP enhanced with generative AI. By integrating generative AI capabilities into IDP solutions, organizations can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Advanced Document Understanding: Deep analysis of structure and semantics&lt;/li&gt;
&lt;li&gt;Structured Data Extraction: Automatic transformation from unstructured to structured data&lt;/li&gt;
&lt;li&gt;Automatic Classification: Document type recognition and appropriate routing&lt;/li&gt;
&lt;li&gt;Information Retrieval: Search and retrieve information from unstructured text&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  2. BDA vs Direct Bedrock API
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Direct Bedrock API&lt;/th&gt;
&lt;th&gt;BDA (Managed Service)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Need to build pipeline&lt;/td&gt;
&lt;td&gt;Fully managed, integrated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Document Understanding&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Requires prompt engineering&lt;/td&gt;
&lt;td&gt;Built-in document intelligence&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multimodal&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Only processes images&lt;/td&gt;
&lt;td&gt;Supports various formats: PDF, images, audio, video&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Output Format&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Custom JSON&lt;/td&gt;
&lt;td&gt;Standardized JSON + Markdown/HTML/CSV&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bounding Boxes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not available&lt;/td&gt;
&lt;td&gt;Automatically provides element positions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scalability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Self-managed&lt;/td&gt;
&lt;td&gt;Auto-scaling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best For&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Custom IDP solutions&lt;/td&gt;
&lt;td&gt;Enterprise document processing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Business Value and Use Cases
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Government &amp;amp; Public Sector
&lt;/h4&gt;

&lt;p&gt;Process and extract data from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Birth certificate applications, ID/Passport applications&lt;/li&gt;
&lt;li&gt;Immigration and visa records&lt;/li&gt;
&lt;li&gt;Legal contracts and government forms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Benefits&lt;/strong&gt;: Reduce processing time from days to minutes, improve citizen service quality.&lt;/p&gt;

&lt;h4&gt;
  
  
  Healthcare
&lt;/h4&gt;

&lt;p&gt;Extract and organize information from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Electronic medical records&lt;/li&gt;
&lt;li&gt;Insurance claim requests&lt;/li&gt;
&lt;li&gt;Prescriptions and test results&lt;/li&gt;
&lt;li&gt;Clinical trial records&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Benefits&lt;/strong&gt;: Improve data accuracy, increase information accessibility for better patient care.&lt;/p&gt;

&lt;h4&gt;
  
  
  Finance &amp;amp; Banking
&lt;/h4&gt;

&lt;p&gt;Automate processing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Loan and credit applications&lt;/li&gt;
&lt;li&gt;Financial reports and tax documents&lt;/li&gt;
&lt;li&gt;Contracts and agreements&lt;/li&gt;
&lt;li&gt;Compliance documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Benefits&lt;/strong&gt;: Reduce manual work, increase operational efficiency, reduce compliance risks.&lt;/p&gt;

&lt;h4&gt;
  
  
  Logistics &amp;amp; Supply Chain
&lt;/h4&gt;

&lt;p&gt;Process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Shipping documents and invoices&lt;/li&gt;
&lt;li&gt;Purchase orders and warehouse receipts&lt;/li&gt;
&lt;li&gt;Supplier contracts&lt;/li&gt;
&lt;li&gt;Customs certificates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Benefits&lt;/strong&gt;: Optimize processes, increase supply chain visibility.&lt;/p&gt;

&lt;h4&gt;
  
  
  Retail &amp;amp; E-commerce
&lt;/h4&gt;

&lt;p&gt;Automate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer orders&lt;/li&gt;
&lt;li&gt;Product catalogs and descriptions&lt;/li&gt;
&lt;li&gt;Invoices and receipts&lt;/li&gt;
&lt;li&gt;Marketing documents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Benefits&lt;/strong&gt;: Personalize customer experience, process orders quickly and efficiently.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Combine BDA with Lambda?
&lt;/h3&gt;

&lt;p&gt;Combining BDA with AWS Lambda creates a powerful &lt;strong&gt;serverless IDP pipeline&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Event-Driven&lt;/strong&gt;: Automatically processes when new documents arrive&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalable&lt;/strong&gt;: Automatically scales with document volume&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost-Effective&lt;/strong&gt;: Pay only when processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low Maintenance&lt;/strong&gt;: No server management needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integration Ready&lt;/strong&gt;: Easy to integrate with existing systems&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Statistics and Impact
&lt;/h3&gt;

&lt;p&gt;According to AWS, organizations deploying IDP with generative AI have achieved:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;90%+ reduction in document processing time&lt;/li&gt;
&lt;li&gt;60-70% cost savings in operations&lt;/li&gt;
&lt;li&gt;95%+ accuracy in data extraction&lt;/li&gt;
&lt;li&gt;80% reduction in manual intervention needs&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  AWS Services in This Guide
&lt;/h3&gt;

&lt;p&gt;This guide uses the following AWS services to build a serverless IDP pipeline:&lt;/p&gt;

&lt;h4&gt;
  
  
  Amazon Bedrock Data Automation
&lt;/h4&gt;

&lt;p&gt;Fully managed service providing intelligent document processing capabilities with generative AI. BDA automatically processes documents and extracts structured information without needing to orchestrate complex tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Document classification and extraction&lt;/li&gt;
&lt;li&gt;Multi-granularity analysis (document, page, element level)&lt;/li&gt;
&lt;li&gt;Generative summaries and descriptions&lt;/li&gt;
&lt;li&gt;Support for diverse formats: PDF, images, audio, video&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  AWS Lambda
&lt;/h4&gt;

&lt;p&gt;Serverless computing service that allows running code without managing servers. Lambda automatically scales and you only pay for compute time used.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Role in solution&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Triggered when new document uploaded&lt;/li&gt;
&lt;li&gt;Calls BDA API to process document&lt;/li&gt;
&lt;li&gt;Handles and saves results&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Amazon S3 (Simple Storage Service)
&lt;/h4&gt;

&lt;p&gt;Highly scalable, durable, and secure object storage service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Role in solution&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Store input documents&lt;/li&gt;
&lt;li&gt;Store output results from BDA&lt;/li&gt;
&lt;li&gt;Trigger S3 events for Lambda&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Amazon CloudWatch
&lt;/h4&gt;

&lt;p&gt;Monitoring and observability service for AWS resources and applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Role in solution&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Collect logs from Lambda execution&lt;/li&gt;
&lt;li&gt;Monitor metrics (invocations, errors, duration)&lt;/li&gt;
&lt;li&gt;Troubleshooting and debugging&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  AWS IAM (Identity and Access Management)
&lt;/h4&gt;

&lt;p&gt;Securely manage access and permissions for AWS resources.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Role in solution&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create execution role for Lambda&lt;/li&gt;
&lt;li&gt;Grant permissions to S3, Bedrock, CloudWatch&lt;/li&gt;
&lt;li&gt;Security and access control&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Detailed Workflow
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. USER uploads document
   │
   ▼
2. S3 Event Notification
   │
   ▼
3. Lambda Function triggered
   │
   ├─▶ Read document from S3
   │
   ├─▶ Call BDA InvokeDataAutomationAsync API
   │   │
   │   ▼
   │   BDA Processing:
   │   ├─ Document ingestion
   │   ├─ Structure analysis
   │   ├─ Content extraction (text, tables, figures)
   │   ├─ Semantic enrichment (AI summaries)
   │   └─ Result formation (JSON, Markdown, CSV)
   │
   ├─▶ Poll for completion status
   │
   └─▶ Save results to S3 output bucket
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Difference from Traditional IDP
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Traditional IDP&lt;/th&gt;
&lt;th&gt;BDA-Powered IDP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Rule-based extraction&lt;/td&gt;
&lt;td&gt;AI-powered understanding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Template dependency&lt;/td&gt;
&lt;td&gt;Template-free processing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manual training needed&lt;/td&gt;
&lt;td&gt;Pre-trained models&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Limited format support&lt;/td&gt;
&lt;td&gt;Multi-format support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No semantic understanding&lt;/td&gt;
&lt;td&gt;Deep semantic analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fixed output structure&lt;/td&gt;
&lt;td&gt;Flexible, rich output&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Part 1: Environment Setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1.1: Enable Model Access in Amazon Bedrock
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: Enable access to foundation models required for BDA.&lt;/p&gt;

&lt;p&gt;Important: This is a mandatory step before using Amazon Bedrock Data Automation. Without enabling model access, BDA cannot process documents.&lt;/p&gt;

&lt;h4&gt;
  
  
  Models to Enable:
&lt;/h4&gt;

&lt;p&gt;We need to enable the following models for this guide:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Models&lt;/strong&gt;: All models&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Models&lt;/strong&gt;: 

&lt;ul&gt;
&lt;li&gt;Claude 3.5 Haiku&lt;/li&gt;
&lt;li&gt;Claude 3 Sonnet&lt;/li&gt;
&lt;li&gt;Claude 3.5 Sonnet&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cohere Models&lt;/strong&gt;: 

&lt;ul&gt;
&lt;li&gt;Cohere Rerank 3.5&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Implementation Steps:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Open Amazon Bedrock Console&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Search for &lt;strong&gt;Bedrock&lt;/strong&gt; in the top search bar&lt;/li&gt;
&lt;li&gt;Click on &lt;strong&gt;Amazon Bedrock&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3yufft1dxl7338dp6gx.png" alt=" " width="800" height="436"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Access Model Access&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In the left menu (or navigation menu), click on &lt;strong&gt;Model access&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Modify model access&lt;/strong&gt; button (or &lt;strong&gt;Request model access&lt;/strong&gt; if first time)
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv5ztvr7qrbbzkuetldem.png" alt=" " width="800" height="391"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Select Models&lt;/strong&gt;&lt;br&gt;
In the models list, select the following:&lt;br&gt;
&lt;strong&gt;Amazon Models&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check all Amazon models
&lt;strong&gt;Claude Models&lt;/strong&gt; (by Anthropic):&lt;/li&gt;
&lt;li&gt;Check &lt;strong&gt;Claude 3.5 Haiku&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Check &lt;strong&gt;Claude 3 Sonnet&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Check &lt;strong&gt;Claude 3.5 Sonnet&lt;/strong&gt;
&lt;strong&gt;Cohere Models&lt;/strong&gt;:&lt;/li&gt;
&lt;li&gt;Check &lt;strong&gt;Cohere Rerank 3.5&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4erqiazusdzwtapai438.png" alt=" " width="800" height="389"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Submit Request&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click &lt;strong&gt;Next&lt;/strong&gt; button at bottom right
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F23mynau4gjsqr41thtlg.png" alt=" " width="800" height="365"&gt;
&lt;/li&gt;
&lt;li&gt;Review selected models&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Submit&lt;/strong&gt; to request access
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3zir5i7voj53q9f1huro.png" alt=" " width="800" height="383"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Wait for Approval&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Most models will be approved instantly (status: &lt;strong&gt;Access granted&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Some models may take a few minutes for provisioning&lt;/li&gt;
&lt;li&gt;Refresh page to see status updates&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When all models show status "Access granted", you are ready to continue.&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 1.2: Create S3 Buckets
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: Create S3 buckets to store input documents and output results.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Log in to &lt;strong&gt;AWS Management Console&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Search for and select &lt;strong&gt;S3&lt;/strong&gt; service&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Create bucket&lt;/strong&gt; button
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F427xau19sj6y5oyxb4qr.png" alt=" " width="800" height="387"&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Create Input Bucket:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;Enter information:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bucket name&lt;/strong&gt;: &lt;code&gt;bda-workshop-input-demo-xyz789&lt;/code&gt; (replace &lt;code&gt;xyz789&lt;/code&gt; with your random string to ensure bucket name is unique)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Block Public Access settings&lt;/strong&gt;: Keep default (Block all public access)
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1g7rh3xcowmpcfydo7vt.png" alt=" " width="800" height="393"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Scroll down and click &lt;strong&gt;Create bucket&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4y3owuqv8uqvkv66vlu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4y3owuqv8uqvkv66vlu.png" alt=" " width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Create Output Bucket:
&lt;/h4&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Repeat steps 3-5 with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bucket name&lt;/strong&gt;: &lt;code&gt;bda-workshop-output-demo-xyz789&lt;/code&gt; (use same suffix as input bucket)&lt;/li&gt;
&lt;li&gt;Other configurations keep default
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1u9dqutbfpz2ccc5acq4.png" alt=" " width="800" height="365"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Step 1.3: Create IAM Role for Lambda
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: Create IAM role with sufficient permissions for Lambda to access S3 and Bedrock.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In AWS Console, search for and select &lt;strong&gt;IAM&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;In left menu, select &lt;strong&gt;Roles&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Click &lt;strong&gt;Create role&lt;/strong&gt; button&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foxms9ujtoedj4zx8zqr3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foxms9ujtoedj4zx8zqr3.png" alt=" " width="800" height="390"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Select Trusted Entity:
&lt;/h4&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Trusted entity type&lt;/strong&gt;: Select &lt;strong&gt;AWS service&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use case&lt;/strong&gt;: Select &lt;strong&gt;Lambda&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Click &lt;strong&gt;Next&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi13d3fkfu5dzu4duagl8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi13d3fkfu5dzu4duagl8.png" alt=" " width="800" height="387"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Assign Permissions:
&lt;/h4&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In policy search box, find and select the following policies (check the checkbox):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;AWSLambdaBasicExecutionRole&lt;/code&gt; (for CloudWatch Logs)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AmazonS3FullAccess&lt;/code&gt; (to read/write S3)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AmazonBedrockFullAccess&lt;/code&gt; (to use Bedrock BDA)
Security Note: In production environments, create custom policies with minimum necessary permissions instead of using FullAccess policies.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;strong&gt;Next&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyzfi0ijfewurk0v3wkbs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyzfi0ijfewurk0v3wkbs.png" alt=" " width="800" height="389"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Name and Create Role:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Role name&lt;/strong&gt;: &lt;code&gt;BDA-Lambda-ExecutionRole&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Description&lt;/strong&gt;: `Execution role for BDA document processing Lambda function
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa42yhbfifnccc5bouoif.png" alt=" " width="800" height="389"&gt;
&lt;/li&gt;
&lt;li&gt;Scroll down and click &lt;strong&gt;Create role&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhgzrq6wtrjhtb4yvt1zw.png" alt=" " width="800" height="387"&gt;
---&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 1.4: Create BDA Project
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: Create BDA project to configure document processing.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In AWS Console, search for and select &lt;strong&gt;Amazon Bedrock&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;In left menu, select &lt;strong&gt;Data Automation&lt;/strong&gt; &amp;gt; &lt;strong&gt;Set-up Project&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4sc6rz1uo1deccbr382l.png" alt=" " width="800" height="388"&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Create project&lt;/strong&gt; button
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh7byh7i6x3coy4w3qekh.png" alt=" " width="800" height="386"&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Configure Project:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Enter information:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Project name&lt;/strong&gt;: &lt;code&gt;document-processing-project&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Select ** Create Project**
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvd6e2tegw7sqz42mrv3d.png" alt=" " width="800" height="386"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In &lt;strong&gt;Standard output configuration&lt;/strong&gt; section, configure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click &lt;strong&gt;Edit&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Granularity types&lt;/strong&gt;: 

&lt;ul&gt;
&lt;li&gt;Check &lt;strong&gt;DOCUMENT&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Check &lt;strong&gt;PAGE&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Check &lt;strong&gt;ELEMENT&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bounding box&lt;/strong&gt;: Select &lt;strong&gt;ENABLED&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generative field&lt;/strong&gt;: Select &lt;strong&gt;ENABLED&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text format types&lt;/strong&gt;: 

&lt;ul&gt;
&lt;li&gt;Check &lt;strong&gt;MARKDOWN&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Additional file format&lt;/strong&gt;: Select &lt;strong&gt;ENABLED&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkf541g5f4yqkty2t87n.png" alt=" " width="800" height="386"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;strong&gt;Save changes&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiopa86vo68dmtyda4knw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiopa86vo68dmtyda4knw.png" alt=" " width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Wait for project status to change to &lt;strong&gt;Changes saved successfully.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Copy and save Project ARN&lt;/strong&gt; (will be used in Lambda configuration step)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click on project name to view details&lt;/li&gt;
&lt;li&gt;Copy ARN from project details page&lt;/li&gt;
&lt;li&gt;Example ARN: &lt;code&gt;arn:aws:bedrock:us-east-1:111111111111:data-automation-project/abc123xyz&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq9c1yb9camduroetrwli.png" alt=" " width="800" height="386"&gt;
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Part 2: Create Lambda Function
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 2.1: Create Lambda Function
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;In AWS Console, search for and select &lt;strong&gt;Lambda&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;strong&gt;Create function&lt;/strong&gt; button&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fujuu6vd2c8jqwwuok4hx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fujuu6vd2c8jqwwuok4hx.png" alt=" " width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Select &lt;strong&gt;Author from scratch&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Enter information:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Function name&lt;/strong&gt;: &lt;code&gt;BDA-Document-Processor&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime&lt;/strong&gt;: Select &lt;strong&gt;Python 3.12&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture&lt;/strong&gt;: Select &lt;strong&gt;x86_64&lt;/strong&gt;
#### Permissions:&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In &lt;strong&gt;Permissions&lt;/strong&gt; section, select &lt;strong&gt;Use an existing role&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Existing role&lt;/strong&gt;: Select &lt;code&gt;BDA-Lambda-ExecutionRole&lt;/code&gt; (role created in step 1.3)&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhesgg8gandjsk639tszl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhesgg8gandjsk639tszl.png" alt=" " width="800" height="389"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Click &lt;strong&gt;Create function&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc2j6kkdiintt7s1qobno.png" alt=" " width="800" height="387"&gt;
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Step 2.2: Configure Lambda Function
&lt;/h3&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Increase Timeout and Memory:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;In Lambda function page, select &lt;strong&gt;Configuration&lt;/strong&gt; tab&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;General configuration&lt;/strong&gt; &amp;gt; Click &lt;strong&gt;Edit&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcyktb71ap3785wbnp4vu.png" alt=" " width="800" height="387"&gt;
&lt;/li&gt;
&lt;li&gt;Configure:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: &lt;code&gt;512 MB&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timeout&lt;/strong&gt;: &lt;code&gt;5 min&lt;/code&gt; (5 minutes)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Description&lt;/strong&gt;: &lt;code&gt;Processes documents using Bedrock Data Automation&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Save&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyjsu1cdxhkm966lq4jl5.png" alt=" " width="800" height="388"&gt;
&lt;/li&gt;
&lt;li&gt;Still in &lt;strong&gt;Configuration&lt;/strong&gt; tab, select &lt;strong&gt;Environment variables&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhe6ai64658g4prw5xgk7.png" alt=" " width="800" height="364"&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Edit&lt;/strong&gt; &amp;gt; Click &lt;strong&gt;Add environment variable&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Add the following variables:&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;BDA_PROJECT_ARN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Paste ARN saved in step 1.4&lt;/td&gt;
&lt;td&gt;ARN of created BDA project&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;OUTPUT_BUCKET&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;bda-workshop-output-demo-xyz789&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Output bucket name (replace with your bucket name)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;8.Click &lt;strong&gt;Save&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxtqe4m4k2nopqefpf82.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxtqe4m4k2nopqefpf82.png" alt=" " width="800" height="390"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2.3: Write Lambda Code
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Return to &lt;strong&gt;Code&lt;/strong&gt; tab&lt;/li&gt;
&lt;li&gt;In editor, delete sample code and replace with the complete Lambda function code provided in the Lambda Function Code Reference section above (before Part 1).&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Deploy&lt;/strong&gt; to save code
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqzdli2ynage7axx980fh.png" alt=" " width="800" height="406"&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The code handles S3 events, validates bucket names, invokes BDA, and polls for completion status with comprehensive error handling.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 3: Configure S3 Trigger
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 3.1: Add S3 Trigger for Lambda
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: Configure Lambda to automatically run when new files are uploaded to S3.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In Lambda function page, select &lt;strong&gt;Configuration&lt;/strong&gt; tab&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Triggers&lt;/strong&gt; in left menu&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;strong&gt;Add trigger&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5wos3vgh0aruwz0i45e7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5wos3vgh0aruwz0i45e7.png" alt=" " width="800" height="387"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Select a source&lt;/strong&gt;: Select &lt;strong&gt;S3&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bucket&lt;/strong&gt;: Select &lt;code&gt;bda-workshop-input-demo-xyz789&lt;/code&gt; (input bucket created in step 1.2)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Event type&lt;/strong&gt;: Select &lt;strong&gt;All object create events&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Prefix&lt;/strong&gt; (optional): Leave blank (process all files) or enter &lt;code&gt;documents/&lt;/code&gt; if only want to process files in documents folder&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Suffix&lt;/strong&gt; (optional): Enter &lt;code&gt;.pdf&lt;/code&gt; if only want to process PDF files&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0fjx7vroiznnubr5xpxu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0fjx7vroiznnubr5xpxu.png" alt=" " width="800" height="392"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Check the checkbox &lt;strong&gt;I acknowledge that using the same S3 bucket...&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;strong&gt;Add&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5qnsftoga7ttzlfu5wu1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5qnsftoga7ttzlfu5wu1.png" alt=" " width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Part 4: Testing and Deployment
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 4.1: Prepare Test Document
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Download a PDF sample file to your computer (or use existing file)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Important - File Naming: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;File names should only contain: letters (a-z, A-Z), numbers (0-9), hyphens (-), underscores (_), dots (.)&lt;/li&gt;
&lt;li&gt;Should NOT use: spaces, special characters, Vietnamese characters&lt;/li&gt;
&lt;li&gt;Good file name examples: &lt;code&gt;document-test.pdf&lt;/code&gt;, &lt;code&gt;report_2024.pdf&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;File names to avoid: &lt;code&gt;report document.pdf&lt;/code&gt;, &lt;code&gt;document (1).pdf&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 4.2: Upload and Test
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Open &lt;strong&gt;S3 Console&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Go to bucket &lt;strong&gt;bda-workshop-input-demo-xyz789&lt;/strong&gt; (your input bucket)&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Upload&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Add files&lt;/strong&gt; and select test PDF file&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Upload&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9d0ccz9005athyes4uvk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9d0ccz9005athyes4uvk.png" alt=" " width="800" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4.3: Check Lambda Execution
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Return to &lt;strong&gt;Lambda Console&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select function &lt;strong&gt;BDA-Document-Processor&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Monitor&lt;/strong&gt; tab&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;strong&gt;View CloudWatch logs&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftpavb6ww8n0zyt921sh2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftpavb6ww8n0zyt921sh2.png" alt=" " width="800" height="390"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click on latest log stream (most recent time)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;View logs to verify:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lambda was triggered&lt;/li&gt;
&lt;li&gt;Document being processed&lt;/li&gt;
&lt;li&gt;BDA invocation successful&lt;/li&gt;
&lt;li&gt;Processing completed&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1xwevfsfxsb9oxxv1mo5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1xwevfsfxsb9oxxv1mo5.png" alt=" " width="800" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4.4: Check Output
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Open &lt;strong&gt;S3 Console&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Go to bucket &lt;strong&gt;bda-workshop-output-demo-xyz789&lt;/strong&gt; (your output bucket)&lt;/li&gt;
&lt;li&gt;Go to &lt;strong&gt;processed/&lt;/strong&gt; folder&lt;/li&gt;
&lt;li&gt;You will see folder with timestamp and file name&lt;/li&gt;
&lt;li&gt;Inside that folder will have:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;job_metadata.json&lt;/code&gt; - job metadata&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;standard-output.json&lt;/code&gt; - main processing results&lt;/li&gt;
&lt;li&gt;Other files like markdown, CSV (if any)
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqi6hez750j7cfyeulgrh.png" alt=" " width="800" height="389"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Download &lt;code&gt;standard-output.json&lt;/code&gt; and view results:

&lt;ul&gt;
&lt;li&gt;Document summary&lt;/li&gt;
&lt;li&gt;Extracted tables&lt;/li&gt;
&lt;li&gt;Figures&lt;/li&gt;
&lt;li&gt;Page-level information&lt;/li&gt;
&lt;li&gt;Element details&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Real-World Example: Government Birth Certificate Processing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Use Case from AWS Blog
&lt;/h3&gt;

&lt;p&gt;A real-world example from &lt;a href="https://aws.amazon.com/blogs/machine-learning/intelligent-document-processing-using-amazon-bedrock-and-anthropic-claude/" rel="noopener noreferrer"&gt;AWS Machine Learning Blog&lt;/a&gt; illustrates how IDP with generative AI solves real problems:&lt;/p&gt;

&lt;h4&gt;
  
  
  The Problem
&lt;/h4&gt;

&lt;p&gt;Government agency issuing birth certificates receives applications through multiple channels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Online applications&lt;/li&gt;
&lt;li&gt;Forms completed at physical locations
&lt;/li&gt;
&lt;li&gt;Mailed paper applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Current Manual Process&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Scan paper applications&lt;/li&gt;
&lt;li&gt;Staff manually read and enter into system&lt;/li&gt;
&lt;li&gt;Check and validation&lt;/li&gt;
&lt;li&gt;Save to database&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Issues&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Very time-consuming&lt;/li&gt;
&lt;li&gt;High labor costs&lt;/li&gt;
&lt;li&gt;Prone to manual data entry errors&lt;/li&gt;
&lt;li&gt;Not scalable when volume increases&lt;/li&gt;
&lt;li&gt;Complex if forms are in multiple languages (English, Spanish, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Solution with BDA
&lt;/h4&gt;

&lt;p&gt;With architecture similar to this guide, but with additions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SQS Queue&lt;/strong&gt;: Buffer to process messages reliably&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DynamoDB&lt;/strong&gt;: Store extracted data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-language Support&lt;/strong&gt;: Automatically translate and extract&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;&lt;/code&gt;&lt;code&gt;&lt;br&gt;
Upload Form → S3 → Lambda → BDA → SQS → Lambda → DynamoDB&lt;br&gt;
                     ↓&lt;br&gt;
                 Auto-detect language&lt;br&gt;
                 Extract all fields&lt;br&gt;
                 Translate if needed&lt;br&gt;
&lt;/code&gt;&lt;code&gt;&lt;/code&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Results Achieved
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Before&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;15-20 minutes/application (manual)&lt;/li&gt;
&lt;li&gt;High labor costs&lt;/li&gt;
&lt;li&gt;85-90% accuracy (human error)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;After&lt;/strong&gt; (with BDA):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&amp;lt; 1 minute/application (automated)&lt;/li&gt;
&lt;li&gt;60-70% cost savings&lt;/li&gt;
&lt;li&gt;95%+ accuracy&lt;/li&gt;
&lt;li&gt;Process multiple languages&lt;/li&gt;
&lt;li&gt;Scale to thousands of applications/day&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Fields Extracted Automatically
&lt;/h4&gt;

&lt;p&gt;BDA can extract complex information:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Applicant information (name, address, contact)&lt;/li&gt;
&lt;li&gt;Birth certificate recipient information&lt;/li&gt;
&lt;li&gt;Parent information&lt;/li&gt;
&lt;li&gt;Fee payment information&lt;/li&gt;
&lt;li&gt;Signatures and dates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bonus&lt;/strong&gt;: Automatically translate from Spanish to English&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Extending the Solution
&lt;/h3&gt;

&lt;p&gt;You can extend this solution in the following directions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add SQS Queue&lt;/strong&gt;: Buffer processing and retry logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add DynamoDB&lt;/strong&gt;: Store structured data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom Extraction&lt;/strong&gt;: Define fields to extract for specific domain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-language&lt;/strong&gt;: Process documents in multiple languages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human-in-the-loop&lt;/strong&gt;: Validation for critical data&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Cleanup Resources (Optional)
&lt;/h2&gt;

&lt;p&gt;If you do not want to continue using and avoid incurring costs:&lt;/p&gt;

&lt;h3&gt;
  
  
  Delete Lambda Function:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Lambda Console &amp;gt; Select function &amp;gt; &lt;strong&gt;Actions&lt;/strong&gt; &amp;gt; &lt;strong&gt;Delete&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Delete S3 Buckets:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;S3 Console &amp;gt; Select bucket &amp;gt; &lt;strong&gt;Empty&lt;/strong&gt; (delete all objects)&lt;/li&gt;
&lt;li&gt;Then select bucket &amp;gt; &lt;strong&gt;Delete&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Delete IAM Role:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;IAM Console &amp;gt; Roles &amp;gt; Select role &amp;gt; &lt;strong&gt;Delete&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Delete BDA Project:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Bedrock Console &amp;gt; Data Automation &amp;gt; Projects &amp;gt; Select project &amp;gt; &lt;strong&gt;Delete&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Reference Documentation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  AWS Documentation:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda Developer Guide&lt;/a&gt; - Complete Lambda guide&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/bda.html" rel="noopener noreferrer"&gt;Amazon Bedrock Data Automation&lt;/a&gt; - Official BDA documentation&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html" rel="noopener noreferrer"&gt;Amazon Bedrock User Guide&lt;/a&gt; - Bedrock overview&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/NotificationHowTo.html" rel="noopener noreferrer"&gt;S3 Event Notifications&lt;/a&gt; - S3 trigger configuration&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/bda-standard-output.html" rel="noopener noreferrer"&gt;Standard Output in BDA&lt;/a&gt; - Output configuration details&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  API References:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateDataAutomationProject.html" rel="noopener noreferrer"&gt;BDA CreateDataAutomationProject API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/bedrock/latest/APIReference/API_InvokeDataAutomationAsync.html" rel="noopener noreferrer"&gt;BDA InvokeDataAutomationAsync API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetDataAutomationStatus.html" rel="noopener noreferrer"&gt;BDA GetDataAutomationStatus API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/API_Invoke.html" rel="noopener noreferrer"&gt;Lambda Invoke API&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  AWS Blogs and Case Studies:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  BDA-Specific:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/machine-learning/simplify-multimodal-generative-ai-with-amazon-bedrock-data-automation/" rel="noopener noreferrer"&gt;Simplify multimodal generative AI with Amazon Bedrock Data Automation&lt;/a&gt; - BDA introduction&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/machine-learning/unleashing-the-multimodal-power-of-amazon-bedrock-data-automation-to-transform-unstructured-data-into-actionable-insights/" rel="noopener noreferrer"&gt;Unleashing the multimodal power of Amazon Bedrock Data Automation&lt;/a&gt; - Advanced BDA use cases&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/aws/get-insights-from-multimodal-content-with-amazon-bedrock-data-automation-now-generally-available/" rel="noopener noreferrer"&gt;Get insights from multimodal content with Amazon Bedrock Data Automation&lt;/a&gt; - GA announcement&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  IDP with Generative AI:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/machine-learning/intelligent-document-processing-using-amazon-bedrock-and-anthropic-claude/" rel="noopener noreferrer"&gt;Intelligent document processing using Amazon Bedrock and Anthropic Claude&lt;/a&gt; - Real-world IDP example (direct Bedrock API approach)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/machine-learning/scalable-intelligent-document-processing-using-amazon-bedrock/" rel="noopener noreferrer"&gt;Scalable intelligent document processing using Amazon Bedrock&lt;/a&gt; - Scalability patterns&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/compute/building-serverless-document-processing-workflows/" rel="noopener noreferrer"&gt;Building serverless document processing workflows&lt;/a&gt; - Serverless architecture patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Advanced Topics:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/machine-learning/building-a-multimodal-rag-based-application-using-amazon-bedrock-data-automation-and-amazon-bedrock-knowledge-bases/" rel="noopener noreferrer"&gt;Building a multimodal RAG application with BDA&lt;/a&gt; - RAG with BDA&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/blogs/aws/new-amazon-bedrock-capabilities-enhance-data-processing-and-retrieval/" rel="noopener noreferrer"&gt;New Amazon Bedrock capabilities enhance data processing&lt;/a&gt; - Latest features&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Solution Guidance:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/solutions/guidance/multimodal-data-processing-using-amazon-bedrock-data-automation/" rel="noopener noreferrer"&gt;Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation&lt;/a&gt; - AWS Solutions Library&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Product Pages:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/bedrock/bda/" rel="noopener noreferrer"&gt;Amazon Bedrock Data Automation Product Page&lt;/a&gt; - Features and pricing&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/bedrock/" rel="noopener noreferrer"&gt;Amazon Bedrock&lt;/a&gt; - Main Bedrock page&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda&lt;/a&gt; - Lambda features&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Video Resources:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.youtube.com/results?search_query=aws+reinvent+bedrock+data+automation" rel="noopener noreferrer"&gt;AWS re:Invent Sessions on Bedrock&lt;/a&gt; - Conference talks&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/events/online-tech-talks/" rel="noopener noreferrer"&gt;AWS Online Tech Talks&lt;/a&gt; - Technical webinars&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Congratulations! You have successfully completed and created a serverless document processing pipeline with AWS Lambda and Amazon Bedrock Data Automation!&lt;/p&gt;

&lt;h3&gt;
  
  
  Summary of What You Learned
&lt;/h3&gt;

&lt;p&gt;In this article, we covered:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Understanding BDA and IDP&lt;/strong&gt;: Learned about Intelligent Document Processing with generative AI and benefits of Amazon Bedrock Data Automation&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Building Infrastructure&lt;/strong&gt;: Created S3 buckets, IAM roles, and BDA project through AWS Console&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Developing Lambda Function&lt;/strong&gt;: Wrote Python code to integrate with BDA API&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Event-Driven Architecture&lt;/strong&gt;: Set up S3 event triggers for automated processing&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Testing and Monitoring&lt;/strong&gt;: Deployed, tested, and monitored solution with CloudWatch&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Troubleshooting&lt;/strong&gt;: Handled common issues and best practices  &lt;/p&gt;

&lt;h3&gt;
  
  
  Next Steps
&lt;/h3&gt;

&lt;p&gt;Now that you have a solid foundation, continue exploring:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Expand to multimodal&lt;/strong&gt;: Try processing images, audio, and video with BDA&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrate downstream systems&lt;/strong&gt;: Connect with DynamoDB, API Gateway, or business applications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Advanced patterns&lt;/strong&gt;: Implement human-in-the-loop workflows, model evaluation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Production readiness&lt;/strong&gt;: Error handling, DLQ, cost optimization, security hardening&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explore RAG&lt;/strong&gt;: Build multimodal RAG applications with BDA and Knowledge Bases&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Acknowledgments
&lt;/h3&gt;

&lt;p&gt;Thank you for reading this article! Hope this guide helps you start your journey with Amazon Bedrock Data Automation. Wish you success in building intelligent document processing solutions!&lt;/p&gt;

</description>
      <category>aws</category>
      <category>awslambda</category>
      <category>awsbedrockdataautomation</category>
    </item>
    <item>
      <title>Building an Intelligent Document Processing Pipeline on AWS with S3, Textract, Comprehend, and DynamoDB</title>
      <dc:creator>Lam Bùi</dc:creator>
      <pubDate>Sat, 04 Oct 2025 15:48:43 +0000</pubDate>
      <link>https://forem.com/lamkhac/building-an-intelligent-document-processing-pipeline-with-aws-s3-textract-comprehend--276g</link>
      <guid>https://forem.com/lamkhac/building-an-intelligent-document-processing-pipeline-with-aws-s3-textract-comprehend--276g</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In today's digital world, organizations are drowning in unstructured data. Documents, images, and PDFs contain valuable information that often remains untapped due to the manual effort required to extract and analyze it. What if we could automatically process these documents, extract meaningful insights, and store structured data for further analysis?&lt;/p&gt;

&lt;p&gt;This blog post will guide you through building a complete &lt;strong&gt;intelligent document processing pipeline&lt;/strong&gt; using AWS services. Our pipeline will automatically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extract text&lt;/strong&gt; from images and PDFs using Amazon Textract&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analyze content&lt;/strong&gt; for entities, key phrases, and sentiment using Amazon Comprehend
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Store structured results&lt;/strong&gt; in DynamoDB for easy querying and analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Process documents&lt;/strong&gt; automatically when uploaded to S3&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;Our pipeline creates a seamless flow where:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Documents are uploaded&lt;/strong&gt; to an S3 bucket (images, PDFs, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda function triggers&lt;/strong&gt; automatically when new files arrive&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Textract extracts text&lt;/strong&gt; and identifies document layout&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Comprehend analyzes&lt;/strong&gt; the extracted text for insights&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Results are stored&lt;/strong&gt; in DynamoDB with structured metadata&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsocstf61w20geoxbx9ul.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsocstf61w20geoxbx9ul.png" alt=" " width="701" height="257"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  System Requirements
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lambda Runtime&lt;/strong&gt;: Python 3.10&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: 1024 MB (recommended)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timeout&lt;/strong&gt;: 120 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment Variables&lt;/strong&gt;: 

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;DDB_TABLE&lt;/code&gt;: SmartDocResults (default)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LANG&lt;/code&gt;: en (default)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step-by-Step Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Create DynamoDB Table
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Navigate to AWS Console → DynamoDB&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Create table"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5pepup9jvd0ll859bx71.png" alt=" " width="800" height="386"&gt;
&lt;/li&gt;
&lt;li&gt;Configure:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Table name&lt;/strong&gt;: &lt;code&gt;SmartDocResults&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Partition key&lt;/strong&gt;: &lt;code&gt;doc_id&lt;/code&gt; (String)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sort key&lt;/strong&gt;: &lt;code&gt;paragraph_id&lt;/code&gt; (String)
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9598o67g4zc9rcj559xx.png" alt=" " width="800" height="387"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Create table"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Wait for table status = &lt;strong&gt;"Active"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F38ju6uj0rhgz461mx72r.png" alt=" " width="800" height="387"&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 2: Create S3 Bucket
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;AWS Console → S3&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Create bucket"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwfsph0imiyktacdus8kt.png" alt=" " width="800" height="388"&gt;
&lt;/li&gt;
&lt;li&gt;Configure:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bucket name&lt;/strong&gt;: &lt;code&gt;your-smart-doc-bucket&lt;/code&gt; (change to unique name)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Region&lt;/strong&gt;: Choose your preferred region
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhnh3ur1pkee81yfqqjv3.png" alt=" " width="800" height="391"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Create bucket"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Remember the bucket name for IAM policy&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 3: Create IAM Policy
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;AWS Console → IAM → Policies&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Create policy"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh0d3yvonyvbao2jpooxr.png" alt=" " width="800" height="386"&gt;
&lt;/li&gt;
&lt;li&gt;Switch to &lt;strong&gt;"JSON"&lt;/strong&gt; tab&lt;/li&gt;
&lt;li&gt;Copy content from &lt;code&gt;iam_policy.json&lt;/code&gt; and replace placeholders:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ACCOUNT_ID&lt;/code&gt;: Your AWS account ID&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;REGION&lt;/code&gt;: Your region (e.g., us-east-1)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;BUCKET_NAME&lt;/code&gt;: S3 bucket name from step 2
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvoag6qo40brxzr0rewsk.png" alt=" " width="800" height="389"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Next: Tags"&lt;/strong&gt; → &lt;strong&gt;"Next: Review"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Name the policy: &lt;code&gt;SmartDocLambdaPolicy&lt;/code&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1j1myv6x13sos5hdsjd8.png" alt=" " width="800" height="386"&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Create policy"&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Least-privilege IAM Policy
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"S3Access"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"s3:GetObject"&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:s3:::BUCKET_NAME/*"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TextractAccess"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"textract:AnalyzeDocument"&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ComprehendAccess"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"comprehend:DetectEntities"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"comprehend:DetectKeyPhrases"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"comprehend:DetectSentiment"&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DynamoDBAccess"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"dynamodb:PutItem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"dynamodb:GetItem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"dynamodb:Query"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"dynamodb:Scan"&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:dynamodb:REGION:ACCOUNT_ID:table/SmartDocResults"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CloudWatchLogs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"logs:CreateLogGroup"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"logs:CreateLogStream"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"logs:PutLogEvents"&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:logs:REGION:ACCOUNT_ID:*"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Create IAM Role for Lambda
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;AWS Console → IAM → Roles&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Create role"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F586flk2q1szy0r5jkx70.png" alt=" " width="800" height="389"&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;"AWS service"&lt;/strong&gt; → &lt;strong&gt;"Lambda"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Next"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsb32msslkuuayjo4plfg.png" alt=" " width="800" height="386"&gt;
&lt;/li&gt;
&lt;li&gt;In &lt;strong&gt;"Permissions"&lt;/strong&gt; tab:

&lt;ul&gt;
&lt;li&gt;Find and select the &lt;code&gt;SmartDocLambdaPolicy&lt;/code&gt; you just created&lt;/li&gt;
&lt;li&gt;Check the policy
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs4bwm5e54f682192zsit.png" alt=" " width="800" height="388"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Next: Tags"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Name the role: &lt;code&gt;SmartDocLambdaRole&lt;/code&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faaw6p8akqdnftrxau6ib.png" alt=" " width="800" height="390"&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Create role"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flqsqy6ek49s0lt17dh10.png" alt=" " width="800" height="385"&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 5: Create Lambda Function
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;AWS Console → Lambda → Functions&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Create function"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;"Author from scratch"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Configure:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Function name&lt;/strong&gt;: &lt;code&gt;SmartDocProcessor&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime&lt;/strong&gt;: Python 3.10&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture&lt;/strong&gt;: x86_64
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbj0l6n8l21c48valz5al.png" alt=" " width="800" height="388"&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Change default execution role&lt;/strong&gt;: Select &lt;strong&gt;"Use an existing role"&lt;/strong&gt; → &lt;code&gt;SmartDocLambdaRole&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Create function"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwt407860ikdik4e671l2.png" alt=" " width="800" height="389"&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 6: Configure Lambda Function
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;In Lambda function, scroll to &lt;strong&gt;"Code"&lt;/strong&gt; section&lt;/li&gt;
&lt;li&gt;Delete default code and paste content from &lt;code&gt;lambda_function.py&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Deploy"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiih7of6idd9z3ihp2sx7.png" alt=" " width="800" height="388"&gt;
&lt;/li&gt;
&lt;li&gt;Configure &lt;strong&gt;"Configuration"&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;General&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: 1024 MB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timeout&lt;/strong&gt;: 2 minutes
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqoy2iuqurt5gyl34evhp.png" alt=" " width="800" height="389"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment variables&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;DDB_TABLE&lt;/code&gt;: &lt;code&gt;SmartDocResults&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LANG&lt;/code&gt;: &lt;code&gt;en&lt;/code&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F67y3ne4mmofns9u32xh5.png" alt=" " width="800" height="336"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;urllib.parse&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;unquote_plus&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;decimal&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Decimal&lt;/span&gt;

&lt;span class="c1"&gt;# Configure logging
&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setLevel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INFO&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize AWS clients
&lt;/span&gt;&lt;span class="n"&gt;textract&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;textract&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;comprehend&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;comprehend&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dynamodb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;dynamodb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Environment variables
&lt;/span&gt;&lt;span class="n"&gt;DDB_TABLE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;DDB_TABLE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;SmartDocResults&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;LANG&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;LANG&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Main Lambda handler for S3 → Textract → Comprehend → DynamoDB pipeline
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing event: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Get DynamoDB table
&lt;/span&gt;    &lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dynamodb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DDB_TABLE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Process each S3 record
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Extract S3 information
&lt;/span&gt;            &lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bucket&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;unquote_plus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;doc_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;basename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing document: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; from bucket: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Step 1: Extract text using Textract
&lt;/span&gt;            &lt;span class="n"&gt;text_lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extract_text_from_s3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;text_lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No text extracted from &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;

            &lt;span class="c1"&gt;# Step 2: Split into paragraphs
&lt;/span&gt;            &lt;span class="n"&gt;paragraphs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;split_paragraphs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Found &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paragraphs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; paragraphs in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Step 3: Process each paragraph with Comprehend
&lt;/span&gt;            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;paragraph_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;paragraph&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paragraphs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paragraph&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# Only process paragraphs with &amp;gt;= 20 characters
&lt;/span&gt;                    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing paragraph &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;paragraph_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (length: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paragraph&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="c1"&gt;# Analyze with Comprehend
&lt;/span&gt;                    &lt;span class="n"&gt;entities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;detect_entities_safe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paragraph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;LANG&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;key_phrases&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;detect_key_phrases_safe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paragraph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;LANG&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;sentiment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;safe_detect_sentiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paragraph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;LANG&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="c1"&gt;# Convert float values to Decimal for DynamoDB
&lt;/span&gt;                    &lt;span class="n"&gt;entities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;convert_floats_to_decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;entities&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;key_phrases&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;convert_floats_to_decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key_phrases&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="c1"&gt;# Save to DynamoDB
&lt;/span&gt;                    &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;doc_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;paragraph_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paragraph_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# Convert to string
&lt;/span&gt;                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;paragraph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;entities&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;entities&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;key_phrases&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;key_phrases&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sentiment&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;created_at&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Z&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;

                    &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Saved paragraph &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;paragraph_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; to DynamoDB&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Skipping paragraph &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;paragraph_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (too short: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paragraph&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; chars)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Successfully processed document: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error processing record &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="c1"&gt;# Continue processing other records
&lt;/span&gt;            &lt;span class="k"&gt;continue&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Processing completed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_text_from_s3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Extract text from S3 object using Textract synchronous API
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;textract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S3Object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Bucket&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;FeatureTypes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;LAYOUT&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Extract LINE blocks and sort by reading order
&lt;/span&gt;        &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="n"&gt;line_blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Blocks&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BlockType&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;LINE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="c1"&gt;# Sort by reading order (top to bottom, left to right)
&lt;/span&gt;        &lt;span class="n"&gt;line_blocks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Geometry&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BoundingBox&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Top&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; 
                                      &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Geometry&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;BoundingBox&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Left&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;line_blocks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Extracted &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; lines from document&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error extracting text from &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;split_paragraphs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Split lines into paragraphs based on spacing and punctuation rules
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="n"&gt;paragraphs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;current_paragraph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Empty line - end current paragraph
&lt;/span&gt;            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current_paragraph&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;paragraphs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_paragraph&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="n"&gt;current_paragraph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Line ends with period - add to current paragraph and end it
&lt;/span&gt;            &lt;span class="n"&gt;current_paragraph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;paragraphs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_paragraph&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;current_paragraph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Regular line - add to current paragraph
&lt;/span&gt;            &lt;span class="n"&gt;current_paragraph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Add final paragraph if exists
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current_paragraph&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;paragraphs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_paragraph&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;paragraphs&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_entities_safe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language_code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Safely detect entities using Comprehend with error handling
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;comprehend&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detect_entities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;LanguageCode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;language_code&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Entities&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error detecting entities: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_key_phrases_safe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language_code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Safely detect key phrases using Comprehend with error handling
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;comprehend&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detect_key_phrases&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;LanguageCode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;language_code&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;KeyPhrases&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error detecting key phrases: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;safe_detect_sentiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language_code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Safely detect sentiment with chunking for large texts
    Comprehend has a 5000 byte limit for DetectSentiment
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Check if text is small enough for single call
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;4500&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;comprehend&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detect_sentiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;LanguageCode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;language_code&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Sentiment&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="c1"&gt;# For large texts, chunk and aggregate
&lt;/span&gt;        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Text too large for single sentiment call, chunking...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;chunk_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Leave some buffer
&lt;/span&gt;        &lt;span class="n"&gt;sentiments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;comprehend&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detect_sentiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;LanguageCode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;language_code&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;sentiments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Sentiment&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error detecting sentiment for chunk: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;sentiments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;UNKNOWN&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Aggregate sentiments (simple majority vote)
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;aggregate_sentiments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentiments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error detecting sentiment: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;UNKNOWN&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chunk_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Split text into chunks that don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t exceed max_bytes
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;current_chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;test_chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_chunk&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current_chunk&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;max_bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;current_chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_chunk&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current_chunk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;current_chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current_chunk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;aggregate_sentiments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentiments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Aggregate multiple sentiment results into a single sentiment
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;sentiments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;UNKNOWN&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

    &lt;span class="c1"&gt;# Count sentiment occurrences
&lt;/span&gt;    &lt;span class="n"&gt;sentiment_counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sentiment&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sentiments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;sentiment_counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sentiment_counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

    &lt;span class="c1"&gt;# Return the most common sentiment
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentiment_counts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sentiment_counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;convert_floats_to_decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Recursively convert float values to Decimal for DynamoDB compatibility
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;convert_floats_to_decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;convert_floats_to_decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 7: Add S3 Trigger
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;In Lambda function → &lt;strong&gt;"Add trigger"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnzuhztabw7dsz4xbz7iw.png" alt=" " width="800" height="364"&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;"S3"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Configure:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bucket&lt;/strong&gt;: Select bucket created in step 2&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event type&lt;/strong&gt;: &lt;code&gt;All object create events&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prefix&lt;/strong&gt;: (leave empty or add folder path)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Suffix&lt;/strong&gt;: (leave empty)
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgnoh42e5l3vrn4ql61im.png" alt=" " width="800" height="389"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Check &lt;strong&gt;"Recursive invocation"&lt;/strong&gt; if you want to process files in subfolders&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Add"&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkwdivhebylzokneasaql.png" alt=" " width="800" height="388"&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 8: Test with Sample File
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Upload a test file to your S3 bucket:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Supported file types&lt;/strong&gt;: JPG, PNG, PDF (1 page)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Size&lt;/strong&gt;: &amp;lt; 10MB (Textract sync limit)
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm8yuksl1o6swkeczf13y.png" alt=" " width="800" height="388"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Check CloudWatch Logs:

&lt;ul&gt;
&lt;li&gt;Lambda function → &lt;strong&gt;"Monitor"&lt;/strong&gt; → &lt;strong&gt;"View CloudWatch logs"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Look for log group: &lt;code&gt;/aws/lambda/SmartDocProcessor&lt;/code&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcgtf38z5ol8799ljghwv.png" alt=" " width="800" height="367"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Check DynamoDB:

&lt;ul&gt;
&lt;li&gt;DynamoDB → Tables → &lt;code&gt;SmartDocResults&lt;/code&gt; → &lt;strong&gt;"Items"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;View created items
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcv9ua7tersjy3yev5joe.png" alt=" " width="800" height="389"&gt;
### Step 9: Verify Results&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Sample DynamoDB Item
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"doc_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sample-document.pdf"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"paragraph_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"This is the first paragraph of the document with important information."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"entities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sample-document.pdf"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"OTHER"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.99&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"key_phrases"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"important information"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.99&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sentiment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"POSITIVE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"created_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-01-15T10:30:00.000Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  How to Verify
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;CloudWatch Logs&lt;/strong&gt;: View logs for debugging and monitoring processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DynamoDB Console&lt;/strong&gt;: View items with complete information&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;S3 Console&lt;/strong&gt;: Confirm file was uploaded&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Troubleshooting
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Common Issues:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Permission denied&lt;/strong&gt;: Check IAM role has sufficient permissions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Textract error&lt;/strong&gt;: File too large or unsupported format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DynamoDB error&lt;/strong&gt;: Check table name and permissions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timeout&lt;/strong&gt;: Increase memory or timeout for Lambda&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Comprehend error&lt;/strong&gt;: Check text language and size limits&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Debug Tips:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;View CloudWatch Logs for specific errors&lt;/li&gt;
&lt;li&gt;Verify IAM policy has correct placeholders&lt;/li&gt;
&lt;li&gt;Confirm S3 bucket and DynamoDB table exist&lt;/li&gt;
&lt;li&gt;Test Textract and Comprehend services separately&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next Steps &amp;amp; Advanced Features
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Multi-page PDF Processing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Use Textract async API (&lt;code&gt;StartDocumentAnalysis&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Combine with SNS for completion notifications&lt;/li&gt;
&lt;li&gt;Modify Lambda to handle SNS events&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Custom Comprehend Models
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Create custom entity recognizers for specific domains&lt;/li&gt;
&lt;li&gt;Use custom classification models&lt;/li&gt;
&lt;li&gt;Improve analysis accuracy&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Security Enhancements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Use KMS for DynamoDB data encryption&lt;/li&gt;
&lt;li&gt;Enable S3 server-side encryption&lt;/li&gt;
&lt;li&gt;Deploy Lambda in VPC for high security&lt;/li&gt;
&lt;li&gt;Use IAM roles instead of access keys&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Monitoring &amp;amp; Alerting
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Set up CloudWatch alarms&lt;/li&gt;
&lt;li&gt;Create dashboards to monitor pipeline&lt;/li&gt;
&lt;li&gt;Use X-Ray for request tracing&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;You've successfully built an intelligent document processing pipeline that can automatically extract and analyze content from various document types. This solution provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automated processing&lt;/strong&gt; of uploaded documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intelligent text extraction&lt;/strong&gt; using AWS Textract&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content analysis&lt;/strong&gt; with entity recognition, key phrase extraction, and sentiment analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured data storage&lt;/strong&gt; in DynamoDB for easy querying&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalable architecture&lt;/strong&gt; that can handle varying workloads&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This pipeline can be extended for various use cases such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Document classification and routing&lt;/li&gt;
&lt;li&gt;Content moderation&lt;/li&gt;
&lt;li&gt;Data extraction for business intelligence&lt;/li&gt;
&lt;li&gt;Automated compliance checking&lt;/li&gt;
&lt;li&gt;Customer support ticket analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The combination of AWS services provides a robust, scalable solution that can grow with your needs while maintaining security and cost-effectiveness.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>lambda</category>
      <category>dynamodb</category>
    </item>
    <item>
      <title>Modernize Mainframe Applications with AWS Transform</title>
      <dc:creator>Lam Bùi</dc:creator>
      <pubDate>Tue, 23 Sep 2025 02:51:25 +0000</pubDate>
      <link>https://forem.com/lamkhac/modernize-mainframe-applications-with-aws-transform-22ni</link>
      <guid>https://forem.com/lamkhac/modernize-mainframe-applications-with-aws-transform-22ni</guid>
      <description>&lt;h1&gt;
  
  
  I. Concepts
&lt;/h1&gt;

&lt;h2&gt;
  
  
  1. What is AWS Transform?
&lt;/h2&gt;

&lt;p&gt;AWS Transform is a new AWS service that uses agentic AI to automate modernization and migration of infrastructure, applications, and source code to AWS. It reduces the time, effort, and risk of handling legacy or complex systems such as mainframes, VMware, and .NET workloads by analyzing, refactoring, and redeploying on cloud.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Primary Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Mainframe Modernization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Transform COBOL, JCL, CICS, Db2, VSAM into modern Java applications.&lt;/li&gt;
&lt;li&gt;Automatically analyze code, extract business logic, generate technical documents, and refactor.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  VMware Migration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Migrate virtual machines from on‑premises VMware to Amazon EC2.&lt;/li&gt;
&lt;li&gt;Auto-generate migration plans, configure VPC networking, and convert VMs to EC2.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  .NET Modernization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Upgrade from legacy .NET Framework to cross‑platform .NET.&lt;/li&gt;
&lt;li&gt;Integrates with Visual Studio; automatically resolves dependencies, builds, and tests.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Migration Assessment
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Pre-migration cost and feasibility assessment.&lt;/li&gt;
&lt;li&gt;AI-driven recommendations for EC2 sizing, BYOL optimization, and cost analysis.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Key Capabilities
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Agentic AI with human-in-the-loop (HITL) for complex tasks and decisions.&lt;/li&gt;
&lt;li&gt;End‑to‑end automation: discovery → planning → code refactor → deployment.&lt;/li&gt;
&lt;li&gt;Flexible collaboration: role-based access (Admin, Approver, Contributor, Reader).&lt;/li&gt;
&lt;li&gt;Workspaces to isolate projects, with dedicated jobs and connectors.&lt;/li&gt;
&lt;li&gt;Worklog &amp;amp; dashboards to track progress and decisions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. Typical Workflow
&lt;/h2&gt;

&lt;p&gt;1) Environment setup&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create an AWS account with administrative permissions.&lt;/li&gt;
&lt;li&gt;Enable the AWS Transform service.&lt;/li&gt;
&lt;li&gt;Configure AWS IAM Identity Center for users and groups.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;2) Create a Workspace&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dedicated per transformation project.&lt;/li&gt;
&lt;li&gt;Contains connectors, jobs, and artifacts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;3) Create a Connector&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Connect to target environments (VMware account, GitHub repository, other AWS accounts).&lt;/li&gt;
&lt;li&gt;Configure IAM policies and encryption with KMS.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;4) Create a Job&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose job type: VMware Migration, Mainframe Modernization, .NET Transform, etc.&lt;/li&gt;
&lt;li&gt;Upload inputs (RVTools export, source code, JCL, spreadsheets, Git repo, etc.).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;5) Execute Workflow steps&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;VMware: Discovery → Grouping → Generate VPC → Replicate &amp;amp; Migrate&lt;/li&gt;
&lt;li&gt;Mainframe: Analyze COBOL → Generate Docs → Extract Logic → Refactor → Build Java&lt;/li&gt;
&lt;li&gt;.NET: Connect Git → Resolve Dependencies → Transform Code → Test &amp;amp; Build&lt;/li&gt;
&lt;li&gt;Assessment: Upload file → Cost estimation → Migration feasibility&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;6) Track Progress&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Worklog: detailed action log.&lt;/li&gt;
&lt;li&gt;Dashboard: overall job status and KPIs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5. Security &amp;amp; Governance
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Data encrypted by default using AWS KMS.&lt;/li&gt;
&lt;li&gt;Option to use customer‑managed KMS keys for tighter control.&lt;/li&gt;
&lt;li&gt;Comprehensive IAM policies for access control, connector creation, and deployments.&lt;/li&gt;
&lt;li&gt;Supports cross‑region processing and compliance validation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  6. Roles
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Core Permissions&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Admin&lt;/td&gt;
&lt;td&gt;Full control: create workspace, job, connector; approve HITL steps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Approver&lt;/td&gt;
&lt;td&gt;Near‑admin; cannot modify users, connectors, or workspaces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Contributor&lt;/td&gt;
&lt;td&gt;Execute jobs, upload files, respond to HITL; cannot approve critical steps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reader&lt;/td&gt;
&lt;td&gt;Read‑only access to progress and artifacts&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  7. Highlights by Use Case
&lt;/h2&gt;

&lt;h3&gt;
  
  
  VMware Migration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;AI‑generated AWS VPC topology mirroring on‑prem networks.&lt;/li&gt;
&lt;li&gt;Automatic wave planning from discovery data.&lt;/li&gt;
&lt;li&gt;Import from NSX or RVTools via collector or direct file upload.&lt;/li&gt;
&lt;li&gt;Supports Hub‑and‑Spoke and Isolated VPC topologies.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Mainframe Modernization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Full support for COBOL, JCL, Db2, and VSAM.&lt;/li&gt;
&lt;li&gt;Clear stages: analyze → extract logic → refactor to Java.&lt;/li&gt;
&lt;li&gt;Re‑run per stage for iterative validation.&lt;/li&gt;
&lt;li&gt;Outputs modern Java applications deployable on AWS.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  .NET Modernization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Connect GitHub repository to retrieve code.&lt;/li&gt;
&lt;li&gt;Automatically resolve missing libraries and dependencies; refactor to .NET 6/7/8.&lt;/li&gt;
&lt;li&gt;Integrate with Visual Studio for review and edits.&lt;/li&gt;
&lt;li&gt;Build and deploy to test environments.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  8. Core Benefits
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Accelerate migration/modernization by 50–80%.&lt;/li&gt;
&lt;li&gt;Reduce manual errors on complex workloads.&lt;/li&gt;
&lt;li&gt;Preserve critical business logic via intelligent extraction.&lt;/li&gt;
&lt;li&gt;Scale for large organizations: multi‑account, multi‑region, IAM governance.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  II. Step‑by‑Step Guides
&lt;/h1&gt;

&lt;h2&gt;
  
  
  1) Modernize Mainframe Applications with AWS Transform
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1.1 Objectives
&lt;/h3&gt;

&lt;p&gt;Use AWS Transform to modernize mainframe COBOL applications (commonly on IBM z/OS) by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Analyzing COBOL/JCL/CICS code.&lt;/li&gt;
&lt;li&gt;Generating technical documentation and extracting business logic.&lt;/li&gt;
&lt;li&gt;Decomposing a monolith into functional domains.&lt;/li&gt;
&lt;li&gt;Refactoring and transforming code into modern Java.&lt;/li&gt;
&lt;li&gt;Planning migration waves.&lt;/li&gt;
&lt;li&gt;Creating deployment pipelines on AWS.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The journey follows four phases: Assess → Mobilize → Migrate &amp;amp; Modernize → Operate &amp;amp; Optimize.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.2 Procedure
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Step 1: Prepare Source Code
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Retrieve the demo repository CardDemo: &lt;a href="https://github.com/aws-samples/aws-mainframe-modernization-carddemo" rel="noopener noreferrer"&gt;https://github.com/aws-samples/aws-mainframe-modernization-carddemo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Select Code → Download ZIP.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzbdbncxzadqx871sa887.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzbdbncxzadqx871sa887.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;After extracting, locate the app/ folder with multiple subfolders.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj2mg335q8byaela8vvff.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj2mg335q8byaela8vvff.png" alt=" " width="800" height="601"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 2: Keep Only Required Folders
&lt;/h4&gt;

&lt;p&gt;Keep these under app/:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;✅ Keep&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;app/cbl/&lt;/td&gt;
&lt;td&gt;COBOL programs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;app/jcl/&lt;/td&gt;
&lt;td&gt;Job Control Language scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;app/cpy/&lt;/td&gt;
&lt;td&gt;Copybooks (shared data definitions)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;app/bms/&lt;/td&gt;
&lt;td&gt;CICS screen maps (if present)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;app/data/&lt;/td&gt;
&lt;td&gt;EBCDIC data (test or demo)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;app/proc/&lt;/td&gt;
&lt;td&gt;JCL procedures and helper programs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Remove:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;❌ Remove&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;catlg/, ctl/, csd/, cpy-bms/&lt;/td&gt;
&lt;td&gt;Not essential for core analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcs9ez43eqnt9x6dc6noi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcs9ez43eqnt9x6dc6noi.png" alt=" " width="800" height="598"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 3: Zip the Source
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Rename folder app to mainframe-src, then zip it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsnmohvhe9oufik7ojn5z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsnmohvhe9oufik7ojn5z.png" alt=" " width="800" height="605"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 4: Upload to S3
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Open AWS Console → S3: &lt;a href="https://s3.console.aws.amazon.com/s3/" rel="noopener noreferrer"&gt;https://s3.console.aws.amazon.com/s3/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Create a bucket (e.g., lam-mainframe-transform).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgadn3r17r33uaxowkuyo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgadn3r17r33uaxowkuyo.png" alt=" " width="800" height="338"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Upload mainframe-src.zip to the bucket.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnodkdkfwjzxspinamgof.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnodkdkfwjzxspinamgof.png" alt=" " width="800" height="336"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 5: Create Workspace and Job
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Open the AWS Transform portal and sign in via IAM Identity Center.&lt;/li&gt;
&lt;li&gt;Open your workspace or create a new one.&lt;/li&gt;
&lt;li&gt;Select Create a job.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5u72yp3e7cinnikq34p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5u72yp3e7cinnikq34p.png" alt=" " width="800" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose job type: Mainframe Modernization.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvykgvujkp1fch69xx36h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvykgvujkp1fch69xx36h.png" alt=" " width="800" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;To experience the full pipeline, select: Analyze code, Generate technical documentation, Decompose code, Plan migration sequence, and Transform code to Java.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Name the job and create it.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvq94mg72fil3u96bt5at.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvq94mg72fil3u96bt5at.png" alt=" " width="800" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ooftqdg9xxg8ytgrivt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ooftqdg9xxg8ytgrivt.png" alt=" " width="800" height="347"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 6: Start Modernization
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Skip Add collaborators and proceed to Connect to AWS account.&lt;/li&gt;
&lt;li&gt;Enter your AWS Account ID and select Next.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fetlc2zrgyyr26qlrhti1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fetlc2zrgyyr26qlrhti1.png" alt=" " width="800" height="347"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In S3 console, copy the Bucket ARN of lam-mainframe-transform and paste it in AWS Transform to create a connector. Select Create connector.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6u5yga3p6tx89mmrv619.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6u5yga3p6tx89mmrv619.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Copy the verification link, open it in a new tab, and approve access to S3 for AWS Transform.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqerpvw0x2yunqog7x8tj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqerpvw0x2yunqog7x8tj.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;After admin approval, select Send to AWS Transform to continue.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnrn0pzg0ex3le3b49qtr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnrn0pzg0ex3le3b49qtr.png" alt=" " width="800" height="358"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enter the S3 URI of mainframe-src.zip and select Send to AWS Transform.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fescs3h3os4j7395dod4i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fescs3h3os4j7395dod4i.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 7: Analyze COBOL Code
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;When Analyze code completes, review results and select Send to AWS Transform to proceed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxy6ajy10vz1is826mc1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxy6ajy10vz1is826mc1.png" alt=" " width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 8: Generate Technical Documentation
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Open Summary and select all folders in Selected Files.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqq4edvle6mks45fxwow7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqq4edvle6mks45fxwow7.png" alt=" " width="800" height="351"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select Send to AWS Transform to begin Decompose code.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj6z6vblat4h1u9r7a8au.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj6z6vblat4h1u9r7a8au.png" alt=" " width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When complete, view the generated documentation in: s3://lam-mainframe-transform/transform-output/&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Step 9: Decompose into Domains
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Create a domain classification. Each domain represents a separate functional area.&lt;/li&gt;
&lt;li&gt;Recommended initial domains and seed files:&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Domain&lt;/th&gt;
&lt;th&gt;Seed Files&lt;/th&gt;
&lt;th&gt;Cyclomatic Complexity&lt;/th&gt;
&lt;th&gt;Description (EN)&lt;/th&gt;
&lt;th&gt;Reason for Selection&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Customer Domain&lt;/td&gt;
&lt;td&gt;CBCUS01C.cbl, COUSR00C.cbl&lt;/td&gt;
&lt;td&gt;20 / 102&lt;/td&gt;
&lt;td&gt;Customer and user information handling&lt;/td&gt;
&lt;td&gt;Key customer/user management, moderate complexity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Transaction Domain&lt;/td&gt;
&lt;td&gt;CBTRN01C.cbl, COTRN00C.cbl&lt;/td&gt;
&lt;td&gt;59 / 100&lt;/td&gt;
&lt;td&gt;Processing and managing financial transactions&lt;/td&gt;
&lt;td&gt;Core transaction logic, high complexity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Card Domain&lt;/td&gt;
&lt;td&gt;COCRDLIC.cbl, COCRDUPC.cbl&lt;/td&gt;
&lt;td&gt;144 / 159&lt;/td&gt;
&lt;td&gt;Card data listing and updates&lt;/td&gt;
&lt;td&gt;Primary card ops, high complexity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Account Domain&lt;/td&gt;
&lt;td&gt;CBACT01C.cbl, CBACT04C.cbl&lt;/td&gt;
&lt;td&gt;1 / 98&lt;/td&gt;
&lt;td&gt;Account-level data and operations&lt;/td&gt;
&lt;td&gt;Foundational account logic, moderate complexity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Admin Domain&lt;/td&gt;
&lt;td&gt;COADM01C.cbl&lt;/td&gt;
&lt;td&gt;36&lt;/td&gt;
&lt;td&gt;Administrative features and controls&lt;/td&gt;
&lt;td&gt;Focused module, clearly separable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Common/Utility&lt;/td&gt;
&lt;td&gt;CBSTM03A.cbl, CSUTLDTC.cbl&lt;/td&gt;
&lt;td&gt;1 / 11&lt;/td&gt;
&lt;td&gt;Shared utilities and helper functions&lt;/td&gt;
&lt;td&gt;Commonly reused lightweight utilities&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;In AWS Transform, choose Actions → Create domain.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5llniwq1bs6tn617o51b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5llniwq1bs6tn617o51b.png" alt=" " width="800" height="346"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create Customer Domain with description: Handles customer and user information.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0rlic0bckdrwa3jq5ieu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0rlic0bckdrwa3jq5ieu.png" alt=" " width="800" height="370"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select CBTRN01C.cbl and choose Mark as seed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzkz9mnujejzcraqe1k8h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzkz9mnujejzcraqe1k8h.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add COUSR00C.cbl as a seed file as well. Select Create.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9k5us25yviy9gcpotlf7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9k5us25yviy9gcpotlf7.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select Save to finalize the domain.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flq78o7iasxm2atwxns0q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flq78o7iasxm2atwxns0q.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repeat for other domains listed above. For Common/Utility Domain, enable Set as a common domain.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7fakugwg376krchqd1nf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7fakugwg376krchqd1nf.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When domains are ready, select Decompose and then Send to AWS Transform.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7f0nlm9v8d67501pzzmq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7f0nlm9v8d67501pzzmq.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcak1uvzz26m2llbgh76f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcak1uvzz26m2llbgh76f.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 10: Plan Migration Waves
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;In Plan migration wave, review the suggested references.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fown1ef7hafkg7t4zz6ez.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fown1ef7hafkg7t4zz6ez.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select Add reference, provide referenced waves, and choose Add and regenerate.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxath4vi8as05upxmim0u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxath4vi8as05upxmim0u.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When complete, select Send to AWS Transform to proceed to Refactor code.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxxtli0bhrwy72a7zzwf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxxtli0bhrwy72a7zzwf.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 11: Refactor Code
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;In Transform, select all domains and choose Send to AWS Transform.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyhnd9ktpu0iq97twi63q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyhnd9ktpu0iq97twi63q.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When refactoring completes, select View result to find generated.zip in S3.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr86apglbwomaaxob1943.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr86apglbwomaaxob1943.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aws</category>
      <category>modernize</category>
    </item>
    <item>
      <title>Modernize .NET Applications with AWS Transform</title>
      <dc:creator>Lam Bùi</dc:creator>
      <pubDate>Mon, 22 Sep 2025 14:28:09 +0000</pubDate>
      <link>https://forem.com/lamkhac/modernize-net-applications-with-aws-transform-4j9f</link>
      <guid>https://forem.com/lamkhac/modernize-net-applications-with-aws-transform-4j9f</guid>
      <description>&lt;h1&gt;
  
  
  I. Concepts
&lt;/h1&gt;

&lt;h2&gt;
  
  
  1. What is AWS Transform?
&lt;/h2&gt;

&lt;p&gt;AWS Transform is a new AWS service that uses agentic AI to automate modernization and migration of infrastructure, applications, and source code to AWS. It reduces the time, effort, and risk of handling legacy or complex systems such as mainframes, VMware, and .NET workloads by analyzing, refactoring, and redeploying on cloud.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Primary Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Mainframe Modernization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Transform COBOL, JCL, CICS, Db2, VSAM into modern Java applications.&lt;/li&gt;
&lt;li&gt;Automatically analyze code, extract business logic, generate technical documents, and refactor.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  VMware Migration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Migrate virtual machines from on‑premises VMware to Amazon EC2.&lt;/li&gt;
&lt;li&gt;Auto-generate migration plans, configure VPC networking, and convert VMs to EC2.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  .NET Modernization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Upgrade from legacy .NET Framework to cross‑platform .NET.&lt;/li&gt;
&lt;li&gt;Integrates with Visual Studio; automatically resolves dependencies, builds, and tests.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Migration Assessment
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Pre-migration cost and feasibility assessment.&lt;/li&gt;
&lt;li&gt;AI-driven recommendations for EC2 sizing, BYOL optimization, and cost analysis.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Key Capabilities
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Agentic AI with human-in-the-loop (HITL) for complex tasks and decisions.&lt;/li&gt;
&lt;li&gt;End‑to‑end automation: discovery → planning → code refactor → deployment.&lt;/li&gt;
&lt;li&gt;Flexible collaboration: role-based access (Admin, Approver, Contributor, Reader).&lt;/li&gt;
&lt;li&gt;Workspaces to isolate projects, with dedicated jobs and connectors.&lt;/li&gt;
&lt;li&gt;Worklog &amp;amp; dashboards to track progress and decisions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. Typical Workflow
&lt;/h2&gt;

&lt;p&gt;1) Environment setup&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create an AWS account with administrative permissions.&lt;/li&gt;
&lt;li&gt;Enable the AWS Transform service.&lt;/li&gt;
&lt;li&gt;Configure AWS IAM Identity Center for users and groups.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;2) Create a Workspace&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dedicated per transformation project.&lt;/li&gt;
&lt;li&gt;Contains connectors, jobs, and artifacts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;3) Create a Connector&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Connect to target environments (VMware account, GitHub repository, other AWS accounts).&lt;/li&gt;
&lt;li&gt;Configure IAM policies and encryption with KMS.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;4) Create a Job&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose job type: VMware Migration, Mainframe Modernization, .NET Transform, etc.&lt;/li&gt;
&lt;li&gt;Upload inputs (RVTools export, source code, JCL, spreadsheets, Git repo, etc.).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;5) Execute Workflow steps&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;VMware: Discovery → Grouping → Generate VPC → Replicate &amp;amp; Migrate&lt;/li&gt;
&lt;li&gt;Mainframe: Analyze COBOL → Generate Docs → Extract Logic → Refactor → Build Java&lt;/li&gt;
&lt;li&gt;.NET: Connect Git → Resolve Dependencies → Transform Code → Test &amp;amp; Build&lt;/li&gt;
&lt;li&gt;Assessment: Upload file → Cost estimation → Migration feasibility&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;6) Track Progress&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Worklog: detailed action log.&lt;/li&gt;
&lt;li&gt;Dashboard: overall job status and KPIs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5. Security &amp;amp; Governance
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Data encrypted by default using AWS KMS.&lt;/li&gt;
&lt;li&gt;Option to use customer‑managed KMS keys for tighter control.&lt;/li&gt;
&lt;li&gt;Comprehensive IAM policies for access control, connector creation, and deployments.&lt;/li&gt;
&lt;li&gt;Supports cross‑region processing and compliance validation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  6. Roles
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Core Permissions&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Admin&lt;/td&gt;
&lt;td&gt;Full control: create workspace, job, connector; approve HITL steps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Approver&lt;/td&gt;
&lt;td&gt;Near‑admin; cannot modify users, connectors, or workspaces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Contributor&lt;/td&gt;
&lt;td&gt;Execute jobs, upload files, respond to HITL; cannot approve critical steps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reader&lt;/td&gt;
&lt;td&gt;Read‑only access to progress and artifacts&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  7. Highlights by Use Case
&lt;/h2&gt;

&lt;h3&gt;
  
  
  VMware Migration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;AI‑generated AWS VPC topology mirroring on‑prem networks.&lt;/li&gt;
&lt;li&gt;Automatic wave planning from discovery data.&lt;/li&gt;
&lt;li&gt;Import from NSX or RVTools via collector or direct file upload.&lt;/li&gt;
&lt;li&gt;Supports Hub‑and‑Spoke and Isolated VPC topologies.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Mainframe Modernization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Full support for COBOL, JCL, Db2, and VSAM.&lt;/li&gt;
&lt;li&gt;Clear stages: analyze → extract logic → refactor to Java.&lt;/li&gt;
&lt;li&gt;Re‑run per stage for iterative validation.&lt;/li&gt;
&lt;li&gt;Outputs modern Java applications deployable on AWS.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  .NET Modernization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Connect GitHub repository to retrieve code.&lt;/li&gt;
&lt;li&gt;Automatically resolve missing libraries and dependencies; refactor to .NET 6/7/8.&lt;/li&gt;
&lt;li&gt;Integrate with Visual Studio for review and edits.&lt;/li&gt;
&lt;li&gt;Build and deploy to test environments.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  8. Core Benefits
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Accelerate migration/modernization by 50–80%.&lt;/li&gt;
&lt;li&gt;Reduce manual errors on complex workloads.&lt;/li&gt;
&lt;li&gt;Preserve critical business logic via intelligent extraction.&lt;/li&gt;
&lt;li&gt;Scale for large organizations: multi‑account, multi‑region, IAM governance.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  II. Step‑by‑Step Guides
&lt;/h1&gt;

&lt;h2&gt;
  
  
  1) Modernize .NET Applications with AWS Transform
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1.1 Objectives
&lt;/h3&gt;

&lt;p&gt;Transform legacy .NET Framework applications (often Windows‑based) into cross‑platform .NET (Linux‑ready .NET 6/7/8) for deployment on AWS services such as ECS, Lambda, and EC2.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.2 Prerequisites
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Required Item&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;✅ Source repo for .NET code&lt;/td&gt;
&lt;td&gt;GitHub, CodeCommit, GitLab, etc.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;✅ Project files identified&lt;/td&gt;
&lt;td&gt;Clear structure with .csproj and/or .sln&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;✅ Dependencies resolved&lt;/td&gt;
&lt;td&gt;Can be handled during transform if needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;✅ AWS account with Transform enabled&lt;/td&gt;
&lt;td&gt;Use IAM Identity Center for user access&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  1.3 Procedure
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Step 1: Create a Workspace
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Open the AWS Transform portal (service URL).&lt;/li&gt;
&lt;li&gt;Sign in with IAM Identity Center credentials.&lt;/li&gt;
&lt;li&gt;Open your existing workspace or create a new one.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F28yoppwi65diq2576rx7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F28yoppwi65diq2576rx7.png" alt=" " width="800" height="369"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 2: Create a Job
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;In the workspace, select Create a job.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyx6kcs6uzwjvto3hsiwj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyx6kcs6uzwjvto3hsiwj.png" alt=" " width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose job type: .NET Modernization.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14n2st2md1c3bs4cvvdb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14n2st2md1c3bs4cvvdb.png" alt=" " width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Name the job, optionally add a description, and select Create job.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwk5r4w1pz86am1oowyzh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwk5r4w1pz86am1oowyzh.png" alt=" " width="800" height="370"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 3: Create a Connector (GitHub)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Choose the AWS account ID used to connect to the source repository and select Next.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3c2328g6by6jrkou4ohz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3c2328g6by6jrkou4ohz.png" alt=" " width="800" height="369"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fork the sample .NET project to your own GitHub account: &lt;a href="https://github.com/aws-samples/legacy-cycle-store-mvc-app" rel="noopener noreferrer"&gt;https://github.com/aws-samples/legacy-cycle-store-mvc-app&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqb2yz3oe62ai7p1fcsdr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqb2yz3oe62ai7p1fcsdr.png" alt=" " width="800" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Navigate to Developer Tools → Connections and select Create connection.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr5tjdusrfqkgyvv3pzsc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr5tjdusrfqkgyvv3pzsc.png" alt=" " width="800" height="355"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In Create a connection, choose GitHub and provide a Connection Name. Select Connect to GitHub.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3w9battrdheikr1qmih6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3w9battrdheikr1qmih6.png" alt=" " width="800" height="339"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In Connect to GitHub, choose Install a new app.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi294u406gqaqzm7p2673.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi294u406gqaqzm7p2673.png" alt=" " width="800" height="335"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select Only select repositories, choose the forked .NET repository, then Install &amp;amp; Authorize.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2k4y0dih90aercdmkl03.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2k4y0dih90aercdmkl03.png" alt=" " width="800" height="374"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Back in Connect to GitHub, select Connect.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi3c2r2p09hf47mk3shx7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi3c2r2p09hf47mk3shx7.png" alt=" " width="800" height="337"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Copy the ARN of the new connection.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F83q1yr2vdkf6qb82u9dd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F83q1yr2vdkf6qb82u9dd.png" alt=" " width="800" height="351"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Return to AWS Transform and paste the Connection ARN and Connection Name. Select Initiate connector creation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1isuj72bsn4bcb4xvhpr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1isuj72bsn4bcb4xvhpr.png" alt=" " width="800" height="358"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose Copy connector link, open it in a new tab, and approve.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fak8j0qw6twnwhcu11mnx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fak8j0qw6twnwhcu11mnx.png" alt=" " width="800" height="329"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwv907sfumipv30dst52k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwv907sfumipv30dst52k.png" alt=" " width="800" height="202"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select Finalize connector to continue.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqlrc9a8fbu12oqkh9d7x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqlrc9a8fbu12oqkh9d7x.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 4: Discover Resources for Transformation
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;AWS Transform scans the repository to prepare for modernization.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw5ml0wuqkne4nhj50l0f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw5ml0wuqkne4nhj50l0f.png" alt=" " width="800" height="349"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 5: Prepare and Start Transformation
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;In Prepare resources for transformation, review resources and select Confirm repository.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyh36mh0vujt8if4eesoc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyh36mh0vujt8if4eesoc.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Review settings (branch to create, .NET version, etc.) and select Approve and start transformation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo01vzb6qhkngbhptsgvf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo01vzb6qhkngbhptsgvf.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When finished, verify the generated branch in your GitHub repository.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6m32fns47tbk0ygs2i4g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6m32fns47tbk0ygs2i4g.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faczg6yikyln4tz59fno6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faczg6yikyln4tz59fno6.png" alt=" " width="800" height="366"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
