<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: bikash119</title>
    <description>The latest articles on Forem by bikash119 (@bikash119).</description>
    <link>https://forem.com/bikash119</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1883358%2F6fb9548a-1005-42fa-8f9e-27a7243ef205.png</url>
      <title>Forem: bikash119</title>
      <link>https://forem.com/bikash119</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/bikash119"/>
    <language>en</language>
    <item>
      <title>Deploying Docling Application on ECS with Application Load Balancer</title>
      <dc:creator>bikash119</dc:creator>
      <pubDate>Mon, 08 Sep 2025 11:02:41 +0000</pubDate>
      <link>https://forem.com/bikash119/deploying-docling-application-on-ecs-with-application-load-balancer-59k3</link>
      <guid>https://forem.com/bikash119/deploying-docling-application-on-ecs-with-application-load-balancer-59k3</guid>
      <description>&lt;p&gt;&lt;em&gt;This is Part 3 (Final) of our &lt;a href="https://dev.to/bikash119/deploying-docling-to-aws-ecs-a-complete-guide-o69"&gt;3-part series&lt;/a&gt; on docling deployment to complete AWS ECS infrastructure. In &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;, we set up the foundational networking and IAM, and in &lt;a href="https://dev.to/bikash119/deploying-gpu-enabled-ecs-ec2-instances-with-auto-scaling-groups-and-launch-templates-569l"&gt;Part 2&lt;/a&gt;, we created the ECS cluster with Auto Scaling Groups and Launch Templates. Now we'll deploy our actual application and make it accessible through an Application Load Balancer.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Welcome to the final part of our journey to &lt;a href="https://dev.toDeploy%20docling%20to%20AWS%20ECS"&gt;deploy docling to AWS ECS&lt;/a&gt; infrastructure! In this comprehensive guide, we'll deploy the &lt;a href="https://github.com/docling-project/docling" rel="noopener noreferrer"&gt;docling&lt;/a&gt; application (a GPU-accelerated document processing service) on our ECS infrastructure and expose it to the internet using an Application Load Balancer (ALB). We'll also explore the core concepts of load balancing through an intuitive restaurant analogy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Application Load Balancer Components
&lt;/h2&gt;

&lt;p&gt;Before diving into the implementation, let's understand how Application Load Balancers work using a restaurant analogy:&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Concepts 🍽️
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Load Balancer
&lt;/h4&gt;

&lt;p&gt;Think of the &lt;strong&gt;Load Balancer&lt;/strong&gt; as the &lt;strong&gt;restaurant owner&lt;/strong&gt; whose primary responsibility is to serve customers efficiently. The restaurant owner ensures that customers have a great dining experience and that the restaurant runs smoothly.&lt;/p&gt;

&lt;h4&gt;
  
  
  Listener
&lt;/h4&gt;

&lt;p&gt;The &lt;strong&gt;Listener&lt;/strong&gt; is like a &lt;strong&gt;host/hostess hired by the restaurant owner&lt;/strong&gt; with specific instructions on which customer requests should be served where. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If a customer requests ice cream → direct them to the ice cream corner&lt;/li&gt;
&lt;li&gt;If a customer wants drinks → direct them to the bar area&lt;/li&gt;
&lt;li&gt;If a family arrives → guide them to the family seating section&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Target Group
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Target Groups&lt;/strong&gt; are like &lt;strong&gt;groups of waiters for each section&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Waiters at the ice cream corner&lt;/li&gt;
&lt;li&gt;Bartenders at the bar&lt;/li&gt;
&lt;li&gt;Waiters in the family seating area&lt;/li&gt;
&lt;li&gt;Each specialized group of staff forms a "Target Group"&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Register Targets
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Registering Targets&lt;/strong&gt; is the process where &lt;strong&gt;waiters register themselves&lt;/strong&gt; with their respective target groups, letting the system know they're available to serve customers in their designated area.&lt;/p&gt;

&lt;p&gt;This analogy helps us understand how ALB distributes incoming traffic (customers) to the right backend services (waiters) based on configured rules (host instructions).&lt;/p&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;In this final part, we'll:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create and register ECS Task Definitions for the Docling application&lt;/li&gt;
&lt;li&gt;Set up ECS Services to manage our containers&lt;/li&gt;
&lt;li&gt;Configure an Application Load Balancer for external access&lt;/li&gt;
&lt;li&gt;Establish proper networking and security group rules&lt;/li&gt;
&lt;li&gt;Test our complete GPU-enabled document processing service&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Make sure you've completed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1: Foundation - Networking &amp;amp; IAM&lt;/a&gt;&lt;/strong&gt; - VPC, subnets, and IAM roles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/bikash119/deploying-gpu-enabled-ecs-ec2-instances-with-auto-scaling-groups-and-launch-templates-569l"&gt;Part 2: ECS EC2 with Auto Scaling&lt;/a&gt;&lt;/strong&gt; - ECS cluster, Launch Templates, and Auto Scaling Groups&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You should have the following from previous parts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;$VPC_ID&lt;/code&gt; - VPC ID from Part 1&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;$PUBLIC_SUBNET&lt;/code&gt; and &lt;code&gt;$PRIVATE_SUBNET&lt;/code&gt; - Subnet IDs from Part 1
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;$ECS_SG_ID&lt;/code&gt; - Security Group ID from Part 2&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docling-ecs-cluster&lt;/code&gt; - ECS cluster from Part 2&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ECS_Asg&lt;/code&gt; - Auto Scaling Group from Part 2&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Create ECS Task Definition
&lt;/h2&gt;

&lt;p&gt;The Task Definition is like a blueprint that tells ECS how to run our Docling container. Create a file called &lt;code&gt;docling-task-definition.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"family"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"docling-nvidia"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"networkMode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"host"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"requiresCompatibilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"EC2"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"executionRoleArn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::&amp;lt;YOUR_ACCOUNT_ID&amp;gt;:role/ecs_task_exec_role"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"taskRoleArn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::&amp;lt;YOUR_ACCOUNT_ID&amp;gt;:role/ecs_task_role"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"containerDefinitions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"docling-serve"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ghcr.io/docling-project/docling-serve-cu126:main"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"essential"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"portMappings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"containerPort"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"hostPort"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"protocol"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tcp"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"environment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DOCLING_SERVE_ENABLE_UI"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"resourceRequirements"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GPU"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"logConfiguration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"logDriver"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"awslogs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"options"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"awslogs-group"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/ecs/docling-serve-nvidia"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"awslogs-region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"awslogs-stream-prefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ecs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"awslogs-create-group"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"linuxParameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"capabilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"add"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"SYS_ADMIN"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cpu"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2048"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"memory"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"8192"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Important&lt;/strong&gt;: Replace &lt;code&gt;&amp;lt;YOUR_ACCOUNT_ID&amp;gt;&lt;/code&gt; with your actual AWS account ID and ensure the role names match those created in &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Key Configuration Highlights
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GPU Support&lt;/strong&gt;: &lt;code&gt;resourceRequirements&lt;/code&gt; specifies 1 GPU allocation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network Mode&lt;/strong&gt;: &lt;code&gt;host&lt;/code&gt; mode allows direct access to the EC2 instance's network&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CloudWatch Logs&lt;/strong&gt;: Automatic log group creation for monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resource Allocation&lt;/strong&gt;: 2 vCPUs and 8GB RAM for GPU-intensive processing&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Register the Task Definition
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;TASK_ARN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ecs register-task-definition &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://docling-task-definition.json &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"taskDefinition.taskDefinitionArn"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Task Definition ARN: &lt;/span&gt;&lt;span class="nv"&gt;$TASK_ARN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Add Tags to Task Definition
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ecs tag-resource &lt;span class="nt"&gt;--resource-arn&lt;/span&gt; &lt;span class="nv"&gt;$TASK_ARN&lt;/span&gt; &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,value&lt;span class="o"&gt;=&lt;/span&gt;ECS_Docling_Task
aws ecs tag-resource &lt;span class="nt"&gt;--resource-arn&lt;/span&gt; &lt;span class="nv"&gt;$TASK_ARN&lt;/span&gt; &lt;span class="nt"&gt;--tags&lt;/span&gt; file://cluster-tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Verify Tags
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ecs list-tags-for-resource &lt;span class="nt"&gt;--resource-arn&lt;/span&gt; &lt;span class="nv"&gt;$TASK_ARN&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Create ECS Service
&lt;/h2&gt;

&lt;p&gt;ECS Services ensure that your desired number of tasks are running and healthy. Create &lt;code&gt;docling-service-definition.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"serviceName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"docling-serve"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cluster"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"docling-ecs-cluster"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"taskDefinition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"docling-nvidia:2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"desiredCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"launchType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"EC2"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;📝 &lt;strong&gt;Note&lt;/strong&gt;: We start with &lt;code&gt;desiredCount: 0&lt;/code&gt; to prevent tasks from starting before we have EC2 instances available.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Create the Service
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;SERVICE_ARN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ecs create-service &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://docling-service-definition.json &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"service.serviceArn"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Service ARN: &lt;/span&gt;&lt;span class="nv"&gt;$SERVICE_ARN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Add Tags to Service
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ecs tag-resource &lt;span class="nt"&gt;--resource-arn&lt;/span&gt; &lt;span class="nv"&gt;$SERVICE_ARN&lt;/span&gt; &lt;span class="nt"&gt;--tags&lt;/span&gt; file://cluster-tags.json
aws ecs tag-resource &lt;span class="nt"&gt;--resource-arn&lt;/span&gt; &lt;span class="nv"&gt;$SERVICE_ARN&lt;/span&gt; &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,value&lt;span class="o"&gt;=&lt;/span&gt;ECS_Docling_Service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Verify Service Tags
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ecs list-tags-for-resource &lt;span class="nt"&gt;--resource-arn&lt;/span&gt; &lt;span class="nv"&gt;$SERVICE_ARN&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Test Basic Service Deployment
&lt;/h2&gt;

&lt;p&gt;Before setting up the load balancer, let's verify our service works correctly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Launch EC2 Instance
&lt;/h3&gt;

&lt;p&gt;Scale up the Auto Scaling Group to launch an instance (this uses the configuration from Part 2):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws autoscaling update-auto-scaling-group &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--auto-scaling-group-name&lt;/span&gt; ECS_Asg &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--min-size&lt;/span&gt; 1 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--max-size&lt;/span&gt; 1 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--desired-capacity&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Start the Service
&lt;/h3&gt;

&lt;p&gt;Update the service to run one task:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ecs update-service &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cluster&lt;/span&gt; docling-ecs-cluster &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--service&lt;/span&gt; docling-serve &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--desired-count&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Verify Container is Running
&lt;/h3&gt;

&lt;p&gt;SSH into the instance and check the container status:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Get the instance IP (from Part 2's command)&lt;/span&gt;
aws ec2 describe-instances &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--filters&lt;/span&gt; &lt;span class="s2"&gt;"Name=instance-state-name,Values=running"&lt;/span&gt; &lt;span class="s2"&gt;"Name=tag-key,Values=Purpose"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Reservations[*].Instances[*].[InstanceId,InstanceType,PrivateIpAddress,PublicIpAddress]"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; table

&lt;span class="c"&gt;# SSH into the instance&lt;/span&gt;
ssh &lt;span class="nt"&gt;-i&lt;/span&gt; ECSInstanceKey.pem ec2-user@&amp;lt;INSTANCE_PUBLIC_IP&amp;gt;

&lt;span class="c"&gt;# Check running containers&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;docker ps

&lt;span class="c"&gt;# View container logs&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;docker logs &amp;lt;container_id&amp;gt; &lt;span class="nt"&gt;-f&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see the Docling application starting up and utilizing the GPU.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Set Up Application Load Balancer
&lt;/h2&gt;

&lt;p&gt;Now let's create the ALB infrastructure to make our application accessible from the internet.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create ALB Security Group
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ALB_SG_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ec2 create-security-group &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tag-specifications&lt;/span&gt; &lt;span class="s1"&gt;'ResourceType=security-group,Tags=[{Key=Name,Value=ALB_SG}]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--vpc-id&lt;/span&gt; &lt;span class="nv"&gt;$VPC_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--group-name&lt;/span&gt; ALB_SG &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--description&lt;/span&gt; &lt;span class="s2"&gt;"SG for ALB"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"GroupId"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"ALB Security Group ID: &lt;/span&gt;&lt;span class="nv"&gt;$ALB_SG_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Allow Internet Traffic to ALB
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 authorize-security-group-ingress &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--group-id&lt;/span&gt; &lt;span class="nv"&gt;$ALB_SG_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--protocol&lt;/span&gt; tcp &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 5001 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cidr&lt;/span&gt; 0.0.0.0/0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create the Load Balancer
&lt;/h3&gt;

&lt;p&gt;Remember our restaurant analogy - this creates the "restaurant owner":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;DOCLING_ALB&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws elbv2 create-load-balancer &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; docling-alb &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--subnets&lt;/span&gt; &lt;span class="nv"&gt;$PRIVATE_SUBNET&lt;/span&gt; &lt;span class="nv"&gt;$PUBLIC_SUBNET&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--security-groups&lt;/span&gt; &lt;span class="nv"&gt;$ALB_SG_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--scheme&lt;/span&gt; internet-facing &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--type&lt;/span&gt; application &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,Value&lt;span class="o"&gt;=&lt;/span&gt;DoclingALB &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"LoadBalancers[].LoadBalancerArn"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Load Balancer ARN: &lt;/span&gt;&lt;span class="nv"&gt;$DOCLING_ALB&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;📋 &lt;strong&gt;Note&lt;/strong&gt;: The load balancer spans both private and public subnets from &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt; for high availability.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Add Tags to Load Balancer
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws elbv2 add-tags &lt;span class="nt"&gt;--resource-arns&lt;/span&gt; &lt;span class="nv"&gt;$DOCLING_ALB&lt;/span&gt; &lt;span class="nt"&gt;--tags&lt;/span&gt; file://alb-tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 5: Create Target Group
&lt;/h2&gt;

&lt;p&gt;In our restaurant analogy, this creates the "group of waiters" that will serve our customers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;DOCLING_TARGET_GRP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws elbv2 create-target-group &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; docling-targets &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--protocol&lt;/span&gt; HTTP &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 5001 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--vpc-id&lt;/span&gt; &lt;span class="nv"&gt;$VPC_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--health-check-path&lt;/span&gt; /docs &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--target-type&lt;/span&gt; ip &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,Value&lt;span class="o"&gt;=&lt;/span&gt;doclingTargetGroup &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"TargetGroups[].TargetGroupArn"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Target Group ARN: &lt;/span&gt;&lt;span class="nv"&gt;$DOCLING_TARGET_GRP&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Key Target Group Configuration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;target-type ip&lt;/strong&gt;: Uses IP addresses rather than instance IDs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;health-check-path /docs&lt;/strong&gt;: Docling's health check endpoint&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;port 5001&lt;/strong&gt;: The port our application listens on&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Add Tags to Target Group
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws elbv2 add-tags &lt;span class="nt"&gt;--resource-arns&lt;/span&gt; &lt;span class="nv"&gt;$DOCLING_TARGET_GRP&lt;/span&gt; &lt;span class="nt"&gt;--tags&lt;/span&gt; file://alb-tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Update Service to use Target Group
&lt;/h3&gt;

&lt;p&gt;Now we need to update our ECS service to integrate with the Application Load Balancer. Create a file called &lt;code&gt;service-alb.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"targetGroupArn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;TARGET_GROUP_ARN&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"containerName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"docling-serve"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"containerPort"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5001&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Important&lt;/strong&gt;: Replace &lt;code&gt;&amp;lt;TARGET_GROUP_ARN&amp;gt;&lt;/code&gt; with the actual Target Group ARN we created above (stored in &lt;code&gt;$DOCLING_TARGET_GRP&lt;/code&gt;).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Update the service to use the load balancer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ecs update-service &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cluster&lt;/span&gt; docling-ecs-cluster &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--service&lt;/span&gt; docling-serve &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--load-balancers&lt;/span&gt; file://service-alb.json

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Service updated to use ALB integration"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This step is crucial as it tells ECS to automatically register and deregister tasks with the target group as they start and stop, eliminating the need for manual target registration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Create Listener
&lt;/h2&gt;

&lt;p&gt;The listener is our "host/hostess" that directs traffic to the right place:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;DOCLING_LISTENER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws elbv2 create-listener &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--load-balancer-arn&lt;/span&gt; &lt;span class="nv"&gt;$DOCLING_ALB&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--protocol&lt;/span&gt; HTTP &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 5001 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--default-actions&lt;/span&gt; &lt;span class="nv"&gt;Type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;forward,TargetGroupArn&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$DOCLING_TARGET_GRP&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,Value&lt;span class="o"&gt;=&lt;/span&gt;DoclingListener &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Listeners[].ListenerArn"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Listener ARN: &lt;/span&gt;&lt;span class="nv"&gt;$DOCLING_LISTENER&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Add Tags to Listener
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws elbv2 add-tags &lt;span class="nt"&gt;--resource-arns&lt;/span&gt; &lt;span class="nv"&gt;$DOCLING_LISTENER&lt;/span&gt; &lt;span class="nt"&gt;--tags&lt;/span&gt; file://alb-tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 7: Configure Security Rules
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Allow ALB to Reach EC2 Instances
&lt;/h3&gt;

&lt;p&gt;Update the EC2 security group (from Part 2) to allow traffic from the ALB:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 authorize-security-group-ingress &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--group-id&lt;/span&gt; &lt;span class="nv"&gt;$ECS_SG_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--protocol&lt;/span&gt; tcp &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 5001 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--source-group&lt;/span&gt; &lt;span class="nv"&gt;$ALB_SG_ID&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a security rule allowing our "restaurant owner" (ALB) to communicate with our "kitchen" (EC2 instances).&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 8: Final Testing and Verification
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Check Target Health
&lt;/h3&gt;

&lt;p&gt;Verify that our target is healthy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws elbv2 describe-target-health &lt;span class="nt"&gt;--target-group-arn&lt;/span&gt; &lt;span class="nv"&gt;$DOCLING_TARGET_GRP&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see the target status as &lt;code&gt;healthy&lt;/code&gt; once the health checks pass.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Load Balancer DNS Name
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ALB_DNS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws elbv2 describe-load-balancers &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--load-balancer-arns&lt;/span&gt; &lt;span class="nv"&gt;$DOCLING_ALB&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"LoadBalancers[].DNSName"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Access your application at: http://&lt;/span&gt;&lt;span class="nv"&gt;$ALB_DNS&lt;/span&gt;&lt;span class="s2"&gt;:5001"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Test the Application
&lt;/h3&gt;

&lt;p&gt;Open your browser and navigate to &lt;code&gt;http://&amp;lt;ALB_DNS&amp;gt;:5001/ui&lt;/code&gt;. You should see the Docling web interface!&lt;/p&gt;

&lt;p&gt;You can also test the API endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://&lt;span class="nv"&gt;$ALB_DNS&lt;/span&gt;:5001/docs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Monitor Application Logs
&lt;/h3&gt;

&lt;p&gt;Check the application logs through CloudWatch or directly on the instance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On the EC2 instance&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;docker logs &amp;lt;container_id&amp;gt; &lt;span class="nt"&gt;-f&lt;/span&gt;

&lt;span class="c"&gt;# Or check CloudWatch Logs&lt;/span&gt;
aws logs describe-log-groups &lt;span class="nt"&gt;--log-group-name-prefix&lt;/span&gt; &lt;span class="s2"&gt;"/ecs/docling-serve-nvidia"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Troubleshooting Common Issues
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Service Won't Start
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Check if EC2 instances are running and registered with ECS cluster&lt;/li&gt;
&lt;li&gt;Verify task definition has correct IAM roles from &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Check CloudWatch logs for container startup errors&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Can't Access Application
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Verify security group rules allow traffic&lt;/li&gt;
&lt;li&gt;Check target group health status&lt;/li&gt;
&lt;li&gt;Ensure load balancer is in correct subnets from &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  GPU Not Available
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Confirm EC2 instance type supports GPU (g4dn.xlarge)&lt;/li&gt;
&lt;li&gt;Check ECS agent configuration includes GPU support&lt;/li&gt;
&lt;li&gt;Verify NVIDIA drivers are installed (should be automatic with ECS GPU AMI)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Congratulations! 🎉 You've successfully deployed docling to  AWS ECS infrastructure with GPU support. Throughout this 3-part series, we've covered:&lt;/p&gt;

&lt;h3&gt;
  
  
  What We've Accomplished
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Part 1 Foundation&lt;/strong&gt;: VPC networking, security groups, and IAM roles&lt;br&gt;
&lt;strong&gt;Part 2 Infrastructure&lt;/strong&gt;: ECS cluster, Launch Templates, and Auto Scaling Groups&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Part 3 Application&lt;/strong&gt;: Task definitions, services, and Application Load Balancer&lt;/p&gt;

&lt;h3&gt;
  
  
  The Complete Architecture
&lt;/h3&gt;

&lt;p&gt;Your infrastructure now includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scalable GPU Computing&lt;/strong&gt;: Auto Scaling Groups with GPU-enabled instances&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Container Orchestration&lt;/strong&gt;: ECS managing Docling containers with resource requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High Availability&lt;/strong&gt;: Multi-subnet deployment with health checks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Internet Accessibility&lt;/strong&gt;: Application Load Balancer with proper security&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitoring&lt;/strong&gt;: CloudWatch integration for logs and metrics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security&lt;/strong&gt;: Layered security groups and IAM roles&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Real-World Applications
&lt;/h3&gt;

&lt;p&gt;This architecture is perfect for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AI/ML Workloads&lt;/strong&gt;: GPU-accelerated machine learning inference&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document Processing&lt;/strong&gt;: Like our Docling example for document conversion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Video Processing&lt;/strong&gt;: GPU-accelerated video transcoding and analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scientific Computing&lt;/strong&gt;: High-performance computing workloads&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Series Navigation
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1: Foundation - Networking &amp;amp; IAM&lt;/a&gt;&lt;/strong&gt; ✅ - VPC setup, subnets, security groups, and IAM roles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/bikash119/deploying-gpu-enabled-ecs-ec2-instances-with-auto-scaling-groups-and-launch-templates-569l"&gt;Part 2: ECS EC2 with Auto Scaling&lt;/a&gt;&lt;/strong&gt; ✅ - Launch templates, Auto Scaling Groups, and ECS cluster setup
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 3: Application Deployment (Current)&lt;/strong&gt; ✅ - Task definitions, services, and load balancers&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;Consider exploring these advanced topics in future posts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;High Availability &amp;amp; Scaling&lt;/strong&gt;: Multi-AZ deployments, auto scaling policies, and health monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitoring &amp;amp; Observability&lt;/strong&gt;: CloudWatch Container Insights, custom metrics, and distributed tracing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost Optimization&lt;/strong&gt;: Spot instances, reserved capacity, and right-sizing strategies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thank you for following along with this comprehensive AWS ECS series. You should now be able to setup the &lt;a href="https://dev.to/bikash119/equity-fundamental-octo-researcher-next-gen-stock-research-3a4n"&gt;n8n workflow&lt;/a&gt; which uses &lt;a href="https://github.com/docling-project/docling" rel="noopener noreferrer"&gt;docling&lt;/a&gt; deployed to a GPU enabled AWS ECS infrastructure 🚀&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ecs</category>
      <category>tutorial</category>
      <category>docling</category>
    </item>
    <item>
      <title>Deploying GPU-Enabled ECS EC2 Instances with Auto Scaling Groups and Launch Templates</title>
      <dc:creator>bikash119</dc:creator>
      <pubDate>Mon, 08 Sep 2025 04:34:54 +0000</pubDate>
      <link>https://forem.com/bikash119/deploying-gpu-enabled-ecs-ec2-instances-with-auto-scaling-groups-and-launch-templates-569l</link>
      <guid>https://forem.com/bikash119/deploying-gpu-enabled-ecs-ec2-instances-with-auto-scaling-groups-and-launch-templates-569l</guid>
      <description>&lt;p&gt;&lt;em&gt;This is Part 2 of a 3-part series on &lt;a href="https://dev.to/bikash119/deploying-docling-to-aws-ecs-a-complete-guide-o69"&gt;Deploy Docling to AWS ECS infrastructure&lt;/a&gt;. In &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;, we covered the foundational networking and IAM setup required for this deployment.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Setting up Amazon ECS (Elastic Container Service) with EC2 instances can be complex, especially when you need GPU support for compute-intensive workloads. In this comprehensive guide, we'll walk through creating a robust, scalable ECS infrastructure using Auto Scaling Groups (ASG) and Launch Templates, specifically configured for GPU workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Use Auto Scaling Groups with ECS?
&lt;/h2&gt;

&lt;p&gt;Auto Scaling Groups provide several key benefits for ECS deployments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automatic scaling&lt;/strong&gt; based on demand and health checks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High availability&lt;/strong&gt; across multiple availability zones&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost optimization&lt;/strong&gt; by scaling down during low usage periods&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistent tagging&lt;/strong&gt; and configuration through Launch Templates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integration with ECS Capacity Providers&lt;/strong&gt; for seamless container orchestration&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before we begin, ensure you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS CLI configured with appropriate permissions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A VPC with public subnets already created&lt;/strong&gt; - If you haven't set this up yet, please refer to &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1 of this series&lt;/a&gt; where we cover the complete VPC setup including subnets, internet gateways, and route tables&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IAM roles properly configured&lt;/strong&gt; - The &lt;code&gt;ec2_instance_role-profile&lt;/code&gt; referenced in our Launch Template was created in &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;. If you skipped Part 1, you'll need to set up the necessary IAM roles and instance profiles&lt;/li&gt;
&lt;li&gt;Basic understanding of AWS ECS, EC2, and Auto Scaling concepts&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Note&lt;/strong&gt;: This guide assumes you have the VPC ID (&lt;code&gt;$VPC_ID&lt;/code&gt;) and public subnet ID (&lt;code&gt;$PUBLIC_SUBNET&lt;/code&gt;) from Part 1. If you need to retrieve these values, refer to the VPC setup section in &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Step 1: Create Security Infrastructure
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Generate SSH Key Pair
&lt;/h3&gt;

&lt;p&gt;First, let's create a key pair for secure access to our EC2 instances:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 create-key-pair &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--key-name&lt;/span&gt; ECS_Instance_Key &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tag-specifications&lt;/span&gt; &lt;span class="s1"&gt;'ResourceType=key-pair,Tags=[{Key=Name,Value=ECS_Instance_Key}]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'KeyMaterial'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; ECSInstanceKey.pem

&lt;span class="c"&gt;# Secure the key file&lt;/span&gt;
&lt;span class="nb"&gt;chmod &lt;/span&gt;400 ECSInstanceKey.pem
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create Security Group
&lt;/h3&gt;

&lt;p&gt;Next, we'll create a security group to control network access. This security group will be associated with the VPC we created in &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ECS_SG_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ec2 create-security-group &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tag-specifications&lt;/span&gt; &lt;span class="s1"&gt;'ResourceType=security-group,Tags=[{Key=Name,Value=ECS_Instance_SG}]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--vpc-id&lt;/span&gt; &lt;span class="nv"&gt;$VPC_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--group-name&lt;/span&gt; ECS_Instance_SG &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--description&lt;/span&gt; &lt;span class="s2"&gt;"SG for ECS Instance"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"GroupId"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Add tags to the security group&lt;/span&gt;
aws ec2 create-tags &lt;span class="nt"&gt;--resources&lt;/span&gt; &lt;span class="nv"&gt;$ECS_SG_ID&lt;/span&gt; &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;📝 &lt;strong&gt;Reminder&lt;/strong&gt;: The &lt;code&gt;$VPC_ID&lt;/code&gt; variable should contain the VPC ID from Part 1. If you need to find your VPC ID, you can use the command provided in &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt; or run: &lt;code&gt;aws ec2 describe-vpcs --filters "Name=tag:Name,Values=your-vpc-name"&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Configure Security Group Rules
&lt;/h3&gt;

&lt;p&gt;⚠️ &lt;strong&gt;Security Notice&lt;/strong&gt;: The following rule allows SSH access from anywhere on the internet. For production environments, restrict this to your specific IP range or use AWS Systems Manager Session Manager for more secure access.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 authorize-security-group-ingress &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--group-id&lt;/span&gt; &lt;span class="nv"&gt;$ECS_SG_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--protocol&lt;/span&gt; tcp &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 22 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cidr&lt;/span&gt; 0.0.0.0/0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Prepare User Data Script
&lt;/h2&gt;

&lt;p&gt;Create a user data script that configures the ECS agent with GPU support:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;echo &lt;/span&gt;&lt;span class="nv"&gt;ECS_CLUSTER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;docling-ecs-cluster &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /etc/ecs/ecs.config
&lt;span class="nb"&gt;echo &lt;/span&gt;&lt;span class="nv"&gt;ECS_BACKEND_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://ecs.us-east-1.amazonaws.com &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /etc/ecs/ecs.config
&lt;span class="nb"&gt;echo &lt;/span&gt;&lt;span class="nv"&gt;ECS_ENABLE_GPU_SUPPORT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /etc/ecs/ecs.config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save this as &lt;code&gt;user-data.sh&lt;/code&gt; and encode it in base64:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;base64 &lt;/span&gt;user-data.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Get the Optimal GPU AMI
&lt;/h2&gt;

&lt;p&gt;AWS provides optimized AMIs for ECS with GPU support. Let's fetch the latest recommended AMI ID:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ssm get-parameters &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--names&lt;/span&gt; /aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;aws configure get region&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: Create Launch Template
&lt;/h2&gt;

&lt;p&gt;Launch Templates provide a way to store launch parameters so that you don't have to specify them every time you launch an instance. Create a JSON file called &lt;code&gt;ec2-launch-template.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ImageId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ami-0372b2cc554a36da2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"InstanceType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"g4dn.xlarge"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"KeyName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ECS_Instance_Key"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"IamInstanceProfile"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ec2_instance_role-profile"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"NetworkInterfaces"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"AssociatePublicIpAddress"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"DeleteOnTermination"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"DeviceIndex"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"SubnetId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;subnet id of public subnet"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Groups"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"value of $ECS_SG_ID"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"UserData"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;replace_with_base64_encoded_user-data.sh&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"BlockDeviceMappings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"DeviceName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/dev/xvda"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Ebs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"VolumeSize"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"VolumeType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gp3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"DeleteOnTermination"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Encrypted"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"TagSpecifications"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ResourceType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"instance"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ECS-Instance"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ResourceType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"volume"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ECS-Instance-Volume"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Monitoring"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"MetadataOptions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"HttpTokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"required"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"HttpPutResponseHopLimit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"HttpEndpoint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"enabled"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Important Configuration Notes&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Replace the &lt;code&gt;SubnetId&lt;/code&gt; with your actual public subnet ID from &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Replace the &lt;code&gt;Groups&lt;/code&gt; array with your actual security group ID (the &lt;code&gt;$ECS_SG_ID&lt;/code&gt; we just created)&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;IamInstanceProfile&lt;/code&gt; name (&lt;code&gt;ec2_instance_role-profile&lt;/code&gt;) was created in &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt; - make sure this matches your actual IAM instance profile name&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Key Configuration Highlights
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GPU Instance Type&lt;/strong&gt;: &lt;code&gt;g4dn.xlarge&lt;/code&gt; provides NVIDIA T4 GPU support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EBS Encryption&lt;/strong&gt;: Enabled for data security&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enhanced Monitoring&lt;/strong&gt;: Enabled for better observability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IMDSv2&lt;/strong&gt;: Enforced for improved instance metadata security&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GP3 Storage&lt;/strong&gt;: Latest generation EBS for better price/performance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now create the launch template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;EC2_LAUNCH_TEMPLATE_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ec2 create-launch-template &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--launch-template-name&lt;/span&gt; DoclingLaunchTemplate &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tag-specifications&lt;/span&gt; &lt;span class="s1"&gt;'ResourceType=launch-template,Tags=[{Key=Name,Value=ECS_EC2_Launch_Template}]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--launch-template-data&lt;/span&gt; file://ec2-launch-template.json &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"LaunchTemplate.LaunchTemplateId"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Add additional tags&lt;/span&gt;
aws ec2 create-tags &lt;span class="nt"&gt;--resources&lt;/span&gt; &lt;span class="nv"&gt;$EC2_LAUNCH_TEMPLATE_ID&lt;/span&gt; &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 5: Create ECS Cluster
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: Create the ECS cluster before launching EC2 instances. The ECS agent on the instances needs an existing cluster to register with.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ecs create-cluster &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cluster-name&lt;/span&gt; docling-ecs-cluster &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,value&lt;span class="o"&gt;=&lt;/span&gt;doclingECSCluster

&lt;span class="c"&gt;# Get cluster ARN for tagging&lt;/span&gt;
&lt;span class="nv"&gt;DOCLING_CLUSTER_ARN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ecs describe-clusters &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--clusters&lt;/span&gt; docling-ecs-cluster &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"clusters[].clusterArn"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Add additional tags&lt;/span&gt;
aws ecs tag-resource &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--resource-arn&lt;/span&gt; &lt;span class="nv"&gt;$DOCLING_CLUSTER_ARN&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tags&lt;/span&gt; file://cluster-tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 6: Create Auto Scaling Group
&lt;/h2&gt;

&lt;p&gt;Create the Auto Scaling Group with zero desired capacity initially. This ASG will use the public subnet we configured in &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws autoscaling create-auto-scaling-group &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--auto-scaling-group-name&lt;/span&gt; ECS_Asg &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--launch-template&lt;/span&gt; &lt;span class="nv"&gt;LaunchTemplateId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$EC2_LAUNCH_TEMPLATE_ID&lt;/span&gt;,Version&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'$Latest'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--min-size&lt;/span&gt; 0 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--max-size&lt;/span&gt; 0 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--desired-capacity&lt;/span&gt; 0 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--vpc-zone-identifier&lt;/span&gt; &lt;span class="nv"&gt;$PUBLIC_SUBNET&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,Value&lt;span class="o"&gt;=&lt;/span&gt;ECS_AutoScaling
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;📋 &lt;strong&gt;Note&lt;/strong&gt;: The &lt;code&gt;$PUBLIC_SUBNET&lt;/code&gt; variable should contain the subnet ID from &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;. If you need to retrieve your subnet ID, refer to the VPC setup section in Part 1.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Configure Auto Scaling Group Tags
&lt;/h3&gt;

&lt;p&gt;Create an &lt;code&gt;asg-tags.json&lt;/code&gt; file for propagating tags to launched instances:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"ResourceId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ECS_Asg"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"ResourceType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"auto-scaling-group"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Purpose"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DoclingSetup"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"PropagateAtLaunch"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"ResourceId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ECS_Asg"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"ResourceType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"auto-scaling-group"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Environment"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Dev"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"PropagateAtLaunch"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply the tags:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws autoscaling create-or-update-tags &lt;span class="nt"&gt;--tags&lt;/span&gt; file://asg-tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 7: Testing and Verification
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Launch an Instance
&lt;/h3&gt;

&lt;p&gt;Scale up the Auto Scaling Group to launch an instance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws autoscaling update-auto-scaling-group &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--auto-scaling-group-name&lt;/span&gt; ECS_Asg &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--min-size&lt;/span&gt; 1 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--max-size&lt;/span&gt; 1 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--desired-capacity&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Monitor Scaling Activities
&lt;/h3&gt;

&lt;p&gt;Check the status of the launch activity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws autoscaling describe-scaling-activities &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--auto-scaling-group-name&lt;/span&gt; ECS_Asg &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'Activities[?StatusCode==`InProgress`]'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Verify Tag Propagation
&lt;/h3&gt;

&lt;p&gt;Confirm that tags from the ASG were propagated to the EC2 instance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 describe-instances &lt;span class="nt"&gt;--filters&lt;/span&gt; &lt;span class="s2"&gt;"Name=tag-key,Values=Purpose"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Get Instance IP Address
&lt;/h3&gt;

&lt;p&gt;Find the IP address of the launched EC2 instance for SSH access:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 describe-instances &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--filters&lt;/span&gt; &lt;span class="s2"&gt;"Name=instance-state-name,Values=running"&lt;/span&gt; &lt;span class="s2"&gt;"Name=tag-key,Values=Purpose"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Reservations[*].Instances[*].[InstanceId,InstanceType,PrivateIpAddress,PublicIpAddress]"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; table
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will display a table showing the Instance ID, Instance Type, Private IP Address, and Public IP Address of all running instances tagged with the "Purpose" key. Make note of the Public IP Address as you'll need it for SSH access in the next step.&lt;/p&gt;

&lt;h3&gt;
  
  
  Verify ECS Agent
&lt;/h3&gt;

&lt;p&gt;SSH into the launched instance and check the ECS agent status:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# SSH into the instance using the generated key&lt;/span&gt;
ssh &lt;span class="nt"&gt;-i&lt;/span&gt; ECSInstanceKey.pem ec2-user@&amp;lt;INSTANCE_PUBLIC_IP&amp;gt;

&lt;span class="c"&gt;# Check ECS agent status&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status ecs

&lt;span class="c"&gt;# Verify ECS agent container is running&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;docker ps
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Verify Cluster Registration
&lt;/h3&gt;

&lt;p&gt;Check if the instance successfully registered with the ECS cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ecs list-container-instances &lt;span class="nt"&gt;--cluster&lt;/span&gt; docling-ecs-cluster
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Best Practices and Production Considerations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Security Enhancements
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Restrict SSH Access&lt;/strong&gt;: Replace &lt;code&gt;0.0.0.0/0&lt;/code&gt; with your specific IP range&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use AWS Systems Manager&lt;/strong&gt;: Consider Session Manager for secure shell access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable VPC Flow Logs&lt;/strong&gt;: Monitor network traffic for security analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Secrets Manager&lt;/strong&gt;: Store sensitive configuration data securely&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Monitoring and Logging
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;CloudWatch Container Insights&lt;/strong&gt;: Enable for comprehensive ECS monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom Metrics&lt;/strong&gt;: Set up custom CloudWatch metrics for your applications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log Aggregation&lt;/strong&gt;: Use CloudWatch Logs or a centralized logging solution&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Cost Optimization
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Spot Instances&lt;/strong&gt;: Consider using Spot Instances for cost savings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mixed Instance Types&lt;/strong&gt;: Use multiple instance types in your ASG&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scheduled Scaling&lt;/strong&gt;: Implement time-based scaling policies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ECS Capacity Providers&lt;/strong&gt;: Use for automatic scaling based on resource utilization&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;You've successfully created a scalable, GPU-enabled ECS infrastructure using Auto Scaling Groups and Launch Templates. This setup builds upon the foundational networking and IAM infrastructure we established in &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt; and provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automated scaling based on demand&lt;/li&gt;
&lt;li&gt;Consistent instance configuration through Launch Templates&lt;/li&gt;
&lt;li&gt;Proper tagging for resource management and cost allocation&lt;/li&gt;
&lt;li&gt;GPU support for compute-intensive workloads&lt;/li&gt;
&lt;li&gt;High availability and fault tolerance&lt;/li&gt;
&lt;li&gt;Integration with the VPC and security architecture from Part 1&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The infrastructure is now ready to host containerized applications that require GPU processing power, with the flexibility to scale automatically based on your workload demands.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;In Part 3 of this series, we'll cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creating and deploying ECS Task Definitions&lt;/li&gt;
&lt;li&gt;Setting up ECS Services for your applications&lt;/li&gt;
&lt;li&gt;Implementing Application Load Balancer for traffic distribution&lt;/li&gt;
&lt;li&gt;Advanced ECS configurations and monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stay tuned for the final part where we'll bring everything together with actual application deployment!&lt;/p&gt;

&lt;h2&gt;
  
  
  Series Navigation
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1: Foundation - Networking &amp;amp; IAM&lt;/a&gt;&lt;/strong&gt; - VPC setup, subnets, security groups, and IAM roles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 2: ECS EC2 with Auto Scaling (Current)&lt;/strong&gt; - Launch templates, Auto Scaling Groups, and ECS cluster setup&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 3: &lt;a href="https://dev.to/bikash119/deploying-docling-application-on-ecs-with-application-load-balancer-59k3"&gt;Application Deployment&lt;/a&gt;&lt;/strong&gt; - Task definitions, services, and load balancers&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;Consider implementing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ECS Services and Task Definitions for your applications&lt;/li&gt;
&lt;li&gt;Application Load Balancer for distributing traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This foundation, combined with the networking setup from &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Part 1&lt;/a&gt;, provides a robust starting point for &lt;a href="https://dev.to/bikash119/deploying-docling-to-aws-ecs-a-complete-guide-o69"&gt;deploying docling to AWS ECS cluster&lt;/a&gt; as containerized applications on AWS ECS with GPU support.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ecs</category>
      <category>ec2</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Deploy docling to AWS ECS: A Complete Guide</title>
      <dc:creator>bikash119</dc:creator>
      <pubDate>Sun, 07 Sep 2025 12:37:31 +0000</pubDate>
      <link>https://forem.com/bikash119/deploying-docling-to-aws-ecs-a-complete-guide-o69</link>
      <guid>https://forem.com/bikash119/deploying-docling-to-aws-ecs-a-complete-guide-o69</guid>
      <description>&lt;p&gt;I want to share key insights from my journey deploying Docling to AWS ECS (Elastic Container Service). Rather than overwhelming you with one massive tutorial, I've structured this as a three-part series that mirrors the logical flow of AWS infrastructure setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Series Overview
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Part 1&lt;/strong&gt;: &lt;strong&gt;&lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Foundation – Networking &amp;amp; IAM&lt;/a&gt;&lt;/strong&gt; &lt;br&gt;
Setting up the essential building blocks: VPC configuration, security groups, and IAM roles. Since networking and permissions form the backbone of any AWS deployment, we'll establish these fundamentals first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Part 2&lt;/strong&gt;: &lt;strong&gt;&lt;a href="https://dev.to/bikash119/deploying-gpu-enabled-ecs-ec2-instances-with-auto-scaling-groups-and-launch-templates-569l"&gt;Setting up the Workhorse – EC2&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;
Configuring the compute infrastructure that will run our containers: Auto Scaling Groups (ASG), EC2 launch templates, and instance profiles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Part 3&lt;/strong&gt;: &lt;strong&gt;&lt;a href="https://dev.to/bikash119/deploying-docling-application-on-ecs-with-application-load-balancer-59k3"&gt;ECS &amp;amp; Application Load Balancer&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;
Creating ECS tasks and services, then configuring an Application Load Balancer (ALB) to provide external access to our Docling deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Structure?
&lt;/h2&gt;

&lt;p&gt;Cloud deployments require significant upfront investment in networking and security configuration. By tackling these foundational elements first, we create a solid platform for our application infrastructure. The EC2 layer serves as our container runtime environment, while ECS orchestrates our application containers on top of this foundation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;This tutorial assumes you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic familiarity with AWS concepts and services&lt;/li&gt;
&lt;li&gt;An active AWS account with appropriate permissions. &lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html" rel="noopener noreferrer"&gt;AWS CLI&lt;/a&gt; installed and configured on your local machine&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you need to set up the AWS CLI, follow the installation and configuration steps in the AWS documentation to get started. For those comfortable taking a calculated risk and wanting to get started quickly, you can use your root user AWS access key and secret for initial setup and learning purposes.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You'll Build
&lt;/h2&gt;

&lt;p&gt;By the end of this series, you'll have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A production-ready VPC with proper security groups&lt;/li&gt;
&lt;li&gt;Auto-scaled EC2 instances configured for container workloads&lt;/li&gt;
&lt;li&gt;ECS services running Docling containers&lt;/li&gt;
&lt;li&gt;Load-balanced external access to your deployment&lt;/li&gt;
&lt;li&gt;A deep understanding of how these AWS services work together&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's begin building our &lt;a href="https://github.com/docling-project/docling" rel="noopener noreferrer"&gt;docling&lt;/a&gt; deployment infrastructure starting with &lt;a href="https://dev.to/bikash119/foundation-networking-iam-1pc9"&gt;Networking and IAM roles&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>aidev</category>
      <category>docling</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Foundation – Networking &amp; IAM</title>
      <dc:creator>bikash119</dc:creator>
      <pubDate>Sun, 07 Sep 2025 12:35:13 +0000</pubDate>
      <link>https://forem.com/bikash119/foundation-networking-iam-1pc9</link>
      <guid>https://forem.com/bikash119/foundation-networking-iam-1pc9</guid>
      <description>&lt;p&gt;&lt;em&gt;This is Part 1 of a 3-part series on &lt;a href="https://dev.to/bikash119/deploying-docling-to-aws-ecs-a-complete-guide-o69"&gt;Deploy Docling to AWS ECS infrastructure&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS VPC Infrastructure Setup Guide
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Overview
&lt;/h3&gt;

&lt;p&gt;This guide provides step-by-step instructions for creating a secure Virtual Private Cloud (VPC) infrastructure on AWS using the AWS CLI. A VPC enables you to provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html" rel="noopener noreferrer"&gt;AWS CLI&lt;/a&gt; installed and configured with appropriate credentials&lt;/li&gt;
&lt;li&gt;Basic understanding of networking concepts (CIDR blocks, subnets)&lt;/li&gt;
&lt;li&gt;Appropriate IAM permissions for VPC and EC2 operations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Resource Tagging Strategy
&lt;/h3&gt;

&lt;p&gt;Throughout this tutorial, we implement a consistent tagging strategy for all resources. This approach ensures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Easy resource identification and tracking&lt;/li&gt;
&lt;li&gt;Simplified cost allocation&lt;/li&gt;
&lt;li&gt;Better resource organization and management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Standard tags used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Name&lt;/code&gt;: Descriptive identifier for the resource&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Environment&lt;/code&gt;: Development stage (e.g., Dev, Staging, Production)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Purpose&lt;/code&gt;: Project or application identifier&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  1. Virtual Private Cloud (VPC) Setup
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Create the VPC
&lt;/h4&gt;

&lt;p&gt;Create a VPC with a /16 CIDR block, providing approximately 65,536 IP addresses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 create-vpc &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cidr-block&lt;/span&gt; 10.0.0.0/16 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tag-specifications&lt;/span&gt; &lt;span class="s1"&gt;'ResourceType=vpc,Tags=[{Key=Name,Value=DoclingVPC},{Key=Environment,Value=Dev}]'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: The CIDR block &lt;code&gt;10.0.0.0/16&lt;/code&gt; means all resources within this VPC will have IP addresses starting with &lt;code&gt;10.0.x.x&lt;/code&gt;, where the first two octets remain constant.&lt;/p&gt;

&lt;h4&gt;
  
  
  Store VPC ID for Future Reference
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;VPC_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ec2 describe-vpcs &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--filters&lt;/span&gt; &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;tag-key,Values&lt;span class="o"&gt;=&lt;/span&gt;Name &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Vpcs[?Tags[?Key=='Name' &amp;amp;&amp;amp; Value=='DoclingVPC']].VpcId"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Apply Additional Tags
&lt;/h4&gt;

&lt;p&gt;Create a &lt;code&gt;tags.json&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Purpose"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DoclingSetup"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply the tags to the VPC:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 create-tags &lt;span class="nt"&gt;--resources&lt;/span&gt; &lt;span class="nv"&gt;$VPC_ID&lt;/span&gt; &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Verify VPC Creation
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 describe-vpcs &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--filters&lt;/span&gt; &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;tag-key,Values&lt;span class="o"&gt;=&lt;/span&gt;Purpose &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Vpcs[].[VpcId,CidrBlock,State]"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; table
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Subnet Configuration
&lt;/h3&gt;

&lt;p&gt;Subnets segment your VPC into smaller networks. We'll create both public and private subnets following AWS best practices.&lt;/p&gt;

&lt;h4&gt;
  
  
  Create Private Subnet
&lt;/h4&gt;

&lt;p&gt;Deploy a private subnet in availability zone &lt;code&gt;us-east-1b&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 create-subnet &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--vpc-id&lt;/span&gt; &lt;span class="nv"&gt;$VPC_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cidr-block&lt;/span&gt; 10.0.4.0/24 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--availability-zone&lt;/span&gt; us-east-1b &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tag-specifications&lt;/span&gt; &lt;span class="s1"&gt;'ResourceType=subnet,Tags=[{Key=Name,Value=PrivateSubnet-1b}]'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Create Public Subnet
&lt;/h4&gt;

&lt;p&gt;Deploy a public subnet in availability zone &lt;code&gt;us-east-1a&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 create-subnet &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--vpc-id&lt;/span&gt; &lt;span class="nv"&gt;$VPC_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--cidr-block&lt;/span&gt; 10.0.3.0/24 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--availability-zone&lt;/span&gt; us-east-1a &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tag-specifications&lt;/span&gt; &lt;span class="s1"&gt;'ResourceType=subnet,Tags=[{Key=Name,Value=PublicSubnet-1a}]'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Each /24 subnet provides 256 IP addresses (though AWS reserves 5 IPs per subnet for internal use).&lt;/p&gt;

&lt;h3&gt;
  
  
  Retrieve Subnet IDs
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Get Private Subnet ID&lt;/span&gt;
&lt;span class="nv"&gt;PRIVATE_SUBNET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ec2 describe-subnets &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--filters&lt;/span&gt; &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;tag-value,Values&lt;span class="o"&gt;=&lt;/span&gt;PrivateSubnet-1b &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Subnets[0].SubnetId"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Get Public Subnet ID&lt;/span&gt;
&lt;span class="nv"&gt;PUBLIC_SUBNET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ec2 describe-subnets &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--filters&lt;/span&gt; &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;tag-value,Values&lt;span class="o"&gt;=&lt;/span&gt;PublicSubnet-1a &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Subnets[0].SubnetId"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Apply Tags to Subnets
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 create-tags &lt;span class="nt"&gt;--resources&lt;/span&gt; &lt;span class="nv"&gt;$PUBLIC_SUBNET&lt;/span&gt; &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://tags.json
aws ec2 create-tags &lt;span class="nt"&gt;--resources&lt;/span&gt; &lt;span class="nv"&gt;$PRIVATE_SUBNET&lt;/span&gt; &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Verify Subnet Configuration
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 describe-subnets &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--filters&lt;/span&gt; &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;tag-key,Values&lt;span class="o"&gt;=&lt;/span&gt;Purpose &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Subnets[].[SubnetId,CidrBlock,AvailabilityZone]"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; table
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Internet Gateway Configuration
&lt;/h3&gt;

&lt;p&gt;An Internet Gateway (IGW) enables communication between your VPC and the internet.&lt;/p&gt;

&lt;h4&gt;
  
  
  Create Internet Gateway
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;IGW_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ec2 create-internet-gateway &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tag-specifications&lt;/span&gt; &lt;span class="s1"&gt;'ResourceType=internet-gateway,Tags=[{Key=Name,Value=DoclingIGW}]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"InternetGateway.InternetGatewayId"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Apply Tags
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 create-tags &lt;span class="nt"&gt;--resources&lt;/span&gt; &lt;span class="nv"&gt;$IGW_ID&lt;/span&gt; &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://tags.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Attach Internet Gateway to VPC
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 attach-internet-gateway &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--internet-gateway-id&lt;/span&gt; &lt;span class="nv"&gt;$IGW_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--vpc-id&lt;/span&gt; &lt;span class="nv"&gt;$VPC_ID&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Verify Internet Gateway
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 describe-internet-gateways &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--filters&lt;/span&gt; &lt;span class="nv"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;tag-key,Values&lt;span class="o"&gt;=&lt;/span&gt;Purpose &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"InternetGateways[].[InternetGatewayId,Attachments[0].VpcId,Attachments[0].State]"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; table
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Route Table Configuration
&lt;/h3&gt;

&lt;p&gt;Route tables determine where network traffic is directed. AWS creates a main route table automatically with each VPC, but best practice dictates creating custom route tables for better security control.&lt;/p&gt;

&lt;h4&gt;
  
  
  Create Public Route Table
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;PUBLIC_ROUTE_TABLE_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ec2 create-route-table &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--vpc-id&lt;/span&gt; &lt;span class="nv"&gt;$VPC_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"RouteTable.RouteTableId"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Add Internet Route
&lt;/h4&gt;

&lt;p&gt;Configure the route table to direct internet-bound traffic through the Internet Gateway:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 create-route &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--route-table-id&lt;/span&gt; &lt;span class="nv"&gt;$PUBLIC_ROUTE_TABLE_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--destination-cidr-block&lt;/span&gt; 0.0.0.0/0 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--gateway-id&lt;/span&gt; &lt;span class="nv"&gt;$IGW_ID&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Associate Public Subnet with Route Table
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;PUBLIC_SUBNET_ASSOCIATION_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws ec2 associate-route-table &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--route-table-id&lt;/span&gt; &lt;span class="nv"&gt;$PUBLIC_ROUTE_TABLE_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--subnet-id&lt;/span&gt; &lt;span class="nv"&gt;$PUBLIC_SUBNET&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"AssociationId"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Verify Route Table Configuration
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ec2 describe-route-tables &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--route-table-ids&lt;/span&gt; &lt;span class="nv"&gt;$PUBLIC_ROUTE_TABLE_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"RouteTables[].[RouteTableId,Routes[].DestinationCidrBlock,Associations[].SubnetId]"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; table
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Security Best Practices
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Main Route Table&lt;/strong&gt;: Never add internet routes to the main route table. This ensures new subnets don't accidentally become public.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Subnet Isolation&lt;/strong&gt;: Keep private subnets truly private by not associating them with route tables containing internet gateway routes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multi-AZ Deployment&lt;/strong&gt;: For production environments, deploy subnets across multiple availability zones for high availability.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Network ACLs&lt;/strong&gt;: Consider implementing Network Access Control Lists for additional subnet-level security.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  AWS IAM Roles and Instance Profiles
&lt;/h2&gt;

&lt;p&gt;AWS IAM operates on three fundamental components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Trust Relationship&lt;/strong&gt; - Defines which entities can assume a role&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Permissions&lt;/strong&gt; - Specifies what actions the role can perform&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Temporary Credentials&lt;/strong&gt; - Provides time-limited access tokens for role assumption&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  IAM Implementation Process
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Create Trust Policy&lt;/strong&gt; - Define which AWS services can assume the role&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create Role&lt;/strong&gt; - Establish the role with the trust policy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attach Permissions&lt;/strong&gt; - Grant specific permissions to the role&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create Instance Profile&lt;/strong&gt; - Package the role for EC2 instances (when needed)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Associate Resources&lt;/strong&gt; - Link the role or instance profile to AWS resources&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;h2&gt;
  
  
  You may want to dive into the concepts &lt;a href="https://dev.to/bikash119/iam-roles-instance-profiles-4fkd"&gt;here.&lt;/a&gt;
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  EC2 IAM Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Trust Policy for EC2
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"Service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ec2.amazonaws.com"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts:AssumeRole"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create EC2 Instance Role
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create the role&lt;/span&gt;
aws iam create-role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-name&lt;/span&gt; ec2_instance_role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--assume-role-policy-document&lt;/span&gt; file://ec2-trust-policy.json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,Value&lt;span class="o"&gt;=&lt;/span&gt;EC2InstanceRole

&lt;span class="c"&gt;# Add additional tags&lt;/span&gt;
aws iam tag-role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-name&lt;/span&gt; ec2_instance_role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://tags.json

&lt;span class="c"&gt;# Attach managed policy&lt;/span&gt;
aws iam attach-role-policy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-name&lt;/span&gt; ec2_instance_role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-arn&lt;/span&gt; arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create Instance Profile
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create instance profile&lt;/span&gt;
aws iam create-instance-profile &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--instance-profile-name&lt;/span&gt; ec2_instance_role-profile &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,Value&lt;span class="o"&gt;=&lt;/span&gt;EC2InstanceRoleProfile

&lt;span class="c"&gt;# Add tags to instance profile&lt;/span&gt;
aws iam tag-instance-profile &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--instance-profile-name&lt;/span&gt; ec2_instance_role-profile &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://tags.json

&lt;span class="c"&gt;# Associate role with instance profile&lt;/span&gt;
aws iam add-role-to-instance-profile &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-name&lt;/span&gt; ec2_instance_role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--instance-profile-name&lt;/span&gt; ec2_instance_role-profile
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ECS IAM Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Trust Policy for ECS Tasks
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"Service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ecs-tasks.amazonaws.com"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts:AssumeRole"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ECS Task Execution Role
&lt;/h3&gt;

&lt;p&gt;The task execution role is required for ECS to pull container images and write logs on your behalf.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create ECS task execution role&lt;/span&gt;
&lt;span class="nv"&gt;ECS_TASK_EXEC_ROLE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws iam create-role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-name&lt;/span&gt; ecs_task_exec_role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--assume-role-policy-document&lt;/span&gt; file://ecs-trust-policy.json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,Value&lt;span class="o"&gt;=&lt;/span&gt;ECSTaskExecutionRole &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Role.RoleName"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Add tags&lt;/span&gt;
aws iam tag-role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-name&lt;/span&gt; &lt;span class="nv"&gt;$ECS_TASK_EXEC_ROLE&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://tags.json

&lt;span class="c"&gt;# Attach AWS managed policy for task execution&lt;/span&gt;
aws iam attach-role-policy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-name&lt;/span&gt; &lt;span class="nv"&gt;$ECS_TASK_EXEC_ROLE&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-arn&lt;/span&gt; arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ECS Task Role (Optional)
&lt;/h3&gt;

&lt;p&gt;The task role provides permissions for your application code running inside the container.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: This role can be omitted if your application doesn't need AWS API access. Test without it first to determine if it's necessary.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create ECS task role (if needed)&lt;/span&gt;
&lt;span class="nv"&gt;ECS_ROLE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws iam create-role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-name&lt;/span&gt; ecs_task_role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--assume-role-policy-document&lt;/span&gt; file://ecs-trust-policy.json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Name,Value&lt;span class="o"&gt;=&lt;/span&gt;ECSTaskRole &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Role.RoleName"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Add tags&lt;/span&gt;
aws iam tag-role &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-name&lt;/span&gt; &lt;span class="nv"&gt;$ECS_ROLE&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cli-input-json&lt;/span&gt; file://tags.json

&lt;span class="c"&gt;# Attach application-specific policies as needed&lt;/span&gt;
&lt;span class="c"&gt;# aws iam attach-role-policy --role-name $ECS_ROLE --policy-arn &amp;lt;policy-arn&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;You now have the essential IAM &amp;amp; networking foundation for our AWS ECS deployment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Next Steps:&lt;/strong&gt; In the &lt;a href="https://dev.to/bikash119/deploying-gpu-enabled-ecs-ec2-instances-with-auto-scaling-groups-and-launch-templates-569l"&gt;next part&lt;/a&gt; of this series, we'll configure the EC2 infrastructure that will use these roles to run your containerized applications.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>iam</category>
      <category>vpc</category>
      <category>ecs</category>
    </item>
    <item>
      <title>IAM Roles &amp; Instance Profiles</title>
      <dc:creator>bikash119</dc:creator>
      <pubDate>Sun, 07 Sep 2025 12:25:17 +0000</pubDate>
      <link>https://forem.com/bikash119/iam-roles-instance-profiles-4fkd</link>
      <guid>https://forem.com/bikash119/iam-roles-instance-profiles-4fkd</guid>
      <description>&lt;p&gt;💡 &lt;strong&gt;Understanding the Concepts&lt;/strong&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Understanding IAM Through a Real-World Analogy
&lt;/h3&gt;

&lt;p&gt;Imagine a FedEx delivery to a house inside a gated community. This scenario perfectly illustrates how AWS IAM works:&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Trust Policy - The Community Protocol
&lt;/h4&gt;

&lt;p&gt;The gated community management creates a protocol stating: "Only verified FedEx employees can be assigned the role of package delivery within our community." This is your &lt;strong&gt;Trust Policy&lt;/strong&gt; - it defines which entities (FedEx employees = AWS services like EC2, ECS) are trusted to assume a specific role.&lt;/p&gt;

&lt;h4&gt;
  
  
  2. Role - The Delivery Role
&lt;/h4&gt;

&lt;p&gt;Using this protocol, the security team at the gate creates a "Package Delivery Role." This role exists as a defined position with specific responsibilities, but no one holds it permanently. This is your &lt;strong&gt;IAM Role&lt;/strong&gt; - a set of permissions that can be temporarily assumed by trusted entities.&lt;/p&gt;

&lt;h4&gt;
  
  
  3. Permissions - What the Role Can Do
&lt;/h4&gt;

&lt;p&gt;The role comes with specific permissions: "Can access all residential areas, cannot access the car charging stations, must use designated parking spots." These rules define what anyone with this role can or cannot do within the community. This represents your &lt;strong&gt;Role Policies&lt;/strong&gt; - the specific AWS permissions attached to the role.&lt;/p&gt;

&lt;h4&gt;
  
  
  4. Instance Profile - The Badge Vending Machine
&lt;/h4&gt;

&lt;p&gt;When a FedEx driver arrives, security verifies their identity and issues a temporary access badge with all the delivery permissions encoded into it. Once the delivery is complete, the badge expires automatically. The driver loses all community access privileges upon exit. This is your &lt;strong&gt;Instance Profile&lt;/strong&gt; - it provides temporary credentials to AWS resources, ensuring access is time-limited and automatically revoked.&lt;/p&gt;

&lt;p&gt;This system ensures that even legitimate delivery personnel only have access for exactly as long as needed, with precisely the permissions required for their task.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Instance Profiles? EC2 vs ECS Role Assignment
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;EC2 Instances Have Their Own Job in ECS&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When an EC2 instance joins an ECS cluster, it becomes a worker node with specific responsibilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Register itself with the ECS service&lt;/li&gt;
&lt;li&gt;Report available resources (CPU, memory, storage)&lt;/li&gt;
&lt;li&gt;Download and start containers as instructed by ECS&lt;/li&gt;
&lt;li&gt;Monitor container health and report status&lt;/li&gt;
&lt;li&gt;Manage container lifecycle events&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To perform these cluster management tasks, the EC2 instance needs the &lt;code&gt;AmazonEC2ContainerServiceforEC2Role&lt;/code&gt; policy. Since EC2 is a general-purpose compute service, it uses &lt;strong&gt;instance profiles&lt;/strong&gt; to securely deliver these credentials.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ECS Tasks Assume Roles Directly&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;ECS tasks, however, don't need instance profiles because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;They're purpose-built containerized workloads with specific, known requirements&lt;/li&gt;
&lt;li&gt;ECS can directly inject appropriate credentials into containers&lt;/li&gt;
&lt;li&gt;The ECS service handles credential management natively&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Two Separate Layers of Permissions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Think of it as two distinct jobs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Infrastructure Layer (EC2 + Instance Profile)&lt;/strong&gt;: "I'm a worker node that needs to communicate with ECS control plane"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Application Layer (ECS Tasks + Task Roles)&lt;/strong&gt;: "I'm a specific application that needs to access AWS services"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This separation allows the same EC2 instance to host multiple different ECS tasks, each with their own specific permissions, while the instance itself maintains its cluster management permissions.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Equity Fundamental Octo Researcher: Next-Gen Stock Research</title>
      <dc:creator>bikash119</dc:creator>
      <pubDate>Sat, 30 Aug 2025 05:26:40 +0000</pubDate>
      <link>https://forem.com/bikash119/equity-fundamental-octo-researcher-next-gen-stock-research-3a4n</link>
      <guid>https://forem.com/bikash119/equity-fundamental-octo-researcher-next-gen-stock-research-3a4n</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/brightdata-n8n-2025-08-13"&gt;AI Agents Challenge powered by n8n and Bright Data&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I created an AI-powered assistant to streamline fundamental analysis for stock market investment ( India ). Investors typically spend hours reviewing annual reports, credit ratings, quarterly results, and transcripts to make investment decisions—a process that’s slow and overwhelming.&lt;/p&gt;

&lt;p&gt;My solution uses Large Language Models (LLMs) to analyze and summarize these documents, enabling users to ask questions and instantly receive targeted, contextual answers. This tool dramatically accelerates research, reduces manual effort, and helps analysts focus on insights, not information overload.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/PerCVbjeXU0"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  n8n Workflow
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://gist.github.com/bikash119/ef67e092130b7383702e5b51df867514" rel="noopener noreferrer"&gt;Equity Fundamental Octo Researcher N8N Workflow&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Overview
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Equity Fundamental Octo Researcher&lt;/strong&gt; workflow automates the extraction, processing, and retrieval of fundamental stock data and reports from Screener.in. It enables interactive fundamental analysis by combining web scraping, automated document parsing (using &lt;a href="https://github.com/docling-project/docling" rel="noopener noreferrer"&gt;docling-project&lt;/a&gt; by IBM), vector storage (Pinecone), and leading LLMs for Q&amp;amp;A.&lt;/p&gt;




&lt;h3&gt;
  
  
  Architecture/Components
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Form Trigger&lt;/strong&gt;: Accepts Screener.in company URLs from the user.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BrightData HTTP Nodes&lt;/strong&gt;: Automate scraping of key disclosure documents (annual reports, transcripts, presentations, etc.).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intermediary Storage &amp;amp; Extraction&lt;/strong&gt;: Handles job tracking and discovery of report/document URLs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docling OSS&lt;/strong&gt;: All document parsing, text extraction, and conversion (from PDF, HTML, image, etc.) are performed through the open-source &lt;strong&gt;docling-project&lt;/strong&gt; library, orchestrated via a custom Gradio API deployed to &lt;a href="https://dev.to/bikash119/deploying-docling-to-aws-ecs-a-complete-guide-o69"&gt;AWS ECS&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Async/Task Management&lt;/strong&gt;: Manages and monitors asynchronous parsing with job/event IDs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedding and Vector Database&lt;/strong&gt;: Processes parsed content into OpenAI embeddings and loads into Pinecone for fast semantic search.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM-Powered Q&amp;amp;A Agent&lt;/strong&gt;: Uses Anthropic Claude for question/answer generation, with context retrieved from Pinecone.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;n8n Chat Trigger&lt;/strong&gt;: Interface for handling user questions about the company or its documents.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Execution Flow
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;User Input&lt;/strong&gt;
User submits a company URL from Screener.in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Scraping with BrightData&lt;/strong&gt;
The workflow triggers a BrightData scraper run to find and collect links to all relevant fundamental documents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extract and Track Results&lt;/strong&gt;
Extraction nodes gather the discovered URLs and manage job progress.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document Processing (Docling OSS)&lt;/strong&gt;
Every collected URL is processed by docling (invoked via a Gradio-based API deployed to AWS). &lt;a href="https://github.com/docling-project/docling" rel="noopener noreferrer"&gt;docling-project&lt;/a&gt; extracts, parses, and converts source documents—including PDFs, scanned images, web pages—into clean markdown/text suitable for downstream NLP.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Async Task Management&lt;/strong&gt;
The workflow creates, tracks, and retrieves outputs from asynchronous Docling processing jobs using event/task IDs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text Embeddings and Storage&lt;/strong&gt;
Extracted text is transformed into embeddings using OpenAI and loaded into Pinecone for retrieval-augmented analysis.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interactive Chat/Q&amp;amp;A&lt;/strong&gt;
The user interacts via chat; queries leverage the Anthropic Claude LLM plus context pulled via semantic search from Pinecone vector store.&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Integrations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;BrightData API&lt;/strong&gt;: Web scraping for financial documents.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;n8n&lt;/strong&gt;: Workflow automation and orchestration.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Docling OSS (deployed to AWS ECS)&lt;/strong&gt;: Core document conversion and extraction.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OpenAI API&lt;/strong&gt;: Embedding generation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pinecone&lt;/strong&gt;: Vector database.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Anthropic Claude API&lt;/strong&gt;: Question-answering language model.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Usage Instructions
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Deploy the workflow&lt;/strong&gt; with all service credentials and dependencies configured.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Expose the webform&lt;/strong&gt; to users for them to submit Screener.in URLs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Scraping, parsing (via Docling OSS)&lt;/strong&gt;, and embedding are automated in the pipeline.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Users chat with the agent&lt;/strong&gt; to ask analysis or document-grounded questions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Extend as needed&lt;/strong&gt;—add new endpoints, sources, or Docling capabilities for more coverage.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Bright Data Verified Node
&lt;/h2&gt;

&lt;p&gt;There was no scraper available for &lt;a href="https://screener.in" rel="noopener noreferrer"&gt;https://screener.in&lt;/a&gt; that could reliably capture all relevant document links (annual reports, con call transcripts, quarterly reports, investor presentations, etc.). To overcome this, I built a custom collector from scratch using the BrightData IDE. You can find the scraper &lt;a href="https://brightdata.com/cp/data_collector?id=hl_223bf3d8&amp;amp;collector_id=c_mekdzh5iwilsejjtt" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdfwv6lwjev9930f2kjil.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdfwv6lwjev9930f2kjil.png" alt="BrighData Collector Invocation through API Img 1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffkkseu9r87o1jroko5ie.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffkkseu9r87o1jroko5ie.png" alt="BrighData Collector Invocation through API Img 2"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ed6edrwiigai8iexi75.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ed6edrwiigai8iexi75.png" alt="BrighData Collector Invocation through API Img 3"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5mg8moi6g7txinh4y7u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5mg8moi6g7txinh4y7u.png" alt="BrighData Extract Response Img 1"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0btl88hbkbrwuwia6xlw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0btl88hbkbrwuwia6xlw.png" alt="BrighData Extract Response Img 2"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Challenges&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Building a Custom Scraper for Screener.in&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
There was no ready-made scraper available for Screener.in that could reliably capture all relevant document links (annual reports, con call transcripts, quarterly reports, investor presentations, etc.). To overcome this, I built a custom collector from scratch using the BrightData IDE. You can find the scraper &lt;a href="https://brightdata.com/cp/data_collector?id=hl_223bf3d8&amp;amp;collector_id=c_mekdzh5iwilsejjtt" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Self-Hosting &lt;code&gt;docling-project&lt;/code&gt; OSS as a Cloud Service&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;code&gt;docling&lt;/code&gt;, which handles document parsing and conversion, is not available as a hosted SaaS. To make it production-ready, converted Docling’s &lt;a href="https://github.com/docling-project/docling-serve/blob/main/docs/deploy-examples/compose-nvidia.yaml" rel="noopener noreferrer"&gt;docker-compose&lt;/a&gt; deployment into an AWS ECS task, set up networking with a load balancer, and ensured stable API access from n8n. This required container orchestration, secure environment set-up, persistent storage, and custom integration so that document parsing at scale could be achieved in a robust and maintainable way.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Learnings&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Use Brightdata and scrape / collect whatever you are interested in.&lt;/li&gt;
&lt;li&gt;A ton about AWS networking, IAM , ECS &amp;amp; LoadBalancers.&lt;strong&gt;My learnings&lt;/strong&gt; &lt;a href="https://dev.to/bikash119/deploying-docling-to-aws-ecs-a-complete-guide-o69"&gt;here.&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;I have just scratched the surface of &lt;strong&gt;docling-project&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>devchallenge</category>
      <category>n8nbrightdatachallenge</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
