Forem: Ehi Enabs

Image Processing with AWS Textract (Extracting Text from Newspaper Images I)

Ehi Enabs — Thu, 27 Mar 2025 07:49:54 +0000

Amazon Textract is a powerful AWS service that allows users to extract text, handwriting, and structured data from scanned documents, including newspaper images. This guide will walk you through setting up batch processing for extracting text from newspaper images stored in an Amazon S3 bucket using Textract.

Prerequisites

Before you begin, ensure you have the following:

An AWS account with appropriate permissions.

An S3 bucket containing newspaper images.

An IAM role with permissions for Amazon Textract, S3, and AWS Lambda (optional for automation).

The AWS CLI or SDK (Boto3 for Python) is installed.

Step 1: Upload Newspaper Images to S3

Navigate to the AWS S3 Console.

Create or select an existing bucket.

Upload the newspaper images you want to process.

Step 2: Create an IAM Role for Textract

Go to the AWS IAM Console.

Create a new role with the following permissions:

{
  "Effect": "Allow",
  "Action": [
    "textract:StartDocumentTextDetection",
    "textract:GetDocumentTextDetection",
    "s3:GetObject",
    "s3:PutObject"
  ],
  "Resource": "*"
}

Attach this policy to your IAM role and note the ARN.

Step 3: Start a Textract Batch Processing Job

Using the AWS CLI, start the text extraction job:

aws textract start-document-text-detection \
    --document-location "S3Object={Bucket=<your-bucket>,Name=<image-file>}" \
    --notification-channel "RoleArn=<your-iam-role-arn>,SNSTopicArn=<sns-topic-arn>"

Alternatively, using Boto3 in Python:

import boto3

def start_textract_job(bucket, document):
    textract = boto3.client('textract')
    response = textract.start_document_text_detection(
        DocumentLocation={
            'S3Object': {
                'Bucket': bucket,
                'Name': document
            }
        }
    )
    return response['JobId']

job_id = start_textract_job("your-bucket", "your-image.jpg")
print(f"Job started with ID: {job_id}")

Step 4: Retrieve the Extracted Text

Once the job is completed, retrieve the results:

aws textract get-document-text-detection --job-id <your-job-id>aws textract get-document-text-detection --job-id <your-job-id>

Or using boto3

import time

def get_textract_results(job_id):
    textract = boto3.client('textract')
    while True:
        response = textract.get_document_text_detection(JobId=job_id)
        if response['JobStatus'] == 'SUCCEEDED':
            break
        time.sleep(5)
    return response['Blocks']

blocks = get_textract_results(job_id)
for block in blocks:
    if block['BlockType'] == 'LINE':
        print(block['Text'])

Step 5: Store and Process Extracted Text

Once you extract the text, you can:

Store it in an S3 bucket.

Process it with AWS Lambda and DynamoDB.

Perform text analysis using Amazon Comprehend.

Conclusion

Using Amazon Textract, you can efficiently extract text from newspaper images stored in S3 via batch processing. This enables large-scale document processing, automation, and text analytics in AWS.

Text Processing with AWS Bedrock (Extracting Text from Newspaper Images II)

Ehi Enabs — Thu, 27 Mar 2025 07:33:28 +0000

Introduction

So, you've got a massive archive of newspaper images sitting in an S3 bucket, and you need to extract text from them, structure them into articles, and store the results for further analysis. Sounds like a headache, right? Well, AWS Bedrock Textract makes it easy.
Here's how;

The Big Picture

You want to automate the process of processing text you extracted from from newspaper images, here are a few AWS services you are going to need:

S3 (to store raw text and processed text)
DynamoDB (to keep track of which files have been processed)
AWS Bedrock (to clean up and format extracted text into structured news articles)

Think of it as a conveyor belt: a text goes in, the processed text comes out, and the system remembers what’s been processed so it doesn’t do the same work twice.

Step 1: Setting Up Our Tools

Before we get into processing the files, we initialize some AWS clients and set up logging:

s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
bedrock = boto3.client('bedrock-runtime')

S3 is used to read and write files in our S3 bucket.

Dynamodb helps us store and retrieve checkpoints (so we don’t reprocess the same files).

Bedrock is our AI workhorse, helping turn raw text into structured articles.

Logging is also set up to capture what’s happening:

logger = logging.getLogger()
logger.setLevel(logging.INFO)

Logging helps debug when things go wrong.

Step 2: Keeping Track of Processed Files

To avoid processing the same file multiple times, we store a “checkpoint” in DynamoDB. This is done through a CheckpointManager class:

class CheckpointManager:
    def __init__(self, table_name=CHECKPOINT_TABLE_NAME):
        self.table = dynamodb.Table(table_name)

This class has two key methods:

get_last_processed_key() - Retrieves the last processed file so we can continue from there.
update_checkpoint() - Updates the checkpoint once a file is processed.

If something crashes mid-run, the script will resume from where it left off, instead of reprocessing everything from scratch. Neat, right?

Step 3: Processing a Single File

The NewsProcessor class is where the magic happens. It does three things:

Reads the extracted text from an S3 JSON file.

Sends the text to AWS Bedrock for structuring.

Saves the structured output back to S3.

Here’s how we fetch the text from S3:

response = s3.get_object(Bucket=INPUT_BUCKET, Key=key)
input_data = json.loads(response['Body'].read().decode('utf-8'))
extracted_text = input_data.get('extracted_text', [])
source_image = input_data.get('source_image', '')

We then send the text to AWS Bedrock, using Claude (a large language model from Anthropic):

\body = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 4096,
    "messages": [
        {"role": "user", "content": prompt}
    ],
    "temperature": 0,
    "system": "You are a professional newspaper editor who excels at identifying and structuring news articles."
})
response = bedrock.invoke_model(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    body=body
)

AWS Bedrock takes the raw text and turns it into well-structured news articles. If it messes up (which AI sometimes does), we try to handle that gracefully.

Step 4: Saving the Output

Once Bedrock does its thing, we save the structured articles back to S3:

s3.put_object(
    Bucket=INPUT_BUCKET,
    Key=output_key,
    Body=json.dumps({
        'metadata': {
            'source_file': key,
            'source_image': source_image,
            'articles_processed': len(processed_articles)
        },
        'articles': processed_articles
    }, indent=2),
    ContentType='application/json'
)

We also update the checkpoint in DynamoDB so we don’t process this file again.

Step 5: The Lambda Function

To save on cost, we will run inside an AWS Lambda function, which means it needs to:

Find new files to process – It lists JSON files in S3 and filters out ones that have already been processed.

Process them one by one – It loops through and calls process_single_file().

Handle timeouts – If Lambda is running out of time, it stops before getting killed.

The full loop looks like this:

for key in files_to_process:
    try:
        result = processor.process_single_file(key)
        results.append(result)

        if context.get_remaining_time_in_millis() < 60000:
            logger.info("Approaching Lambda timeout, stopping processing")
            break
    except Exception as e:
        logger.error(f"Error processing file {key}: {e}")
        results.append({
            'success': False,
            'input_key': key,
            'error': str(e)
        })

When it finishes processing, it saves a summary file in S3 with details of which files were processed successfully and which ones failed.

Wrapping It Up

In short, we can automate text extraction of thousands of newspaper images stored in an S3 bucket, extract text, organizes it into articles using AWS Bedrock, and stores the structured output back in S3.

If you’re dealing with large volumes of scanned newspapers, this kind of automation can save you weeks of manual work.

Deploying Amazon Managed Service for Apache Kafka (Amazon MSK) with CloudFormation

Ehi Enabs — Thu, 21 Mar 2024 03:16:00 +0000

Amazon Managed Service for Apache Kafka (Amazon MSK) helps developers easily build scalable and resilient streaming applications by abstracting the complexities of Kafka infrastructure management.

Developed initially by LinkedIn, Apache Kafka has become a significant technology for real-time data processing and event streaming. Kafka is a distributed streaming platform which uses a publish-subscribe messaging model. Known for its scalability, reliability, fault tolerance, and durable storage capabilities, Kafka facilitates the seamless ingestion, processing, and delivery of massive volumes of data in real-time.

Whether it's processing website clickstreams, tracking sensor data from IoT devices, or powering real-time analytics, Kafka's versatility makes it a go-to solution for building robust streaming architectures.

Deploying Amazon Managed Service for Apache Kafka (Amazon MSK) with CloudFormation

AWS CloudFormation is a great tool designed to streamline infrastructure provisioning through code. With CloudFormation, developers can describe their AWS infrastructure using a simple, declarative template format, encompassing everything from EC2 instances to S3 buckets and IAM roles. This template is a blueprint for resource creation and configuration, ensuring consistency and reproducibility across environments. CloudFormation templates, typically written in JSON or YAML, articulate the desired state of the infrastructure, abstracting away the procedural steps needed to achieve that state.

The following is a CloudFormation Template for deploying Apache Kafka (Amazon MSK)

AWSTemplateFormatVersion: "2024-09-09"
Description: "Amazon MSK Deployment Example"

Resources:
  KafkaCluster:
    Type: AWS::MSK::Cluster
    Properties:
      ClusterName: MyKafkaCluster
      KafkaVersion: "2.8.0"
      NumberOfBrokerNodes: 3
      BrokerNodeGroupInfo:
        InstanceType: kafka.m5.large
        ClientSubnets: 
          - !Ref SubnetIds
        SecurityGroups:
          - !Ref SecurityGroupId
      EncryptionInfo:
        EncryptionInTransit:
          InCluster: true
        EncryptionAtRest:
          DataVolumeKMSKeyId: !Ref KmsKeyId
      LoggingInfo:
        BrokerLogs:
          CloudWatchLogs:
            LogGroup: !Ref LogGroupName

Parameters:
  SubnetIds:
    Description: "Subnet IDs for Kafka cluster"
    Type: List<AWS::EC2::Subnet::Id>
  SecurityGroupId:
    Description: "Security Group ID for Kafka cluster"
    Type: AWS::EC2::SecurityGroup::Id
  KmsKeyId:
    Description: "KMS Key ID for data encryption at rest"
    Type: AWS::KMS::Key::Id
  LogGroupName:
    Description: "Log Group Name for Kafka cluster logs"
    Type: String

After creating the template, you can deploy your Amazon MSK with was cli by using the following command;

aws cloudformation create-stack --stack-name MyKafkaStack --template-body file://msk-deployment.yaml \
--parameters ParameterKey=SubnetIds,ParameterValue="subnet-12345678,subnet-23456789" \
ParameterKey=SecurityGroupId,ParameterValue="sg-12345678" \
ParameterKey=KmsKeyId,ParameterValue="arn:aws:kms:us-east-1:123456789012:key/abcd1234-abcd-1234-abcd-123456789012" \
ParameterKey=LogGroupName,ParameterValue="/aws/kafka/mykafkacluster"

Conclusion:

Deploying Amazon Managed Service for Apache Kafka (Amazon MSK) with AWS CloudFormation streamlines the process of setting up a Kafka cluster for real-time data streaming. By abstracting the complexities of infrastructure management, Amazon MSK enables teams to focus on building innovative streaming applications with ease.

Amazon Managed Service for Prometheus (AMP) with CloudFormation

Ehi Enabs — Mon, 18 Mar 2024 04:22:44 +0000

Deploying Amazon Managed Service for Prometheus (AMP) with CloudFormation.

Amazon Managed Service for Prometheus (AMP) simplifies Prometheus's deployment, management, and scaling.

Prometheus is an open-source monitoring and alerting toolkit. It has a robust querying language, a powerful data model, and extensive integrations. Prometheus helps engineers gain insights into the health and performance of their applications and systems by offering a thorough solution for monitoring infrastructure, applications, and services. From collecting metrics and generating alerts based on predefined thresholds, to its scalability, reliability, and community support, Prometheus is a significant pillar of observability.

Why Amazon Managed Service for Prometheus (AMP)?

AMP helps reduce the hassle of managing Prometheus infrastructure by handling the heavy lifting. Tasks such as provisioning network, storage, and computing resources for the deployment of Prometheus, scaling, upgrades and patches, high availability, etc., are all managed by AWS.

That means you and your team can use those valuable insights to improve your applications. Plus, AMP plays super nicely with other AWS services and offers a pay-as-you-go setup, making it the go-to for organizations wanting top-notch Prometheus monitoring on AWS.

Deploying Amazon Managed Service for Prometheus (AMP) with CloudFormation

AWS CloudFormation, is a tool that enables developers to define their infrastructure as code (IaC). With CloudFormation, you can describe your AWS resources in a simple, declarative template, specifying everything you need, from EC2 instances to S3 buckets and IAM roles. Once defined, CloudFormation provides and configures the resources, ensuring consistency and repeatability across environments.

A CloudFormation template is a JSON or YAML file that defines the AWS resources and their configurations needed to deploy an application or infrastructure stack.CloudFormation templates are written in a declarative format, specifying the desired state of the infrastructure rather than the steps required to achieve that state.

The following is a CloudFormation Template for deploying AMP

AWSTemplateFormatVersion: "2024-09-09"
Description: "Amazon Managed Service for Prometheus (AMP) Deployment Example"

Parameters:
  WorkspaceName:
    Description: "Name for the Prometheus workspace"
    Type: String
    Default: "MyPrometheusWorkspace"

Resources:
  PrometheusWorkspace:
    Type: AWS::Prometheus::Workspace
    Properties:
      WorkspaceName: !Ref WorkspaceName
      Retention: 30 # Retention period for metrics (in days)
      DataSources:
        - Type: "CloudWatch"
          Region: !Ref AWS::Region
          AssumeRoleArn: !GetAtt IAMRole.Arn
      WorkspaceDescription: "Managed Prometheus workspace for monitoring applications"

  IAMRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2024-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: [prometheus.amazonaws.com]
            Action: ["sts:AssumeRole"]
      Policies:
        - PolicyName: "PrometheusDataAccessPolicy"
          PolicyDocument:
            Version: "2024-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - "cloudwatch:GetMetricData"
                  - "cloudwatch:GetMetricStatistics"
                  - "cloudwatch:ListMetrics"
                Resource: "*"

Outputs:
  PrometheusWorkspaceName:
    Description: "Name of the created Prometheus workspace"
    Value: !Ref PrometheusWorkspace
  IAMRoleArn:
    Description: "ARN of the IAM role used by Prometheus workspace"
    Value: !GetAtt IAMRole.Arn

After creating the template, you can deploy the AMP stack with aws cli by using the following command;

aws cloudformation create-stack --stack-name MyAMPStack --template-body file://amp-deployment.yaml --parameters ParameterKey=WorkspaceName,ParameterValue=MyPrometheusWorkspace

Benefits of Deploying AMP with CloudFormation

Deploying AMP with CloudFormation helps simplify the process and reduces the manual labour required. Other benefits includes

Infrastructure as Code (IaC): CloudFormation allows you to define your infrastructure in a declarative template, enabling you to treat infrastructure as code. This approach enhances consistency, repeatability, and version control, as infrastructure changes can be tracked along with application code.
Automated Provisioning: With CloudFormation, you can automate the provisioning of AMP resources, including Prometheus workspaces and IAM roles. This eliminates manual intervention and reduces the risk of errors during deployment.
Simplified Deployment: CloudFormation abstracts the complexity of infrastructure management, providing a simple and standardized way to deploy AMP. You can define all the necessary resources and configurations in a single template, making deployment straightforward and repeatable.
Integration with AWS Ecosystem: CloudFormation seamlessly integrates with other AWS services, allowing you to incorporate AMP into your existing AWS environment effortlessly. You can leverage CloudFormation features like parameterization and resource dependencies to create highly customizable and scalable deployments
Resource Management: CloudFormation provides centralized management and tracking of AMP resources, making it easy to monitor, update, and delete resources as needed. You can track changes, view stack history, and roll back to previous configurations if necessary, providing greater control and visibility over your infrastructure.

Conclusion

Deploying Amazon Managed Service for Prometheus (AMP) with AWS CloudFormation streamlines the process of setting up Prometheus monitoring on AWS. By abstracting the complexities of infrastructure management, AMP enables teams to leverage monitoring insights to optimize application performance and reliability while reducing the manual labour and time.

Bootstrapping Kubernetes Cluster with CloudFormation

Ehi Enabs — Tue, 14 Feb 2023 17:06:27 +0000

Introduction

Kubernetes is an open-source platform for managing containerized applications. It is one of the most widely used tools for managing large-scale deployments of applications and provides an impressive array of features, such as service discovery, auto-scaling, and container orchestration.
Kubernetes also enables users to configure and manage their applications, scaling them quickly and easily. This includes creating deployments, services, and ingresses, as well as configuring the networking between services. Additionally, users can add monitoring and logging to keep track of their applications and ensure they are running optimally. Finally, users can use CloudFormation to automate the deployment of their applications on the Kubernetes cluster.
CloudFormation is an Amazon Web Services (AWS) service that enables users to create and manage configuration files for their cloud resources. With CloudFormation, users can easily set up, configure, and manage Kubernetes clusters with minimal effort and without having to manually configure every aspect. In this blog post, we'll explore how to bootstrap a Kubernetes cluster with CloudFormation, and how to configure and manage it once it is up and running.

Overview

CloudFormation makes it easy to bootstrap a Kubernetes cluster with minimal effort. It enables users to create a template that defines the desired configuration of the cluster, such as the number of nodes, the instance types, and the networking settings. Once the template is created, users can deploy the cluster by running the CloudFormation stack. Once the stack is complete, users can begin configuring and managing the cluster, installing the necessary software components, configuring the networking between the nodes, and scaling their applications quickly and easily. Additionally, users can use CloudFormation to automate the deployment of their applications on the Kubernetes cluster. In this blog post, we'll explore how to bootstrap a Kubernetes cluster with CloudFormation, and how to configure and manage it once it is up and running.

Creating a Kubernetes Cluster

The first step in bootstrapping a Kubernetes cluster with CloudFormation is to create a template that defines the desired configuration of the cluster. The template should include the desired number of nodes, their instance types, and their associated roles. Additionally, the template should include all the necessary software components and configurations, such as the Kubernetes Dashboard and kubectl, as well as the desired networking settings. Once the template is created, users can deploy the cluster by running the CloudFormation stack. Once the stack is complete, the cluster is ready to be used.

The following is a sample template for creating a Kubernetes Cluster with CloudFormation.

---
AWSTemplateFormatVersion: '2010-09-09'
Parameters:
  VpcId:
    Type: AWS::EC2::VPC::Id
    Description: ID of the VPC in which to create the Kubernetes cluster
  SubnetIds:
    Type: List<AWS::EC2::Subnet::Id>
    Description: List of Subnet IDs in which to create the Kubernetes cluster
  KeyPairName:
    Type: AWS::EC2::KeyPair::KeyName
    Description: Name of the EC2 Key Pair to use for SSH access to worker nodes
  ClusterName:
    Type: String
    Description: Name of the Kubernetes cluster to create
Resources:
  ControlPlaneSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      VpcId: !Ref VpcId
      GroupDescription: Allow inbound traffic to the Kubernetes control plane
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp: 0.0.0.0/0
  WorkerNodeSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      VpcId: !Ref VpcId
      GroupDescription: Allow inbound traffic to Kubernetes worker nodes
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp: 0.0.0.0/0
  ControlPlaneInstanceProfile:
    Type: AWS::IAM::InstanceProfile
    Properties:
      Roles:
        - !Ref ControlPlaneRole
  ControlPlaneRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - ec2.amazonaws.com
            Action:
              - sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/AmazonEKSClusterPolicy
        - arn:aws:iam::aws:policy/AmazonEKSServicePolicy
  ControlPlaneInstance:
    Type: AWS::EC2::Instance
    Properties:
      ImageId: ami-0b69ea66ff7391e80
      InstanceType: t2.micro
      KeyName: !Ref KeyPairName
      NetworkInterfaces:
        - DeviceIndex: 0
          AssociatePublicIpAddress: true
          GroupSet:
            - !Ref ControlPlaneSecurityGroup
          SubnetId: !Select [0, !Ref SubnetIds]
      IamInstanceProfile: !Ref ControlPlaneInstanceProfile
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash
          echo 'net.bridge.bridge-nf-call-iptables=1' | tee -a /etc/sysctl.conf
          sysctl -p
          yum update -y
          amazon-linux-extras install docker -y
          service docker start
          usermod -a -G docker ec2-user
          curl -o /usr/local/bin/kubectl https://amazon-eks.s3.us-west-2.amazonaws.com/1.21.2/2021-07-05/bin/linux/amd64/kubectl
          chmod +x /usr/local/bin/kubectl
          echo 'export PATH=$PATH:/usr/local/bin' >> /etc/bashrc
          curl --silent --location "https://github.com/weaveworks/eksctl/releases

Configuring the Kubernetes Cluster

Once the Kubernetes cluster has been created, users can begin configuring it. This includes installing the necessary software components, such as the Kubernetes Dashboard and kubectl, as well as configuring the networking between the nodes. Additionally, users can configure the cluster to run specific workloads, such as batch jobs or web applications, and to use specific types of storage, such as persistent storage or local storage.

Once the cluster is configured, users can begin deploying applications and services to the cluster in a number of ways. This can be accomplished by using the Kubernetes APIs, utilizing tools such as Helm and Operators, or even with the Kubernetes dashboard to monitor the health of the cluster and identify any potential issues. On top of that, users can also use the AWS CLI to manage the nodes, services, and applications running on the cluster, as well as to create and manage backups of the cluster's data. This is especially important for larger clusters, as they often require more rigorous maintenance and monitoring to ensure that the cluster is running optimally. Additionally, users can also use the AWS CLI to scale the cluster up or down, depending on the current demand for resources. This provides users with the flexibility to quickly and easily adjust their application and service deployment strategy, allowing them to quickly and efficiently respond to changes in customer demand.

Once the cluster is up and running, users can use a variety of tools to manage it. The AWS Command Line Interface (CLI) provides an easy to use interface for managing the cluster's nodes, services, and applications. Additionally, users can use the Kubernetes APIs to deploy and manage applications running on the cluster. Additionally, users can take advantage of the Kubernetes ecosystem of tools, such as Helm and Operators, to further extend the capabilities of their cluster. Finally, users can use the Kubernetes dashboard to monitor the health of the cluster and identify any potential issues.

Introduction to AWS App Config

Ehi Enabs — Fri, 10 Feb 2023 09:16:01 +0000

AWS App Config is an AWS service that helps you make sure your applications are configured correctly, so they can run reliably and securely. With AWS App Config, you can easily monitor and manage application configurations, ensuring that your applications remain up-to-date and compliant with security standards. This article will provide a comprehensive overview of what AWS App Config is, how it can be used, and how it can help you ensure that your applications remain up-to-date and secure.

AWS App Config is a fully managed service that helps you automate the deployment and configuration of applications. It allows you to create configuration profiles that define the desired state of an application, such as the size of the instances, security settings, and other parameters. The service then monitors the configuration of the application and notifies you if any changes are detected. This helps you identify any configuration drift and ensure that your application remains compliant with security policies, as well as enabling you to quickly and easily deploy new versions of your applications.

AWS App Config also integrates with other AWS services, such as AWS Lambda, AWS CloudFormation, and AWS CodeDeploy, allowing you to use the service to automate the deployment and configuration of applications. You can use the service to deploy applications and perform automated tests to ensure that they are working correctly. This will ensure that your applications are up-to-date and compliant with security standards.

AWS App Config provides a range of features to help you manage your application configurations. It enables you to create configuration profiles that specify the desired state of an application, and monitors changes to ensure they remain compliant with security policies. It also integrates with other AWS services, such as AWS Lambda and AWS CloudFormation, allowing you to use the service to automate the deployment and configuration of applications. AWS App Config is a powerful and useful tool for managing and monitoring application configurations and ensuring that your applications remain up-to-date and secure.

AWS App Config is also highly scalable, allowing you to manage applications across multiple regions and accounts. It also provides detailed metrics and reporting that can help you identify and address potential issues before they become critical. Furthermore, it supports integration with other AWS services and third-party tools, enabling you to easily monitor and manage your application configurations from a single platform. With AWS App Config, you can ensure that your applications remain up-to-date and secure, while making the deployment and configuration process faster and more efficient.
AWS App Config offers a range of features to help you manage your applications. It enables you to create configuration profiles that specify the desired state of an application and monitors changes to ensure that they remain compliant with security policies. It also integrates with other AWS services and third-party tools, allowing you to easily monitor and manage application configurations from a single platform. Additionally, it is highly scalable and provides detailed metrics and reporting to help you identify and address potential issues. With AWS App Config, you can ensure that your applications remain up-to-date and secure, while making the deployment and configuration process faster and more efficient.

import boto3

app_config = boto3.client('appconfig')

# Retrieve the configuration for an application
response = app_config.get_configuration(
    Application='MyApp',
    Environment='MyEnvironment',
    Configuration='MyConfig'
)

config_data = response['Content']

# Use the retrieved configuration data
print(config_data)

In the above example, the boto3 library is used to interact with the AWS App Config service. The app_config variable is a client object that is used to call the get_configuration method. This method retrieves the configuration for an application with the specified name, environment, and configuration.

The response variable contains the response from the get_configuration method, and the config_data variable is set to the Content field of the response, which contains the configuration data for the specified application, environment, and configuration.

Finally, the config_data is printed to the console, allowing you to use it within your application.

With this code, you can retrieve the configuration for your application from AWS App Config, which can help you manage configuration changes in a controlled and predictable manner.

In conclusion, AWS App Config is a powerful and useful tool for managing and monitoring application configurations. It helps you ensure that your applications remain up-to-date and compliant with security standards, while also allowing you to quickly and easily deploy new versions of your applications. AWS App Config is a valuable asset, enabling you to monitor and manage application configurations, ensuring that your applications remain secure and reliable.

Guide to Setting Service Level Objectives and Service Level Indicators

Ehi Enabs — Wed, 18 Jan 2023 18:50:13 +0000

Businesses often want to balance feature development and service reliability while prioritizing customer happiness. With service level objectives (SLOs), stakeholders can set definite target levels for the reliability of their services, which are measured using service level indicators (SLIs).

One of the core duties of an SRE team is making the performance of your services measurable; if you can measure something, you can improve it.

SLOs are target values or range of values set for the reliability of your services as measured by service level indicators and expressed as a percentage. SLIs are metrics that indicate the level of performance your users experience with your services. They help approximate the degree of happiness and satisfaction of your users. SLIs are vital signals an organization uses to measure how well certain aspects of its services are meeting its service level objectives.

This guide will walk you through setting practical service level objectives and service level indicators for your site reliability engineering practices.

Key Terms for Service Level Objectives and Service Level Indicators

There are some key terms you should know before getting started with service level objectives and service level indicators.

Service reliability: The probability that a service, product, or system will adequately do what it is supposed to for a specific period is its reliability. Your service reliability measures how well your system performs given a set of conditions over a particular time.

Site reliability engineering: Ben Treynor Sloss, credited with spearheading SRE, once said that SRE is what happens when you ask a software engineer to design an operations team.

It's a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems.

Site reliability engineering leverages scripting, automation, and other software development techniques for IT operations to improve the reliability of your software products and the infrastructure that powers them.

Site reliability engineer: Site reliability engineers sit at the intersection of traditional IT/ops and software development. An SRE team would typically be responsible for the reliability of the entire stack of your services, from the frontend client-side applications to the backend, databases, and general infrastructure.

Service level agreement: A service level agreement defines the level of service expected by users. They also include the penalties in case of agreement violation.

Why Does Your Organization Need SLOs and SLIs?

Collecting metrics from our applications and infrastructure can be inadequate for getting a clear picture of how your users are experiencing your service. Service level indicators are quantifiable measures of an aspect of your service performance from your users' perspective. SLI metrics correlate with your users' journey when they use your application.

The following are some examples:

Latency: How long does it take for your service to respond to a request?
Errors: What percentage of your service responses are errors?
Traffic: How many requests are your services receiving?
Availability: What percentage of the time are your services available?

Service level indicators form the basis of service level objectives by providing data that helps you set appropriate reliability targets.

Prioritizing service reliability is one way organizations ensure user happiness. Measurable and realistic targets can help your organization better manage your services.

As mentioned, businesses often want to balance feature development and service stability; integrating service level objectives into your site reliability practices is an effective way to make these decisions more data driven.

How Can SLOs and SLIs Help Your Organization?

When appropriately implemented, SLIs and SLOs can be an asset to your organization. Some of their benefits include the following:

Measuring the performance of your system: With SLOs and SLIs, you can answer important questions about how your system is performing—for example, how fast your service returns data, how often the data is inaccurate—and set realistic targets for your system's performance.
Measuring the reliability of your system: SLIs can help you quantitatively measure the reliability of your system. Things like how much downtime your users have experienced in a given time, what parts of your system your users are having the most trouble, etc. can be easily quantified.
Measuring customer happiness: Because you design SLOs with user happiness in mind, it is easy to measure the level of satisfaction your users get from using your service. While black box monitoring can give you an insight into how your application and infrastructure are performing, SLOs and SLIs are better suited for reflecting your users' experience.

Setting Service Level Objectives and Service Level Indicators

Getting all the stakeholders of the various services in your organization to agree on your reliability targets is the first step to setting SLOs. When setting reliability targets, it is advisable to give room for failure and violations of your SLOs before tripping off any alerts.

Stakeholders negotiate reliability compromises by operating on the universal truth that 100 percent reliability is an unrealistic target. Error budgets are one way to quantify these compromises. Once you have your error budget, you can begin setting SLI and SLOs in your organization.

Identify System Boundaries

When your users are experiencing a slow service or your service is returning incorrect data, they will not know whether a database is lagging or a microservice is failing. The internal workings of your service architecture are irrelevant to your users. They will usually direct most of their concerns to the usability of your service.

Correctly identifying your system boundaries can help your organization pinpoint the parts of your services your users interact with directly. Collecting metrics that reflect your users' experience makes setting reliability targets easier. A system boundary is where your users interface with your services via one or more components.

To begin implementing your SLIs, you must start thinking about how your users interact with your service. For example, a streaming service will have users who are concerned with things such as the following:

Their video taking too long to start playing
Their streams getting interrupted by buffering
The accuracy of the search results when they look for a video
The quickness with which your service returns data

When implementing SLIs, identify how your users interact with your services and collect metrics from the components that comprise the service as a group. So even though your video search service may contain components such as load balancers, databases, and various microservices, it is advisable to measure their performance from the perspective of your users.

Differentiate between Service Types

Understanding your system's capabilities and how they achieve their goals can help you set better SLIs and SLOs.

You should group your services into types. In most organizations, service types often coincide with team boundaries. The following are service type examples:

Synchronous Services: These are services for which immediate response to a query is expected. When a client sends a query to this service, all other events, services, or queries dependent on this service must wait on its response to perform their tasks. Regarding the reliability of your synchronous services, you'll want to keep the latency and availability of your services in mind.
Asynchronous Services: These are services for which an immediate response is not expected. Other services are not reliant on the response from async services and may continue processing other tasks in the meantime. For async services, latency, service degradation, and tasks queues are some primary concerns when setting your reliability targets.
Stateful Services: These are services that keep track of sessions or client transactions made or other services. With stateful services, such as databases, processing of transactions will often rely on knowledge of previous transactions. This can put some restrictions on your infrastructure, as the same server must be used to process all related transactions. Saturation, availability, and data correctness are some primary reliability concerns for stateful services.
Stateless Services: Stateless services do not need to keep track of their client sessions, nor do they require knowledge of previous transactions to perform their tasks sufficiently. For stateless services, like web servers, your organization can set reliability targets for availability, resilience, and latency.

Define Your Services' Scope of Availability

Next, you want to define in plain terms what it means for your service to meet user happiness. Defining your service's expected performance in plain terms can help standardize your reliability targets and get everyone on board with the organization's goals.

For example, saying that you want your service to have low latency, although vague, might mean something to your engineers and software developers. However, you might be excluding business and product managers with this vocabulary. But when you say that you want your service to respond to requests quickly, your reliability goals are clearer across the entire organization.

Choose the Right SLI Based on Service Type

Now that you have carefully defined what it means for your services to be available in plain English, you can start the technical implementation of SLIs. When choosing your SLIs, always prioritize your users and how they interact with your services' different aspects. For this, user journeys are instrumental in illuminating how your users use your service.

Here are some recommended SLIs based on service types that your organization should consider:

For a user-facing system, availability (Are your services responding to requests?); latency (How long is it taking for your service to send responses?); and throughput (How many requests can your service handle in a given time frame?) are valuable metrics for measuring how happy your users are with your service.

For data pipelines, metrics such as correctness (Is the correct data being collected and returned?) and latency (How long is it taking for your pipelines to complete?) are practical metrics to consider.

Define Realistic SLOs for Each Metric Based on the SLIs Provided

Now that you have implemented metrics that best reflect your users' happiness, use the data gathered to set your service level objectives. SLOs are target values or a range of values that define the upper bounds of the reliability of your service.

To set realistic SLOs, you should consider the baseline of what reliability looks like for your service. For example, there's no point setting a 99% threshold if your service availability is 85%. The data from your SLIs as well as feedback from your users should inform the baselines of your SLOs.

Once you have successfully put these targets in place, it becomes easier for you to gauge how satisfied your users are with your services, thus balancing innovation and service stability.

Iterate the process to fine-tune SLOs over time. Bear in mind that your organization keeps evolving and your users' needs change over time; your reliability target should be dynamic.

Lastly, feedback is vital when fine-tuning your SLOs over time. Always consult stakeholders, users, and historical data to inform your SLOs.

Conclusion

Setting adequate SLIs and SLOs can improve your services and systems. When deciding on your reliability targets, it is essential to pay attention to your user journeys and system boundaries. You should always set SLIs based on how users interact with your system, and don't be shy about setting different targets for different aspects of your system.

Begin your process by defining an error budget to make room for failure, thus avoiding unrealistic expectations for your service reliability. Remember that your organization will evolve as well as your users' needs. Revisiting your SLOs and fine-tuning them over time is the best practice.

To help manage your products' reliability at any scale, Last9 is a reliability platform that helps DevOps engineers and SRE teams set SLOs faster by automatically measuring baselines and providing SLI and SLO suggestions. You can also catalog your services, map the relationship between them, and get change intelligence. Last9 offers a variety of integrations to meet your needs.