<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Juv Chan</title>
    <description>The latest articles on Forem by Juv Chan (@juvchan).</description>
    <link>https://forem.com/juvchan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F467137%2F040f4614-900a-462a-aa5a-0c623ecfb310.png</url>
      <title>Forem: Juv Chan</title>
      <link>https://forem.com/juvchan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/juvchan"/>
    <language>en</language>
    <item>
      <title>Building Natural Flower Classifier using Amazon Rekognition Custom Labels The Complete Guide with AWS Best Practices</title>
      <dc:creator>Juv Chan</dc:creator>
      <pubDate>Tue, 17 Nov 2020 07:11:13 +0000</pubDate>
      <link>https://forem.com/aws-heroes/building-natural-flower-classifier-using-amazon-rekognition-custom-labels-the-complete-guide-with-aws-best-practices-4nbk</link>
      <guid>https://forem.com/aws-heroes/building-natural-flower-classifier-using-amazon-rekognition-custom-labels-the-complete-guide-with-aws-best-practices-4nbk</guid>
      <description>&lt;h1&gt;
  
  
  &lt;strong&gt;Introduction&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;Building your own computer vision model from scratch can be fun and fulfilling. You get to decide your preferred choice of machine learning framework and platform for training and deployment, design your data pipeline and neural network architecture, write custom training and inference scripts, and fine-tune your model algorithm’s hyperparameters to get the optimal model performance.&lt;/p&gt;

&lt;p&gt;On the other hand, this can also be a daunting task for someone who has no or little computer vision and machine learning expertise. This post shows a step-by-step guide on how to build a natural flower classifier using &lt;a href="https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/what-is.html" rel="noopener noreferrer"&gt;&lt;strong&gt;Amazon Rekognition Custom Labels&lt;/strong&gt;&lt;/a&gt; with AWS best practices.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Amazon Rekognition Custom Labels Overview&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Amazon Rekognition Custom Labels&lt;/strong&gt; is a feature of &lt;a href="https://aws.amazon.com/rekognition/" rel="noopener noreferrer"&gt;Amazon Rekognition&lt;/a&gt;, one of the &lt;a href="https://aws.amazon.com/machine-learning/ai-services/" rel="noopener noreferrer"&gt;AWS AI services&lt;/a&gt; for automated image and video analysis with machine learning. It provides &lt;strong&gt;Automated Machine Learning (AutoML)&lt;/strong&gt; capability for custom computer vision end-to-end machine learning workflows.&lt;/p&gt;

&lt;p&gt;It is suitable for anyone who wants to quickly build a custom computer vision model to classify images, detect objects and scenes unique to their use cases. No machine learning expertise is required.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Prerequisites&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;For this walkthrough, you should have the following prerequisites:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;An AWS account&lt;/strong&gt; — You can &lt;a href="https://portal.aws.amazon.com/billing/signup#/start" rel="noopener noreferrer"&gt;create a new account&lt;/a&gt; if you don’t have one yet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS CLI&lt;/strong&gt; — You should install or upgrade to the latest &lt;a href="https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html" rel="noopener noreferrer"&gt;AWS Command Line Interface (AWS CLI) version 2&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Creating Least Privilege Access IAM User &amp;amp; Policies&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;As a &lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html#lock-away-credentials" rel="noopener noreferrer"&gt;security best practice&lt;/a&gt;, it is strongly recommended not to use the AWS account root user for any task where it is not required. Instead, create a new IAM (Identity and Access Management) user and grant the required permissions for the IAM user based on the &lt;strong&gt;principle of least privilege&lt;/strong&gt; using &lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html" rel="noopener noreferrer"&gt;identity-based policy&lt;/a&gt;. This adheres to the &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/machine-learning-lens/identity-and-access-management.html" rel="noopener noreferrer"&gt;IAM best practices&lt;/a&gt; under the Security Pillar in the &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/machine-learning-lens/welcome.html" rel="noopener noreferrer"&gt;Machine Learning Lens&lt;/a&gt; for the &lt;a href="https://aws.amazon.com/architecture/well-architected/" rel="noopener noreferrer"&gt;AWS Well-Architected Framework&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In this walkthrough, the new IAM user requires both &lt;strong&gt;Programmatic access&lt;/strong&gt; and &lt;strong&gt;AWS Management Console access&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2AkEpAwhqbXUTKcktOK6inDg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2AkEpAwhqbXUTKcktOK6inDg.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A new &lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_managed-vs-inline.html#customer-managed-policies" rel="noopener noreferrer"&gt;customer-managed policy&lt;/a&gt; is created to define the set of permissions required for the IAM user. Besides, a &lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/dev/using-iam-policies.html" rel="noopener noreferrer"&gt;bucket policy&lt;/a&gt; is also needed for an existing S3 bucket (in this case, my-rekognition-custom-labels-bucket), which is storing the natural flower dataset for access control. This existing bucket can be created by any user other than the new IAM user.&lt;/p&gt;

&lt;p&gt;The policy’s definition in JSON format is as shown.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2A5m2s8CCPiPAbJ7E1jjJ-eQ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2A5m2s8CCPiPAbJ7E1jjJ-eQ.png"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets"
            ],
            "Resource": "*"
        },
        {
            "Sid": "s3Policies",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:ListBucketVersions",
                "s3:CreateBucket",
                "s3:GetBucketAcl",
                "s3:GetBucketLocation",
                "s3:GetObject",
                "s3:GetObjectAcl",
                "s3:GetObjectVersion",
                "s3:GetObjectTagging",
                "s3:GetBucketVersioning",
                "s3:GetObjectVersionTagging",
                "s3:PutBucketCORS",
                "s3:PutLifecycleConfiguration",
                "s3:PutBucketPolicy",
                "s3:PutObject",
                "s3:PutObjectTagging",
                "s3:PutBucketVersioning",
                "s3:PutObjectVersionTagging"
            ],
            "Resource": "arn:aws:s3:::custom-labels-console*"
        },
        {
            "Sid": "rekognitionPolicies",
            "Effect": "Allow",
            "Action": [
                "rekognition:CreateProject",
                "rekognition:CreateProjectVersion",
                "rekognition:StartProjectVersion",
                "rekognition:StopProjectVersion",
                "rekognition:DescribeProjects",
                "rekognition:DescribeProjectVersions",
                "rekognition:DetectCustomLabels",
                "rekognition:DeleteProject",
                "rekognition:DeleteProjectVersion"
            ],
            "Resource": "*"
        },
        {
            "Sid": "groundTruthPolicies",
            "Effect": "Allow",
            "Action": [
                "groundtruthlabeling:*"
            ],
            "Resource": "*"
        },
        {
            "Sid": "s3ExternalBucketPolicies",
            "Effect": "Allow",
            "Action": [
                "s3:GetBucketAcl",
                "s3:GetBucketLocation",
                "s3:GetObject",
                "s3:GetObjectAcl",
                "s3:GetObjectVersion",
                "s3:GetObjectTagging",
                "s3:ListBucket",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::my-rekognition-custom-labels-bucket/*"
            ]
        }
    ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2AvxJEoYBcQrxpCW6iD_MyZA.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2AvxJEoYBcQrxpCW6iD_MyZA.png"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AWSRekognitionS3AclBucketRead20191011",
            "Effect": "Allow",
            "Principal": {
                "Service": "rekognition.amazonaws.com"
            },
            "Action": [
                "s3:GetBucketAcl",
                "s3:GetBucketLocation"
            ],
            "Resource": "arn:aws:s3:::my-rekognition-custom-labels-bucket"
        },
        {
            "Sid": "AWSRekognitionS3GetBucket20191011",
            "Effect": "Allow",
            "Principal": {
                "Service": "rekognition.amazonaws.com"
            },
            "Action": [
                "s3:GetObject",
                "s3:GetObjectAcl",
                "s3:GetObjectVersion",
                "s3:GetObjectTagging"
            ],
            "Resource": "arn:aws:s3:::my-rekognition-custom-labels-bucket/*"
        },
        {
            "Sid": "AWSRekognitionS3ACLBucketWrite20191011",
            "Effect": "Allow",
            "Principal": {
                "Service": "rekognition.amazonaws.com"
            },
            "Action": "s3:GetBucketAcl",
            "Resource": "arn:aws:s3:::my-rekognition-custom-labels-bucket"
        },
        {
            "Sid": "AWSRekognitionS3PutObject20191011",
            "Effect": "Allow",
            "Principal": {
                "Service": "rekognition.amazonaws.com"
            },
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::my-rekognition-custom-labels-bucket/*",
            "Condition": {
                "StringEquals": {
                    "s3:x-amz-acl": "bucket-owner-full-control"
                }
            }
        }
    ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  &lt;strong&gt;Flower Dataset&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;We use the &lt;a href="https://www.kaggle.com/c/oxford-102-flower-pytorch/data?select=flower_data.zip" rel="noopener noreferrer"&gt;&lt;strong&gt;Oxford Flower 102 dataset&lt;/strong&gt;&lt;/a&gt; from the &lt;a href="https://www.kaggle.com/c/oxford-102-flower-pytorch/" rel="noopener noreferrer"&gt;Oxford 102 Flower PyTorch Kaggle competition&lt;/a&gt; for building the natural flower classifier using Amazon Rekognition Custom Labels. We use this instead of the &lt;a href="http://www.robots.ox.ac.uk/~vgg/data/flowers/102/102flowers.tgz" rel="noopener noreferrer"&gt;original dataset&lt;/a&gt; from the &lt;a href="https://www.robots.ox.ac.uk/~vgg/index.html" rel="noopener noreferrer"&gt;Visual Geometry Group, University of Oxford&lt;/a&gt;, because it has already been split into &lt;em&gt;train&lt;/em&gt;, &lt;em&gt;valid&lt;/em&gt;, &lt;em&gt;test&lt;/em&gt; datasets, and more importantly, the data has been labelled with respective flower category numbers accordingly for &lt;em&gt;train&lt;/em&gt; and &lt;em&gt;valid&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;This dataset has a total of &lt;strong&gt;8,189&lt;/strong&gt; flower images, where the &lt;em&gt;train&lt;/em&gt; split has &lt;em&gt;6,552&lt;/em&gt; images (&lt;strong&gt;80%&lt;/strong&gt;), the &lt;em&gt;valid&lt;/em&gt; split has &lt;em&gt;818&lt;/em&gt; images (&lt;strong&gt;10%&lt;/strong&gt;), and the &lt;em&gt;test&lt;/em&gt; split has &lt;em&gt;819&lt;/em&gt; images (&lt;strong&gt;10%&lt;/strong&gt;). The code snippet below helps to convert each of the 102 flower category numbers to their respective flower category name.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import os
import json
with open('cat_to_name.json', 'r') as flower_cat:
    data = flower_cat.read()
flower_types = json.loads(data)
for cur_dir_name, new_dir_name in flower_types.items():
    os.rename(cur_dir_name, new_dir_name)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The dataset bucket should have the same folder structure, as shown below, with both train and valid folders. Each should have 102 folders beneath where each folder name corresponds to a specific flower category name.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2Ak5wiB9Bkbpeb538it-gEOA.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2Ak5wiB9Bkbpeb538it-gEOA.png" alt="Flower Dataset Bucket Folder Structure"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Creating a New Flower Classifier Project&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;After the necessary setup has been completed, you can sign in to the AWS management console as the IAM user. Follow the steps in this &lt;a href="https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/gs-step-create-bucket.html" rel="noopener noreferrer"&gt;guide&lt;/a&gt; to create your new project for Amazon Rekognition Custom Labels.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Creating New Training and Test Datasets&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;We create new training and test datasets for the flower classifier project in Amazon Rekognition Custom Labels by &lt;a href="https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/cd-s3.html#cd-procedure" rel="noopener noreferrer"&gt;importing images from the S3 bucket&lt;/a&gt;. It is important to give the dataset a clear and distinctive name to distinguish between different datasets as well as training or test.&lt;/p&gt;

&lt;p&gt;For the training dataset, the S3 folder location is set to the S3 &lt;em&gt;train&lt;/em&gt; folder path as below. Similarly, for the test dataset, the S3 folder location is set to the S3 &lt;em&gt;valid&lt;/em&gt; folder path.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;s3://my-rekognition-custom-labels-bucket/datasets/oxford_flowers_102/train/&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;s3://my-rekognition-custom-labels-bucket/datasets/oxford_flowers_102/valid/&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;train
|- alpine sea holly
|  |- image_06969.jpg
|  |- image_06970.jpg
|  |- ...
|- anthurium
|  |- image_01964.jpg
|  |- image_01965.jpg
|  |- ...
|...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;valid
|- alpine sea holly
|  |- image_06977.jpg
|  |- image_06978.jpg
|  |- ...
|- anthurium
|  |- image_01972.jpg
|  |- image_01975.jpg
|  |- ...
|...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AFJnLlNqMy2p5IzyhcA1pHQ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AFJnLlNqMy2p5IzyhcA1pHQ.png" alt="Create Training/Test Dataset from Importing Images from S3 Bucket"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;All the images in both training and test datasets are organized into folder names that represent their respective flower category labels. Please make sure to enable &lt;strong&gt;Automatic Labeling&lt;/strong&gt; by checking the box as shown above as Amazon Rekognition Custom Labels supports automatic labeling of these images in such structures. This can save a lot of time and effort from manually labeling large image datasets.&lt;/p&gt;

&lt;p&gt;You can safely disregard the “&lt;em&gt;Make sure that your S3 bucket is correctly configured&lt;/em&gt;” message as you should have applied the bucket policy earlier. Please make sure that your bucket name is correct if you use a different name than the one in this example.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2AEt-0BgVorjXdwVtgEGVKgQ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2AEt-0BgVorjXdwVtgEGVKgQ.png" alt="Make sure the S3 Bucket Policy is Configured Correctly"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After you create the training and test datasets, you should use the datasets as listed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AkYH2IvxEwCKUZwTm6CA72g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AkYH2IvxEwCKUZwTm6CA72g.png" alt="Training and Test Datasets"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When you click into either of the datasets, you should find that all the images are labeled accordingly. You can click on any of the labels to inspect the images of that label. You can also search for a label in the search text box on the left.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AL_72a9MaScJI3PV-UOPmgw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AL_72a9MaScJI3PV-UOPmgw.png" alt="Labeled Images for Moon Orchid"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Training New Flower Classifier Model&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;You can train a new model in the Amazon Rekognition Custom Labels console by following this &lt;a href="https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/tm-console.html" rel="noopener noreferrer"&gt;guide&lt;/a&gt;. To create a test dataset, you should use the “Choose an existing test dataset” option, as shown below, since it should have been created in the previous section.&lt;/p&gt;

&lt;p&gt;The training based on this flower dataset could take more than an hour (approximately 1 hour and 20 minutes in this case) to complete.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2A1zW9ZF81CKdi9dFfdOQjyg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2A1zW9ZF81CKdi9dFfdOQjyg.png" alt="Train Model"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Evaluating the Trained Model Performance&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;After the flower classifier model is trained, you can review the model performance by accessing the &lt;a href="https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/tr-console.html" rel="noopener noreferrer"&gt;Evaluation Results&lt;/a&gt; in the console, as shown. You can better understand the metrics for evaluating the model performance from this &lt;a href="https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/tr-metrics-use.html" rel="noopener noreferrer"&gt;guide&lt;/a&gt;. You should be able to achieve similar model performance evaluation results with the same datasets in Amazon Rekognition Custom Labels.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Per Label Performance&lt;/strong&gt; is a great feature that allows you to analyze the performance metrics at per label level so that it’s faster and easier for you to find out which labels are performing better or poorer than the average.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AIeyXmj52TVu4oWabv5iH1A.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AIeyXmj52TVu4oWabv5iH1A.png" alt="Flower Classifier Model Evaluation Results Summary and Per Label Performance"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Besides, you can also review and filter the results (&lt;strong&gt;True Positive&lt;/strong&gt;, &lt;strong&gt;False Positive&lt;/strong&gt;, &lt;strong&gt;False Negative&lt;/strong&gt;) of the test images to understand where the model is making incorrect predictions. This information helps you to improve your model’s performance by indicating how to change or add images to your training or test dataset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2A4Sr10Niva_FmlR9DMZP1Kg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2A4Sr10Niva_FmlR9DMZP1Kg.png" alt="Test Results Evaluation Gallery"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2ApLBuc2O5p1uUQhv0op31Ig.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2ApLBuc2O5p1uUQhv0op31Ig.png" alt="Test Results filtered by False Positive"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Starting the Flower Classifier Model&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;When you are happy with the performance of your trained flower classifier model, you can use it to predict flowers of your choice. Before you can use it, you need to start the model. At the bottom section of the model evaluation results page, there are sample AWS CLI commands on how to start, stop, and analyze flower images with your model. You can refer to this &lt;a href="https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/gs-step-start-model.html" rel="noopener noreferrer"&gt;guide&lt;/a&gt; for the detailed step to start the model and set up the AWS CLI for the IAM user.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AGQ98iPMTepdbdKuTiiRu0w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AGQ98iPMTepdbdKuTiiRu0w.png" alt="Use Model with AWS CLI Commands"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To start the model, use the AWS CLI command, as shown below. Note that you should change the command line arguments based on your setup or preference. The named profile is specific to the IAM user created for Amazon Rekognition Custom Labels.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws rekognition start-project-version \
  --project-version-arn "MODEL_ARN" \
  --min-inference-units 1 \
  --region us-east-1 \
  --profile customlabels-iam
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Starting the model takes a while (approximately 15 minutes in this case) to complete. You should see the model status shows as RUNNING in the console, as shown.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AKJKmNU3UrqxtzALCQ57vMA.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AKJKmNU3UrqxtzALCQ57vMA.png" alt="Model Started"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Classifying with Unseen Flower Images&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;After the model is running, you can use it to predict the flower types of images that do not exist in both the training and test datasets to determine how well your model can perform on supported flower types, which it has not seen before. You can use the AWS CLI command below to determine the predicted label of your image.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws rekognition detect-custom-labels \
  --project-version-arn "MODEL_ARN" \
  --image '{"S3Object": {"Bucket": "BUCKET_NAME", "Name": "IMAGE_PATH"}}' \
  --region us-east-1 \
  --profile customlabels-iam
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here are some of the prediction results with datasets that are self-taken or independent from both the training and test datasets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F836%2F1%2AQPtkfVv4IHtze1e4daxWnQ.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F836%2F1%2AQPtkfVv4IHtze1e4daxWnQ.jpeg" alt="Roses"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "CustomLabels": [
        {
            "Name": "rose",
            "Confidence": 99.93900299072266
        }
    ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F668%2F1%2AJK8XGlRMt3smxNwQdQEOyw.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F668%2F1%2AJK8XGlRMt3smxNwQdQEOyw.jpeg" alt="Lotus"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "CustomLabels": [
        {
            "Name": "lotus",
            "Confidence": 99.7560043334961
        }
    ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F834%2F1%2AH_JLiio0U2izEV4Hjm5H7g.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F834%2F1%2AH_JLiio0U2izEV4Hjm5H7g.jpeg" alt="Moon Orchid"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "CustomLabels": [
        {
            "Name": "moon orchid",
            "Confidence": 98.02899932861328
        }
    ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F861%2F1%2AZJ_Pv1EIuQbAcEwkMLC-Fw.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F861%2F1%2AZJ_Pv1EIuQbAcEwkMLC-Fw.jpeg" alt="Hibiscus"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "CustomLabels": [
        {
            "Name": "hibiscus",
            "Confidence": 98.11100006103516
        }
    ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F669%2F1%2AhebIOScyvLDZ7R5lNjTx2g.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F669%2F1%2AhebIOScyvLDZ7R5lNjTx2g.jpeg" alt="Sunflower"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "CustomLabels": [
        {
            "Name": "sunflower",
            "Confidence": 99.86699676513672
        }
    ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2A_GIMAmdSUiNJTEOH1qmJ8w.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2A_GIMAmdSUiNJTEOH1qmJ8w.jpeg" alt="Artificial Flower 1"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "CustomLabels": []
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2Az5PzvSwIBmws4WE0nra4lA.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F875%2F1%2Az5PzvSwIBmws4WE0nra4lA.jpeg" alt="Artificial Flower 2"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "CustomLabels": []
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  &lt;strong&gt;Cleaning Up Resources&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;You are charged for the amount of time your model is running. If you have finished using the model, you should stop it. You can use the AWS CLI command below to stop the model to avoid unnecessary costs incurred.&lt;/p&gt;

&lt;p&gt;You should also delete the Custom Labels project and datasets in the S3 bucket if they are no longer needed to save costs as well.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws rekognition stop-project-version \
  --project-version-arn "MODEL_ARN" \
  --region us-east-1 \
  --profile customlabels-iam
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stopping the model is faster than starting the model. It takes approximately 5 minutes in this case. You should see the model status shows STOPPED in the console.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AzVsg9aHltNkpOqW_ZQDv9w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmiro.medium.com%2Fmax%2F1250%2F1%2AzVsg9aHltNkpOqW_ZQDv9w.png" alt="Model Stopped"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Conclusions and Next Steps&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;This post shows the complete step-by-step walkthrough to create a natural flower classifier using Amazon Rekognition Custom Labels with AWS best practices based on the AWS Well-Architected Framework. It also shows that you can build a high-performance custom computer vision model with Amazon Rekognition Custom Labels without machine learning expertise.&lt;/p&gt;

&lt;p&gt;The model built in this walkthrough has an &lt;em&gt;F1 score&lt;/em&gt; of &lt;strong&gt;0.997&lt;/strong&gt;, which is not easy to achieve for the same dataset if build from scratch even with extensive machine learning expertise. It is also able to perform well on the samples of the unseen natural flowers and is expected not able to predict on the samples of artificial flowers.&lt;/p&gt;

&lt;p&gt;If you are interested in building a natural flower classifier from scratch, you might be interested in my post: &lt;a href="https://towardsdatascience.com/build-train-and-deploy-a-real-world-flower-classifier-of-102-flower-types-a90f66d2092a" rel="noopener noreferrer"&gt;&lt;strong&gt;Build, Train and Deploy A Real-World Flower Classifier of 102 Flower Types&lt;/strong&gt; — With TensorFlow 2.3, Amazon SageMaker Python SDK 2.x, and Custom SageMaker Training &amp;amp; Serving Docker Containers&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>machinelearning</category>
      <category>computervision</category>
    </item>
    <item>
      <title>Build, Train and Deploy A Real-World Flower Classifier of 102 Flower Types</title>
      <dc:creator>Juv Chan</dc:creator>
      <pubDate>Fri, 11 Sep 2020 13:05:35 +0000</pubDate>
      <link>https://forem.com/aws-builders/build-train-and-deploy-a-real-world-flower-classifier-of-102-flower-types-2ko7</link>
      <guid>https://forem.com/aws-builders/build-train-and-deploy-a-real-world-flower-classifier-of-102-flower-types-2ko7</guid>
      <description>&lt;h3&gt;
  
  
  &lt;em&gt;With TensorFlow 2.3, Amazon SageMaker Python SDK 2.5.x and Custom SageMaker Training &amp;amp; Serving Docker Containers&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_600-cC2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/791/1%2AU5fka0ETq6k3mVVd4zyrMA.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_600-cC2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/791/1%2AU5fka0ETq6k3mVVd4zyrMA.jpeg" alt="Copyright Juv Chan's Flower"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Introduction&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;I love flowers. The lotus flower above is one of my most favorite flower photos taken during my visit to the Summer Palace Beijing in 2008. Since I am a developer and enjoy learning and working on artificial intelligence and cloud projects, I decide to write this blog post to share my project on building a real-world flower classifier with TensorFlow, Amazon SageMaker and Docker.&lt;/p&gt;

&lt;p&gt;This post shows step-by-step guide on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using ready-to-use &lt;a href="https://www.tensorflow.org/datasets/catalog/oxford_flowers102"&gt;&lt;strong&gt;Flower dataset&lt;/strong&gt;&lt;/a&gt; from &lt;a href="https://www.tensorflow.org/datasets"&gt;&lt;strong&gt;TensorFlow Datasets&lt;/strong&gt;&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Using &lt;a href="https://www.tensorflow.org/tutorials/images/transfer_learning"&gt;&lt;strong&gt;Transfer Learning&lt;/strong&gt;&lt;/a&gt; for &lt;strong&gt;feature extraction&lt;/strong&gt; from a &lt;strong&gt;pre-trained model&lt;/strong&gt; from &lt;a href="https://www.tensorflow.org/hub"&gt;&lt;strong&gt;TensorFlow Hub&lt;/strong&gt;&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Using &lt;a href="https://www.tensorflow.org/guide/data"&gt;&lt;strong&gt;tf.data&lt;/strong&gt;&lt;/a&gt; API to build &lt;strong&gt;input pipelines&lt;/strong&gt; for the dataset split into &lt;strong&gt;training&lt;/strong&gt;, &lt;strong&gt;validation&lt;/strong&gt; and &lt;strong&gt;test&lt;/strong&gt; datasets.&lt;/li&gt;
&lt;li&gt;Using &lt;a href="https://www.tensorflow.org/api_docs/python/tf/keras"&gt;&lt;strong&gt;tf.keras&lt;/strong&gt;&lt;/a&gt; API to build, train and evaluate the model.&lt;/li&gt;
&lt;li&gt;Using &lt;a href="https://www.tensorflow.org/guide/keras/custom_callback"&gt;&lt;strong&gt;Callback&lt;/strong&gt;&lt;/a&gt; to define &lt;strong&gt;early stopping&lt;/strong&gt; threshold for model training.&lt;/li&gt;
&lt;li&gt;Preparing &lt;a href="https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/using_tf.html#id1"&gt;&lt;strong&gt;training script&lt;/strong&gt;&lt;/a&gt; to train and export the model in &lt;a href="https://www.tensorflow.org/guide/saved_model"&gt;&lt;strong&gt;SavedModel&lt;/strong&gt;&lt;/a&gt; format for deploy with &lt;strong&gt;TensorFlow 2.x&lt;/strong&gt; and &lt;a href="https://sagemaker.readthedocs.io/en/stable/v2.html"&gt;&lt;strong&gt;Amazon SageMaker Python SDK 2.x&lt;/strong&gt;&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Preparing &lt;strong&gt;inference code&lt;/strong&gt; and configuration to run the &lt;a href="https://www.tensorflow.org/tfx/serving/serving_advanced"&gt;&lt;strong&gt;TensorFlow Serving ModelServer&lt;/strong&gt;&lt;/a&gt; for serving the model.&lt;/li&gt;
&lt;li&gt;Building custom &lt;a href="https://www.docker.com/resources/what-container"&gt;&lt;strong&gt;Docker Containers&lt;/strong&gt;&lt;/a&gt; for training and serving the &lt;strong&gt;TensorFlow model&lt;/strong&gt; with &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/build-container-to-train-script-get-started.html"&gt;&lt;strong&gt;Amazon SageMaker Python SDK&lt;/strong&gt;&lt;/a&gt; and &lt;a href="https://github.com/aws/sagemaker-tensorflow-training-toolkit"&gt;&lt;strong&gt;SageMaker TensorFlow Training Toolkit&lt;/strong&gt;&lt;/a&gt; in &lt;strong&gt;Local mode&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The project is available to the public at:&lt;br&gt;
&lt;a href="https://github.com/juvchan/amazon-sagemaker-tensorflow-custom-containers"&gt;https://github.com/juvchan/amazon-sagemaker-tensorflow-custom-containers&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;Setup&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Below is the list of system, hardware, software and Python packages that are used to develop and test the project.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ubuntu 18.04.5 LTS&lt;/li&gt;
&lt;li&gt;Docker 19.03.12&lt;/li&gt;
&lt;li&gt;Python 3.8.5&lt;/li&gt;
&lt;li&gt;Conda 4.8.4&lt;/li&gt;
&lt;li&gt;NVIDIA GeForce RTX 2070&lt;/li&gt;
&lt;li&gt;NVIDIA Container Runtime Library 1.20&lt;/li&gt;
&lt;li&gt;NVIDIA CUDA Toolkit 10.1&lt;/li&gt;
&lt;li&gt;sagemaker 2.5.3&lt;/li&gt;
&lt;li&gt;sagemaker-tensorflow-training 20.1.2&lt;/li&gt;
&lt;li&gt;tensorflow-gpu 2.3.0&lt;/li&gt;
&lt;li&gt;tensorflow-datasets 3.2.1&lt;/li&gt;
&lt;li&gt;tensorflow-hub 0.9.0&lt;/li&gt;
&lt;li&gt;tensorflow-model-server 2.3.0&lt;/li&gt;
&lt;li&gt;jupyterlab 2.2.6&lt;/li&gt;
&lt;li&gt;Pillow 7.2.0&lt;/li&gt;
&lt;li&gt;matplotlib 3.3.1&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;Flower Dataset&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;TensorFlow Datasets (TFDS) is a collection of public datasets ready to use with TensorFlow, JAX and other machine learning frameworks. All TFDS datasets are exposed as &lt;a href="https://www.tensorflow.org/api_docs/python/tf/data/Dataset"&gt;tf.data.Datasets&lt;/a&gt;, which are easy to use for high-performance input pipelines.&lt;/p&gt;

&lt;p&gt;There are a total of &lt;strong&gt;195&lt;/strong&gt; ready-to-use datasets available in the TFDS to date. There are 2 flower datasets in TFDS: &lt;strong&gt;oxford_flowers102&lt;/strong&gt;, &lt;strong&gt;tf_flowers&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;oxford_flowers102&lt;/strong&gt; dataset is used because it has both larger dataset size and larger number of flower categories.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ds_name = 'oxford_flowers102'
splits = ['test', 'validation', 'train']
ds, info = tfds.load(ds_name, split = splits, with_info=True)
(train_examples, validation_examples, test_examples) = ds
print(f"Number of flower types {info.features['label'].num_classes}")
print(f"Number of training examples: {tf.data.experimental.cardinality(train_examples)}")
print(f"Number of validation examples: {tf.data.experimental.cardinality(validation_examples)}")
print(f"Number of test examples: {tf.data.experimental.cardinality(test_examples)}\n")

print('Flower types full list:')
print(info.features['label'].names)

tfds.show_examples(train_examples, info, rows=2, cols=8)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RGq_CaKM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1250/1%2AVQkETmU-buuodJkwnHFtTg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RGq_CaKM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1250/1%2AVQkETmU-buuodJkwnHFtTg.png" alt="Flower Samples"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Create SageMaker TensorFlow Training Script&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Amazon SageMaker&lt;/strong&gt; allows users to use training script or inference code in the same way that would be used outside SageMaker to run custom training or inference algorithm.&lt;br&gt;One of the differences is that the training script used with Amazon SageMaker could make use of the &lt;a href="https://github.com/aws/sagemaker-containers#important-environment-variables"&gt;&lt;strong&gt;SageMaker Containers Environment Variables&lt;/strong&gt;&lt;/a&gt;, e.g. &lt;strong&gt;SM_MODEL_DIR&lt;/strong&gt;, &lt;strong&gt;SM_NUM_GPUS&lt;/strong&gt;, &lt;strong&gt;SM_NUM_CPUS&lt;/strong&gt; in the SageMaker container.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Amazon SageMaker always uses Docker containers when running scripts, training algorithms or deploying models. Amazon SageMaker provides containers for its built-in algorithms and pre-built Docker images for some of the most common machine learning frameworks.  You can also create your own container images to manage more advanced use cases not addressed by the containers provided by Amazon SageMaker.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The custom training script is as shown below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import argparse
import numpy as np
import os
import logging
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_datasets as tfds


EPOCHS = 5
BATCH_SIZE = 32
LEARNING_RATE = 0.001
DROPOUT_RATE = 0.3
EARLY_STOPPING_TRAIN_ACCURACY = 0.995
TF_AUTOTUNE = tf.data.experimental.AUTOTUNE
TF_HUB_MODEL_URL = 'https://tfhub.dev/google/inaturalist/inception_v3/feature_vector/4'
TF_DATASET_NAME = 'oxford_flowers102'
IMAGE_SIZE = (299, 299)
SHUFFLE_BUFFER_SIZE = 473
MODEL_VERSION = '1'


class EarlyStoppingCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if(logs.get('accuracy') &amp;gt; EARLY_STOPPING_TRAIN_ACCURACY):
            print(
                f"\nEarly stopping at {logs.get('accuracy'):.4f} &amp;gt; {EARLY_STOPPING_TRAIN_ACCURACY}!\n")
            self.model.stop_training = True


def parse_args():
    parser = argparse.ArgumentParser()

    # hyperparameters sent by the client are passed as command-line arguments to the script
    parser.add_argument('--epochs', type=int, default=EPOCHS)
    parser.add_argument('--batch_size', type=int, default=BATCH_SIZE)
    parser.add_argument('--learning_rate', type=float, default=LEARNING_RATE)

    # model_dir is always passed in from SageMaker. By default this is a S3 path under the default bucket.
    parser.add_argument('--model_dir', type=str)
    parser.add_argument('--sm_model_dir', type=str,
                        default=os.environ.get('SM_MODEL_DIR'))
    parser.add_argument('--model_version', type=str, default=MODEL_VERSION)

    return parser.parse_known_args()


def set_gpu_memory_growth():
    gpus = tf.config.list_physical_devices('GPU')

    if gpus:
        print("\nGPU Available.")
        print(f"Number of GPU: {len(gpus)}")
        try:
            for gpu in gpus:
                tf.config.experimental.set_memory_growth(gpu, True)
                print(f"Enabled Memory Growth on {gpu.name}\n")
                print()
        except RuntimeError as e:
            print(e)

    print()


def get_datasets(dataset_name):
    tfds.disable_progress_bar()

    splits = ['test', 'validation', 'train']
    splits, ds_info = tfds.load(dataset_name, split=splits, with_info=True)
    (ds_train, ds_validation, ds_test) = splits

    return (ds_train, ds_validation, ds_test), ds_info


def parse_image(features):
    image = features['image']
    image = tf.image.resize(image, IMAGE_SIZE) / 255.0
    return image, features['label']


def training_pipeline(train_raw, batch_size):
    train_preprocessed = train_raw.shuffle(SHUFFLE_BUFFER_SIZE).map(
        parse_image, num_parallel_calls=TF_AUTOTUNE).cache().batch(batch_size).prefetch(TF_AUTOTUNE)

    return train_preprocessed


def test_pipeline(test_raw, batch_size):
    test_preprocessed = test_raw.map(parse_image, num_parallel_calls=TF_AUTOTUNE).cache(
    ).batch(batch_size).prefetch(TF_AUTOTUNE)

    return test_preprocessed


def create_model(train_batches, val_batches, learning_rate):
    optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)

    base_model = hub.KerasLayer(TF_HUB_MODEL_URL,
                                input_shape=IMAGE_SIZE + (3,), trainable=False)

    early_stop_callback = EarlyStoppingCallback()

    model = tf.keras.Sequential([
        base_model,
        tf.keras.layers.Dropout(DROPOUT_RATE),
        tf.keras.layers.Dense(NUM_CLASSES, activation='softmax')
    ])

    model.compile(optimizer=optimizer,
                  loss='sparse_categorical_crossentropy', metrics=['accuracy'])

    model.summary()

    model.fit(train_batches, epochs=args.epochs,
              validation_data=val_batches,
              callbacks=[early_stop_callback])

    return model


if __name__ == "__main__":
    args, _ = parse_args()
    batch_size = args.batch_size
    epochs = args.epochs
    learning_rate = args.learning_rate
    print(
        f"\nBatch Size = {batch_size}, Epochs = {epochs}, Learning Rate = {learning_rate}\n")

    set_gpu_memory_growth()

    (ds_train, ds_validation, ds_test), ds_info = get_datasets(TF_DATASET_NAME)
    NUM_CLASSES = ds_info.features['label'].num_classes

    print(
        f"\nNumber of Training dataset samples: {tf.data.experimental.cardinality(ds_train)}")
    print(
        f"Number of Validation dataset samples: {tf.data.experimental.cardinality(ds_validation)}")
    print(
        f"Number of Test dataset samples: {tf.data.experimental.cardinality(ds_test)}")
    print(f"Number of Flower Categories: {NUM_CLASSES}\n")

    train_batches = training_pipeline(ds_train, batch_size)
    validation_batches = test_pipeline(ds_validation, batch_size)
    test_batches = test_pipeline(ds_test, batch_size)

    model = create_model(train_batches, validation_batches, learning_rate)
    eval_results = model.evaluate(test_batches)

    for metric, value in zip(model.metrics_names, eval_results):
        print(metric + ': {:.4f}'.format(value))

    export_path = os.path.join(args.sm_model_dir, args.model_version)
    print(
        f'\nModel version: {args.model_version} exported to: {export_path}\n')

    model.save(export_path)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;
  
  
  &lt;strong&gt;Transfer Learning with TensorFlow Hub (TF-Hub)&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://tfhub.dev/"&gt;&lt;strong&gt;TensorFlow Hub&lt;/strong&gt;&lt;/a&gt; is a library of reusable pre-trained machine learning models for transfer learning in different problem domains.&lt;br&gt;For this flower classification problem, we evaluate the &lt;strong&gt;pre-trained image feature vectors&lt;/strong&gt; based on different image model architectures and datasets from TF-Hub as below for transfer learning on the &lt;strong&gt;oxford_flowers102&lt;/strong&gt; dataset.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://tfhub.dev/tensorflow/resnet_50/feature_vector/1"&gt;ResNet50 Feature Vector&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/feature_vector/4"&gt;MobileNet V2 (ImageNet) Feature Vector&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tfhub.dev/google/imagenet/inception_v3/feature_vector/4"&gt;Inception V3 (ImageNet) Feature Vector&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tfhub.dev/google/inaturalist/inception_v3/feature_vector/4"&gt;Inception V3 (iNaturalist) Feature Vector&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the final training script, the &lt;strong&gt;Inception V3 (iNaturalist) feature vector&lt;/strong&gt; pre-trained model is used for transfer learning for this problem because it performs the best compared to the others above &lt;strong&gt;(~95% test accuracy over 5 epochs without fine-tune)&lt;/strong&gt;. This model uses the Inception V3 architecture and trained on the &lt;a href="https://arxiv.org/abs/1707.06642"&gt;&lt;strong&gt;iNaturalist (iNat) 2017&lt;/strong&gt;&lt;/a&gt; dataset of &lt;strong&gt;over 5,000&lt;/strong&gt; different species of plants and animals from &lt;a href="https://www.inaturalist.org/"&gt;https://www.inaturalist.org/&lt;/a&gt;. In contrast, the &lt;a href="https://www.tensorflow.org/datasets/catalog/imagenet2012"&gt;&lt;strong&gt;ImageNet 2012&lt;/strong&gt;&lt;/a&gt; dataset has only 1,000 classes which has very few flower types.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Serve Flower Classifier with TensorFlow Serving&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;TensorFlow Serving&lt;/strong&gt; is a flexible, high-performance machine learning models serving system, designed for production environment. It is part of &lt;a href="https://www.tensorflow.org/tfx"&gt;&lt;strong&gt;TensorFlow Extended (TFX)&lt;/strong&gt;&lt;/a&gt;, an end-to-end platform for deploying production Machine Learning (ML) pipelines. The &lt;a href="https://www.tensorflow.org/tfx/serving/setup#available_binaries"&gt;&lt;strong&gt;TensorFlow Serving ModelServer binary&lt;/strong&gt;&lt;/a&gt; is available in two variants: &lt;strong&gt;tensorflow-model-server&lt;/strong&gt; and &lt;strong&gt;tensorflow-model-server-universal&lt;/strong&gt;. The &lt;strong&gt;TensorFlow Serving ModelServer&lt;/strong&gt; supports both &lt;a href="https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto"&gt;gRPC APIs&lt;/a&gt; and &lt;a href="https://www.tensorflow.org/tfx/serving/api_rest"&gt;RESTful APIs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In the inference code, the &lt;strong&gt;tensorflow-model-server&lt;/strong&gt; is used to serve the model via RESTful APIs from where it is exported in the SageMaker container. It is a fully optimized server that uses some platform specific compiler optimizations and should be the preferred option for users. The inference code is as shown below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#!/usr/bin/env python

# This file implements the hosting solution, which just starts TensorFlow Model Serving.
import subprocess
import os

TF_SERVING_DEFAULT_PORT = 8501
MODEL_NAME = 'flowers_model'
MODEL_BASE_PATH = '/opt/ml/model'


def start_server():
    print('Starting TensorFlow Serving.')

    # link the log streams to stdout/err so they will be logged to the container logs
    subprocess.check_call(
        ['ln', '-sf', '/dev/stdout', '/var/log/nginx/access.log'])
    subprocess.check_call(
        ['ln', '-sf', '/dev/stderr', '/var/log/nginx/error.log'])

    # start nginx server
    nginx = subprocess.Popen(['nginx', '-c', '/opt/ml/code/nginx.conf'])

    # start TensorFlow Serving
    # https://www.tensorflow.org/serving/api_rest#start_modelserver_with_the_rest_api_endpoint
    tf_model_server = subprocess.call(['tensorflow_model_server',
                                       '--rest_api_port=' +
                                       str(TF_SERVING_DEFAULT_PORT),
                                       '--model_name=' + MODEL_NAME,
                                       '--model_base_path=' + MODEL_BASE_PATH])


# The main routine just invokes the start function.
if __name__ == '__main__':
    start_server()
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;
  
  
  &lt;strong&gt;Build Custom Docker Image and Container for SageMaker Training and Inference&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Amazon SageMaker utilizes Docker containers to run all training jobs and inference endpoints.&lt;br&gt;&lt;/p&gt;

&lt;p&gt;Amazon SageMaker provides pre-built Docker containers that support machine learning frameworks such as &lt;a href="https://github.com/aws/sagemaker-scikit-learn-container"&gt;&lt;strong&gt;SageMaker Scikit-learn Container&lt;/strong&gt;&lt;/a&gt;, &lt;a href="https://github.com/aws/sagemaker-xgboost-container"&gt;&lt;strong&gt;SageMaker XGBoost Container&lt;/strong&gt;&lt;/a&gt;, &lt;a href="https://github.com/aws/sagemaker-sparkml-serving-container"&gt;&lt;strong&gt;SageMaker SparkML Serving Container&lt;/strong&gt;&lt;/a&gt;, &lt;a href="https://github.com/aws/deep-learning-containers"&gt;&lt;strong&gt;Deep Learning Containers&lt;/strong&gt;&lt;/a&gt; (TensorFlow, PyTorch, MXNet and Chainer) as well as &lt;a href="https://github.com/aws/sagemaker-rl-container"&gt;&lt;strong&gt;SageMaker RL (Reinforcement Learning) Container&lt;/strong&gt;&lt;/a&gt; for training and inference. These pre-built SageMaker containers should be sufficient for general purpose machine learning training and inference scenarios.&lt;br&gt;&lt;/p&gt;

&lt;p&gt;There are some scenarios where the pre-built SageMaker containers are unable to support, e.g.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using unsupported machine learning framework versions&lt;/li&gt;
&lt;li&gt;Using third-party packages, libraries, run-times or dependencies which are not available in the pre-built SageMaker container&lt;/li&gt;
&lt;li&gt;Using custom machine learning algorithms
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Amazon SageMaker supports user-provided custom Docker images and containers for the advanced scenarios above.&lt;br&gt;Users can use any programming language, framework or packages to build their own Docker image and container that are tailored for their machine learning scenario with Amazon SageMaker.&lt;/p&gt;

&lt;p&gt;In this flower classification scenario, custom Docker image and containers are used for the training and inference because the pre-built SageMaker TensorFlow containers do not have the packages required for the training, i.e. &lt;strong&gt;tensorflow_hub&lt;/strong&gt; and &lt;strong&gt;tensorflow_datasets&lt;/strong&gt;. Below is the &lt;strong&gt;Dockerfile&lt;/strong&gt; used to build the custom Docker image.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Copyright 2020 Juv Chan. All Rights Reserved.
FROM tensorflow/tensorflow:2.3.0-gpu

LABEL maintainer="Juv Chan &amp;lt;juvchan@hotmail.com&amp;gt;"

RUN apt-get update &amp;amp;&amp;amp; apt-get install -y --no-install-recommends nginx curl
RUN pip install --no-cache-dir --upgrade pip tensorflow-hub tensorflow-datasets sagemaker-tensorflow-training

RUN echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | tee /etc/apt/sources.list.d/tensorflow-serving.list
RUN curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add -
RUN apt-get update &amp;amp;&amp;amp; apt-get install tensorflow-model-server

ENV PATH="/opt/ml/code:${PATH}"

# /opt/ml and all subdirectories are utilized by SageMaker, we use the /code subdirectory to store our user code.
COPY /code /opt/ml/code
WORKDIR /opt/ml/code

RUN chmod 755 serve
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The Docker command below is used to build the custom Docker image used for both training and hosting with SageMaker for this project.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;docker build ./container/ -t sagemaker-custom-tensorflow-container-gpu:1.0&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;After the Docker image is built successfully, use the Docker commands below to verify the new image is listed as expected.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;docker images&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--sr4cJ8fk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2ANai3DAhGO37qAd8OxTRsSw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--sr4cJ8fk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2ANai3DAhGO37qAd8OxTRsSw.png" alt="Docker images built"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;SageMaker Training In Local Mode&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;SageMaker Python SDK&lt;/strong&gt; supports &lt;a href="https://sagemaker.readthedocs.io/en/stable/overview.html#local-mode"&gt;&lt;strong&gt;local mode&lt;/strong&gt;&lt;/a&gt;, which allows users to create estimators, train models and deploy them to their local environments. This is very useful and cost-effective for anyone who wants to prototype, build, develop and test his or her machine learning projects in a Jupyter Notebook with the SageMaker Python SDK on the local instance before running in the cloud.&lt;/p&gt;

&lt;p&gt;The Amazon SageMaker local mode supports &lt;strong&gt;local CPU instance (single and multiple-instance)&lt;/strong&gt; and &lt;strong&gt;local GPU instance (single instance)&lt;/strong&gt;. It also allows users to switch seamlessly between local and cloud instances (i.e. &lt;a href="https://aws.amazon.com/ec2/instance-types/"&gt;Amazon EC2 instance&lt;/a&gt;) by changing the &lt;strong&gt;instance_type&lt;/strong&gt; argument for the SageMaker Estimator object (Note: This argument is previously known as &lt;strong&gt;train_instance_type&lt;/strong&gt; in SageMaker Python SDK 1.x). Everything else works the same.&lt;/p&gt;

&lt;p&gt;In this scenario, the local GPU instance is used by default if available, else fall back to local CPU instance. Note that the &lt;strong&gt;output_path&lt;/strong&gt; is set to the local current directory (&lt;strong&gt;file://.&lt;/strong&gt;) which will output the trained model artifacts to the local current directory instead of uploading onto Amazon S3. The &lt;strong&gt;image_uri&lt;/strong&gt; is set to the local custom Docker image which is built locally so that SageMaker will not fetch from the pre-built Docker images based on framework and version. You can refer to the latest &lt;a href="https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/sagemaker.tensorflow.html"&gt;SageMaker TensorFlow Estimator&lt;/a&gt; and &lt;a href="https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html"&gt;SageMaker Estimator Base&lt;/a&gt; API documentations for the full details.&lt;/p&gt;

&lt;p&gt;In addition, &lt;strong&gt;hyperparameters&lt;/strong&gt; can be passed to the training script by setting the &lt;strong&gt;hyperparameters&lt;/strong&gt; of the SageMaker Estimator object. The hyperparameters that can be set depend on the hyperparameters used in the training script. In this case, they are &lt;em&gt;'epochs'&lt;/em&gt;, &lt;em&gt;'batch_size'&lt;/em&gt; and &lt;em&gt;'learning_rate'&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sagemaker.tensorflow import TensorFlow

instance_type = 'local_gpu' # For Local GPU training. For Local CPU Training, type = 'local'

gpu = tf.config.list_physical_devices('GPU')

if len(gpu) == 0:
    instance_type = 'local'

print(f'Instance type = {instance_type}')

role = 'SageMakerRole' # Import get_execution_role from sagemaker and use get_execution_role() on SageMaker Notebook instance

hyperparams = {'epochs': 5}

tf_local_estimator = TensorFlow(entry_point='train.py', role=role, 
                                instance_count=1, instance_type='local_gpu', output_path='file://.',
                                image_uri='sagemaker-custom-tensorflow-container-gpu:1.0',
                                hyperparameters=hyperparams)
tf_local_estimator.fit()
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--BsDWreLP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2ApaJHgdSvSbGlG8uHlZf77Q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--BsDWreLP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2ApaJHgdSvSbGlG8uHlZf77Q.png" alt="SageMaker Training Job Completed Successfully"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;SageMaker Local Endpoint Deployment and Model Serving&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;After the SageMaker training job is completed, the Docker container that run that job will be exited. When the training is completed successfully, the trained model can be deployed to a &lt;strong&gt;local SageMaker endpoint&lt;/strong&gt; by calling the &lt;strong&gt;deploy&lt;/strong&gt; method of the SageMaker Estimator object and setting the &lt;strong&gt;instance_type&lt;/strong&gt; to local instance type (i.e. &lt;strong&gt;local_gpu&lt;/strong&gt; or &lt;strong&gt;local&lt;/strong&gt;). &lt;/p&gt;

&lt;p&gt;A new Docker container will be started to run the custom inference code (i.e the &lt;strong&gt;serve&lt;/strong&gt; program), which runs the TensorFlow Serving ModelServer to serve the model for real-time inference. The ModelServer will serve in RESTful APIs mode and expect both the request and response data in JSON format.  When the local SageMaker endpoint is deployed successfully, users can make prediction requests to the endpoint and get prediction responses in real-time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tf_local_predictor = tf_local_estimator.deploy(initial_instance_count=1,                                                          
                          instance_type=instance_type)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RG_IIdPn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1250/1%2Ax5YU3qgptmEoZ9tM9D_uZw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RG_IIdPn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1250/1%2Ax5YU3qgptmEoZ9tM9D_uZw.png" alt="SageMaker Local Inference Endpoint Deployed Successfully"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--iIoYPdeu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1250/1%2ArPFlzGoHZvNuvc-fu1aMRQ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--iIoYPdeu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1250/1%2ArPFlzGoHZvNuvc-fu1aMRQ.png" alt="SageMaker Containers Status"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;Predict Flower Type with External Sources of Flower Images&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;To evaluate this flower classification model performance using the &lt;strong&gt;accuracy&lt;/strong&gt; metric, different flower images from external sources which are independent of the &lt;strong&gt;oxford_flowers102&lt;/strong&gt; dataset are used. The main sources of these test images are from websites which provide high quality free images such as &lt;a href="https://unsplash.com/"&gt;&lt;strong&gt;unsplash.com&lt;/strong&gt;&lt;/a&gt; and &lt;a href="https://pixabay.com"&gt;&lt;strong&gt;pixabay.com&lt;/strong&gt;&lt;/a&gt; as well as self-taken photos.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def preprocess_input(image_path):
    if (os.path.exists(image_path)):
        originalImage = Image.open(image_path)
        image = originalImage.resize((299, 299))
        image = np.asarray(image) / 255.
        image = tf.expand_dims(image,0)
        input_data = {'instances': np.asarray(image).astype(float)}
        return input_data
    else:
        print(f'{image_path} does not exist!\n')
        return None

def display(image, predicted_label, confidence_score, actual_label):
    fig, ax = plt.subplots(figsize=(8, 6))
    fig.suptitle(f'Predicted: {predicted_label}     Score: {confidence_score}     Actual: {actual_label}', \
                 fontsize='xx-large', fontweight='extra bold')
    ax.imshow(image, aspect='auto')
    ax.axis('off')
    plt.show()

def predict_flower_type(image_path, actual_label):
    input_data = preprocess_input(image_path)

    if (input_data):
        result = tf_local_predictor.predict(input_data)
        CLASSES = info.features['label'].names
        predicted_class_idx = np.argmax(result['predictions'][0], axis=-1)
        predicted_class_label = CLASSES[predicted_class_idx]
        predicted_score = round(result['predictions'][0][predicted_class_idx], 4)
        original_image = Image.open(image_path)
        display(original_image, predicted_class_label, predicted_score, actual_label)
    else:
        print(f'Unable to predict {image_path}!\n')
        return None
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--OrYgNH1l--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2AK7XY4BPsbJCYGN46XpwsQw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--OrYgNH1l--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2AK7XY4BPsbJCYGN46XpwsQw.png" alt="Lotus Flower"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--JEk_vKDQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2AcoY6aq4ENGZXq48o2rxxRg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--JEk_vKDQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2AcoY6aq4ENGZXq48o2rxxRg.png" alt="Hibiscus"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--rwxTt4xv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2ASK8hozPQhpoGenkQlFNQtg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--rwxTt4xv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2ASK8hozPQhpoGenkQlFNQtg.png" alt="Sunflower"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GecbYds6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2Ap2-h_SGAdzh_Xsgjik2W9w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GecbYds6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2Ap2-h_SGAdzh_Xsgjik2W9w.png" alt="Waterlily"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Vqk4Vb4X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2Ak-QnzcsjwMAk4J1uWVxvvg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Vqk4Vb4X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2Ak-QnzcsjwMAk4J1uWVxvvg.png" alt="Carnation"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--XAfFXuK5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2Axr_vNWqQ7zsTpnOdlerTEA.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--XAfFXuK5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2Axr_vNWqQ7zsTpnOdlerTEA.png" alt="Marigold"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--A7tnwCQZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2AMeZNInB-awNUgru4KIcqlw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A7tnwCQZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/875/1%2AMeZNInB-awNUgru4KIcqlw.png" alt="MoonOrchid"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;Wrap-up&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The final flower classification model is evaluated against a set of real-world flower images of different types from external sources to test how well it generalizes against unseen data. As a result, the model is able to classify all the unseen flower images correctly. The model size is 80 MB, which could be considered as reasonably compact and efficient for edge deployment in production. In summary, the model seemed to be able to perform well on a given small set of unseen data and reasonably compact for production edge or web deployment.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;Proposed Enhancements&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Due to time and resources constraints, the solution here may not be providing the best practices or optimal designs and implementations. &lt;br&gt;
Here are some of the ideas which could be useful for anyone who is interested to contribute to improve the current solution.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Apply &lt;a href="https://www.tensorflow.org/tutorials/images/data_augmentation"&gt;&lt;strong&gt;Data Augmentation&lt;/strong&gt;&lt;/a&gt;, i.e. random (but realistic) transformations such as rotation, flip, crop, brightness and contrast etc. on the training dataset to increase its size and diversity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use &lt;a href="https://keras.io/guides/preprocessing_layers/"&gt;&lt;strong&gt;Keras preprocessing layers&lt;/strong&gt;&lt;/a&gt;. &lt;a href="https://keras.io/"&gt;&lt;strong&gt;Keras&lt;/strong&gt;&lt;/a&gt; provides preprocessing layers such as &lt;strong&gt;Image preprocessing layers&lt;/strong&gt; and &lt;strong&gt;Image Data Augmentation preprocessing layers&lt;/strong&gt; which can be combined and exported as part of a Keras SavedModel. As a result, the model can accept raw images as input. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Convert the TensorFlow model (SavedModel format) to a &lt;a href="https://www.tensorflow.org/lite/"&gt;&lt;strong&gt;TensorFlow Lite&lt;/strong&gt;&lt;/a&gt; model (.tflite) for edge deployment and optimization on mobile and IoT devices.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Optimize the TensorFlow Serving signature (&lt;strong&gt;SignatureDefs&lt;/strong&gt; in SavedModel) to minimize the prediction output data structure and payload size. The current model prediction output returns the predicted class and score for all 102 flower types.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use &lt;a href="https://www.tensorflow.org/guide/profiler"&gt;&lt;strong&gt;TensorFlow Profiler&lt;/strong&gt;&lt;/a&gt; tools to track, analyze and optimize the performance of TensorFlow model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use &lt;a href="https://software.intel.com/content/www/us/en/develop/tools/openvino-toolkit.html"&gt;&lt;strong&gt;Intel Distribution of OpenVINO toolkit&lt;/strong&gt;&lt;/a&gt; for the model's optimization and high-performance inference on Intel hardware such as CPU, iGPU, VPU or FPGA. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Optimize the Docker image size.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add unit test for the TensorFlow training script.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add unit test for the Dockerfile.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;Next Steps&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;After the machine learning workflow has been tested working as expected in the local environment, the next step is to fully migrate this workflow to &lt;a href="https://aws.amazon.com/"&gt;AWS Cloud&lt;/a&gt; with &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html"&gt;&lt;strong&gt;Amazon SageMaker Notebook Instance&lt;/strong&gt;&lt;/a&gt;. In the next guide, I will demonstrate how to adapt this Jupyter notebook to run on SageMaker Notebook Instance as well as how to push the custom Docker image to the &lt;a href="https://aws.amazon.com/ecr/"&gt;&lt;strong&gt;Amazon Elastic Container Registry (ECR)&lt;/strong&gt;&lt;/a&gt; so that the whole workflow is fully hosted and managed in AWS.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;Clean-up&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;It is always a best practice to clean up obsolete resources or sessions at the end to reclaim compute, memory and storage resources as well as to save cost if clean up on cloud or distributed environment. For this scenario, the local SageMaker inference endpoint as well as SageMaker containers are deleted as shown below.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;tf_local_predictor.delete_endpoint()&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--atfSl4nm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/704/1%2A30DTqcXLfEZsnQqKt7QM-Q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--atfSl4nm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/704/1%2A30DTqcXLfEZsnQqKt7QM-Q.png" alt="Delete SageMaker endpoint"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;docker container ls -a&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--oHOIt27E--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1250/1%2AYznAb1ZJadpP7BqaB7pTdg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--oHOIt27E--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1250/1%2AYznAb1ZJadpP7BqaB7pTdg.png" alt="List Containers"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;docker rm $(docker ps -a -q)&lt;br&gt;
docker container ls -a&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qj_Q3ftC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1250/1%2AeRmIe9xQCKroPU9Y8nGTxw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qj_Q3ftC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1250/1%2AeRmIe9xQCKroPU9Y8nGTxw.png" alt="All Containers Deleted"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>amazonsagemaker</category>
      <category>tensorflow</category>
      <category>docker</category>
    </item>
  </channel>
</rss>
