Forem: nicolaspinocea

ML-Ops: An automated routine of training and deployment of a model using AWS ML-OPs Orchestrator and Step Functions

nicolaspinocea — Thu, 08 Dec 2022 04:24:12 +0000

Introduction

One of the greatest challenges, and one that is of vital importance when it comes to adding value in an organization that develops AI solutions, is to distribute them among its different stakeholders and to turn them into a support tool for decision-making. However, to achieve this collaboration and visibility of the data scientist team it is necessary to evolve towards operational solutions, that is, to scale a model, outside of a jupyter notebook, that can be consumed by different teams within the company, in addition to optimizing the different stages of the machine learning lifecycle. Next, we will show how to create and deploy, in an automatic way, a machine learning model, specifcally the creation of a Step Function State Machine, in charge of orchestrating a training with hyperparameter job in SageMaker, to later query the metrics of its result, obtaining the name and path of the best performing model, where latter the artifact will be registered in SageMaker and an Endpoint will be deployed ready to be queried from external applications since this last step is executed through the MLOps Orchestrator, a lambda attached to API Gateway will be provisioned with a REST API to which queries can be made from outside AWS, using the corresponding IAM credentials.

Resource creation.

Since the resources attached to the Endpoint, i.e. the lambda together with the API gateway and its respective API will be provisioned through the orchestrator, it is necessary to first upload your Cloudformation template to the development account (Link for Cloud Formation is left in the reference section).

Lambda

The main objective is to automate the deployment of this pipeline, therefore the first generated lambda will be in charge of generating a random name for the Sagemaker Hyperparameter Tuning Job.

import json
import uuid

def lambda_handler(event, context):
  uuid_tmp = str(uuid.uuid4())
  random_uuid = uuid_tmp[:6]
  nick=f'training-hyperparameter-{random_uuid}'

  return nick

Extract the path of the best model

Then we proceed to create a new lambda function in charge of extracting the best-performing artifact in the hyperparameter job, which will be displayed.

The associated metric to define which is the 'best' is indicated at the time of creating the tuning job, so we will only handle the name of the one that obtained the best performance.

def lambda_handler(event, context):

  bestTrainingJobName=event['BestTrainingJob']['TrainingJobName']

  return '"models/'+bestTrainingJobName+'/output/model.tar.gz"

The following Lambda has the function of receiving the previous result, i.e. the best training model to indicate it to the orchestrator, which will execute a commit that will deploy the Endpoint with its respective API.

import json
import uuid
import os
import boto3

def lambda_handler(event, context):
  client = boto3.client('codecommit')
  uuid_tmp = str(uuid.uuid4())
  random_uuid = uuid_tmp[:6]


  json_tmp = f"""
  "pipeline_type": "byom_realtime_builtin",
  "model_framework": "xgboost",
  "model_framework_version": "1",
  "model_name": "best-model-{random_uuid}",
  "model_artifact_location": "{event}",       
  "data_capture_location": "stepfunctionssample-sagemak-bucketformodelanddata-1vkv7vhuej3kt/capture",
  "inference_instance": "ml.m5.large",
  "endpoint_name": "endpoint-best-{random_uuid}"
  """

  json='{'+json_tmp+'}'
  response = client.get_branch(
  repositoryName=str(os.environ.get('REPOSITORY_NAME')),
  branchName='main',
  )
  last_commit_id=response['branch']['commitId']

  response = client.create_commit(
      repositoryName=str(os.environ.get('REPOSITORY_NAME')),
      branchName='main',
      ...

Orchestration using Step Functions

Once we have developed the fundamental elements for the creation of this sequence of procedures, we proceed to orchestrate each stage previously designed using the Step Functions service. Initially, we will create our State Machine that offers us to diagram a process logic through a graphical interface, in this way we concatenate each developed component.

Execute State Machine

Finally, our designed logic is executed, launching a hyperparameter job, and selecting the best model, which is deployed through an API Rest, using the AWS MLOps Workload Orchestrator deployment pipeline.

Conclusion

In this publication, an alternative to generating a pipeline with machine learning models was presented, involving training and deployment in the AWS cloud, combining two alternatives for process orchestration. Given the benefits offered by a lambda function, it is possible to continue incorporating improvements to our modeling logic, such as launching processes through the arrival of files in S3, re-training a solution, or even adding monitoring jobs for the analysis of the behavior of a solution in production.

Acknowledgement

I would like to highlight the collaboration of my friend and teammate @matiasgonzalezes, who was fundamental in the different developments shown in this document.

Reference

AI Use Case: Developing image classification model with Sagemaker JumpStart

nicolaspinocea — Sun, 07 Aug 2022 00:12:51 +0000

An image classifier is a tool that is developed in the context of supervised learning, where through techniques associated with the field of deep learning, mainly convolutional neural networks (CNN), we seek to extract and learn features, shapes, and textures within an image, in order to achieve a classification model with a level of accuracy according to the conditions of the business context.

To build an image classification model, we must first obtain records of these, i.e. images of the classes within the context to work, and then start a modeling process, based on a life cycle widely studied in the area of machine learning (ML), which indicates that there are 4 fundamental pillars to develop an ML solution, with this we mean a data engineering procedure, application of artificial intelligence algorithms, performance evaluation of the model achieved, ending with deployment and monitoring of the solution.

Each of the stages mentioned involves extended development times, both in reaching the best model, as well as its deployment and use within an organization, if tools or work environments suitable for the creation of such solutions are not considered, which can become an over-adjusted solution in relation to the model code as such, a non-scalable solution from the point of view of deployment, and therefore have no business impact. In this post we will address the construction of an image classification model, relying on the great tool of pre-built solutions offered by Sagemaker Studio, that is, the JumpStart service, adding an alternative deployment via clicks, within the AWS cloud. In detail we will develop the following points:

JumpStart service review
Development of a solution using ResNet 152 algorithm from the family of models offered by Pytorch.
Solution deployment using a lambda function, and an application with Simple Notification Service (SNS) notification service.

JumpStart of Sagemaker Studio

JumpStart can be understood as the evolution of the concept of built-in algorithms within AWS, offering a series of algorithms, not only pre-built but also pre-trained, such as computer vision algorithms.

At the same time, JumpStart allows you to train a model with your information, i.e. it is possible to train with your data, and deploy the solution in the cloud through clicks. In the following image you can visualize all JumpStart options:

In the following example to be developed, we will show the simple steps that allow us to develop a deep learning solution in the JumpStart service.

Practical example

The case to be developed applies in the context of the classification of construction materials, in which there are 6 types of objects, that is to say, we have 6 classes, which we must identify by means of a machine learning model. For this case, we selected the ResNet 152 model, from the Pytorch framework, which belongs to one of the multiple solutions offered by JumpStart. This particular algorithm has the best performance within the ResNet family, algorithms proposed by Kaiming He et al, 2015 in the article Deep Residual Learning for Image Recognition.
One of the particularities of this algorithm is the depth of the implemented layers, i.e. 152 layers deep.

Data

The amount of information for each class is sufficient to start the iteration directly with JumpStart, since the class with the least amount of observations is close to 300 and the largest has about 2,000 records. Using a code developed in a Jupyter notebook, we validate the extension of each file, since it must be in .jpg format, and a dataset is generated for train and another for test, in a proportion of 90/10 respectively.

Train and evaluation

Once the data is configured in S3, we proceed to deploy the Sagemaker Studio service. This will facilitate the use of the JumpStart service. Once the Studio interface is enabled, we go to the service containing the models and search for the ResNet 152 algorithm, select it and proceed to complete the information requested for training with the data from our exercise.

Finally, in 3 clicks we are already training a solution, which will take approximately 20 minutes to finish, tracking in cloud watch the training metrics.

The most important part of this process is to generate our .pth file compressed in a .tar.gz file, also called the model artifact.

Batch Transform

A common alternative is to perform a test by deploying an endpoint and configure a routine that allows running the inference on a batch of images, however, a question arose, what would be the cost of inferring on thousands of images, and not necessarily in real-time? therefore I decided to explore how to perform a batch job with the artifact generated by the JumpStart training.

For this process, the most important thing is that the model artifact must have a specific structure, storing a python inference file, or also known as an entrypoint.py when we develop a model locally. The structure of that artifact is shown below:

/tar.gz
|-- tar
|   |-- label_info.json
|   |-- model.pth
|   `-- code
|        -- _init_.py
|        -- inference.py
|        -- version
|        `-- lib
|        `-- constants
|               -- _init_.py
|               -- constants.py

Where the file inference.py contains the following functions:

import json
import logging
import os
import re

import numpy as np
import torch
from constants import constants
from PIL import Image
from sagemaker_inference import content_types
from sagemaker_inference import encoder
from six import BytesIO


logging.basicConfig(format="%(asctime)s %(message)s", level=logging.INFO)

DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")


class PyTorchIC:
    """Model class that wraps around the Torch model."""

    def __init__(self, model, model_dir):
        """Image classification class that wraps around the Torch model.
        Stores the model inside class and read the labels from a JSON file in the model directory.
        """
        self.model = model
        with open(os.path.join(model_dir, constants.LABELS_INFO)) as f:
            self.labels = json.loads(f.read())[constants.LABELS]

    def forward(self, tensors):
        """Make a forward pass."""
        input_batch = tensors.unsqueeze(0)
        input_batch = input_batch.to(DEVICE)
        with torch.no_grad():
            output = self.model(input_batch)
        return torch.nn.functional.softmax(output[0], dim=0)

    @classmethod
    def tensorize(cls, input_data):
        """Prepare the image, return the tensor."""
        try:
            from torchvision import transforms
        except ImportError:
            raise
        transform = transforms.Compose(
            [
                transforms.Resize(256),
                transforms.CenterCrop(224),
                transforms.ToTensor(),
                transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
            ]
        )
        return transform(input_data).to(DEVICE)

    @classmethod
    def decode(cls, input_data, content_type):
        """Decode input with content_type"""
        _ = content_type
        return input_data

    @classmethod
    def encode(cls, predictions, accept):
        """Encode results with accept"""
        return encoder.encode(predictions, accept)

    def __call__(self, input_data, content_type=content_types.JSON, accept=content_types.JSON, **kwargs):
        """Decode the image, tensorize it, make a forward pass and return the encoded prediction."""
        input_data = self.decode(input_data=input_data, content_type=content_type)
        tensors = self.tensorize(input_data)
        predictions = self.forward(tensors)
        predictions = predictions.cpu()
        output = {constants.PROBABILITIES: predictions}
        if accept.endswith(constants.VERBOSE_EXTENSION):
            output[constants.LABELS] = self.labels
            predicted_label_idx = np.argmax(predictions)
            output[constants.PREDICTED_LABEL] = self.labels[predicted_label_idx]
        accept = accept.rstrip(constants.VERBOSE_EXTENSION)
        return self.encode(output, accept=accept)


def model_fn(model_dir):
    """Create our inference task as a delegate to the model.
    This runs only once per one worker.
    """
    for root, dirs, files in os.walk(model_dir):
        for file in files:
            if re.compile(".*\\.pth").match(file):
                checkpoint = re.compile(".*\\.pth").match(file).group()
    try:
        model = torch.load(checkpoint)
        if torch.cuda.is_available():
            model.to("cuda")
        model.eval()
        return PyTorchIC(model=model, model_dir=model_dir)
    except Exception:
        logging.exception("Failed to load model")
        raise


def transform_fn(task: PyTorchIC, input_data, content_type, accept):
    """Make predictions against the model and return a serialized response.
    The function signature conforms to the SM contract.
    Args:
        task (obj): model loaded by model_fn, in our case is one of the Task.
        input_data (obj): the request data.
        content_type (str): the request content type.
        accept (str): accept header expected by the client.
    Returns:
        obj: the serialized prediction result or a tuple of the form
            (response_data, content_type)
    """
    # input_data = decoder.decode(input_data, content_type)
    if content_type == "application/x-image":
        input_data = Image.open(BytesIO(input_data)).convert("RGB")
        try:
            output = task(input_data=input_data, content_type=content_type, accept=accept)
            return output
        except Exception:
            logging.exception("Failed to do transform")
            raise
    raise ValueError('{{"error": "unsupported content type {}"}}'.format(content_type or "unknown"))

With this structure set up, we proceed to run the batch jobs for each class, where we generate a confusion matrix for multiple classes.

While high values can be seen in the metrics, two of the classes are not performing well with data outside of the training sample,class-1 and class-4, so we decided to apply data augmentation to those classes.

Data Augmentation

For this process, functions predefined by Pytorch were used, using transformers that are applied to the images to obtain new files, incorporating modifications to the base image. The transformers applied are shown below:

from PIL import Image
from torchvision import transforms as tr
from torchvision.transforms import Compo

pipeline1 = Compose(
             [tr.RandomRotation(degrees = 90),
              tr.RandomRotation(degrees = 270)
             ])

augmented_image1 = pipeline1(img = im)
augmented_image1.show()

An example is shown below, the original image is:

The new image generated:

The process was applied to the two classes with the lowest performance, on the training data. We randomly selected 20% of the images from each class to run these transformations. After training the algorithm again, the new confusion matrix was obtained, where there was an increase in the metrics derived from this evaluation tool for classification problems, being this the model to be deployed.

Model use: Inference notification on email

For this exercise, we chose to use an AWS lambda function, which allows us to invoke the model when an image is received in an S3 bucket.

We also added the feature of sending an e-mail notification every time an inference is generated, that is, every time an image is uploaded to the bucket, the user is notified by e-mail of the model's inference.

Closing words

In this post we managed to develop an end-to-end process, in the context of deep learning algorithms, applying the 4 fundamental pillars of artificial intelligence modeling, and building a simple application of the use of these algorithms. The next challenge is to generate the deployment of this model through the AWS ml-ops orchestrator and test the consumption of the solution under a productive architecture. The core value of this publication was to demonstrate how the JumpStart service allows us to obtain artificial intelligence solutions, through a few clicks, training with your data, and deploying a solution in the cloud.

Reference

Segmentación: Review general sobre técnicas de Machine Learning

nicolaspinocea — Tue, 19 Jul 2022 21:00:18 +0000

Sagemaker CANVAS: Machine learning solution in few clics

nicolaspinocea — Thu, 12 May 2022 16:09:24 +0000

One of the main trends due to the advances in the industry in relation to artificial intelligence is to give access to people without the need to have an academic or technical background in the area of computer science, mathematics, or statistics. Moreover, the goal of large companies linked to the design and implementation of solutions based on artificial intelligence is to create platforms, which through simple steps and a friendly interface, manage to deploy AI solutions, machine learning, or even deep learning, bringing such tools to certain departments within an organization, streamlining and achieving greater accuracy when making decisions.

In the following publication we are going to show the Amazon Sagemaker CANVAS service, functionality that is born under the concept of No code, where you can load data from your local computer, or connect any database already deployed in AWS, analyze and transform your dataset (feature engineering), building the best model you can achieve with your data, generate predictions, either in batch or singular form, also review distinct metrics that allow you to analyze the performance of the models developed by CANVAS. To explore the service in detail, we will use a customer churn dataset to build a binary classification model, without writing any line of code, showing the main advantages of the service.

Setting up the service

The Amazon Sagemaker CANVAS service is located in the Sagemaker console, where one must click and create a user for the service:

When creating a user, we must complete the information requested in the panel, and then wait for the successful creation of the user:

Once the user has been created, we launch the CANVAS application:

Main interface

Once the application is launched, the initial interface is shown below, where the option to create our first model is explicitly enabled.

CANVAS addresses the 4 main technical stages of a modeling process for a machine learning algorithm, by this we mean, feature engineering, i.e. the process of analyzing, visualizing, cleaning, and transforming the features that will enter the model, then configuring the type of model to perform the training (binary or multiple classification problem, regression, or time series), then displaying dashboards with the training evaluation showing general metrics as well as the impact of each variable to finally make predictions. In other words, this tool performs the fundamental steps to deploy a machine learning solution in a customized way, but at the same time automatically, without requiring code syntax or elaborate advanced data preprocessing functions. Below is an overview of each stage, generated by the tool itself when it is the first time you run a solution in the service

First step: Load data

Already introduced the main concepts of Sagemaker CANVAS, we will begin the exploration of each stage. The first is to load our data, where we have two options, the first is to connect data from services such as S3 or another database within the AWS ecosystem, and the other option is to load data from your laptop, being this option chosen by me, requiring additional configuration to enable this feature that will be explained below:

Additional: Setting of bucket for import local data

As shown in the following figure, a policy must be added to the default bucket offered by S3 for the sagemaker service, adding special permissions to enable local uploading. (This process is guided by the application, directing you to the link with the necessary documentation to complete the process).

Finally

After loading the information, you can see the header of the dataset. It is recommended to use comma-separated value files.

Next, we select the dataset and start the build stage.

Build

In this step, we proceed to analyze our data sets, where a table with basic, but no less important, information about the condition of our variables is immediately displayed. We can see the missing values, and unique values of each column, in the case of a categorical variable this is understood as the cardinality of the variable, as well as other descriptive indicators of the data set.

We also have another way to visualize the data, whereby by means of graphs we can see the distribution of each column present in the dataset.

Another feature that CANVAS offers is to select the target variable, i.e. what we want to answer with our model, for this example is whether or not the customer abandons a service. In addition, we can drop the columns that are not considered as part of the training of the classification algorithm, all this only by clicks, no code. For this example, we do not consider 2 columns of the total attributes of the dataset.

Also, we can review and change the type model to employ

With the settings defined, a model preview is displayed in the lower right part of the screen, i.e. a model built quickly indicating the first partial values of training, with speed prevailing over the accuracy, adding the impact of the variables in the model fitting process, allowing to obtain which variable is relevant in the learning process. For our data set and the problem to be modeled the results are encouraging even bordering on overfitting.

The last action at this stage is to define the final training task, and for that, there are two ways to generate this predictor. One option is to create a model in a fast way but probably will not reach an adequate accuracy (in this case in a fast way if good performance is reached), and the other option is to consider a training where it prevails to reach a good accuracy metric, but taking a longer adjustment time, where this decision depends purely on the user.

Analize

After indicating the type of training in the previous step, a third window is enabled that will allow us to see the results of the training at a general level, that is, by means of metrics typical of a classification problem, as well as at the variable level, where we can see the response and performance of each variable against our target variable, for the latter, different graphs are displayed to understand the process.

For our example, we can see a very good performance in all the metrics offered for a classification problem, i.e. Accuracy, Recall, F1 Score, and even the AUC of the model. Again I reiterate that the analysis and importance of each metric depend on the modeled phenomenon, and what is the cost of error in the business for a prediction.

Predict

Finally, once a model has been built, there is the option to generate predictions at both batch and single levels. For batch predictions, a sample containing the attributes considered in the training process is needed, while to test singular predictions, it is possible to modify the values of each variable in a panel, showing the model prediction.

Discussion and conclusion

With this exercise we reviewed each step of the Amazon Sagemaker CANVAS service, discovering a great tool to develop machine learning solutions under the concept of not code, allowing us to create classification, regression, and time-series models. At each stage natural questions arise for a model developer, such as what algorithms to consider in each type of problem, for example, does XGBoost use for classification problems? what portion of data is used for training and testing? or how is the performance for a classification problem with unbalanced labels? and other unknowns associated with each stage of the process offered by CANVAS. However, regardless of those questions, it is a great tool, which above all brings people closer to creating their artificial intelligence models, in particular machine learning models.

References

Amazon Forecast: Models performance employing AutoML and AutoPredictor (New feature)

nicolaspinocea — Fri, 22 Apr 2022 05:12:30 +0000

In machine learning, and deep learning, is known that there is no "best" algorithm, let alone a set of standard hyperparameters, it all depends on the data. For this when theren´t clarity about the algorithms to employ, or put another way, when we want find the technique that reaches the best perfomance for this dataset, is recommendable generate a baseline.

In this post, I will review the new feature of the AWS forecasting service, Amazon Forecast, explain the two methods that allow a first approach to time series forecasting using machine learning algorithms, and finally develop a comparison of these methods through of an example.

A brief review Amazon Forecast

Amazon Forecast is the service for forecasting AWS, enabling create forecasts in simple steps, for millions of records, employment machine learning algorithms, being this techniques used on the Amazon.com for more than 20 years.

The main aim of Amazon Forecast is to facilitate the generation of forecasts without being an expert developer in machine learning, offering classic models until models complex of deep learning how deepAR and CNN-QR, guiding the process through a comfortable interface. Currently is enable the incorporation of weather information and holidays data, depending of the country to forecasting, in addition to Amazon Forecast has developed a new feature for explaining the impact of covariables, in other words, how affect time series related to the predictor. The next image represent the core of service:

However when we want to init some tests with Amazon Forecast, and we don´t have clarity about the algorithm employ, the service offered two alternatives:

AutoML: The search of best algorithm

A classic strategy to start any machine learning project, in the context of forecasting, is to create a baseline of algorithm results and select the technique that offers the best performance. AutoML responds to this need by making available a series of algorithms of different nature, training each of these techniques, returning by default the model with the best metrics. In addition, in the Amazon Forecast predictor, each trained model is enabled for review performance.

AutoPredictor: The search of best combination

There are phenomena related to time series with a segmented behavior in different periods, such as seasons of the year, holidays, or unexpected events that generate the development of the observations that are difficult to predict and much less than a single algorithm can deal with the complete time series. A suggested methodology, but at the same time difficult to implement, is the assembly of models for a time series, that is, looking for a combination of mathematical models that adapt in the best possible way and respond to the objective modeling scenario. Amazon Forecast has developed functionality that is responsible for generating this assembly of models, without the need for programming or any type of advanced development, this feature is an AutoPredictor, and it is currently the default predictor of the Forecasting service. Next, we will validate the performance theory of this type of solution through an example comparing the solution with the functionality of Auto ML.

Comparison performance

The database used for the comparison of the new feature of Amazon Forecast, with AutoML, is in the retail context, where the main objective is to be accurate with the target prediction, to improve planning in a customer's internal process. The dataset contains 340 observations, with a daily frequency, fixing a forecast horizon to the training of 14 days and predicting quantiles 0.1, 0.5, and 0.9. The data schema requested by Amazon Forecast is shown below:

To create the predictor with AutoML, we working from sagemaker, throught of ForecastService of boto3, that allows deploying a process end-to-end. A next to show the code in sagemaker for generating predictor in Amazon Forecast:

create_predictor_response = forecast.create_predictor(PredictorName='smu_iteracion_1_predictor0',
                              ForecastHorizon=FORECAST_LENGTH,
                              PerformAutoML=True,
                              PerformHPO=False,
                              EvaluationParameters= {"NumberOfBacktestWindows": 4, 
                                                                         "BackTestWindowOffset": 21},
                              InputDataConfig= {"DatasetGroupArn": dataset_group_arn,      
                                                "SupplementaryFeatures": [          
                                                                        {             
                                                                            "Name": "holiday",            
                                                                            "Value": "CL"

                                                                                    },      
                                                                            ]   
                                              }, 

                              FeaturizationConfig= {"ForecastFrequency": DATASET_FREQUENCY,
                                                   "Featurizations": 
                                                                        [
                                                                          {"AttributeName": "target_value", 
                                                                           "FeaturizationPipeline": 
                                                                            [
                                                                              {"FeaturizationMethodName": "filling", 
                                                                               "FeaturizationMethodParameters": 
                                                                                {"aggregation": "sum",
                                                                                 "middlefill": "zero",
                                                                                 "backfill": "zero"}
                                                                              }
                                                                            ]
                                                                          },
                                                                        ]
                                                   } )

On another hand, the Amazon Forecast console by default work with the AutoPredictor model, hence from the console we can generate the predictor, wherein each case incorporated the holidays function (the country selected was Chile). In the next image, we can see both predictors created, it is even immediately verified that the global metrics of the models indicate that the AutoPredictor has a better performance

while the metrics associated with the behavior of the model in certain positions of the estimated probability distribution indicate that the precision on the analyzed quantiles (0.1, 0.5, and 0.9) in the predictor of the AutoPredictor method, are better than those achieved by the AutoML. Remember that these metrics should be analyzed with greater care in case of overforecasting or underforecasting that generates a high cost in the business.

For the evaluation of dataset outside of training sample, it is considering 14 days, where next figure to shown the behavior both models, being the AutoML a curve more erratic than AutoPredictor model, even AutoML models for 25 of March no capture the range real value inside of quantiles. Remember that Amazon Forecast performs probabilistic predictions, for example for the value predicted in P10, is understood how the probability of that target of time series is less than or equal to the value predicted is of 10%.

Discussion and conclusion

This post reviewed the Amazon Forecast tool for creating a multi-algorithm baseline or test, the advantage of using a set of models for the time series, as well as showing how Amazon Forecast made this feature simple for users. The next step is to evaluate related time series and the new variable explanation function. Also investigate how to obtain better information regarding the ensemble of models, to understand that the models were relevant to explaining the objective time series.

References

Deploying the Xgboost model to AWS from locally developed artifacts, adding inference pipeline

nicolaspinocea — Wed, 23 Mar 2022 00:56:28 +0000

There are companies or clients that need to deploy models in the cloud, without training their models in the AWS environment, since a new training can modify performance, change metrics and ultimately not respond to fundamental needs.

This blog will show how to deploy an Xgboost model binary built for a developer, where a post-processing layer is added through an inference pipeline in sagemaker, deploying an endpoint.

Xgboost algorithm

Tree-based ensemble methods frequently get a good performance, besides offering an interpretation of variables that are employed, making them popular models within the machine learning solution developer community. Extreme gradient boosting (Xgboost) is a variant of a tree-based ensemble method widely useful for handling sparse data, employing a minimal amount of resources, besides being a highly scalable solution. Xgboost is a supervised learning model, where the learning process is sequential, capturing the error of each prior learner, being considered an adaptive algorithm. Also, employ gradient descent for learning. The next figure show as work the gradient boosting:

Add booster from .pkl to in tar.gz

The key process and the central theme of this publication are translated in this section. The fundamental element as a result of training a tree-based model is the booster. When we use the Xgboost library, we train and save a model, where a series of attributes from the modeling stage is saved, which for purposes of inference and use of the model, do not contribute. In this way, by rescuing only the booster from the format in which the model is saved, it allows us to communicate with the pre-built AWS Xgboost solution and generate the deployment and use of the solution

import xgboost
import joblib
import tarfile
model_pkl=joblib.load('model_client.pkl')
booster=model_pkl.get_booster()
booster.save_model('xgboost-model')
# add xgboost-model to tar.gz file
fp = tarfile.open("model.tar.gz","w:gz")
fp.add('xgboost-model')
fp.close()

Create model in sagemaker

The first step is to indicate url of the container of the algorithm. In this case, employed the container issued for AWS

region = Session().boto_region_name
xgboost_container = sagemaker.image_uris.retrieve("xgboost", region, "1.0-1")

The next setting is to create a model, with the SDK of sagemaker. Should facilitate the location artifacts in S3 and container of algorithm

from sagemaker.model import Model
xgboost_model=Model(xgboost_container,
                  model_data='s3://file_path_in_s3/model.tar.gz',
                  role=sagemaker.get_execution_role())

Setting inference pipeline

The next step is setup the processing of output of model. For this create a postprocessing model through SkLearnModel

Post processing

from sagemaker.sklearn.model import SKLearnModel

FRAMEWORK_VERSION = '0.23-1'
entry_point = 'postprocessing.py'

postprocessing_model = SKLearnModel(
    model_data='s3://file_path_in_s3/model.tar.gz',
    role=role,
    entry_point=entry_point,
    framework_version=FRAMEWORK_VERSION,
    sagemaker_session=sagemaker_session
)

The entry point is a python file that contains functions that basically manage the output of the model (strings), associating a context. For this there consider the binary or multi-class, and context of the project. The next show an extract of this code:


def output_fn(prediction, accept):

    accept, params = cgi.parse_header(accept.lower())

    if accept == "application/json":
        results = []
        classes = prediction['classes']
        score=prediction['scores']
        score.insert(0,1-score[0])
        score=[score]
        for scores in score: 
            row = []
            for class_, score in zip(classes, scores):
                row.append({
                    'id': class_,
                    'score': score
                })

            results.append(row)

        json_output = {"context": results[0]}

Pipeline model

from sagemaker.pipeline import PipelineModel

model_name='name-model'
inference_model = PipelineModel(
    name=model_name, 
    role=sagemaker.get_execution_role(), 
    models=[ 
        xgboost_model,
        postprocessing_model,
    ])

Deploy and testing endpoint

Finally, we deploy models through an endpoint, where they work sequentially, obtaining the output according to the configuration designed by the user.

inference_model.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.xlarge',  
    endpoint_name=endpoint_name
)

In the following code, you can see the response when invoking the endpoint that contains a post-processing container:

b'{"context": [{"id": "class-0", "score": 0.24162}, {"id": "class-1", "score": 0.75837}]}'

Conclusion and discussion

Using the steps listed above, you can deploy a model to the AWS Cloud, ensuring the consistency and performance of the model that is built locally. Something that can be developed taking this path is to work on the preprocessing of the algorithm that was worked with, and add the preprocessing layer to the inference pipeline, configuring this stage according to your need.