<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Tomas</title>
    <description>The latest articles on Forem by Tomas (@tkeyo).</description>
    <link>https://forem.com/tkeyo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F641054%2Fe4a6a810-1508-4049-9cef-0098f8e4c91b.PNG</url>
      <title>Forem: Tomas</title>
      <link>https://forem.com/tkeyo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/tkeyo"/>
    <language>en</language>
    <item>
      <title>Debugging AWS Lambda + Serverless Framework Locally</title>
      <dc:creator>Tomas</dc:creator>
      <pubDate>Thu, 10 Mar 2022 22:31:08 +0000</pubDate>
      <link>https://forem.com/tkeyo/debugging-aws-lambda-locally-5bk</link>
      <guid>https://forem.com/tkeyo/debugging-aws-lambda-locally-5bk</guid>
      <description>&lt;h3&gt;
  
  
  Working with Lambdas
&lt;/h3&gt;

&lt;p&gt;I've been working with AWS Lambdas + &lt;a href="https://www.serverless.com"&gt;&lt;strong&gt;Serverless Framework&lt;/strong&gt;&lt;/a&gt; on my projects lately. When I started to work with AWS Lambda I was a bit lost - I was not sure about the best way to develop🛠, debug🐛 and test🧪 AWS Lambdas locally. &lt;strong&gt;One thing I knew for sure - AWS web IDE is not the way to go.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This approach uses Serverless Framework specific CLI commands. However, the idea can be generalized and used with other frameworks. I must also mention, that this post will not walk you through the project setup. But you can check the sample repository &lt;a href="https://github.com/tkeyo/aws_lambda_debug_sample"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Project &amp;amp; Workflow
&lt;/h3&gt;

&lt;p&gt;While working with Lambdas I converged to my current workflow - the topic of this blog.&lt;/p&gt;

&lt;h4&gt;
  
  
  Project Setup
&lt;/h4&gt;

&lt;p&gt;Consider the following setup for a Lambda project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;serverless.yaml&lt;/code&gt; contains the definition of a function(s) - handler path, deployment settings, etc.  it's specific to Serverless Framework. More &lt;a href="https://www.serverless.com/framework/docs/providers/aws/guide/serverless.yml/"&gt;here&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;lambda_1&lt;/code&gt; directory contains an AWS Lambda function. It's possible to have multiple Lambdas per project. You could have something like &lt;code&gt;lambda_2&lt;/code&gt; in your project. You just need to add additional definitions in your &lt;code&gt;serverless.yaml&lt;/code&gt; file.&lt;/li&gt;
&lt;li&gt;I prefer to use &lt;code&gt;src&lt;/code&gt; directories that hold the source code for a function. &lt;/li&gt;
&lt;li&gt;
&lt;code&gt;handler.py&lt;/code&gt; contains the entry point ("lambda handler") for invocation&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;event.json&lt;/code&gt; is a sample event for local invocation. More on that soon.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# directory structure

aws_lambda_project
|-lambda_1
|  |-src
|    |-__init__.py
|    |-util.py
|    |-db.py
|  |-event.json
|  |-handler.py
|-serverless.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Local Debugging with Serverless Framework
&lt;/h4&gt;

&lt;p&gt;Now, if you want to run/debug your Lambda you can use the Serverless Framework CLI command. Something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sls invoke &lt;span class="nb"&gt;local&lt;/span&gt; &lt;span class="nt"&gt;--function&lt;/span&gt; my_funnction 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or if you have environment variables and events you'd do something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sls invoke &lt;span class="nb"&gt;local&lt;/span&gt; &lt;span class="nt"&gt;--function&lt;/span&gt; my_function &lt;span class="nt"&gt;--path&lt;/span&gt; event.json &lt;span class="nt"&gt;--env&lt;/span&gt; &lt;span class="nv"&gt;KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;VALUE
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using these commands is a legit way to run your AWS Lambdas locally. However, in my experience, this invocation is &lt;strong&gt;slow(ish)&lt;/strong&gt; and &lt;strong&gt;no debug breakpoints&lt;/strong&gt; for you. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;❗️ Please, let me know if there's a way to execute with Serverless Framework CLI in debug mode with breakpoints. &lt;br&gt;
❗️ Apparently, if you use AWS SAM it &lt;a href="https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-using-debugging.html"&gt;works OOB&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Making local debugging more effective &amp;amp; efficient
&lt;/h4&gt;

&lt;p&gt;The workaround and workflow I converged to. I use an additional script, &lt;code&gt;local_handler.py&lt;/code&gt;which wraps &lt;code&gt;handler.py&lt;/code&gt; This allows you to set up your variables, "events", environment variables, and everything you might need. And best of all - you can use &lt;strong&gt;breakpoints in your code.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws_lambda_project
|-lambda_1
|  |-src
|    |-__init__.py
|    |-util.py
|    |-db.py
|  |-event.json
|  |-handler.py
|  |-local_handler.py &amp;lt;- THIS
|-serverless.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's consider the following &lt;code&gt;handler.py&lt;/code&gt; and &lt;code&gt;local_handler.py&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# handler.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;lambda_1.src.util&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_db&lt;/span&gt;  

&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
    &lt;span class="s"&gt;"Sample lambda function handler."&lt;/span&gt;
    &lt;span class="n"&gt;records&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Records"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"statusCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Hello from Lambda!"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Sidenote:  This is the handler you can also invoke with the &lt;code&gt;sls invoke local&lt;/code&gt;. This simulates an AWS trigger locally. If it works locally, it will likely work on AWS when deployed. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Your local handler should look something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# local_handler.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;handler&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;  

&lt;span class="n"&gt;sample_event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"Records"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"John"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"age"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"30"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Jane"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"age"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"25"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;  
&lt;span class="p"&gt;}&lt;/span&gt;  

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;  
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"__main__"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
    &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now in &lt;code&gt;local_handler.py&lt;/code&gt;, I wrap the &lt;code&gt;handler&lt;/code&gt; function with a &lt;code&gt;main&lt;/code&gt; function which is called when you run the &lt;code&gt;local_handler.py&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;This has a couple of &lt;strong&gt;advantages&lt;/strong&gt; IMO:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you can run your code with Python 🐍  - i.e. breakpoints, IDE capabilities&lt;/li&gt;
&lt;li&gt;you can keep your lambda handler intact and deploy it directly to AWS&lt;/li&gt;
&lt;li&gt;faster startup - no need to initialize Serverless Framework &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Un-cluttering deployments
&lt;/h3&gt;

&lt;p&gt;In &lt;code&gt;serverless.yml&lt;/code&gt; it's possible to define files that you do not want to deploy to AWS. The exclamation mark ignores the files. They won't be packaged and pushed - keeping it tidy. See the example below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;...&lt;/span&gt;
&lt;span class="na"&gt;functions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;sample_lambda&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
    &lt;span class="na"&gt;package&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
      &lt;span class="na"&gt;patterns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
        &lt;span class="c1"&gt;# include  &lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;src/**'&lt;/span&gt;  
        &lt;span class="c1"&gt;# exclude &lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;!local_handler.py'&lt;/span&gt;  
        &lt;span class="s"&gt;...&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;!venv/**'&lt;/span&gt;  
    &lt;span class="na"&gt;handler&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lambda_1.handler.handler&lt;/span&gt;  
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${file(env/.env.sls.json):stage}&lt;/span&gt;
&lt;span class="nn"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;🏁  Fin. This approach has proven to be the most flexible in my experience with AWS Lambdas.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;Feel free to leave a comment or reach out.  📥 💫.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/tkeyo/aws_lambda_debug_sample"&gt;AWS Sample Repo&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Catch me on:&lt;/strong&gt; &lt;a href="https://github.com/tkeyo"&gt;github&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Catch me on:&lt;/strong&gt; &lt;a href="https://twitter.com/tkeyo_"&gt;Twitter&lt;/a&gt;&lt;/p&gt;




</description>
      <category>aws</category>
      <category>lambda</category>
      <category>cloud</category>
      <category>serverless</category>
    </item>
    <item>
      <title>Data Engineering Pipeline with AWS Step Functions, CodeBuild and Dagster</title>
      <dc:creator>Tomas</dc:creator>
      <pubDate>Thu, 30 Dec 2021 20:13:52 +0000</pubDate>
      <link>https://forem.com/tkeyo/data-engineering-pipeline-with-aws-step-functions-codebuild-and-dagster-5290</link>
      <guid>https://forem.com/tkeyo/data-engineering-pipeline-with-aws-step-functions-codebuild-and-dagster-5290</guid>
      <description>&lt;h2&gt;
  
  
  What are we building?
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;An end-to-end project to collect, process, and visualize housing data.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The goal of this project is to collect Slovak real estate market data, process it, and aggregate it. Aggregated data is consumed by a web application to display a price map of 2 Slovak cities - Bratislava and Kosice.&lt;/p&gt;

&lt;p&gt;Data is collected once per month. My intention is to create a snapshot of the housing market in a given month and check on changing price trends, market statistics, ROIs, and others. You could call it a &lt;strong&gt;business intelligence application&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Collect -&amp;gt; Process -&amp;gt; Visualize  //  🏠📄 -&amp;gt; 🛠 -&amp;gt; 💻📈
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Currently, the web application frontend shows the median rent and sell prices by borough. Still a WIP 💻🛠. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fesq8kidz080nb8kq621j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fesq8kidz080nb8kq621j.png" alt="Front End"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have a backlog of features I want to implement in the upcoming months. Also, feature ideas are welcome 💡. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why am I building this?&lt;/strong&gt;&lt;br&gt;
I am interested in price trends and whatnot. Plus, I wanted to build a project on AWS using new exciting technologies like Dagster. &lt;/p&gt;
&lt;h2&gt;
  
  
  What's in it for you?
&lt;/h2&gt;

&lt;p&gt;It's not a tutorial by any means. More of a walkthrough and reasoning behind the design and gotchas along the way. I will talk about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AWS Step Functions&lt;/strong&gt; and how I implemented my pipeline using this service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS CodeBuild&lt;/strong&gt; and why I think it is the optimal service to use for this my use-case.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dagster&lt;/strong&gt; and how it fits in the picture.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Going technical
&lt;/h2&gt;
&lt;h3&gt;
  
  
  The Workflow &amp;amp; Architecture
&lt;/h3&gt;

&lt;p&gt;From technical perspective the project is &lt;strong&gt;implemented via 3 separate microservices&lt;/strong&gt;. This allows flexibility in deployments, managing Step Functions, and developing the project part-by-part. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvmsq6uj925ufflerm9d3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvmsq6uj925ufflerm9d3.png" alt="Workflow and Microservices"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is a side project so I have to keep costs as low as possible while having a &lt;em&gt;"fully running product"&lt;/em&gt;. I built the project around serverless services which introduced a couple of constraints to keep the prices low. Mainly using GCP along AWS.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fho1tt7n8yxy5e4qkmsi8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fho1tt7n8yxy5e4qkmsi8.png" alt="Project Architecture"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why AWS and GCP?&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Cost savings 💸.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I wanted to build this project solely on AWS... but AWS AppRunner(GCP CloudRun analog), to run the web application does not support scale down to 0 instances. Meaning, there's fixed base cost for 1 running instance which I wanted to avoid. &lt;/p&gt;

&lt;p&gt;GCP CloudRun supports scale down to 0 instances which is ideal. I will only pay for the resources when a web application is accessed and I do not have to keep a constantly running instance. &lt;/p&gt;
&lt;h3&gt;
  
  
  Services &amp;amp; Tools
&lt;/h3&gt;

&lt;p&gt;I will write about the less known AWS Services and the reason I selected them for the project. Everyone knows about S3. Plus, Dagster is an awesome pipeline orchestrator.&lt;/p&gt;
&lt;h3&gt;
  
  
  AWS Step Functions
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;AWS Step Functions is a low-code, visual workflow service that to build distributed applications, automate IT and business processes, and build data and machine learning pipelines using AWS services.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There is a great blog post about AWS Step Function use cases. It goes in-depth on patterns, use-cases, and pros/cons of each. Check &lt;a href="https://blog.bassemdy.com/2020/06/08/aws/architecture/microservices/patterns/aws-step-functions-think-again.html" rel="noopener noreferrer"&gt;this link&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For my use-cases it was the ideal orchestration tool. Because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pipelines run infrequently&lt;/strong&gt; - with AWS Step Functions + CodeBuild + Dagster I avoided the overhead of deploying to EC2, Fargate, ECS. Everything is executed on demand.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low complexity&lt;/strong&gt; - Ideal for AWS Step Functions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cheap (free in my case)&lt;/strong&gt; - Low number of state transitions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native integration&lt;/strong&gt; with CloudBuild, CloudWatch, and other Step Functions. No need to fiddle with Lambda triggers.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  CodeBuild
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. With CodeBuild, you don’t need to provision, manage, and scale your own build servers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I use CodeBuild as it's the &lt;strong&gt;easiest way to get long running on-demand compute&lt;/strong&gt;. Has native support in Step Functions, and comes with 100 free build minutes. It would be possible to use EC2 instances to execute workloads in the same manner but CodeBuild is quicker to spin up and requires less maintenance. Not to mention it's easy to scale, and run in parallel. &lt;/p&gt;

&lt;p&gt;The drawback is that build jobs are ephemeral and therefore data is lost if not saved. This required a bit of engineering to handle errors gracefully in the containers and uploading data artifacts right after they are produced.&lt;/p&gt;
&lt;h3&gt;
  
  
  Dagster
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Dagster is &lt;strong&gt;a data orchestrator for machine learning, analytics, and ETL&lt;/strong&gt;. It lets you define pipelines in terms of the data flow between reusable, logical components, then test locally and run anywhere."&lt;/em&gt; Great intro &lt;a href="https://hackernoon.com/a-quick-introduction-to-machine-learning-with-dagster-gh53336m" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I tried two other tool before settling on Dagster. Namely, &lt;a href="https://www.prefect.io" rel="noopener noreferrer"&gt;Prefect&lt;/a&gt;, and &lt;a href="https://kedro.readthedocs.io/en/stable/" rel="noopener noreferrer"&gt;Kedro&lt;/a&gt;. While both great, they were not ideal for this project. Prefect needs a running Docker and I felt Kedro had to steep learning curve. Also Kedro is intended for ML project management. When it comes to Kedro, I will dig deeper into it in future projects as I liked how it's organized and also used their Data Engineering convention in this project. I will talk about it later.&lt;/p&gt;

&lt;p&gt;Back to Daster, I ultimately choose it because it doesn't need a running docker instance - it's a &lt;code&gt;pip install dagster&lt;/code&gt; away, lightweight, extensible, and can be run anywhere - locally, Airflow, Kubernetes, you choose.&lt;/p&gt;

&lt;p&gt;Dagster comes in two parts. Dagster - orchestration and Dagit - Web UI. They are installed separately which proven to be a benefit in my development workflow. &lt;/p&gt;

&lt;p&gt;As already mentioned I use CodeBuild as an accessible compute resource where I run my Dagster pipeline. I don't think Dagster was intended to be used this way (inside a Docker build) but everything worked seamlessly.&lt;/p&gt;


&lt;h2&gt;
  
  
  Making It All Work
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Step Functions
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Main Step Function&lt;/strong&gt;&lt;br&gt;
Everything is orchestrated by the &lt;em&gt;Main&lt;/em&gt; state machine. Which triggers the &lt;em&gt;Data Collect&lt;/em&gt; and &lt;em&gt;Data Process&lt;/em&gt; state machines containing the CodeBuild blocks where "real work" is done. &lt;/p&gt;

&lt;p&gt;My main state machine contains two choice blocks. This allows to run collect and process independently by defining an input at execution trigger.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Main Step Function inputs
{
    "run_data_collect": true or false,
    "run_data_process": true or false
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftsdvy3sxjhld0k516py2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftsdvy3sxjhld0k516py2.png" alt="Step Function - Main"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why triggering a Step Function from a Step Function?&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Easier debugging. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;By decoupling collect and process and creating two child Step Functions it was easier to debug. I was able to run them workflows separately. It made the whole development process more friendly. On top of that, making changes in the underlying Step Functions doesn't affect the overall flow, and I can easily change the Step Function that is called.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note on triggering Step Functions &amp;amp; CodeBuilds&lt;/strong&gt;&lt;br&gt;
My use-cases requires sequential execution of steps. By default, AWS Step Functions triggers another Step Function in, a &lt;em&gt;"Fire and Forget"&lt;/em&gt;, async manner. Meaning if the child Step Function trigger is successful, it proceeds to the next step.&lt;/p&gt;

&lt;p&gt;To wait for the child Step Function execution to finish and return a Success(or Failure) state you should use &lt;code&gt;startExecution.sync&lt;/code&gt;. This ensures that the parent Step Function waits until the child Step Function finishes its work.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;



&lt;p&gt;Similarly for CodeBuild triggers. To wait for the build task to finish use &lt;code&gt;startBuild.sync&lt;/code&gt;.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Note on environmental variable overrides in AWS Step Functions&lt;/strong&gt;&lt;br&gt;
Same code is used for all data collection and processing CodeBuild jobs. To make it possible I am passing environmental variables extensively to define parameters. I define them in Step Functions and use them as Docker &lt;code&gt;--build-arg&lt;/code&gt; in CodeBuild.&lt;/p&gt;

&lt;p&gt;To make it work, I had to override the env vars in the Step Function CodeBuild trigger. This gave me a headache as AWS in their documentation &lt;a href="https://docs.aws.amazon.com/step-functions/latest/dg/connect-codebuild.html" rel="noopener noreferrer"&gt;Call AWS CodeBuild with Step Functions&lt;/a&gt; and API reference &lt;a href="https://docs.aws.amazon.com/codebuild/latest/APIReference/API_StartBuild.html#API_StartBuild_RequestParameters" rel="noopener noreferrer"&gt;StartBuild&lt;/a&gt; says to use:&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;That's incorrect - see below. Notice the &lt;em&gt;PascalCase&lt;/em&gt; instead of &lt;em&gt;camelCase&lt;/em&gt;. &lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Collect Data&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1rm8qlqq636ri8ooxc2f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1rm8qlqq636ri8ooxc2f.png" alt="Step Function - Data Collect"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I use BeautifulSoup to collect data. There are great articles, and tutorials out there. I will only mention that I am running data collection sequentially to be a good internet citizen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Process &amp;amp; Aggregate Data&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flbjp79irsy6709vnh1m1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flbjp79irsy6709vnh1m1.png" alt="Step Function - Data Process &amp;amp; Aggregate"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The magic happens inside of CodeBuild block where a Dagster pipeline is executed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deeper into Dagster
&lt;/h3&gt;

&lt;p&gt;Dagster offers a number ways to deploy and execute pipelines. See &lt;a href="https://docs.dagster.io/deployment#deployment" rel="noopener noreferrer"&gt;here&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;But that's not what I do - I run the Dagster in a Docker/CodeBuild. I am still questioning if it's the right approach. Nonetheless, taking the pipeline from local development to AWS was painless.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuw9ghlu7g2fdsunih597.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuw9ghlu7g2fdsunih597.png" alt="Image description"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;I mentioned that &lt;strong&gt;Dagster comes with an UI component - Dagit with a full suite of features&lt;/strong&gt; to make development enjoyable. While I worked locally, I used both components. Dagit has great UI to launch pipelines, re-execute from a selected step, it also saves intermediary results, and keeps a DB of runs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dagit is not necessary&lt;/strong&gt; to execute Dagster runs and I did not install it at all for Docker builds. Thanks to Poetry it was easy to separate dev installs and &lt;strong&gt;save time while building.&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Dev Workflow - form Local to Step Functions
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6a6ed4s2cx9d656q4qds.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6a6ed4s2cx9d656q4qds.png" alt="Dagster Dev Workflow"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Local Dev Runs - At this step I used my computer to execute the runs. &lt;/li&gt;
&lt;li&gt;Local Docker Runs - I executed the pipeline in a local Docker build.&lt;/li&gt;
&lt;li&gt;AWS CodeBuild Runs - Same as the previous step but on AWS.&lt;/li&gt;
&lt;li&gt;AWS Step Function Runs - End-to-end testing.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I must say that using Dagster might have been an overkill but this project was a great opportunity to learn it. Also provides future-proofness in case I want to restructure my project (add data collection to Dagster pipeline etc.), add machine learning pipelines to Dagster's repository, execute on Spark.&lt;/p&gt;

&lt;h4&gt;
  
  
  Data Process Steps
&lt;/h4&gt;

&lt;p&gt;When doing my research, I ran into Kedro as one of the alternatives. While not used on this project I repurposed &lt;a href="https://kedro.readthedocs.io/en/0.15.3/06_resources/01_faq.html#what-is-data-engineering-convention" rel="noopener noreferrer"&gt;Kedro's Data Engineering convention&lt;/a&gt;. It works with "layers" for each stage of the data engineering pipeline. I am only using the first 3 layers - &lt;em&gt;Raw&lt;/em&gt;, &lt;em&gt;Intermediate&lt;/em&gt;, and &lt;em&gt;Primary&lt;/em&gt;. As, I am not &lt;em&gt;(yet)&lt;/em&gt; running any machine learning jobs.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Raw&lt;/td&gt;
&lt;td&gt;"Raw" data that is gathered in the "Data Gathering" step of the State machine is downloaded to this folder&lt;/td&gt;
&lt;td&gt;txt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Intermediate&lt;/td&gt;
&lt;td&gt;Cleaned "raw" data. At this stage redundant columns are removed. Data is cleaned, validated, and mapped.&lt;/td&gt;
&lt;td&gt;csv&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Primary&lt;/td&gt;
&lt;td&gt;Aggregated data that will be consumed by the front-end.&lt;/td&gt;
&lt;td&gt;csv&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The above stages and associated directories contain data after each group of tasks was executed. Output files from &lt;em&gt;Raw&lt;/em&gt;, &lt;em&gt;Intermediate&lt;/em&gt;, and &lt;em&gt;Primary&lt;/em&gt; are uploaded to S3. Locally, I used them for debugging and sanity checks.&lt;/p&gt;

&lt;h4&gt;
  
  
  Dagster pipeline
&lt;/h4&gt;

&lt;p&gt;Dagster separates business logic from the execution. You can write the business logic inside the components and Dagster takes care of the orchestration. The underlying execution engine is abstracted away. It's possible to use Dagster's executor, Dask, Celery, etc.&lt;/p&gt;

&lt;p&gt;Three main Dagster concepts are: &lt;code&gt;@op&lt;/code&gt;, &lt;code&gt;@job&lt;/code&gt; and &lt;code&gt;@graph&lt;/code&gt;. You can read about them &lt;a href="https://docs.dagster.io/concepts/ops-jobs-graphs/ops" rel="noopener noreferrer"&gt;here&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Briefly &lt;code&gt;@op&lt;/code&gt; is unit of compute work to be done - it should be simple and written in functional style. Larger number of &lt;code&gt;@op&lt;/code&gt; can be connected into a &lt;code&gt;@graph&lt;/code&gt; for convenience. I connected mapping, cleaning, and validation steps into graphs. A logical grouping of ops based on job type. &lt;code&gt;@job&lt;/code&gt; is a fully connected graph of &lt;code&gt;@op&lt;/code&gt; and &lt;code&gt;@graph&lt;/code&gt; units that can be triggered to process data.&lt;/p&gt;

&lt;p&gt;As I am processing both, rent and sell data, using the same &lt;code&gt;@op&lt;/code&gt; in the same job in parallel and reusing the &lt;code&gt;@op&lt;/code&gt;s by aliasing them. See the gist below:&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;Full dagster &lt;code&gt;@job&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbcno0ubela24bs7tbrzp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbcno0ubela24bs7tbrzp.png" alt="Dagster @job"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;@graph&lt;/code&gt; implementation in Dagster.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;When expanded in Dagit it looks like below. &lt;code&gt;@graph&lt;/code&gt; helps to group operations together and unclutter the UI compared to only &lt;code&gt;op&lt;/code&gt; implementation. Furthermore you can test a full block of operations instead of testing an operation by operation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ed51cl6jdnrgzoa5msr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ed51cl6jdnrgzoa5msr.png" alt="Dagster @graph"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Future Scope &amp;amp; Room for Improvement
&lt;/h2&gt;

&lt;p&gt;I finished the first version of my project. I already see how could parts of the code be improved. Mostly in the data processing part where I use Dagster. It's my first time working with this tool and I missed some important featrues that would make development, testing, and data handling easier.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;S3 File Handling&lt;/strong&gt; I wrote my own S3 manager to upload and download data from s3 buckets. I only recently found out that a &lt;code&gt;dagster-aws&lt;/code&gt; model exists. Looking at the module it does exactly what I need, minus the code I had to write.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Artifact/Data Handling&lt;/strong&gt; I use the &lt;em&gt;Raw, Intermediate, Primary&lt;/em&gt; stages for data artifacts created during process. To save it to the respective folder I implemented a simple write &lt;code&gt;@op&lt;/code&gt;. It's a legit approach but &lt;code&gt;AssetMaterialization&lt;/code&gt; seems like a better, more Dagster-y, way to do it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Settings &amp;amp; Config&lt;/strong&gt; I created a global &lt;code&gt;Settings&lt;/code&gt; class which contained all settings and configs. In hindsight I should have added the &lt;code&gt;Settings&lt;/code&gt; class to Dagster's &lt;code&gt;context&lt;/code&gt; or just use Dagster's config. (I think I carried over the mindset from the previous pure-Python implementation of the data processing pipeline).&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;Feel free to leave comment 📥 💫. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Catch me on&lt;/strong&gt; &lt;a href="https://github.com/tkeyo" rel="noopener noreferrer"&gt;github&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Catch me on&lt;/strong&gt; &lt;a href="https://twitter.com/tkeyo_" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Links:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://blog.bassemdy.com/2020/06/08/aws/architecture/microservices/patterns/aws-step-functions-think-again.html" rel="noopener noreferrer"&gt;Planning on using AWS Step Functions? Think again&lt;/a&gt;&lt;br&gt;
&lt;a href="https://hackernoon.com/a-quick-introduction-to-machine-learning-with-dagster-gh53336m" rel="noopener noreferrer"&gt;A Quick Introduction to Machine Learning with Dagster&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.aws.amazon.com/step-functions/latest/dg/connect-codebuild.html" rel="noopener noreferrer"&gt;Call AWS CodeBuild with Step Functions&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.aws.amazon.com/codebuild/latest/APIReference/API_StartBuild.html#API_StartBuild_RequestParameters" rel="noopener noreferrer"&gt;StartBuild&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.dagster.io/deployment#deployment" rel="noopener noreferrer"&gt;Dagster - Deployment&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>dagster</category>
      <category>dataengineering</category>
      <category>python</category>
    </item>
    <item>
      <title>Export FastAI ResNet models to ONNX</title>
      <dc:creator>Tomas</dc:creator>
      <pubDate>Wed, 07 Jul 2021 23:25:55 +0000</pubDate>
      <link>https://forem.com/tkeyo/export-fastai-resnet-models-to-onnx-2gj7</link>
      <guid>https://forem.com/tkeyo/export-fastai-resnet-models-to-onnx-2gj7</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;A short guide on FastAI vision model conversion to ONNX. Code included. 👀&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What is FastAI?
&lt;/h2&gt;

&lt;p&gt;FastAI is &lt;em&gt;"making neural nets uncool again"&lt;/em&gt;. It offers a high-level API to PyTorch. It could be considered the Keras of PyTorch (?). FastAI and the accompanying course taught by Jeremy Howard and Rachel Thomas has a practical approach to deep learning. It encourages students to train DL models from the first minute. BUT I am sure you already have hands-on experience with this framework if you are looking to convert your FastAI models to ONNX.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is ONNX?
&lt;/h2&gt;

&lt;p&gt;Open Neural Network Exchange or ONNX is a unified format for deep learning and traditional machine learning models. The idea behind ONNX is to create a common interface for all ML frameworks and increase the interoperability between frameworks and devices. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;ONNX is an open specification that consists of a definition of an extensible computation graph model, definition of standard data types, and definition of built-in operators. Extensible computation graph and definition of standard data types make up the Intermediate Representation (IR). &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Source [&lt;a href="https://github.com/onnx/onnx/blob/master/docs/IR.md" rel="noopener noreferrer"&gt;link&lt;/a&gt;]&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1625693023787%2FyhXdGzeXJ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1625693023787%2FyhXdGzeXJ.png" alt="image.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Source [&lt;a href="https://microsoft.github.io/ai-at-edge/docs/onnx/" rel="noopener noreferrer"&gt;link&lt;/a&gt;]&lt;/p&gt;

&lt;p&gt;ONNX, and its implementation - ONNX Runtime, make it easier to put your models into production. You can train your models using the framework of your choice and deploy to a target that uses ONNX Runtime. This way bloated environments with large number of dependencies can be minimized to (pretty much) only ONNX Runtime. There's a growing support to ONNX, and exports are being natively supported by frameworks like PyTorch, MxNet, etc. Find all of them here [&lt;a href="https://onnx.ai/supported-tools" rel="noopener noreferrer"&gt;link&lt;/a&gt;]. Although in some cases it might be tricky to export/import, due opset compatibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why use ONNX and ONNX Runtime?
&lt;/h3&gt;

&lt;p&gt;Couple of reasons here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Faster inference [&lt;a href="https://medium.com/microsoftazure/faster-and-smaller-quantized-nlp-with-hugging-face-and-onnx-runtime-ec5525473bb7" rel="noopener noreferrer"&gt;link&lt;/a&gt;], [&lt;a href="https://cloudblogs.microsoft.com/opensource/2020/12/17/accelerate-simplify-scikit-learn-model-inference-onnx-runtime/" rel="noopener noreferrer"&gt;link&lt;/a&gt;]&lt;/li&gt;
&lt;li&gt;Lower number of dependencies*&lt;/li&gt;
&lt;li&gt;Smaller environment size*&lt;/li&gt;
&lt;li&gt;One, universal target framework for deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;*See conda environment and dependency comparison at the end of the article.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Process
&lt;/h2&gt;

&lt;p&gt;FastAI currently doesn't natively support ONNX exports from FastAI learners. But by design FastAI is a high-level API of PyTorch. This allows us to extract the wrapped PyTorch model. And luckily, PyTorch models can be natively exported to ONNX. It's a 2-step process with a couple of gotchas. This guide intends to make it a smooth experience.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1625693254745%2FcVtex5wkQ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1625693254745%2FcVtex5wkQ.png" alt="image.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can find the entire process in my repository [&lt;a href="https://github.com/tkeyo/fastai-onnx" rel="noopener noreferrer"&gt;link&lt;/a&gt;]. It also includes an optional ResNet model training. You can skip it and proceed with model export to ONNX. I included a link to a pre-trained model in the notebooks. &lt;em&gt;&lt;strong&gt;Or BYOFM - Bring Your Own FastAI Model&lt;/strong&gt;&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Export (Extract) the PyTorch model
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1625698022279%2FqH5_G0jep.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1625698022279%2FqH5_G0jep.png" alt="image.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's break down what's happening.&lt;/p&gt;

&lt;p&gt;If you check the associated notebooks you will find that I exported the FastAI ResNet learner in the previous steps. And named it &lt;em&gt;&lt;code&gt;hot_dog_model_resnet18_256_256.pkl&lt;/code&gt;.&lt;/em&gt; With &lt;code&gt;load_learner()&lt;/code&gt; I am loading the previously exported FastAI model on &lt;strong&gt;line 7&lt;/strong&gt;. If you trained your own model you can skip the load step. Your model is already stored in &lt;code&gt;learn&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To get the PyTorch model from the FastAI wrapper we use &lt;code&gt;model&lt;/code&gt; attribute on &lt;code&gt;learn&lt;/code&gt; - see &lt;strong&gt;line 12&lt;/strong&gt;. I don't want to train the model in subsequent steps thus  I am also setting it to evaluation mode with &lt;code&gt;eval()&lt;/code&gt;. For more details on &lt;code&gt;eval()&lt;/code&gt; and &lt;code&gt;torch.no_grad()&lt;/code&gt; see the discussion [&lt;a href="https://discuss.pytorch.org/t/model-eval-vs-with-torch-no-grad/19615" rel="noopener noreferrer"&gt;link&lt;/a&gt;].&lt;/p&gt;

&lt;p&gt;FastAI wraps the PyTorch model with additional layer for convenience - &lt;strong&gt;Softmax&lt;/strong&gt;, &lt;strong&gt;Normalization&lt;/strong&gt;, and other transformation(defined in FastAI DataBlock API). When using the bare-bones PyTorch model I have to make up for this. Otherwise I'll be getting &lt;em&gt;weird&lt;/em&gt; results.&lt;/p&gt;

&lt;p&gt;First, I define the softmax layer. This will turn inference results into more human readable format. It turn the inference results from something likes this &lt;code&gt;('not_hot_dog', array([[-3.0275817, 1.2424631]], dtype=float32))&lt;/code&gt; into  &lt;code&gt;('not_hot_dog', array([[0.01378838, 0.98621166]], dtype=float32))&lt;/code&gt;. Notice the range of inference results - with the added &lt;strong&gt;softmax&lt;/strong&gt; layer the results are scaled between &lt;strong&gt;0-1&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;On &lt;strong&gt;line 18&lt;/strong&gt;, the normalization layer is defined. I am reusing the suggested ImageNet mean and standard deviation values as described here [&lt;a href="https://pytorch.org/vision/stable/models.html" rel="noopener noreferrer"&gt;link&lt;/a&gt;]. If you are interested in an in-depth conversation on the topic of normalization. See this [&lt;a href="https://discuss.pytorch.org/t/understanding-transform-normalize/21730/8" rel="noopener noreferrer"&gt;link&lt;/a&gt;].&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;On &lt;strong&gt;lines 21-25&lt;/strong&gt;, I am pulling all together into the final model. This final model will be used for ONNX conversion. FastAI learner also handles resizing but for PyTorch and ONNX this will be handled outside of the model by an extra function.&lt;/p&gt;

&lt;h2&gt;
  
  
  Export PyTorch to ONNX
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1625697979846%2Ftgc5n0j7C.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1625697979846%2Ftgc5n0j7C.png" alt="image.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;PyTorch natively support ONNX exports, I only need to define the export parameters. As you can see we are (re)using the &lt;code&gt;final_model&lt;/code&gt; for export. On line 5  I am creating a dummy tensor that is used to define the input dimensions of my ONNX model. These dimensions are defined as &lt;code&gt;batch x channels x height x width - BCHW&lt;/code&gt; format. My FastAI model was trained on images with 256 x 256 dimension which was defined in our FastAI DataBlock API. The same dimensions must be used for the ONNX export - &lt;code&gt;torch.randn(1, 3, 256, 256)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I got this wrong a couple of times - the dummy tensor had different dimensions than the images the model was trained on. &lt;em&gt;Example: Dummy tensor &lt;code&gt;torch.randn(1, 3, 320, 320)&lt;/code&gt; while training image dimensions were &lt;code&gt;3 x 224 x 224&lt;/code&gt;. It took me a while to figure out why I got poor results from my ONNX models.&lt;/em&gt;&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;The &lt;code&gt;export_param&lt;/code&gt; argument, if set to &lt;code&gt;True&lt;/code&gt;, includes the parameters of the trained model in the export. It's important to use &lt;code&gt;True&lt;/code&gt; in this case. We want our model with parameters. As you might have guessed, &lt;code&gt;export_params=False&lt;/code&gt; exports a model without parameters. Full &lt;code&gt;torch.onnx&lt;/code&gt; documentation [&lt;a href="https://pytorch.org/docs/master/onnx.html" rel="noopener noreferrer"&gt;link&lt;/a&gt;].&lt;/p&gt;

&lt;h2&gt;
  
  
  Inference with ONNX Runtime
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1625698218983%2F_ZorSiRJJ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1625698218983%2F_ZorSiRJJ.png" alt="onnx_runtime_logo.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On &lt;strong&gt;line 10&lt;/strong&gt;, I am creating an ONNX runtime inference session and loading the exported model. For debugging purposes, or if you get your hands on an ONNX model with unknown input dimensions. You can run &lt;code&gt;get_inputs()[0].shape&lt;/code&gt; on the inference session instance to get the expected inputs. If you prefer a GUI, Netron [&lt;a href="https://netron.app" rel="noopener noreferrer"&gt;link&lt;/a&gt;] can help you to visualize the architecture of the neural networks.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;The inference itself is done by using the &lt;code&gt;run()&lt;/code&gt; method which returns a numpy array with softmaxed probabilities. See &lt;strong&gt;line 21&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Storage, Dependencies &amp;amp; Inference Speed
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Storage
&lt;/h4&gt;

&lt;p&gt;The advantage of using ONNX Runtime is the small storage footprint compared to PyTorch and FastAI. A conda environment with ONNX Runtime (+ Pillow for convenience) is &lt;strong&gt;~ 25% of the PyTorch&lt;/strong&gt; environment and only &lt;strong&gt;~ 15% of the FastAI&lt;/strong&gt; environment. Important for serverless deployments.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;h4&gt;
  
  
  Dependencies
&lt;/h4&gt;

&lt;p&gt;See for yourself.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;h4&gt;
  
  
  Inference speed
&lt;/h4&gt;

&lt;p&gt;I mentioned inference speed as an advantage of ONNX. I tested the inference speed of all three versions of the same model. There were negligible differences between inference speed. Other experiments had more favourable results for ONNX. See references in &lt;strong&gt;Why use ONNX and ONNX Runtime?&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;FastAI is a great tool to get you up and running with model training in a (VERY) short time. It has everything you need to get top notch results with minimal effort in a practical manner. But when it comes to deployment, tools like ONNX &amp;amp; ONNX Runtime can save resource with their smaller footprint and efficient implementation. Hope this guide was helpful and that you managed successfully convert your model to ONNX.&lt;/p&gt;

&lt;h2&gt;
  
  
  Repository/Code
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/tkeyo/fastai-onnx" rel="noopener noreferrer"&gt;FastAI-ONNX GitHub&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Feel free to reach out.👏&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;/blockquote&gt;

</description>
      <category>machinelearning</category>
      <category>fastai</category>
      <category>pytorch</category>
      <category>onnx</category>
    </item>
    <item>
      <title>TinyML: Machine Learning on ESP32 with MicroPython</title>
      <dc:creator>Tomas</dc:creator>
      <pubDate>Sat, 26 Jun 2021 11:03:15 +0000</pubDate>
      <link>https://forem.com/tkeyo/tinyml-machine-learning-on-esp32-with-micropython-38a6</link>
      <guid>https://forem.com/tkeyo/tinyml-machine-learning-on-esp32-with-micropython-38a6</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Detecting gestures from time-series data with ESP32, accelerometer, and MicroPython in near real-time.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Why this project?
&lt;/h2&gt;

&lt;p&gt;I wanted to build a &lt;strong&gt;TinyML&lt;/strong&gt; application that uses &lt;strong&gt;time-series data&lt;/strong&gt; and could be deployed to edge devices - ESP32 microcontroller in this case. I looked into machine learning projects that use &lt;strong&gt;MicroPython on ESP32&lt;/strong&gt; but could not find any (let me know if I am missing something 🙃). Although, There's a growing number of C/C++ TinyML projects using Tensorflow Lite Micro in combination with neural networks. For the first iteration of this project I skipped neural networks and explored what's possible with &lt;strong&gt;standard machine learning algorithms&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Before jumping into code, let's clear the basics...&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing TinyML
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What's TinyML?
&lt;/h3&gt;

&lt;p&gt;TinyML is the overlap between Machine Learning and embedded (IoT) devices. It gives more "intelligence" to power advanced applications using machine. The idea is simple - for complex use-cases where rule-based logic is insufficient; apply ML algorithms. And run them on low-power device at the edge. Sounds simple; execution gets tougher.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvmr70ffxar9pdp4c6p3v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvmr70ffxar9pdp4c6p3v.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;TinyML is a fairly new concept, first mentions are dating back to 2018(?). There's still ambiguity about what is considered TinyML. For the purpose of this article, TinyML applications are applications running on microcontrollers with MHz of clock speed up to more powerful ones like the Nvidia Jetson Family. Raspberry included. Other names for TinyML are AIoT, Edge Analytics, Edge AI, far-edge computing. Choose the one you like the most.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why TinyML?
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6pdpcqur9ycgmsrwaeri.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6pdpcqur9ycgmsrwaeri.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Bandwidth&lt;/strong&gt; - As an example, a device at 100Hz sampling rate produces 360,000 data points each hour. Now imagine the amount of data produced by a fleet of these devices. It get's even trickier with images and video.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt; - "&lt;em&gt;time between when a system takes in a sensory input and responds to it"&lt;/em&gt;. In case of conventional ML deployment data must be first sent to an ML application. This increases the time in which an edge device can take action as it waits for the response.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Economics&lt;/strong&gt; - Cloud is cheap but not so cheap. It still costs money to ingest large amounts of data, especially if it must happen in real-time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reliability&lt;/strong&gt; - Revisiting the bandwidth example, in case of high-frequency sampling, it might be hard to ensure that data arrives to a target in the same order as it was produced by an edge device.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy&lt;/strong&gt; - TinyML processes data on-device and is not sent through network. This reduces the surface for data abuse.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  TinyML use cases
&lt;/h3&gt;

&lt;p&gt;TinyML use cases can range from predictive maintenance all the way to virtual assistants. I might write an article on the current landscape, use cases, and business behind case behind TinyML. &lt;/p&gt;




&lt;h2&gt;
  
  
  What's this TinyML Project about?
&lt;/h2&gt;

&lt;p&gt;I set out to build a TinyML system that detects 3 types of gestures (I will be using gestures/movements interchangeably throughout this article.) from a time-series, stores results, and visualizes them on a webpage. &lt;/p&gt;

&lt;p&gt;The system has a static webpage hosted on S3 buckets, DynamoDB, a Golang Microservices, and obviously the edge device with a TinyML application.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Project architecture&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq385htj4ab55xldxzadn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq385htj4ab55xldxzadn.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While there are more components to this project. This article will be about machine learning and some parts of the implementation on ESP32. If you are interested in the full code you can find the links to the repositories at the end of this article.&lt;/p&gt;

&lt;p&gt;Let's go to the edge and see the hardware.&lt;/p&gt;
&lt;h3&gt;
  
  
  On the edge
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fugyewwjoe0j4m72nn55o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fugyewwjoe0j4m72nn55o.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The core of the system is an ESP32 - a microcontroller produced by Espressif with 240MHz clock speed, built-in WiFi+BLE and ability to handle MicroPython🐍 (I used MicroPython 1.14). The IMU used for this project was an MPU6500 with 6 degrees of freedom (DoF) - 3 accelerometers (X,Y,Z) and 3 angular velocities (X,Y,Z). Plus breadboard and jumper wires to connect it all together.&lt;/p&gt;
&lt;h3&gt;
  
  
  MicroPython
&lt;/h3&gt;

&lt;p&gt;If you haven't yet heard about MicroPython - it's Python for microcontrollers.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"MicroPython is a lean and efficient implementation of the Python 3 programming language that includes a small subset of the Python standard library and is optimized to run on microcontrollers and in constrained environments." [Link]&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It's might not be as performant as C or C++ but provides plenty to make prototyping enjoyable. Especially for IoT applications which are not latency sensitive. (*It worked fine with 100Hz sampling rate)&lt;/p&gt;
&lt;h2&gt;
  
  
  Data &amp;amp; Machine Learning
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Gestures definition
&lt;/h3&gt;

&lt;p&gt;You can find the 3 gestures for which I collected data below. I call them 'circle', 'X' and 'Y' - gifs follow the same order. 'Circle' is self-explanatory. 'X' and 'Y' because to gesture was along the X and Y axis of the sensor, respectively. Ideally, I would have wanted to detect anomalies on real machine data but that type of data is hard to come by and also hard to replicate. My defined gestures, on the other hand, were easy to generate and more than enough to test the possibilities of MicroPython and ESP32.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Ftinyml-article-gifs%2Fcircle_x_y_optimized.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Ftinyml-article-gifs%2Fcircle_x_y_optimized.gif" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Experiments
&lt;/h3&gt;

&lt;p&gt;Experimentation with Machine learning was divided into two parts - the first explored the effects of time-series labelling on model performance. I use all available signals from the sensors - X,Y,Z accelerations and X,Y,Z angular velocities. Additionally, I tested the viability of ML on ESP32 from the inference time perspective. Whether, it will be possible to achieve low enough inference times. &lt;/p&gt;

&lt;p&gt;The focus of the second set of experiments was model optimization, reducing feature space, selecting the right sampling frequency, and reducing incorrect inference results.&lt;/p&gt;
&lt;h3&gt;
  
  
  Collecting data
&lt;/h3&gt;

&lt;p&gt;I simplified data collection by using Terminal Capture VS Code extension. It let me save sensor data from VSC's terminal to a txt file which I later wrangled to csv format. For printing out sensor data I wrote the below script. It runs on ESP32 at startup with a 10ms sampling period (100Hz sampling rate). 10ms was the lowest I could get with consistent results.  I tried a &lt;code&gt;period=5&lt;/code&gt; but the readings were inconsistent with readings between 5-7ms. Hitting the first limitation of the stack. Nonetheless, 10ms (100Hz) was more than enough.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;And this is how it works:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Ftinyml-article-gifs%2F100q_80cols_25rows.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Ftinyml-article-gifs%2F100q_80cols_25rows.gif" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Labelling and label distribution
&lt;/h3&gt;

&lt;p&gt;There are great labelling tools out there, I used &lt;a href="http://labelstud.io" rel="noopener noreferrer"&gt;labelstud.io&lt;/a&gt; to label my time-series data. Among the 3 defined gestures 'circle' is the longest, at around 800-1000ms, 'X', and 'Y' are between 400-600ms. To have a buffer, I used a 1000ms label span for all three of the labels.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flf4peucuo57e4ml6b1w2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flf4peucuo57e4ml6b1w2.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Exploring the dataset - EDA
&lt;/h3&gt;

&lt;p&gt;I used 3D plots to see if there's a relationship between the signals. All data points of the 1000ms time span are plotted (101 data points).&lt;/p&gt;

&lt;h4&gt;
  
  
  Acceleration 3D plot
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftr49ivtr5t0y2959n3b3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftr49ivtr5t0y2959n3b3.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Angular velocity 3D plot
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3buiu9jpk5r99ihv8k8m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3buiu9jpk5r99ihv8k8m.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It's clear that there's a pattern in gesture accelerations and angular velocities.&lt;/p&gt;

&lt;p&gt;Let's double check by plotting all signals against their mean. You can find plots of all gestures and correlation matrices in jupyter notebooks in the associated repository.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4656afh0ddufalry0lvv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4656afh0ddufalry0lvv.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: The cutoff at the top and bottom of the signals is due sensor range which was set to 2G (~19.6 m/s^2).&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Machine Learning to detect gestures
&lt;/h3&gt;

&lt;p&gt;Since ESP32, and microcontrollers in general, are resource constrained there's a couple of requirements for my TinyML application:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Inference time &amp;lt;&amp;lt; sampling period&lt;/li&gt;
&lt;li&gt;ML model &amp;lt; 20kB - it's hard do load files larger than 20kB onto ESP32 (at least with MPY)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;MicroPython is still a young project, supported by an active community and there are many libraries already developed. &lt;strong&gt;Unfortunately there's no scikit-learn or a dedicated time-series machine learning library for MicroPython&lt;/strong&gt;. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;How to overcome this?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;The answer is pure-python machine learning models.&lt;/strong&gt; Luckily, I found a great library (&lt;a href="https://github.com/BayesWitnesses/m2cgen/tree/master/generated_code_examples" rel="noopener noreferrer"&gt;m2cgen&lt;/a&gt;) that let's you export scikit-learn models to Python, Go, Java (and many other) programming languages. It doesn't have time-series specific ML model export capabilities. So, I'll be using standard scikit-learn algorithms.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm29itv3ijno3j577ywky.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm29itv3ijno3j577ywky.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In practice it looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Train models with scikit-learn on tabular data&lt;/li&gt;
&lt;li&gt;Convert scikit-learn models to pure-python code&lt;/li&gt;
&lt;li&gt;Use pure-python models for inference&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Caveats of using scikit-learn for time-series data
&lt;/h3&gt;

&lt;p&gt;Using scikit-learn for time-series comes with a price - data must be in a tabular format to train the models. There are two ways to go about this [&lt;a href="https://www.sktime.org/en/latest/examples/02_classification_univariate.html" rel="noopener noreferrer"&gt;link&lt;/a&gt;]:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Tabularizing(reducing) data&lt;/p&gt;

&lt;p&gt;In this case each time point is considered a feature and we lose order of data in time. There's no dependency of one point on the previous or next in the series.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Feature extraction&lt;/p&gt;

&lt;p&gt;In case of feature extraction time-series data is used to calculate mean, max, min, variance and other, time-series specific, variables which are then used as features for model training. We moved away from the time-series domain and operate in the domain of features.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;I chose data tabularization&lt;/strong&gt;. While it's simple to call advanced libraries in Python - MicroPython has a limited mathematical toolset - I might not be able extract all features in MPY. Secondly, I had to consider the speed at which these features could be calculated - given the limited resources might take longer than sampling period. Maybe in the next iteration of this TinyML project.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffgrrapvu7xbb9m0zlby9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffgrrapvu7xbb9m0zlby9.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Dataset variations for ML model training
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Event position in label
&lt;/h4&gt;

&lt;p&gt;This affected 'X' and 'Y' gestures since their execution takes between 400-600ms. It was possible to change their position in the 1000ms label window. 'Circle' takes 800-1000ms so I left this gesture as labelled.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fog49mk6hbwrkurwvggp6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fog49mk6hbwrkurwvggp6.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dataset description&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Baseline dataset&lt;/p&gt;

&lt;p&gt;Dataset used for training and validation contained movements as collected and labeled.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Centered X and Y move signals&lt;/p&gt;

&lt;p&gt;'Circle' movement takes up the whole span of 1000ms and cannot be manipulated by moving it along the time axis. However 'X' and 'Y' have shorter execution at around 400-600ms and allow for flexibility. I tried to center the movement of the signal in the center of the 1000ms window to see if the model will perform better with this setup.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Centered X and Y move signals + Augmentation&lt;/p&gt;

&lt;p&gt;Similarly as in previous case 'X' and 'Y' movements were in the center of the 1000ms window. Additionally, a sort of augmentation was introduced. Since labelling the movements is not 'exact' some signals might have a misaligned start. To make up for this, and possibly achieve a better generalization I added a shift - meaning I used a range of small shifts.&lt;/p&gt;

&lt;p&gt;For 'X' and 'Y' movements the center is at -20 steps . For augmentation a range between -20 and -15 was used. Where one step is 10ms.&lt;/p&gt;

&lt;p&gt;For 'circle' a range between -2 and 2 was used.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example: If the original label starts at 0 and the augmented dataset was shifted by -1 step - the augmented dataset will have its start at 0-10ms step.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Centered X and Y move signals + SMOTE&lt;/p&gt;

&lt;p&gt;Simlarly, as previous two cases 'X' and 'Y' are centerd but additional synthetic oversampling is used (SMOTE) and an equal amount of labels is create for the training dataset.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;X and Y signal at the end of the window&lt;/p&gt;

&lt;p&gt;In this case the 'X' and 'Y' movements are put at the end of the 1000ms sampling window. &lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Data sampling rate&lt;/strong&gt;&lt;br&gt;
Data was collected at 100Hz which allowed me to downsample. You can find the frequencies used for model training below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1qt0e7x6u1mf2jbkoboh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1qt0e7x6u1mf2jbkoboh.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Model evaluation
&lt;/h3&gt;

&lt;p&gt;After the initial model training, deployment and inference on live data I noticed that inference on ESP32 was too sensitive - multiple detections for the same movement occurrence. I collected a validation dataset to see what happens when used with live data. Each validations dataset - 'circle', 'X', 'Y' contained 5-6 gesture events.&lt;/p&gt;

&lt;p&gt;I emulated live data feed through a sliding inference window, which makes an inference at each step while it's sliding through the time-series. Each green(circle), blue(X), and red(Y) represent one inference. These line are in the center of the sliding window (i.e. T + 500ms).&lt;/p&gt;

&lt;p&gt;See the example below - all of these models had 0.95+ accuracy but still produce incorrect inference results when emulating live data on validation datasets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqphl8ccn49m4um9ocvbc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqphl8ccn49m4um9ocvbc.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Evaluation equation&lt;/strong&gt;&lt;br&gt;
I used the below equations to do so. For each dataset I calculated the ratio of incorrect labels that shouldn't be there. I acknowledge that I should have labelled my evaluation datasets but I needed a quick way to quantitatively evaluate models.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Label&lt;/th&gt;
&lt;th&gt;Equation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cirle&lt;/td&gt;
&lt;td&gt;cirle_error = X+Y  / (X+Y+Circle)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X&lt;/td&gt;
&lt;td&gt;x_error = Circle+Y / (X+Y+Circle)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;y_error = Circle+X / (X+Y+Circle)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  Baseline model training results
&lt;/h3&gt;

&lt;p&gt;Initially I used 5 models for baseline model training but reduced it to Decision Trees and Random Forests from the initial set of Decision Tree, Random Forest, Support Vector Machines, Logistic Regression, and Naive Bayes. m2cgen doesn't support Naive Bayes so I was unable convert NB models to pure-python. Logistic Regression and SVMs had issues with inference times when converted to pure-python.&lt;/p&gt;

&lt;p&gt;Both Random Forests and Decision Tree settings were left on default.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17n71leztfz1bd6ztmiv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17n71leztfz1bd6ztmiv.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see there's nothing conclusive with regards to sampling frequency and event position in the label. On top of that, I noticed large variations in results just by changing the random_seed of the model. I assume this could be solved by collecting more data.&lt;/p&gt;

&lt;p&gt;You can see the results of means across all 3 movement errors can be found below. Again there's no clear winner so going forward I will be using the baseline dataset to train optimized models - sticking to the basics.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg10azmtkuzb2k9qdynvn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg10azmtkuzb2k9qdynvn.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Testing inference time
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqz6lyn3duqwoc763uhpe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqz6lyn3duqwoc763uhpe.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I tested inference times of Random Forests with different number of estimators to see what's the highest number that is still usable. Inference time with 10 estimators is approximately 4ms which is viable even at 10 ms sampling period. Additionally, the ESP32 was set to 160MHz clock speed, for the actual script I will be using 240MHz (50% increase) which will further decrease inference times.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: Random Forests are just ensembles of Decision Trees - if Random Forest pass Decision Trees will pass as well.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Optimizing models
&lt;/h3&gt;

&lt;p&gt;I tested inference times of Random Forests with different number of estimators to see what's the highest number that is still usable. Inference time with 10 estimators is approximately 4ms which is viable even at 10 ms sampling period. Additionally, the ESP32 was set to 160MHz clock speed, for the actual script I will be using 240MHz (50% increase) which will further decrease inference times.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: Random Forests are just ensembles of Decision Trees - if Random Forest pass Decision Trees will pass as well.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Considered optimization&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Optimizing the number of estimators&lt;/p&gt;

&lt;p&gt;Number of estimators must be kept low - ideally between 3-5 because of time constraints.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Optimizing the number of collected inputs&lt;/p&gt;

&lt;p&gt;X,Y,Z acceleration signals must be collected for a different part of the application. I considered to create a combination of acceleration and 1 or 2 angular velocity signals.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Optimizing sampling rate&lt;/p&gt;

&lt;p&gt;Sampling rate of 100Hz might be an overkill for the application. And based on evaluation results it doesn't offer any benefit over 50Hz or 20Hz sampling rate. On the other hand 10Hz might be too slow. Therefore for experiments I will be using 20, 25 and 50Hz sampling rates. &lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To train optimized models I used a grid search over the parameters below.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftexnf88rr24ikla1dfz3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftexnf88rr24ikla1dfz3.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Comparing the results
&lt;/h4&gt;

&lt;p&gt;In the charts you can see the results for all 3 gestures. Blue dots and line represent the baseline model (not optimized model) and red dots and line optimized models (result of grid search). Horizontal lines in each of the charts are the means of all 3 gestures.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F76zqik8qh1mht4fwi37z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F76zqik8qh1mht4fwi37z.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The best model is ID #2. &lt;/p&gt;
&lt;h4&gt;
  
  
  Comparing model #2 to baseline model
&lt;/h4&gt;

&lt;p&gt;Baseline model was trained with all 6 signals at 50Hz with default settings. It's counter intuitive but by reducing the number of signals it was possible to reduce the number of incorrect inferences. Same evaluation methods were used as described in previously.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fssp5m69zbkkbbzl2h2i6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fssp5m69zbkkbbzl2h2i6.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Although the inference improved by reducing the number of signals and tuning the hyperparameters - it's far from perfect. There are two phenomenons in the inference results -  first there are groups of same CORRECT inferences and secondly there's a trailing inference (mostly for circle gestures). The trailing inferences are due residual movement at the end of the circle motion. While these might be correctly classified, they are unwanted and must be filtered out. Ideally, there's only one correct inference per event that is sent to the REST API.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fco1qrllzyis62lipg7l7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fco1qrllzyis62lipg7l7.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Debouncing inference results
&lt;/h3&gt;

&lt;p&gt;I am assuming models have a window around the 'true' center of movement. Meaning, models will make inference few ms before and after the 'true' movement point. Additionally, there are incorrect 'trailing' inferences.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjo2ql2wj0ldp7k2rcxtu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjo2ql2wj0ldp7k2rcxtu.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;My debounce implementation is based on two conditions. One of them compares the &lt;strong&gt;time difference between the first and last inference&lt;/strong&gt; in an inference buffer('Circle', 'X', and 'Y' inference results are added to the inference buffer.). The other one &lt;strong&gt;evaluates the number of inferences in the inference buffer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For the window around the 'true' movement I am assuming 200ms which practically allows 9 inferences at 50Hz sampling rate. Therefore, the inference buffer must contain at least 9 values.&lt;/p&gt;

&lt;p&gt;The time difference threshold value is set to 450 ms. After experimentation it worked the best at 50Hz sampling rate. It filtered out trailing inferences of 'Circle' gesture while still detecting 'X' and 'Y' gestures. Values above 450ms were unable to detect them. In contrary, time difference threshold values below 400ms were classifying 'trailing' inferences as separate gestures (often of incorrect type)&lt;/p&gt;

&lt;p&gt;If above conditions are met - the &lt;strong&gt;most frequent value of the first 9 elements in the inference buffer are returned as the final inference result&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: This is still wip and I am thinking about smarter re-implementations.&lt;/em&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Debouncing implementation
&lt;/h4&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;h4&gt;
  
  
  Comparing raw and debounced inference results
&lt;/h4&gt;

&lt;p&gt;The results are cleaner but there's still room for improvement - there should be only in inference per event in 'Circle' eval data and all events should be picked up in 'X' and 'Y'.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Circle&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh0crjmgaxx0hbrhwrpo1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh0crjmgaxx0hbrhwrpo1.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;X&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgfgzvf69a307t2bf2kbq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgfgzvf69a307t2bf2kbq.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Y&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgeyjsma33ymojm7zu0t0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgeyjsma33ymojm7zu0t0.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In general the results look better than those produced by baseline models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inference on ESP32
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Ftinyml-article-gifs%2Frender1622581364829.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Ftinyml-article-gifs%2Frender1622581364829.gif" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Future tweaks
&lt;/h2&gt;

&lt;p&gt;I am already thinking how to tweak this project to achieve faster classification, more accurate results. Here's a couple of ideas I am thinking about:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Improve model evaluation by labelling validation data or by designing better evaluation methods.&lt;/li&gt;
&lt;li&gt;Implement feature extraction in addition to time-series data to (possibly) achieve better inference results.&lt;/li&gt;
&lt;li&gt;Implement async writes to DB on backend. Shorter response time -&amp;gt; shorter blocking. *MPY requests module implementation does not yet support async.&lt;/li&gt;
&lt;li&gt;Replace HTTP requests (does not support async) with MQTT (supports async)&lt;/li&gt;
&lt;li&gt;Implement digital signal processing methods to smooth out signals.&lt;/li&gt;
&lt;li&gt;Improve data handling - memory allocation errors with 900 data points.&lt;/li&gt;
&lt;li&gt;Compare evaluation results to dedicated time-series models  and neural networks.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;To conclude - it is clearly possible to classify gestures on an ESP32 microcontroller using standard machine learning algorithms, and MicroPython but some corners need to be cut. Among others, time-series data must be tabularized, highest possible sampling rate is 100Hz (with current setup).&lt;/p&gt;

&lt;h2&gt;
  
  
  Future scope
&lt;/h2&gt;

&lt;p&gt;While working on this project I found many interesting sources and projects implementing TinyML with Tensorflow Lite Micro, DeepC and similar. Next, I'd like to explore implement gesture classification using neural networks to compare the results between standard ML and DL. &lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Reach out if you have any questions or suggestions. 👏&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Repos
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/tkeyo/tinyml-esp-data" rel="noopener noreferrer"&gt;Data &amp;amp; ML notebooks&lt;/a&gt;&lt;br&gt;
&lt;a href="https://github.com/tkeyo/tinyml-esp" rel="noopener noreferrer"&gt;ESP32 TinyML&lt;/a&gt;&lt;br&gt;
&lt;a href="https://github.com/tkeyo/tinyml-be" rel="noopener noreferrer"&gt;Golang REST API&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: Code is work-in-progress.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>iot</category>
      <category>micropython</category>
      <category>esp32</category>
    </item>
  </channel>
</rss>
