<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Austin Deyan</title>
    <description>The latest articles on Forem by Austin Deyan (@austin_deyan_6c9b2445aed6).</description>
    <link>https://forem.com/austin_deyan_6c9b2445aed6</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3616341%2F630eda26-d8e2-4bef-b04b-ced93e823573.png</url>
      <title>Forem: Austin Deyan</title>
      <link>https://forem.com/austin_deyan_6c9b2445aed6</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/austin_deyan_6c9b2445aed6"/>
    <language>en</language>
    <item>
      <title>How I Built a Full-Stack ML App (and Fixed a 3GB Docker Image) 🐳</title>
      <dc:creator>Austin Deyan</dc:creator>
      <pubDate>Thu, 01 Jan 2026 17:05:26 +0000</pubDate>
      <link>https://forem.com/austin_deyan_6c9b2445aed6/how-i-built-a-full-stack-ml-app-and-fixed-a-3gb-docker-image-44ck</link>
      <guid>https://forem.com/austin_deyan_6c9b2445aed6/how-i-built-a-full-stack-ml-app-and-fixed-a-3gb-docker-image-44ck</guid>
      <description>&lt;p&gt;Most Machine Learning tutorials have a fatal flaw: They stop at the Notebook.&lt;/p&gt;

&lt;p&gt;You train a model, get a nice accuracy score, and then... nothing. The model sits in a .ipynb file gathering digital dust.&lt;/p&gt;

&lt;p&gt;I wanted to change that. I recently built an end-to-end Customer Conversion System that takes raw data, predicts purchasing behavior, and triggers automated marketing actions via a live API.&lt;/p&gt;

&lt;p&gt;Here is the journey from "Localhost" to "Production"—including how I accidentally built a 3.3GB Docker container and how I slashed it by 65%.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏗️ The Tech Stack
&lt;/h2&gt;

&lt;p&gt;We aren't just fitting curves; we are shipping code.&lt;/p&gt;

&lt;p&gt;Model: XGBoost (Classification + Regression)&lt;/p&gt;

&lt;p&gt;Backend: Flask (Python)&lt;/p&gt;

&lt;p&gt;Container: Docker&lt;/p&gt;

&lt;p&gt;Cloud: Google Cloud Run (Serverless)&lt;/p&gt;

&lt;p&gt;Frontend: Streamlit&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 1: The Logic (Beyond "0.85 Accuracy")
&lt;/h3&gt;

&lt;p&gt;A raw probability score isn't actionable. Marketing teams don't want to know "User 123 has a 0.82 score." They want to know what to do.&lt;/p&gt;

&lt;p&gt;I wrapped my XGBoost model in a "Decision Engine" function inside Python:&lt;/p&gt;

&lt;p&gt;Python&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def determine_action(prob, days_to_buy, value):
    # High probability, High spender
    if prob &amp;gt; 0.8 and value &amp;gt; 2000:
        return f"VIP ALERT: Send Early Access Catalog. (Expected buy in {int(days_to_buy)} days)"

    # High probability, Low spender
    elif prob &amp;gt; 0.8:
        return "PROMO: Send 'Bundle Discount' to increase basket size."

    # Low probability, High historic value (Churn Risk)
    elif prob &amp;lt; 0.3 and value &amp;gt; 2000:
        return "RISK: Trigger Personal Outreach Call."

    else:
        return "NURTURE: Add to General Newsletter."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the API returns business strategy, not just math.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 2: The Docker Nightmare 🐳
&lt;/h3&gt;

&lt;p&gt;This was the biggest hurdle. I wrote a standard Dockerfile to wrap up my Flask API.&lt;/p&gt;

&lt;p&gt;I ran docker build, went to grab coffee, came back, and saw this:&lt;/p&gt;

&lt;p&gt;Bash&lt;/p&gt;

&lt;p&gt;Successfully built...&lt;br&gt;
Image size: 3.36 GB&lt;br&gt;
3.36 GB. For a simple API? That’s unacceptable. It makes deployment slow and storage expensive.&lt;/p&gt;
&lt;h3&gt;
  
  
  🕵️‍♂️ The Investigation
&lt;/h3&gt;

&lt;p&gt;I ran a deep scan inside the container to see where the fat was hiding:&lt;/p&gt;

&lt;p&gt;Bash&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run --rm my-app du -ah /usr/local/lib/python3.9/site-packages | sort -rh | head -n 10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output was shocking:&lt;/p&gt;

&lt;p&gt;900MB+ in nvidia/ drivers.&lt;/p&gt;

&lt;p&gt;1GB+ in my local .venv folder that I accidentally copied over.&lt;/p&gt;

&lt;h3&gt;
  
  
  🛠️ The Fixes
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;The .dockerignore File I was lazy and didn't create a .dockerignore file, so Docker copied my local virtual environment (.venv), git history, and raw data into the image.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Fix: Added .venv, .git, and data/ to .dockerignore.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The XGBoost/NVIDIA Trap It turns out that pip install xgboost (latest version) often bundles massive NVIDIA CUDA drivers, even if you are only running on a CPU.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Fix: I pinned the version to a lighter release in requirements.txt:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;xgboost==1.7.6&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The Result: The image dropped from 3.36GB -&amp;gt; 1.2GB. Much better.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 3: Serverless Deployment (Google Cloud Run)
&lt;/h3&gt;

&lt;p&gt;I love Cloud Run for side projects. You give it a container, and it gives you a HTTPS URL. It scales to zero when no one is using it, meaning it costs $0/month for low traffic.&lt;/p&gt;

&lt;p&gt;Deploying was just three commands:&lt;/p&gt;

&lt;p&gt;Bash&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Tag the image
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;docker tag conversion-api gcr.io/my-project/conversion-api&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Push to Google Container Registry
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;docker push gcr.io/my-project/conversion-api&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Deploy
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;gcloud run deploy conversion-service --image gcr.io/my-project/conversion-api --platform managed&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Boom. A live API endpoint accessible from anywhere in the world.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 4: The Frontend
&lt;/h2&gt;

&lt;p&gt;To make this usable for non-technical users, I threw together a Streamlit dashboard in about 50 lines of Python.&lt;/p&gt;

&lt;p&gt;It connects to the Cloud Run API and provides a UI for testing customer profiles.&lt;/p&gt;

&lt;h3&gt;
  
  
  📝 Key Takeaways
&lt;/h3&gt;

&lt;p&gt;ML isn't done until it's deployed. A model in a notebook delivers zero value.&lt;/p&gt;

&lt;p&gt;Watch your dependencies. pip install is dangerous if you don't check what's being installed. That single XGBoost line cost me 1GB of space.&lt;/p&gt;

&lt;p&gt;Context matters. Transforming a probability score into a "Next Best Action" makes your model 10x more valuable to stakeholders.&lt;/p&gt;

&lt;p&gt;Have you ever struggled with massive Docker images in Python? Let me know in the comments!&lt;/p&gt;

</description>
      <category>python</category>
      <category>docker</category>
      <category>datascience</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>Zero-to-Scale ML: Deploying ONNX Models on Kubernetes with FastAPI and HPA</title>
      <dc:creator>Austin Deyan</dc:creator>
      <pubDate>Mon, 15 Dec 2025 18:42:28 +0000</pubDate>
      <link>https://forem.com/austin_deyan_6c9b2445aed6/zero-to-scale-ml-deploying-onnx-models-on-kubernetes-with-fastapi-and-hpa-l78</link>
      <guid>https://forem.com/austin_deyan_6c9b2445aed6/zero-to-scale-ml-deploying-onnx-models-on-kubernetes-with-fastapi-and-hpa-l78</guid>
      <description>&lt;p&gt;The path to scalable ML deployment requires high-performance APIs and robust orchestration. This post walks through setting up a local, highly available, and auto-scaling inference service using &lt;strong&gt;FastAPI&lt;/strong&gt; for speed and &lt;strong&gt;Kind&lt;/strong&gt; for Kubernetes orchestration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 1: The FastAPI Inference Service
&lt;/h2&gt;

&lt;p&gt;Our Python service handles ONNX model inference. The critical component for K8s stability is the &lt;code&gt;/health&lt;/code&gt; endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Python

# app.py snippet
# ... model loading logic ...

@app.get("/health")
def health_check():
    # K8s Probes will hit this endpoint frequently
    return {"status": "ok", "model_loaded": True}

# ... /predict endpoint ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Phase 2: Docker and Kubernetes Deployment
&lt;/h2&gt;

&lt;p&gt;After building the image (&lt;code&gt;clothing-classifier:latest&lt;/code&gt;) and loading it into Kind, we define the Deployment. Note the crucial resource constraints and probes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;YAML

# deployment.yaml (Snippet focusing on probes and resources)
        resources:
          requests:
            cpu: "250m"  # For scheduling
            memory: "500Mi"
          limits:
            cpu: "500m"  # To prevent monopolizing the node
            memory: "1Gi"
        livenessProbe:
          httpGet: {path: /health, port: 8000}
          initialDelaySeconds: 5
        readinessProbe:
          httpGet: {path: /health, port: 8000}
          initialDelaySeconds: 5 # Gives time for the ONNX model to load
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Phase 3: Implementing Horizontal Pod Autoscaler (HPA)
&lt;/h2&gt;

&lt;p&gt;Scalability is handled by the HPA, which requires the Metrics Server to be running.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;YAML

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: clothing-classifier-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: clothing-classifier-deployment
  minReplicas: 2
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50 # Scale up if CPU exceeds 50%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: Under load, the HPA dynamically adjusts replica count. This is the definition of &lt;strong&gt;elastic, cost-effective MLOps&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Read the full guide &lt;a href="https://medium.com/@meediax.digital/deploying-machine-learning-models-on-kubernetes-a-practical-guide-with-fastapi-docker-and-kind-048fdf1483f4" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you're deploying any Python API, adopting this pattern for resource management and scaling will save you major headaches down the road.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>kubernetes</category>
      <category>fastapi</category>
      <category>ai</category>
    </item>
    <item>
      <title>Serverless Deep Learning: From Notebook to Production with AWS Lambda</title>
      <dc:creator>Austin Deyan</dc:creator>
      <pubDate>Mon, 08 Dec 2025 10:02:25 +0000</pubDate>
      <link>https://forem.com/austin_deyan_6c9b2445aed6/serverless-deep-learning-from-notebook-to-production-with-aws-lambda-3386</link>
      <guid>https://forem.com/austin_deyan_6c9b2445aed6/serverless-deep-learning-from-notebook-to-production-with-aws-lambda-3386</guid>
      <description>&lt;p&gt;Training a model in a Jupyter Notebook is satisfying. But deploying it? That's where the headaches usually start. Today, I'm going to show you how to deploy a Keras image classifier using &lt;strong&gt;AWS Lambda&lt;/strong&gt; and &lt;strong&gt;TensorFlow Lite&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Serverless?
&lt;/h2&gt;

&lt;p&gt;AWS Lambda is "Serverless," meaning you don't manage the OS or hardware. You just upload code. It's cheap because you only pay when your code runs.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Heavyweight Problem 🏋️
&lt;/h2&gt;

&lt;p&gt;Standard TensorFlow is huge (approx. 1.7 GB). If you try to shove this into a Lambda function, you'll run into storage issues and slow performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Lightweight Solution ⚡
&lt;/h2&gt;

&lt;p&gt;We use TensorFlow Lite. It optimizes the model for inference (prediction) only, stripping out all the training logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: The Handler Code
&lt;/h2&gt;

&lt;p&gt;Your Python script needs a special function to handle the AWS event:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import tflite_runtime.interpreter as tflite
from keras_image_helper import create_preprocessor

interpreter = tflite.Interpreter(model_path='clothing-model.tflite')
interpreter.allocate_tensors()

def lambda_handler(event, context):
    url = event['url']
    # ... preprocessing and inference logic ...
    return result
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: The Dockerfile
&lt;/h2&gt;

&lt;p&gt;We use Docker to package our dependencies. Crucial Tip: When installing the TF-Lite runtime from a URL, ensure you use the &lt;code&gt;raw&lt;/code&gt; version of the link, or &lt;code&gt;pip&lt;/code&gt; will throw a &lt;code&gt;BadZipFile&lt;/code&gt; error.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Deploy with Serverless Framework
&lt;/h2&gt;

&lt;p&gt;Instead of clicking buttons in the AWS console, we can use a &lt;code&gt;serverless.yml&lt;/code&gt; file to describe our infrastructure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;service: clothing-model
provider:
  name: aws
  ecr:
    images:
      appimage:
        path: ./
functions:
  predict:
    image:
      name: appimage
    events:
      - http:
          path: predict
          method: post
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running &lt;code&gt;serverless deploy&lt;/code&gt; handles the Docker build, ECR upload, and Lambda creation automatically!&lt;/p&gt;

&lt;p&gt;Happy coding!&lt;/p&gt;

</description>
      <category>serverless</category>
      <category>lambda</category>
      <category>tensorflow</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>PyTorch in Practice: Engineering a Custom CNN for Hair Texture Classification</title>
      <dc:creator>Austin Deyan</dc:creator>
      <pubDate>Mon, 01 Dec 2025 19:04:56 +0000</pubDate>
      <link>https://forem.com/austin_deyan_6c9b2445aed6/pytorch-in-practice-engineering-a-custom-cnn-for-hair-texture-classification-1b37</link>
      <guid>https://forem.com/austin_deyan_6c9b2445aed6/pytorch-in-practice-engineering-a-custom-cnn-for-hair-texture-classification-1b37</guid>
      <description>&lt;p&gt;In the current landscape of Computer Vision, the default move is often Transfer Learning—taking a massive model like ResNet50 and fine-tuning it. While effective, this often abstracts away the fundamental mechanics of how a network actually "sees" texture.&lt;/p&gt;

&lt;p&gt;For my latest project, I decided to build a Convolutional Neural Network (CNN) entirely from scratch using &lt;strong&gt;PyTorch&lt;/strong&gt;. My goal? To build a binary classifier capable of distinguishing between hair textures (e.g., &lt;strong&gt;Curly vs. Straight&lt;/strong&gt;) using the Kaggle Hair Type dataset.&lt;/p&gt;

&lt;p&gt;Here is a look under the hood of the architecture and the engineering decisions I made.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. The Data Pipeline: Why Augmentation Matters
&lt;/h2&gt;

&lt;p&gt;The input images were standardized to $200 \times 200$ pixels. However, training a model from scratch on a smaller dataset poses a high risk of overfitting—where the model memorizes the images rather than learning the features.&lt;/p&gt;

&lt;p&gt;To combat this, I engineered a robust training pipeline using &lt;code&gt;torchvision.transforms&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Instead of feeding the model static images, I applied dynamic transformations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Random Rotations (50°):&lt;/strong&gt; To handle different head tilts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Random Resized Crop:&lt;/strong&gt; To force the model to look at different scales of the hair strands.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Horizontal Flips:&lt;/strong&gt; To ensure directional invariance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Crucially, I kept the &lt;strong&gt;Test Set&lt;/strong&gt; deterministic (only resizing and normalizing) to ensure I had a stable benchmark for evaluation.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The Architecture
&lt;/h2&gt;

&lt;p&gt;I opted for a lightweight, shallow architecture to test how much information could be extracted with minimal compute.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2cjh72zjj46lcou515ym.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2cjh72zjj46lcou515ym.jpeg" alt="Convolutional Neural Network" width="800" height="235"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Stack:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input:&lt;/strong&gt; &lt;code&gt;(3, 200, 200)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feature Extraction:&lt;/strong&gt; A generic convolutional layer (32 filters, $3\times3$ kernel) followed by ReLU activation and $2\times2$ Max Pooling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dimensionality Reduction:&lt;/strong&gt; A Flatten layer converting the 2D feature maps into a vector of over 313,000 features.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Classification Head:&lt;/strong&gt; A dense hidden layer (64 neurons) leading to a &lt;strong&gt;single output neuron&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. The "Binary" Nuance
&lt;/h2&gt;

&lt;p&gt;Since I designed this as a binary classifier, the output layer and loss function had to be paired perfectly.&lt;/p&gt;

&lt;p&gt;I used a &lt;strong&gt;Sigmoid&lt;/strong&gt; activation on the final neuron to squash the output between 0 and 1 (representing probability). Consequently, I utilized &lt;strong&gt;Binary Cross Entropy Loss (&lt;code&gt;BCELoss&lt;/code&gt;)&lt;/strong&gt; rather than the standard Cross Entropy used in multi-class problems.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# The Classification Head
&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;99&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fc2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sigmoid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Sigmoid&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Training for Reproducibility
&lt;/h2&gt;

&lt;p&gt;One of the biggest challenges in ML engineering is reproducibility. To ensure my results weren't just a fluke of random initialization, I strictly seeded the environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;SEED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;
&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SEED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;manual_seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SEED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backends&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cudnn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deterministic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I used &lt;strong&gt;Stochastic Gradient Descent (SGD)&lt;/strong&gt; with a learning rate of 0.002 and momentum of 0.8. I tracked the &lt;strong&gt;Median Training Accuracy&lt;/strong&gt; across epochs to filter out noise and the &lt;strong&gt;Mean Test Loss&lt;/strong&gt; to monitor generalization.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;Building this from scratch reinforced several core Deep Learning concepts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Input math is critical:&lt;/strong&gt; Calculating the exact feature map size after convolution and pooling is necessary to line up the Linear layers.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Data is king:&lt;/strong&gt; The model performance improved significantly after introducing the RandomResizedCrop augmentation.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Simplicity works:&lt;/strong&gt; You don't always need a Transformer. For distinct textural differences, a simple CNN is fast, lightweight, and effective.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;#MachineLearning #PyTorch #ComputerVision #DeepLearning #DataScience #CNN&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>python</category>
      <category>datascience</category>
      <category>github</category>
    </item>
    <item>
      <title>🌾 How I Built &amp; Deployed a Crop Yield Prediction API in the Cloud</title>
      <dc:creator>Austin Deyan</dc:creator>
      <pubDate>Mon, 17 Nov 2025 23:15:51 +0000</pubDate>
      <link>https://forem.com/austin_deyan_6c9b2445aed6/how-i-built-deployed-a-crop-yield-prediction-api-in-the-cloud-o8h</link>
      <guid>https://forem.com/austin_deyan_6c9b2445aed6/how-i-built-deployed-a-crop-yield-prediction-api-in-the-cloud-o8h</guid>
      <description>&lt;p&gt;Hey devs! 👋&lt;/p&gt;

&lt;p&gt;I just wrapped up a super interesting project and wanted to share the entire journey—wins, fails, and everything in between.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;An AI-powered crop yield prediction system that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Predicts harvest yields with 91% accuracy&lt;/li&gt;
&lt;li&gt;Serves predictions via REST API&lt;/li&gt;
&lt;li&gt;Runs on Google Cloud Run&lt;/li&gt;
&lt;li&gt;Has a beautiful web UI&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Stack
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Backend:  Python + Flask + Scikit-learn
DevOps:   Docker + Google Cloud Run
Frontend: Vanilla JS + HTML/CSS (keeping it simple!)
ML:       Gradient Boosting Regressor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Journey (Story Time! 📖)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Week 1: Data Exploration Hell 😅
&lt;/h3&gt;

&lt;p&gt;Started with messy agricultural data. Spent days just cleaning and understanding it. Pro tip: ALWAYS look at your data distributions first!&lt;/p&gt;

&lt;h3&gt;
  
  
  Week 2: Model Selection Drama 🤖
&lt;/h3&gt;

&lt;p&gt;Trained 7 models. Linear Regression? Terrible. Decision Trees? Overfitting. Random Forest? Better but slow. Gradient Boosting? &lt;em&gt;Chef's kiss&lt;/em&gt; 👌&lt;/p&gt;

&lt;p&gt;Here's the comparison:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;Model&lt;/span&gt;               &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="err"&gt;²&lt;/span&gt; &lt;span class="n"&gt;Score&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;MAE&lt;/span&gt;
&lt;span class="o"&gt;--------------------|----------|--------&lt;/span&gt;
&lt;span class="n"&gt;Gradient&lt;/span&gt; &lt;span class="n"&gt;Boosting&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mf"&gt;0.913&lt;/span&gt;    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mf"&gt;0.31&lt;/span&gt;
&lt;span class="n"&gt;Random&lt;/span&gt; &lt;span class="n"&gt;Forest&lt;/span&gt;       &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mf"&gt;0.895&lt;/span&gt;    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mf"&gt;0.35&lt;/span&gt;
&lt;span class="n"&gt;Linear&lt;/span&gt; &lt;span class="n"&gt;Regression&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mf"&gt;0.623&lt;/span&gt;    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mf"&gt;0.89&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Week 3: Docker Nightmares 🐳
&lt;/h3&gt;

&lt;p&gt;"Works on my machine" → Real problem.&lt;/p&gt;

&lt;p&gt;Issue #1: Model files not loading in container&lt;br&gt;
Solution: Load model at module level, not in &lt;code&gt;if __name__ == '__main__'&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Issue #2: CORS blocking requests&lt;br&gt;
Solution: &lt;code&gt;pip install flask-cors&lt;/code&gt; saved my life&lt;/p&gt;
&lt;h3&gt;
  
  
  Week 4: Cloud Deployment Victory! ☁️
&lt;/h3&gt;

&lt;p&gt;Google Cloud Run = Amazing for ML models&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Serverless (scales to zero!)&lt;/li&gt;
&lt;li&gt;Easy Docker deployment&lt;/li&gt;
&lt;li&gt;Built-in HTTPS&lt;/li&gt;
&lt;li&gt;Pay per request&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Code Snippets
&lt;/h2&gt;

&lt;p&gt;Here's the prediction endpoint (simplified):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/predict&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;methods&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Preprocessing
&lt;/span&gt;    &lt;span class="n"&gt;input_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;input_encoded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_dummies&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;input_scaled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;scaler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_encoded&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Prediction
&lt;/span&gt;    &lt;span class="n"&gt;prediction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_scaled&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;predicted_yield&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;unit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tons_per_hectare&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Biggest Learnings
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Data &amp;gt; Models&lt;/strong&gt;: Feature engineering mattered more than model selection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment is Hard&lt;/strong&gt;: Spend time on DevOps early&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UI Matters&lt;/strong&gt;: Built a simple HTML interface—users loved it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation&lt;/strong&gt;: Write it as you code, not after!&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;🔗 &lt;a href="https://medium.com/@meediax.digital/from-raw-data-to-production-building-a-crop-yield-prediction-system-ml-zoomcamp-project-772f0b597e12?postPublishedType=initial" rel="noopener noreferrer"&gt;More info on Medium&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;Planning to add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Time-series forecasting&lt;/li&gt;
&lt;li&gt;[ ] Weather API integration&lt;/li&gt;
&lt;li&gt;[ ] Mobile app&lt;/li&gt;
&lt;li&gt;[ ] Model retraining pipeline&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Questions?
&lt;/h2&gt;

&lt;p&gt;Drop them in the comments! Happy to discuss anything about ML deployment, Docker, or agricultural AI! 👇&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>python</category>
      <category>devops</category>
      <category>api</category>
    </item>
  </channel>
</rss>
