<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: marcusmayo</title>
    <description>The latest articles on Forem by marcusmayo (@marcusmayo).</description>
    <link>https://forem.com/marcusmayo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3549315%2Fd732ee96-0496-413e-847b-5957b8567d6d.png</url>
      <title>Forem: marcusmayo</title>
      <link>https://forem.com/marcusmayo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/marcusmayo"/>
    <language>en</language>
    <item>
      <title>💭 PromptOps Policy Coach — From Metrics to Mechanisms You Can Trust</title>
      <dc:creator>marcusmayo</dc:creator>
      <pubDate>Tue, 21 Oct 2025 16:13:09 +0000</pubDate>
      <link>https://forem.com/marcusmayo/promptops-policy-coach-from-metrics-to-mechanisms-you-can-trust-c5f</link>
      <guid>https://forem.com/marcusmayo/promptops-policy-coach-from-metrics-to-mechanisms-you-can-trust-c5f</guid>
      <description>&lt;p&gt;If you’ve ever tried to scale AI inside a big company, you already know the truth: models aren’t the hard part — &lt;strong&gt;trust is.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
And trust doesn’t show up because we ask for it; it shows up because we can &lt;strong&gt;measure&lt;/strong&gt; what’s happening and &lt;strong&gt;govern&lt;/strong&gt; how it behaves.&lt;/p&gt;

&lt;p&gt;Last week I shared &lt;em&gt;Why Metrics Matter&lt;/em&gt; — how velocity, predictability, and flow efficiency quietly fixed delivery pain on our AI teams.&lt;br&gt;&lt;br&gt;
Today I’m taking that one step further with &lt;strong&gt;PromptOps Policy Coach&lt;/strong&gt; — a platform that turns those same delivery insights into &lt;strong&gt;governable AI systems&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 This article is part of my &lt;strong&gt;Weekend AI Project Series&lt;/strong&gt;, where I turn weekend build ideas into production-ready AI systems.&lt;br&gt;&lt;br&gt;
👉 Read Part 1 — &lt;a href="https://dev.to/marcusmayo/weekend-ai-project-series-adventures-in-vibe-coding-gk1"&gt;Adventures in Vibe Coding&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  🎯 TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🧠 &lt;strong&gt;What it is:&lt;/strong&gt; A policy Q&amp;amp;A system that runs one question through &lt;strong&gt;five&lt;/strong&gt; prompt frameworks and tracks cost/performance in real time.
&lt;/li&gt;
&lt;li&gt;💡 &lt;strong&gt;Why it exists:&lt;/strong&gt; To make AI &lt;strong&gt;consistent, explainable, and affordable&lt;/strong&gt; across HR/Legal/IT.
&lt;/li&gt;
&lt;li&gt;⚙️ &lt;strong&gt;What it proves:&lt;/strong&gt; Enterprise AI isn’t “just prompts.” It’s &lt;strong&gt;patterns + governance + metrics&lt;/strong&gt; working together.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Frameworks:&lt;/strong&gt; CRAFT, CRISPE, Chain-of-Thought, Constitutional AI, ReAct&lt;br&gt;&lt;br&gt;
&lt;strong&gt;RAG:&lt;/strong&gt; Custom numpy-based retrieval&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; &amp;lt; $0.01/query (OpenAI GPT-4o-mini)&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Deploy:&lt;/strong&gt; Docker or Google Cloud Shell&lt;/p&gt;


&lt;h2&gt;
  
  
  💬 The backstory
&lt;/h2&gt;

&lt;p&gt;In big orgs, three things kill AI rollouts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Inconsistent outputs&lt;/strong&gt; — same question, five answers.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runaway costs&lt;/strong&gt; — invisible API usage that eats budgets alive.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slow adoption&lt;/strong&gt; — heavy infra that scares off internal teams.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So I standardized how the AI &lt;strong&gt;reasons&lt;/strong&gt;, made sources explicit with &lt;strong&gt;RAG&lt;/strong&gt;, and surfaced &lt;strong&gt;cost &amp;amp; performance&lt;/strong&gt; as first-class citizens. That became PromptOps.&lt;/p&gt;


&lt;h2&gt;
  
  
  ⚙️ A quick tour
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1) One clean interface&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Ask a policy question. Pick a framework. Get the answer &lt;em&gt;and&lt;/em&gt; see what it cost. Simple and transparent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2) Five brains, one question&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ReAct&lt;/strong&gt; — think → act → observe
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CRISPE&lt;/strong&gt; — helpful, human tone
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CRAFT&lt;/strong&gt; — exec-level structure
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chain-of-Thought&lt;/strong&gt; — careful reasoning
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Constitutional AI&lt;/strong&gt; — principles + self-checks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3) RAG that’s boring on purpose&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Custom, numpy-based, dependency-light. Fast and deployable anywhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4) Live metrics&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Tokens, cost, response time — per framework — because if you can’t &lt;em&gt;see&lt;/em&gt; it, you can’t &lt;em&gt;trust&lt;/em&gt; it.&lt;/p&gt;


&lt;h2&gt;
  
  
  🏗️ Architecture
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TB
    A[Company Docs] --&amp;gt; B[Chunk &amp;amp; Index]
    B --&amp;gt; C[Vector Search (numpy + embeddings)]
    E[User Query] --&amp;gt; D[Multi-Framework Engine]
    C --&amp;gt; D
    D --&amp;gt; F[CRAFT]
    D --&amp;gt; G[CRISPE]
    D --&amp;gt; H[Chain-of-Thought]
    D --&amp;gt; I[Constitutional AI]
    D --&amp;gt; J[ReAct]
    F --&amp;gt; K[OpenAI GPT-4o-mini]
    G --&amp;gt; K
    H --&amp;gt; K
    I --&amp;gt; K
    J --&amp;gt; K
    K --&amp;gt; L[Response + Sources + Metrics]
    L --&amp;gt; M[Cost Tracking]
    L --&amp;gt; O[Dashboard]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;🧩 Engineering highlights&lt;/p&gt;

&lt;p&gt;✅ Custom RAG &amp;gt; heavy stacks — smaller image, fewer headaches, clearer control.&lt;/p&gt;

&lt;p&gt;✅ Cloud Shell optimized — consistent demo environment (no local setup drama).&lt;/p&gt;

&lt;p&gt;✅ OpenAI v1 client — cheaper, future-proof.&lt;/p&gt;

&lt;p&gt;✅ Defensive code — zero-error demos.&lt;/p&gt;

&lt;p&gt;Benchmarks: 2.4–8.4s response | $0.0001–$0.0002/query | &amp;lt;200MB footprint.&lt;/p&gt;

&lt;p&gt;🚀 What it means for enterprise teams&lt;/p&gt;

&lt;p&gt;HR/IT/Legal → consistent answers with source links&lt;/p&gt;

&lt;p&gt;Finance → predictable usage and spend&lt;/p&gt;

&lt;p&gt;Compliance → logs and auditability&lt;/p&gt;

&lt;p&gt;Product → compare frameworks and ship what users prefer&lt;/p&gt;

&lt;p&gt;It’s a working prototype of how AI governance should feel — transparent, fast, dependable.&lt;/p&gt;

&lt;p&gt;🛠️ Quick start&lt;/p&gt;

&lt;p&gt;Cloud Shell / Local&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/marcusmayo/machine-learning-portfolio.git
cd machine-learning-portfolio/prompt-ops-policy-coach
pip install -r requirements.txt
streamlit run app/enhanced_app.py \
  --server.port 8501 --server.address 0.0.0.0 \
  --browser.serverAddress localhost \
  --browser.gatherUsageStats false \
  --server.enableCORS false --server.enableXsrfProtection false
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Docker&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker build -t policy-coach .
docker run -d -p 8080:8080 --name policy-coach-prod --env-file .env policy-coach
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🧭 What’s next&lt;/p&gt;

&lt;p&gt;Framework marketplace · SSO/RBAC · QA suite for prompt consistency · Cost optimizer · Kubernetes scaling.&lt;/p&gt;

&lt;p&gt;💬 Final thought&lt;/p&gt;

&lt;p&gt;If Why Metrics Matter was about measuring, PromptOps is about governing.&lt;br&gt;
Measure → improve. Govern → trust.&lt;/p&gt;

&lt;p&gt;🧠 Read My AI Build Logs (CTA)&lt;br&gt;
Medium → &lt;a href="https://medium.com/@mayo.marcus" rel="noopener noreferrer"&gt;https://medium.com/@mayo.marcus&lt;/a&gt;&lt;br&gt;
Dev.to → &lt;a href="https://dev.to/marcusmayo"&gt;https://dev.to/marcusmayo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📇 Connect&lt;br&gt;
LinkedIn → &lt;a href="https://lnkd.in/e9CBVihC" rel="noopener noreferrer"&gt;https://lnkd.in/e9CBVihC&lt;/a&gt;&lt;br&gt;
X / Twitter → &lt;a href="https://x.com/MarcusMayoAI" rel="noopener noreferrer"&gt;https://x.com/MarcusMayoAI&lt;/a&gt;&lt;br&gt;
Email → &lt;a href="//marcusmayo.ai@gmail.com"&gt;marcusmayo.ai@gmail.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;💻 Portfolio&lt;br&gt;
Part 1 — &lt;a href="https://github.com/marcusmayo/machine-learning-portfolio" rel="noopener noreferrer"&gt;https://github.com/marcusmayo/machine-learning-portfolio&lt;/a&gt;&lt;br&gt;
Part 2 — &lt;a href="https://github.com/marcusmayo/ai-ml-portfolio-2" rel="noopener noreferrer"&gt;https://github.com/marcusmayo/ai-ml-portfolio-2&lt;/a&gt;&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>devops</category>
      <category>ai</category>
      <category>llm</category>
    </item>
    <item>
      <title>From ML Beginner to Production Engineer: How I’m Leveling Up My AI Projects</title>
      <dc:creator>marcusmayo</dc:creator>
      <pubDate>Sun, 12 Oct 2025 16:26:59 +0000</pubDate>
      <link>https://forem.com/marcusmayo/from-ml-beginner-to-production-engineer-how-im-leveling-up-my-ai-projects-2glg</link>
      <guid>https://forem.com/marcusmayo/from-ml-beginner-to-production-engineer-how-im-leveling-up-my-ai-projects-2glg</guid>
      <description>&lt;p&gt;🎯 &lt;em&gt;From training toy models to shipping real ML systems — here’s what that journey really looks like.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Most people start their ML learning journey in Jupyter notebooks. But when you want your model to serve real users, things get serious — and a lot more complex.&lt;/p&gt;

&lt;p&gt;Here’s how the levels break down 👇&lt;/p&gt;




&lt;h3&gt;
  
  
  🧩 Level 1 – Learning the Basics
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Clean datasets (Kaggle, UCI)&lt;/li&gt;
&lt;li&gt;Jupyter notebooks &amp;amp; visualization&lt;/li&gt;
&lt;li&gt;Simple metrics and evaluation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ⚙️ Level 2 – Professional Data Science
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Handling messy, real-world data&lt;/li&gt;
&lt;li&gt;Organized code + config files&lt;/li&gt;
&lt;li&gt;Feature engineering &amp;amp; tuning&lt;/li&gt;
&lt;li&gt;Git for reproducibility&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🚀 Level 3 – Machine Learning Engineering
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Containerized model APIs (Docker/FastAPI)&lt;/li&gt;
&lt;li&gt;MLflow for tracking + model registry&lt;/li&gt;
&lt;li&gt;CI/CD pipelines&lt;/li&gt;
&lt;li&gt;Monitoring &amp;amp; scaling on AWS/GCP&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;I'm documenting my path across these levels — moving from education to &lt;em&gt;execution&lt;/em&gt;.&lt;br&gt;&lt;br&gt;
The next phase: &lt;strong&gt;Level 4&lt;/strong&gt;, where models scale, retrain automatically, and support real users.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 Read My AI Build Logs
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/marcusmayo"&gt;Weekend AI Project Series on Dev.to&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/in/marcusmayo" rel="noopener noreferrer"&gt;LinkedIn Articles&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📫 Get In Touch
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;LinkedIn:&lt;/strong&gt; &lt;a href="https://linkedin.com/in/marcusmayo" rel="noopener noreferrer"&gt;Connect with me&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;X / Twitter:&lt;/strong&gt; &lt;a href="https://x.com/MarcusMayoAI" rel="noopener noreferrer"&gt;@MarcusMayoAI&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Email:&lt;/strong&gt; &lt;a href="mailto:marcusmayo.ai@gmail.com"&gt;marcusmayo.ai@gmail.com&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Portfolio Part 1:&lt;/strong&gt; &lt;a href="https://github.com/marcusmayo/machine-learning-portfolio" rel="noopener noreferrer"&gt;AI &amp;amp; MLOps Projects&lt;/a&gt;  &lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>mlops</category>
      <category>ai</category>
      <category>buildinpublic</category>
    </item>
    <item>
      <title>🎙️ What Building the AI Interview Analyzer Taught Me About Production ML</title>
      <dc:creator>marcusmayo</dc:creator>
      <pubDate>Sat, 11 Oct 2025 12:36:22 +0000</pubDate>
      <link>https://forem.com/marcusmayo/what-building-the-ai-interview-analyzer-taught-me-about-production-ml-26ij</link>
      <guid>https://forem.com/marcusmayo/what-building-the-ai-interview-analyzer-taught-me-about-production-ml-26ij</guid>
      <description>&lt;p&gt;After shipping the AI Interview Analyzer on GCP&lt;br&gt;
, I realized that production-ready AI isn’t about adding more models — it’s about orchestrating them efficiently.&lt;/p&gt;

&lt;p&gt;This build used:&lt;/p&gt;

&lt;p&gt;FastAPI + Whisper for fast audio transcription&lt;/p&gt;

&lt;p&gt;RoBERTa + Toxic-BERT + mDeBERTa for tone and competency scoring&lt;/p&gt;

&lt;p&gt;Gemini 2.0 Flash for contextual feedback&lt;/p&gt;

&lt;p&gt;Compute Engine to handle large audio workloads&lt;/p&gt;

&lt;p&gt;It taught me three truths about real ML deployment:&lt;br&gt;
1️⃣ Infrastructure matters more than model size.&lt;br&gt;
2️⃣ Feedback loops make AI useful, not just functional.&lt;br&gt;
3️⃣ Performance visibility (CloudWatch / GCP Monitoring) builds trust.&lt;/p&gt;

&lt;p&gt;Full article 👇&lt;br&gt;
🔗 &lt;a href="https://dev.to/marcusmayo/building-an-ai-powered-interview-analyzer-on-gcp-31ia"&gt;https://dev.to/marcusmayo/building-an-ai-powered-interview-analyzer-on-gcp-31ia&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📢 Follow my AI builds &amp;amp; insights:&lt;br&gt;
 | 🐦 &lt;a href="https://x.com/MarcusMayoAI" rel="noopener noreferrer"&gt;@MarcusMayoAI&lt;/a&gt;&lt;br&gt;
 | 🧠 &lt;a href="https://dev.to/marcusmayo"&gt;Dev.to/marcusmayo&lt;/a&gt;&lt;br&gt;
 | 💻 &lt;a href="https://lnkd.in/ezrSSDUR" rel="noopener noreferrer"&gt;GitHub/marcusmayo&lt;/a&gt;&lt;br&gt;
 | 💼 &lt;a href="https://lnkd.in/eNSvdtpH" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>googlecloud</category>
      <category>architecture</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>🧠 The Simplest Neural Network That Actually Works</title>
      <dc:creator>marcusmayo</dc:creator>
      <pubDate>Fri, 10 Oct 2025 21:59:09 +0000</pubDate>
      <link>https://forem.com/marcusmayo/the-simplest-neural-network-that-actually-works-3j54</link>
      <guid>https://forem.com/marcusmayo/the-simplest-neural-network-that-actually-works-3j54</guid>
      <description>&lt;p&gt;Before tackling multi-layer or transformer architectures, I built the simplest neural network I could — a single-layer perceptron to classify 0s and 1s from the MNIST dataset.&lt;/p&gt;

&lt;p&gt;Project Highlights:&lt;/p&gt;

&lt;p&gt;Framework: TensorFlow + Keras&lt;/p&gt;

&lt;p&gt;Architecture: 1 Dense layer, 1 neuron, sigmoid activation&lt;/p&gt;

&lt;p&gt;Optimizer: SGD&lt;/p&gt;

&lt;p&gt;Accuracy: 99.9% test accuracy&lt;/p&gt;

&lt;p&gt;Dataset: MNIST (filtered to digits 0 and 1)&lt;/p&gt;

&lt;p&gt;Key Takeaway:&lt;br&gt;
Even a one-layer model can teach core ML principles:&lt;/p&gt;

&lt;p&gt;Data normalization&lt;/p&gt;

&lt;p&gt;Gradient descent&lt;/p&gt;

&lt;p&gt;Binary cross-entropy&lt;/p&gt;

&lt;p&gt;Evaluation with precision, recall, and F1&lt;/p&gt;

&lt;p&gt;Explore the notebook here 👇&lt;br&gt;
🔗 &lt;a href="https://github.com/marcusmayo/machine-learning-portfolio/blob/main/simple-neural-network.ipynb" rel="noopener noreferrer"&gt;Simple Neural Network Project&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📢 Follow my AI builds &amp;amp; insights:&lt;br&gt;
 🐦 &lt;a href="https://x.com/MarcusMayoAI" rel="noopener noreferrer"&gt;@MarcusMayoAI&lt;/a&gt;&lt;br&gt;
 | 🧠 &lt;a href="https://dev.to/marcusmayo"&gt;Dev.to/marcusmayo&lt;/a&gt;&lt;br&gt;
 | 💻 &lt;a href="https://lnkd.in/ezrSSDUR" rel="noopener noreferrer"&gt;GitHub/marcusmayo&lt;/a&gt;&lt;br&gt;
 | 💼 &lt;a href="https://lnkd.in/eNSvdtpH" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>deeplearning</category>
      <category>buildinpublic</category>
    </item>
    <item>
      <title>💡 What Serverless Design Taught Me About AI Cost Optimization</title>
      <dc:creator>marcusmayo</dc:creator>
      <pubDate>Thu, 09 Oct 2025 17:36:46 +0000</pubDate>
      <link>https://forem.com/marcusmayo/what-serverless-design-taught-me-about-ai-cost-optimization-24f5</link>
      <guid>https://forem.com/marcusmayo/what-serverless-design-taught-me-about-ai-cost-optimization-24f5</guid>
      <description>&lt;p&gt;Building the Edenred Invoice Assistant&lt;br&gt;
 taught me something simple but powerful:&lt;/p&gt;

&lt;p&gt;Cost optimization is an architecture decision, not a finance one.&lt;/p&gt;

&lt;p&gt;When you design AI systems that know when not to run — like spinning down SageMaker endpoints when idle — you turn efficiency into intelligence.&lt;/p&gt;

&lt;p&gt;This approach saved 90% in AWS cost and improved uptime reliability.&lt;/p&gt;

&lt;p&gt;You can explore the full architecture and lessons learned here 👇&lt;br&gt;
🔗 &lt;a href="https://dev.to/marcusmayo/edenred-invoice-assistant-serverless-ai-chatbot-for-invoice-payment-support-4bpn"&gt;https://dev.to/marcusmayo/edenred-invoice-assistant-serverless-ai-chatbot-for-invoice-payment-support-4bpn&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;👋 Follow for real-world AI builds:&lt;br&gt;
 |🐦 &lt;a href="https://x.com/MarcusMayoAI" rel="noopener noreferrer"&gt;@MarcusMayoAI&lt;/a&gt;&lt;br&gt;
 | 🧠 &lt;a href="https://dev.to/marcusmayo"&gt;Dev.to&lt;/a&gt;&lt;br&gt;
 | 💻 &lt;a href="https://github.com/marcusmayo" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;br&gt;
 | 💼 &lt;a href="https://www.linkedin.com/in/marcusmayo" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>serverless</category>
      <category>architecture</category>
      <category>ai</category>
      <category>aws</category>
    </item>
    <item>
      <title>🤖 Edenred Invoice Assistant – Serverless AI Chatbot for Invoice &amp; Payment Support</title>
      <dc:creator>marcusmayo</dc:creator>
      <pubDate>Mon, 06 Oct 2025 23:55:53 +0000</pubDate>
      <link>https://forem.com/marcusmayo/edenred-invoice-assistant-serverless-ai-chatbot-for-invoice-payment-support-4bpn</link>
      <guid>https://forem.com/marcusmayo/edenred-invoice-assistant-serverless-ai-chatbot-for-invoice-payment-support-4bpn</guid>
      <description>&lt;p&gt;🤖 Edenred Invoice Assistant – Serverless AI Chatbot for Invoice &amp;amp; Payment Support&lt;br&gt;
Part of my ongoing Weekend AI Project Series where I turn weekend experiments into production-grade AI systems.&lt;/p&gt;

&lt;p&gt;🎯 Problem &amp;amp; Goal&lt;br&gt;
Invoice and payment queries overwhelm finance teams — repetitive, predictable, and time-consuming.&lt;br&gt;
I wanted to build a serverless AI assistant that handles these inquiries in real-time, with zero infrastructure overhead and full cost control.&lt;/p&gt;

&lt;p&gt;☁️ Architecture Overview&lt;br&gt;
Tech Stack:&lt;/p&gt;

&lt;p&gt;🧠 AWS Lambda — Serverless compute for sub-second responses&lt;/p&gt;

&lt;p&gt;🧩 Amazon SageMaker — Model fine-tuning and batch training&lt;/p&gt;

&lt;p&gt;📦 S3 — Training data, model artifacts, and logs&lt;/p&gt;

&lt;p&gt;🔗 API Gateway — REST endpoint serving chatbot responses&lt;/p&gt;

&lt;p&gt;📊 CloudWatch — Monitoring and error tracking&lt;/p&gt;

&lt;p&gt;Frontend → API Gateway → AWS Lambda → SageMaker Inference (trained model)&lt;br&gt;
                                   ↳ Fallback logic → Pre-trained responses (cost-optimized)&lt;br&gt;
Key Design Principle:&lt;br&gt;
The SageMaker endpoint spins down post-training. Fallback logic (pattern-based responses) handles common queries — maintaining user experience while cutting costs by ~90%.&lt;/p&gt;

&lt;p&gt;💡 Features Implemented&lt;br&gt;
✅ Smart Response System:&lt;br&gt;
Handles invoice submissions, payment status, and account queries.&lt;/p&gt;

&lt;p&gt;✅ SageMaker Training Pipeline:&lt;br&gt;
Fine-tuned Hugging Face model for domain-specific language understanding.&lt;/p&gt;

&lt;p&gt;✅ Cost-Optimized Fallback Logic:&lt;br&gt;
Pattern-matched responses cover 95% of user queries without active inference cost.&lt;/p&gt;

&lt;p&gt;✅ Lambda Optimization:&lt;br&gt;
Pre-loaded model weights and response caching = sub-second latency.&lt;/p&gt;

&lt;p&gt;✅ Enterprise Readiness:&lt;br&gt;
CORS-enabled, IAM roles configured, and robust error handling for production-grade uptime.&lt;/p&gt;

&lt;p&gt;🧩 System Highlights&lt;br&gt;
Metric  Performance&lt;br&gt;
Response Time   &amp;lt; 1s average&lt;br&gt;
Accuracy (trained scenarios)    95%+&lt;br&gt;
Uptime  100% (fallback system)&lt;br&gt;
Cost Reduction  90% vs always-on SageMaker&lt;br&gt;
Deployment  Serverless (Lambda + API Gateway)&lt;/p&gt;

&lt;p&gt;📊 Chatbot in Action&lt;br&gt;
Demo Scenarios:&lt;/p&gt;

&lt;p&gt;“How do I submit an invoice?”&lt;/p&gt;

&lt;p&gt;“Check payment status.”&lt;/p&gt;

&lt;p&gt;“Why was my invoice rejected?”&lt;/p&gt;

&lt;p&gt;“How do I update my bank details?”&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6wue5uq5bmab8ujlk4f4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6wue5uq5bmab8ujlk4f4.png" alt="AI Chatbot UI showing conversation flow and results" width="800" height="784"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🧱 Fallback ensures continuity even if the model endpoint is offline.&lt;/p&gt;

&lt;p&gt;⚙️ Technical Implementation&lt;br&gt;
Lambda Function&lt;br&gt;
Handles logic, invokes SageMaker if active, else triggers fallback:&lt;/p&gt;

&lt;p&gt;def handler(event, context):&lt;br&gt;
    query = event["queryStringParameters"]["q"].lower()&lt;br&gt;
    if query in fallback_responses:&lt;br&gt;
        return fallback_responses[query]&lt;br&gt;
    else:&lt;br&gt;
        return sagemaker_inference(query)&lt;br&gt;
Fallback Response Dictionary&lt;br&gt;
python&lt;br&gt;
Copy code&lt;br&gt;
fallback_responses = {&lt;br&gt;
    "invoice submission": "You can submit invoices via the Finance Portal under 'Upload Invoice'.",&lt;br&gt;
    "payment status": "Payments are processed every Thursday. You can track them via your dashboard.",&lt;br&gt;
    "bank details": "Update your bank info under Profile &amp;gt; Payment Settings."&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;🧱 Deployment Details&lt;br&gt;
Environment Setup:&lt;/p&gt;

&lt;p&gt;Python 3.11&lt;/p&gt;

&lt;p&gt;Dependencies in requirements.txt&lt;/p&gt;

&lt;p&gt;Deploy via AWS SAM CLI or Serverless Framework&lt;/p&gt;

&lt;p&gt;Pipeline:&lt;/p&gt;

&lt;p&gt;Train model → Save artifact to S3&lt;/p&gt;

&lt;p&gt;Deploy Lambda with model fallback logic&lt;/p&gt;

&lt;p&gt;Integrate API Gateway endpoint&lt;/p&gt;

&lt;p&gt;Configure CloudWatch for monitoring&lt;/p&gt;

&lt;p&gt;🧭 Key Learnings&lt;br&gt;
1️⃣ Cost vs. Performance: Smart fallbacks drastically reduce inference costs without hurting UX.&lt;br&gt;
2️⃣ Serverless Design: Lambda is ideal for low-frequency workloads with instant scale.&lt;br&gt;
3️⃣ MLOps Simplified: SageMaker pipelines streamline model iteration and retraining.&lt;br&gt;
4️⃣ Enterprise Fit: Combining AI logic with predictable cost makes adoption easier for finance teams.&lt;/p&gt;

&lt;p&gt;🧱 Repository &amp;amp; Access&lt;br&gt;
📁 Code: github.com/marcusmayo/ai-ml-portfolio-2&lt;/p&gt;

&lt;p&gt;🧠 Main Portfolio: github.com/marcusmayo/machine-learning-portfolio&lt;/p&gt;

&lt;p&gt;💬 Closing Thoughts&lt;br&gt;
Building Edenred Invoice Assistant reinforced one key idea:&lt;/p&gt;

&lt;p&gt;“Intelligent cost optimization is as valuable as model accuracy.”&lt;/p&gt;

&lt;p&gt;This project shows how to merge AI innovation with business pragmatism — a skill every ML product engineer needs.&lt;/p&gt;

&lt;p&gt;Follow for More:&lt;/p&gt;

&lt;p&gt;🧠 &lt;a href="https://dev.to/marcusmayo/weekend-ai-project-series-adventures-in-vibe-coding-gk1"&gt;Weekend AI Project Series&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;💼 LinkedIn &lt;a href="https://linkedin.com/in/marcusmayo" rel="noopener noreferrer"&gt;Connect with me&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;💻 GitHub &lt;a href="https://github.com/marcusmayo/machine-learning-portfolio" rel="noopener noreferrer"&gt;Live Projects&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>mlops</category>
      <category>aws</category>
      <category>python</category>
    </item>
    <item>
      <title>🚀 Weekend AI Project Series: Adventures in Vibe Coding</title>
      <dc:creator>marcusmayo</dc:creator>
      <pubDate>Mon, 06 Oct 2025 17:12:30 +0000</pubDate>
      <link>https://forem.com/marcusmayo/weekend-ai-project-series-adventures-in-vibe-coding-gk1</link>
      <guid>https://forem.com/marcusmayo/weekend-ai-project-series-adventures-in-vibe-coding-gk1</guid>
      <description>&lt;h1&gt;🚀 Weekend AI Project Series: Adventures in Vibe Coding&lt;/h1&gt;

&lt;p&gt;
The &lt;strong&gt;Weekend AI Project Series&lt;/strong&gt; turns 48-hour builds into &lt;strong&gt;production-ready AI systems&lt;/strong&gt;.  
Each episode explores a new MLOps challenge — from architecture tradeoffs to cost optimization and deployment pipelines — with real code, measurable outcomes, and product thinking at the core.
&lt;/p&gt;

&lt;h3&gt;✅ Every project demonstrates:&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Real-world ML pipelines&lt;/li&gt;
  &lt;li&gt;Cloud deployment (AWS / GCP)&lt;/li&gt;
  &lt;li&gt;MLOps best practices&lt;/li&gt;
  &lt;li&gt;Cost-optimization strategies&lt;/li&gt;
  &lt;li&gt;Lessons from production AI delivery&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;🎯 Recent Episodes&lt;/h2&gt;

&lt;p&gt;
  &lt;a href="https://dev.to/marcusmayo/building-an-ai-powered-interview-analyzer-on-gcp-113h"&gt;
    &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbc17s4te7du3wrtgbslo.png" alt="AI Interview Analyzer on GCP — Episode 1" width="800" height="533"&gt;
  &lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
  &lt;a href="https://dev.to/marcusmayo/ederned-invoice-assistant-serverless-ai-chatbot-for-invoice-support-4g8n"&gt;
    &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faqcnrme3q2rahjqktxk9.png" alt="Ederned Invoice Assistant on AWS — Episode 2" width="800" height="533"&gt;
  &lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
  &lt;a href="https://dev.to/marcusmayo/the-simplest-neural-network-that-actually-works-3j54"&gt;
    &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3o26nwc165fo0ys2ia1w.png" alt="The Simplest Neural Network That Actually Works — Episode 3" width="800" height="800"&gt;
  &lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
  &lt;a href="https://dev.to/marcusmayo/promptops-policy-coach-from-metrics-to-mechanisms-you-can-trust-c5f"&gt;
    &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy86lc7p9zchvrsxb5jez.png" alt="PromptOps Policy Coach — Episode 4" width="800" height="450"&gt;
  &lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;em&gt;✨ Episode 5 drops this weekend — stay tuned for the next build in the Adventures in Vibe Coding series.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;🐙 All Projects on GitHub&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://github.com/marcusmayo/ai-ml-portfolio" rel="noopener noreferrer"&gt;AI + ML Portfolio (Part 1)&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://github.com/marcusmayo/ai-ml-portfolio-2" rel="noopener noreferrer"&gt;AI + ML Portfolio (Part 2)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;🔗 Follow Along for Weekend AI Builds&lt;/h2&gt;

&lt;p&gt;
Each weekend = one real-world AI product built, shipped, and documented.  
Follow for hands-on experiments in &lt;strong&gt;AI Engineering&lt;/strong&gt;, &lt;strong&gt;MLOps&lt;/strong&gt;, and &lt;strong&gt;Product Strategy&lt;/strong&gt;.
&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;📖 &lt;strong&gt;Weekend AI Project Series (Medium List):&lt;/strong&gt; &lt;a href="https://medium.com/@mayo.marcus/list/weekend-ai-project-series-adventures-in-vibe-coding-26527bc4d47b" rel="noopener noreferrer"&gt;medium.com/@mayo.marcus/list/weekend-ai-project-series-adventures-in-vibe-coding&lt;/a&gt;
&lt;/li&gt;
  &lt;li&gt;💡 &lt;strong&gt;Dev.to Hub:&lt;/strong&gt; &lt;a href="https://dev.to/marcusmayo/weekend-ai-project-series-adventures-in-vibe-coding-gk1"&gt;dev.to/marcusmayo/weekend-ai-project-series-adventures-in-vibe-coding&lt;/a&gt;
&lt;/li&gt;
  &lt;li&gt;💼 &lt;strong&gt;LinkedIn:&lt;/strong&gt; &lt;a href="https://linkedin.com/in/e9CBVihC" rel="noopener noreferrer"&gt;linkedin.com/in/e9CBVihC&lt;/a&gt;
&lt;/li&gt;
  &lt;li&gt;🐦 &lt;strong&gt;X (Twitter):&lt;/strong&gt; &lt;a href="https://x.com/MarcusMayoAI" rel="noopener noreferrer"&gt;x.com/MarcusMayoAI&lt;/a&gt;
&lt;/li&gt;
  &lt;li&gt;🐙 &lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/marcusmayo" rel="noopener noreferrer"&gt;github.com/marcusmayo&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;✨ Turning weekend ideas into production-grade AI systems — one episode at a time.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>portfolio</category>
      <category>devops</category>
      <category>cloud</category>
      <category>ai</category>
    </item>
    <item>
      <title>🎙️ Building an AI-Powered Interview Analyzer on GCP</title>
      <dc:creator>marcusmayo</dc:creator>
      <pubDate>Mon, 06 Oct 2025 16:01:31 +0000</pubDate>
      <link>https://forem.com/marcusmayo/building-an-ai-powered-interview-analyzer-on-gcp-31ia</link>
      <guid>https://forem.com/marcusmayo/building-an-ai-powered-interview-analyzer-on-gcp-31ia</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;A production-grade AI project that listens, scores, and coaches — built over a single weekend as part of my &lt;a href="https://www.linkedin.com/pulse/weekend-ai-project-series-adventures-vibe-coding-marcus-wubie" rel="noopener noreferrer"&gt;Weekend AI Project Series: Adventures in Vibe Coding&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;AI interviews are messy. Human feedback is subjective.&lt;br&gt;&lt;br&gt;
So I built a system that listens, transcribes, analyzes, and &lt;em&gt;mentors&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;In this deep dive, I’ll show you how I:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deployed a &lt;strong&gt;FastAPI backend&lt;/strong&gt; with &lt;strong&gt;Whisper ASR&lt;/strong&gt; for transcription
&lt;/li&gt;
&lt;li&gt;Integrated &lt;strong&gt;3 NLP models&lt;/strong&gt; (RoBERTa, Toxic-BERT, mDeBERTa) for sentiment and competency scoring
&lt;/li&gt;
&lt;li&gt;Added &lt;strong&gt;Gemini 2.0 Flash&lt;/strong&gt; for human-like feedback
&lt;/li&gt;
&lt;li&gt;Migrated from &lt;strong&gt;Cloud Run&lt;/strong&gt; to &lt;strong&gt;Compute Engine&lt;/strong&gt; for production workloads
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the end, you’ll see how to turn a weekend experiment into a &lt;strong&gt;fully-functional, production-ready AI application&lt;/strong&gt; — the kind of build that gets noticed by both engineers and hiring managers.  &lt;/p&gt;




&lt;p&gt;🚀 Project Overview&lt;/p&gt;

&lt;p&gt;This project demonstrates how to build a production-ready AI interview analysis system — one that evaluates communication quality, professionalism, and competency in recorded interviews.&lt;/p&gt;

&lt;p&gt;It combines:&lt;/p&gt;

&lt;p&gt;🎙️ Speech-to-text (ASR) using Whisper&lt;/p&gt;

&lt;p&gt;🧠 NLP scoring with RoBERTa, Toxic-BERT, and mDeBERTa&lt;/p&gt;

&lt;p&gt;🤖 Feedback generation with Gemini 2.0 Flash&lt;/p&gt;

&lt;p&gt;The system produces quantitative scores, segment-level analytics, and contextual AI feedback — the kind that turns interview recordings into actionable coaching data.&lt;/p&gt;

&lt;p&gt;⚙️ Architecture Overview&lt;/p&gt;

&lt;p&gt;The pipeline runs on Google Cloud Compute Engine (n1-standard-16) with the following key components:&lt;/p&gt;

&lt;p&gt;Audio Upload → Whisper ASR → NLP Scoring → Ensemble Aggregation → Gemini Feedback → UI Visualization&lt;/p&gt;

&lt;p&gt;Components:&lt;/p&gt;

&lt;p&gt;Frontend (HTML + JS): Handles uploads, displays scores, and feedback.&lt;/p&gt;

&lt;p&gt;FastAPI Backend (Python 3.11): Routes processing, manages inference requests.&lt;/p&gt;

&lt;p&gt;Whisper Models (ASR): Supports tiny → medium variants for speed/accuracy tradeoffs.&lt;/p&gt;

&lt;p&gt;NLP Models (Hugging Face):&lt;/p&gt;

&lt;p&gt;cardiffnlp/twitter-roberta-base-sentiment&lt;/p&gt;

&lt;p&gt;unitary/toxic-bert&lt;/p&gt;

&lt;p&gt;MoritzLaurer/mDeBERTa-v3-base-mnli-xnli&lt;/p&gt;

&lt;p&gt;LLM Feedback: Powered by Google Gemini 2.0 Flash for summarization and recommendations.&lt;/p&gt;

&lt;p&gt;🧠 Core ML Pipeline&lt;/p&gt;

&lt;p&gt;Here’s how each component works together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Transcription (ASR)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Whisper model transcribes the uploaded interview audio (MP3/WAV/M4A).&lt;/p&gt;

&lt;p&gt;from faster_whisper import WhisperModel&lt;/p&gt;

&lt;p&gt;model = WhisperModel("tiny", device="cuda")&lt;br&gt;
segments, _ = model.transcribe("interview.m4a")&lt;br&gt;
transcript = " ".join([s.text for s in segments])&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;NLP Scoring&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each transcript segment is passed through three different models:&lt;/p&gt;

&lt;p&gt;from transformers import pipeline&lt;/p&gt;

&lt;p&gt;sentiment = pipeline("sentiment-analysis", model="cardiffnlp/twitter-roberta-base-sentiment")&lt;br&gt;
toxicity = pipeline("text-classification", model="unitary/toxic-bert")&lt;br&gt;
competency = pipeline("zero-shot-classification", model="MoritzLaurer/mDeBERTa-v3-base-mnli-xnli")&lt;/p&gt;

&lt;p&gt;result = {&lt;br&gt;
    "sentiment": sentiment(transcript[:512])[0]["score"],&lt;br&gt;
    "toxicity": 1 - toxicity(transcript[:512])[0]["score"],  # inverted&lt;br&gt;
    "competency": competency(transcript[:512], ["leadership", "communication", "technical skill"])&lt;br&gt;
}&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Ensemble Scoring System&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The scores are normalized and weighted across five dimensions:&lt;/p&gt;

&lt;p&gt;Component   Weight  Purpose&lt;br&gt;
Sentiment   0.25    Emotional tone&lt;br&gt;
Toxicity    0.20    Professionalism&lt;br&gt;
Competency  0.25    Skill fit&lt;br&gt;
Keywords    0.15    Domain-specific terms&lt;br&gt;
Filler Words    0.15    Clarity of expression&lt;/p&gt;

&lt;p&gt;This produces an overall “Interview Fit Score” between 0–100.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;AI Feedback (Gemini Integration)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After scoring, Gemini 2.0 Flash generates structured feedback:&lt;/p&gt;

&lt;p&gt;prompt = f"""&lt;br&gt;
You are an AI interviewer. Based on the following transcript and scores:&lt;br&gt;
{transcript[:2000]}&lt;br&gt;
Scores: {result}&lt;br&gt;
Provide:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;3 Strengths&lt;/li&gt;
&lt;li&gt;3 Areas for Improvement&lt;/li&gt;
&lt;li&gt;2 Next Steps
"""&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;response = gemini.generate_text(prompt)&lt;/p&gt;

&lt;p&gt;Output Example:&lt;/p&gt;

&lt;p&gt;Strengths: Excellent communication and positive tone.&lt;/p&gt;

&lt;p&gt;Improvement: Needs stronger technical examples.&lt;/p&gt;

&lt;p&gt;Next Steps: Practice STAR method; refine domain language.&lt;/p&gt;

&lt;p&gt;🧩 Visualization&lt;/p&gt;

&lt;p&gt;The frontend visualizes scores with color-coded progress bars and an NLP-driven performance timeline:&lt;/p&gt;

&lt;p&gt;📊 Dashboard Example&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F31ppvwxpwqqvo795xkp2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F31ppvwxpwqqvo795xkp2.png" alt="📸 Figure 1: Scoring dashboard with live component breakdowns (Example dashboard showing sentiment, toxicity, and competency breakdowns)" width="530" height="707"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📋 AI Feedback Example&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftf035bfel61n2pcxkje2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftf035bfel61n2pcxkje2.png" alt="📸 Figure 2: AI-generated interview feedback and improvement plan (Gemini-powered strengths, improvement areas, and next steps)" width="533" height="677"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🧱 Deployment Details&lt;/p&gt;

&lt;p&gt;Cloud: Google Cloud Compute Engine&lt;/p&gt;

&lt;p&gt;Machine: n1-standard-16 (16 vCPUs, 64GB RAM)&lt;/p&gt;

&lt;p&gt;Environment: Dockerized FastAPI service&lt;/p&gt;

&lt;p&gt;Storage: Local + Cloud Storage (optional for large files)&lt;/p&gt;

&lt;p&gt;Monitoring: Basic logging via Cloud Logging&lt;/p&gt;

&lt;p&gt;Note: The system originally ran on Cloud Run, but due to the 32MB file upload limit, it was migrated to Compute Engine for unrestricted workloads.&lt;/p&gt;

&lt;p&gt;⚠️ Challenges &amp;amp; Fixes&lt;br&gt;
Issue   Root Cause  Resolution&lt;br&gt;
Cloud Run upload limit  32MB request cap    Migrated to Compute Engine VM&lt;br&gt;
Long Whisper inference  Model size vs. time Added model selection (tiny→medium)&lt;br&gt;
Flat score ranges   Heuristic-only scoring  Replaced with NLP-based segment scoring&lt;br&gt;
Dependency errors   Missing faster_whisper  Pinned requirements + venv isolation&lt;br&gt;
Frontend API mismatch   Response schema drift   Added unified response format + error handling&lt;/p&gt;

&lt;p&gt;🔍 Key Learnings&lt;/p&gt;

&lt;p&gt;Infrastructure matters — Serverless is not always production-friendly.&lt;/p&gt;

&lt;p&gt;Speed/Accuracy tradeoff — Tiny vs. Medium Whisper can be 8× faster for 90% of the accuracy.&lt;/p&gt;

&lt;p&gt;Heuristics ≠ ML — Real models make insights meaningful.&lt;/p&gt;

&lt;p&gt;UX is part of ML — Users need visible progress and clear outcomes.&lt;/p&gt;

&lt;p&gt;🧭 Future Roadmap&lt;/p&gt;

&lt;p&gt;WhisperX Word-Level Analysis&lt;br&gt;
→ Enables clickable word-level scoring visualization.&lt;/p&gt;

&lt;p&gt;Role-Aware Rubrics&lt;br&gt;
→ Zero-shot matching between candidate responses and job descriptions.&lt;/p&gt;

&lt;p&gt;Real-Time SSE Updates&lt;br&gt;
→ Show live progress of transcription and analysis in the UI.&lt;/p&gt;

&lt;p&gt;🧰 Tech Stack Summary&lt;br&gt;
Category    Tools / Services&lt;br&gt;
Cloud   GCP Compute Engine&lt;br&gt;
Backend FastAPI, Python 3.11&lt;br&gt;
ML  Whisper, RoBERTa, Toxic-BERT, mDeBERTa&lt;br&gt;
LLM Gemini 2.0 Flash&lt;br&gt;
Frontend    HTML + JS&lt;br&gt;
Infra   Docker, venv, Cloud Logging&lt;/p&gt;

&lt;p&gt;📂 Project Structure&lt;br&gt;
interview-predictor/&lt;br&gt;
├── app.py                    # FastAPI backend&lt;br&gt;
├── utils/&lt;br&gt;
│   ├── asr_processor.py      # Whisper transcription&lt;br&gt;
│   ├── nlp_analyzer.py       # NLP model scoring&lt;br&gt;
│   ├── ensemble_scorer.py    # Weighted aggregation&lt;br&gt;
│   ├── timeline_analyzer.py  # Segment analysis&lt;br&gt;
│   └── llm_feedback.py       # Gemini integration&lt;br&gt;
├── static/&lt;br&gt;
│   └── index.html            # Frontend UI&lt;br&gt;
├── requirements.txt          # Dependencies&lt;br&gt;
└── Dockerfile                # Deployment setup&lt;/p&gt;

&lt;p&gt;📁 GitHub&lt;/p&gt;

&lt;p&gt;🧠 Main portfolio: &lt;a href="https://github.com/marcusmayo/machine-learning-portfolio" rel="noopener noreferrer"&gt;https://github.com/marcusmayo/machine-learning-portfolio&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🚀 AI/ML portfolio (new repo): &lt;a href="https://github.com/marcusmayo/ai-ml-portfolio-2" rel="noopener noreferrer"&gt;https://github.com/marcusmayo/ai-ml-portfolio-2&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(New projects will be added here as the portfolio expands.)&lt;/p&gt;

&lt;p&gt;💬 Closing Thoughts&lt;/p&gt;

&lt;p&gt;This project taught me that the hardest part of AI engineering isn’t model tuning — it’s designing systems that work under real-world constraints.&lt;/p&gt;

&lt;p&gt;If you’re an ML engineer, data scientist, or product builder exploring AI system design, this project is a great blueprint to start from.&lt;/p&gt;

&lt;p&gt;Connect &amp;amp; Collaborate&lt;br&gt;
I’m open for:&lt;/p&gt;

&lt;p&gt;🤝 AI Product Coaching&lt;/p&gt;

&lt;p&gt;🧠 Consulting on AI/ML System Design&lt;/p&gt;

&lt;p&gt;💼 Collaborations with startups &amp;amp; innovation teams&lt;/p&gt;

&lt;p&gt;Follow my work:&lt;br&gt;
🧠 &lt;a href="https://dev.to/marcusmayo/weekend-ai-project-series-adventures-in-vibe-coding-gk1"&gt;Weekend AI Project Series&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;💼 LinkedIn &lt;a href="https://linkedin.com/in/marcusmayo" rel="noopener noreferrer"&gt;Connect with me&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;💻 GitHub &lt;a href="https://github.com/marcusmayo/machine-learning-portfolio" rel="noopener noreferrer"&gt;Live Projects&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>mlops</category>
      <category>python</category>
      <category>gcp</category>
    </item>
  </channel>
</rss>
