Forem: Bibhu Pradhan

Building a Self-Improving Orchestration Layer for IoT Dashboards

Bibhu Pradhan — Tue, 26 May 2026 06:22:29 +0000

This is a submission for the Hermes Agent Challenge: Write About Hermes Agent

When mapping out the future roadmap for AirSense AI, the primary goal was to evolve the hyper-local air quality intelligence dashboard by integrating data directly from physical IoT sensors in specific localities. The bottleneck, as it turns out, isn't the hardware itself, but the orchestration. Managing unpredictable sensor streams, handling node dropouts, normalizing messy JSON payloads, and updating a dashboard autonomously requires more than a simple cron job and a Python script.

Enter Hermes Agent by Nous Research.

If you've been following the open-source agentic space, you already know that Hermes is turning heads because it doesn't just execute tool calls - it actually learns. In this post, I want to break down why Hermes Agent's architecture is a paradigm shift for developers building physical-to-digital pipelines, and how its specific capabilities solve the exact orchestration problems encountered when managing decentralized data nodes.

1. The Closed-Loop Learning System (SKILL.md)

Standard AI agents execute a prompt, return a result, and instantly forget the execution path. If an IoT sensor in a specific locality sends a malformed payload, a standard agent might figure out how to parse it using a code execution tool, but it will have to re-solve that same problem from scratch tomorrow.

Hermes features a built-in closed-loop learning system. When it runs through a complex trajectory (usually involving 5+ tool calls to troubleshoot and format data), it automatically reflects on its success and crystallizes the workflow into a permanent, reusable SKILL.md file stored locally. The next time the dashboard receives that same malformed payload, Hermes doesn't guess—it relies on its procedural memory, bypassing the costly reasoning steps and executing the fix immediately. The agent literally writes its own playbook for your specific edge cases.

2. The Three-Layer Memory Architecture

Handling a hyper-local intelligence dashboard requires context that spans across days and weeks, not just a single session. Hermes separates its memory gracefully:

Working Memory: The immediate context of the current sensor ingestion task.
Episodic Memory: Powered by a local SQLite FTS5 database, Hermes remembers cross-session facts. It learns that "Node #4 in the downtown locality drops packets when it rains" and retrieves that context automatically without requiring hardcoded logic.
Procedural Memory: The repository of the auto-created skills.

This means you aren't just building an app; you are training an autonomous operator that becomes uniquely attuned to the specific quirks of your infrastructure.

3. Execution Environment Flexibility

One of the biggest risks of building with AI agents is vendor lock-in. Hermes Agent completely decouples the orchestration logic from the model provider. You can run it against Claude or GPT-4 for complex reasoning tasks, or route it to a local, lightweight Qwen 3.6 model for repetitive sensor polling. It speaks standard OpenAI-compatible JSON to any backend, and even allows per-task provider overrides.

For a project dealing with continuous IoT sensor data integration, having the option to use local Docker containers or an E2B cloud sandbox for isolated, high-security code execution is a game-changer.

What This Means for the Future

We are moving past the era of "AI as a chat interface" and entering the era of "AI as persistent infrastructure." When an open-source framework like Hermes allows a developer to spin up an agent that self-improves, persists memory locally without privacy trade-offs, and seamlessly bridges the gap between raw hardware data and a polished dashboard, the barrier to building enterprise-grade intelligent systems essentially vanishes.

The true power of open agentic systems isn't just about automating tasks - it's about building tools that get smarter alongside us as our projects grow.

Over to You!

If you found this breakdown insightful, please leave a reaction below to support the post!

How are you approaching agentic workflows or IoT data orchestration in your own projects? Drop a comment below if you have any questions related to Hermes Agent, local setups, or memory management - let's spark a discussion!

FairLens AI: An Intelligent Dashboard for Automated Bias Auditing

Bibhu Pradhan — Tue, 26 May 2026 05:47:36 +0000

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

FairLens AI is a premium, high-end SaaS platform designed for AI-powered bias auditing. I built this tool to help data scientists and researchers easily identify, quantify, and mitigate hidden biases within their datasets before those datasets are used to train machine learning models.

My vision as a developer has always been to create meaningful impact in society through technology. Monitoring and detection systems are crucial for accountability in tech, and I realized that while many people talk about AI fairness, there are very few accessible, beautifully designed tools to actually measure it. FairLens AI bridges that gap. By simply uploading a CSV dataset, users receive instant insights into fairness metrics across protected attributes, visualized through an interactive, glassmorphism-styled dashboard. It calculates complex metrics like Demographic Parity Ratio and Disparate Impact, assigns an overall fairness score, and provides actionable mitigation recommendations.

Demo

Live Project Link: FairLens AI Platform
GitHub Repository: bibhupradhanofficial/fairlens-ai

Video Demo:

Screenshots:
Fairness score:

AI-generated executive summary and intersectional analysis:

The Comeback Story

This project originally started as an ambitious idea for a data visualization dashboard, but I hit a massive roadblock when it came to the actual data science and backend engineering. The Finish-Up-A-Thon gave me the exact push I needed to rethink my architecture and finally complete it.

Where the project was before:
Previously, FairLens AI was essentially a beautiful, static mockup. I had built out the frontend architecture using React 18, Vite, and Tailwind CSS, and perfected the UI using Framer Motion and Recharts to give it a premium feel. However, the project stalled completely at the backend. Writing a manual, hardcoded statistical engine capable of parsing diverse datasets, calculating edge cases for Disparate Impact, and figuring out "feature importance" was overwhelming. The dashboard was full of dummy data, and the repository sat untouched.

What I added and fixed to finish it up (The "After"):
To bring the project across the finish line, I completely abandoned the idea of hardcoding the statistical logic and pivoted to an AI-agentic architecture. I added the following major features:

Supabase Edge Functions: I implemented a robust, serverless backend using Deno (audit-bias/index.ts) to securely handle the dataset statistics over an API without bogging down the client.
Google Gemini 3 Integration: I connected the Edge Function to the Google Gemini 3 Flash Preview model via an AI gateway. I engineered a highly specific system prompt that feeds the CSV cross-tabulations to the LLM and forces it to act as a "Fairness Expert."
Structured JSON Insights: Instead of returning plain text, I configured the AI to return strictly typed JSON tool calls containing the exact fairness metrics, an overall 0-100 fairness score, and concrete mitigation steps.
Dynamic Frontend Wiring: I updated the AuditDashboard to dynamically map this live AI data into my Recharts visualizations and metric gauges, turning the UI into a fully functional, intelligent auditing tool.

My Experience with GitHub Copilot

GitHub Copilot was an absolute game-changer for pushing this project to completion, particularly when navigating the complex typing requirements between the frontend and the Supabase Edge Functions.

Type Safety & Boilerplate: Copilot anticipated the Zod schemas and TypeScript interfaces required for my AuditResult objects, saving me hours of manual typing.
Component Generation: When building the AuditDashboard.tsx and the MetricGauge components, Copilot suggested the repetitive Tailwind classes needed for the glassmorphism effects and conditional rendering (e.g., automatically suggesting the success/warning/destructive color mappings based on the metric status).
Data Parsing: Copilot was incredibly helpful in suggesting the logic for processing the CSV outputs and formatting the cross-tabulations accurately before sending them off to the Edge Function payload.

It acted as a constant pair programmer, allowing me to focus on the high-level architecture and the user experience rather than getting bogged down in syntax.

If you found FairLens AI interesting or helpful, please leave a ❤️ or 🦄 reaction on this post.

Got any questions about how I implemented the Supabase Edge Functions, engineered the Gemini prompts, or built the glassmorphism UI? Drop a comment below, and I'd be happy to answer them!

Looking for Teamates

Bibhu Pradhan — Tue, 03 Mar 2026 13:39:00 +0000

Are you a passionate programmer or problem solver eager to tackle real-world challenges using technology and AI, but often held back? This could be because you lack a team or don't possess all the necessary tech stacks required to solve a specific problem or participate in hackathons.

I am Bibhu Pradhan, an Engineering undergraduate student from India. I am looking for teamates from India to participate in upcoming hackathons with me. We will focus on solving real-world problems using technology and AI, and I invite you to join my team.

My LinkedIn Profile: https://www.linkedin.com/in/bibhupradhanofficial
My GitHub Profile: https://github.com/bibhupradhanofficial

If interested message me on LinkedIn

AI Pitch Deck Generator: A multimodal AI agent that generates complete startup pitch decks

Bibhu Pradhan — Sun, 01 Mar 2026 11:38:56 +0000

This is a submission for the Built with Google Gemini: Writing Challenge

What I Built with Google Gemini

Founders and entrepreneurs often spend countless hours agonizing over the formatting, narrative structure, and visual design of their pitch decks instead of focusing on building their actual product.

I built the AI Pitch Deck Generator to remove this friction entirely. It is a powerful, multimodal web application that takes a simple startup idea and transforms it into a comprehensive, cohesive, and investor-ready pitch package in under a minute.

Google Gemini's Role:
Google's Generative AI ecosystem is the core engine of this project. The application utilizes a multi-agent architecture powered by the new google-genai SDK:

Gemini 2.0 Flash (gemini-2.0-flash): Acts as the master orchestrator. It processes the user's idea and generates a highly structured JSON response containing the full narrative (8 slides, speaker notes, social media captions), specifications for data charts, and detailed prompts for the image and video models.
Imagen 3 (imagen-3.0-generate-002): Consumes the prompts written by Gemini to generate high-quality, photorealistic product mockups and thematic scene visuals.
Veo 2.0 (veo-2.0-generate-001): Creates a dynamic, 5-second cinematic promotional video clip for the startup based on Gemini's prompt.

The backend (FastAPI) then programmatically renders premium charts using matplotlib and assembles everything into a downloadable PowerPoint (.pptx) file.

Demo

bibhupradhanofficial / AI-Pitch-Deck-Generator

A multimodal AI agent that generates complete startup pitch decks including slides, charts, product mockup images, voiceover scripts, promo video clips, and social media captions from a single text prompt.

AI Pitch Deck Generator

Project Overview

AI Pitch Deck Generator is a powerful tool that leverages Google's Generative AI to automatically create and assemble pitch decks. It handles everything from drafting content to generating visual assets and charts, providing a seamless generation experience with real-time streaming feedback to the user.

Architecture

+------+      +----------+      +-----------------------+      +--------------+      +-----------------------+
|      |      |          |      |                       |      |              |      |                       |
| User | ---> | Frontend | ---> | FastAPI / Cloud Run   | ---> | Gemini Agent | ---> | [Imagen, Veo, Charts] |
|      |      |          |      |                       |      |              |      |                       |
+------+      +----------+      +-----------------------+      +--------------+      +-----------------------+
   ^                                                                                             |
   |                                                                                             |
   |                                                                                             v
   |                                                                                         +-------+
   +--------------------------------------- Response stream -------------------------------- |  GCS  |
                                                                                             +-------+

Prerequisites

Before you begin, ensure you have the following requirements met:

Python: 3.11 or higher
GCP Account: A Google Cloud project with an active billing account
Google Cloud…

View on GitHub

What I Learned

Building this application pushed me to learn a lot about orchestrating complex AI workflows and building reactive user interfaces:

Real-Time Streaming (SSE): Because generating images, videos, and complex charts takes time, I learned how to implement Server-Sent Events (SSE) using FastAPI. This allowed the backend to stream text, status updates, and individual assets to the vanilla JavaScript frontend as soon as they were ready, creating a magical, progressively revealing UI instead of a boring loading spinner.
Agentic Orchestration: I learned advanced techniques in prompt engineering to force Gemini to output strict, complex JSON structures reliably. Getting the model to act as a "director" that writes prompts for other models (Imagen and Veo) was a fascinating exercise in AI-to-AI communication.
Programmatic Asset Generation: I deepened my Python skills by using python-pptx to dynamically calculate layouts and build native PowerPoint files, and configuring matplotlib to render beautiful, premium dark-themed data visualizations.

Google Gemini Feedback

What worked well:

The new google-genai SDK is incredibly clean and intuitive. Being able to access text, image, and video generation models from a single unified client made the backend architecture much simpler.
Gemini 2.0 Flash is phenomenal. Its speed and ability to consistently adhere to a complex JSON schema (containing arrays of slides, chart data, and nested dictionaries) made it the perfect orchestration agent.

Where I ran into friction:

Video Generation Polling: Integrating Veo 2.0 required handling long-running operations. Since video generation isn't instant, I had to implement an asynchronous polling mechanism to check the operation status (client.operations.get(operation)) and eventually extract the video bytes. Figuring out how to do this smoothly without blocking the FastAPI event loop took some trial and error.
Cross-Model Prompting: Getting Gemini to write good prompts for Imagen was sometimes tricky. I had to inject strict system instructions and formatting rules (like appending specific style keywords) to ensure the generated images matched the overall dark-mode aesthetic of the application.

Challenges we ran into

Multimodal Orchestration: Coordinating asynchronous calls to three different AI models (Gemini, Imagen, and Veo) while ensuring the narrative, visual aesthetics, and generated data remained cohesive was complex.
Structured Output Formatting: Ensuring that the LLM consistently returned highly structured, valid JSON containing slide data, exact chart configurations, and specific image/video prompts required meticulous prompt engineering and fallback handling.
Real-Time User Experience: Generating heavy media assets like videos and images takes time. Keeping the user engaged required implementing an SSE (Server-Sent Events) pipeline to stream text, status updates, and individual assets to the frontend as soon as they were ready, rather than forcing the user to wait at a blank loading screen.
Programmatic PPTX Generation: Calculating layouts, scaling images, and ensuring the programmatically generated PowerPoint file looked professional and properly aligned required extensive fine-tuning using python-pptx.
Google Cloud Billing Requirements: We faced a significant roadblock when trying to enable the Google Cloud Storage (Buckets) service. The platform requires active billing information to be set up before allowing the service to be enabled.

APOD Mood Gallery: A visually rich, AI-powered interactive astronomy gallery

Bibhu Pradhan — Sun, 01 Mar 2026 10:47:52 +0000

This is a submission for the DEV Weekend Challenge: Community

The Community

This project was built for the community of space enthusiasts, astronomy lovers, and astrophotography fans who follow NASA's Astronomy Picture of the Day (APOD). It serves anyone who wants to explore the cosmos not just scientifically, but through visual aesthetics, emotional moods, and personalized collections.

What I Built

I built the APOD Mood Gallery, a Progressive Web App (PWA) that takes NASA's iconic APOD archive and transforms it into an interactive, visually stunning, and intelligent experience. The application includes the following features:

AI Image Analysis: Completely private, in-browser image analysis using TensorFlow.js to identify visual characteristics and classify images by their "mood".
Dynamic Color Palettes: Automatic extraction and display of beautiful color palettes from astronomical imagery, utilizing Web Workers to maintain a snappy UI.
3D Solar System & Exoplanets: Interactive exploration of real-time planetary positions using 3D rendering.
Personalized "For You" Feed: A local recommendation engine that learns what types of space images you appreciate over time.
Mood Board Creator: A tool to curate favorite images into a visual mood board that can be exported locally via PDF or ZIP.

Demo

✨LIVE DEMO: APOD Mood Gallery

Code

bibhupradhanofficial / APOD-Mood-Gallery

A visually rich, AI-powered interactive gallery using NASA’s Astronomy Picture of the Day (APOD) API, featuring mood classification, color palette extraction, and immersive space exploration.

🌌 APOD Mood Gallery

NASA Astronomy Pictures - Explore the cosmos through moods, palettes, and AI-powered collections.

APOD Mood Gallery takes NASA's iconic Astronomy Picture of the Day (APOD) archive and transforms it into an interactive, visually stunning, and intelligent experience. Using client-side machine learning and advanced 3D rendering, it analyzes celestial images to extract dominant color palettes, classify emotional moods, and provide personalized space discoveries.

✨ Features

PWA Support: Installable Progressive Web App with offline capabilities and background APOD synchronization.
AI Image Analysis: Completely private, in-browser image analysis using TensorFlow.js (MobileNet). Identifies visual characteristics and content to classify images by "mood".
Dynamic Color Palettes: Automatically extracts and displays beautiful, harmonious color palettes from astronomical imagery using Web Workers to keep the UI snappy.
3D Solar System & Exoplanets: Explore real-time planetary positions using astronomy-engine and react-three-fiber.
Personalized "For You" Feed: A local recommendation…

View on GitHub

How I Built It

The application was built with a strong focus on client-side performance, intelligent processing, and modern web standards:

Frontend Framework: Developed using React 19 and Vite for a fast, modern development experience.
Styling: Crafted with Tailwind CSS, PostCSS, and AutoPrefixer.
Machine Learning: Integrated @tensorflow/tfjs and the MobileNet model (@tensorflow-models/mobilenet) to run image classification directly in the browser.
3D Rendering: Built the interactive space environments using three, @react-three/fiber, and @react-three/drei, combined with astronomy-engine for accurate celestial math.
State & Performance: Utilized custom local storage services and Web Workers for parallel processing of image pixels to prevent main-thread blocking.