How’s My Day? — A Voice-First Mood Tracker Using AssemblyAI’s Speech Understanding & Real-Time TTS

Abhishek Taneja — Sun, 27 Jul 2025 15:53:08 +0000

This is a submission for the AssemblyAI Voice Agents Challenge

What I Built

Domain Expert Voice Agent

How’s My Day? is a one-shot voice check-in app that helps users feel heard — emotionally, not just functionally.

In just one tap:

The user speaks how they’re feeling
The app transcribes their voice in real-time using AssemblyAI Universal Streaming
It detects the emotional tone of their voice using AssemblyAI’s Speech Understanding
It matches that emotion to a hand-curated emotional tip using Algolia MCP Server
Finally, it reads the tip aloud using AssemblyAI’s Text-to-Speech — or falls back to ElevenLabs if needed

✨ The experience feels like being heard by someone who cares — not a chatbot.

🧠 How I Used AssemblyAI

We used AssemblyAI’s Universal-Streaming API to:

Capture and transcribe voice input with <300ms latency
Show live transcript to the user while they speak
Handle punctuation and natural pauses beautifully

What I Learned

AssemblyAI’s emotion detection is shockingly accurate — tone alone can reveal so much more than words
Transcription feels like magic when done right — and AssemblyAI nailed it
Using voice as input and output feels more natural than a chatbot for mental wellness apps
People want calm, 1-shot interactions — not 20-message bots

Challenges

Browser-based mic streaming + latency management was tricky
Emotion ↔ tip mapping needed thoughtful writing
Not all users want to hear their feelings read back — we added a toggle
AssemblyAI TTS is clean, but fallback was needed for broader support

Demo

GitHub Repository

abhishektaneja09 / how-is-your-day

How's My Day? - AI-Powered Voice Mood Tracker

A sophisticated, voice-powered mood tracking web application that listens to your feelings and provides supportive, human-like responses using cutting-edge AI technology.

✨ Features

🎤 Professional voice recording - File-based audio capture with high quality
🎯 AI-powered transcription - AssemblyAI integration for accurate speech-to-text
🧠 Enhanced mood detection - Local algorithm with scoring and emotion mapping
🤖 GPT-4o-mini responses - Human-like, empathetic AI-generated support messages
🔊 High-quality TTS - OpenAI text-to-speech with natural voice synthesis
⌨️ Real-time typing animation - Text appears character-by-character during speech
🎨 Modern UI - Clean, responsive design with smooth animations
🚀 Full-stack architecture - Node.js backend with Express server

🚀 Quick Setup Guide

Prerequisites

Node.js (v14 or higher) - Download here
Git (optional) - For cloning the repository
Modern web browser - Chrome, Firefox, Safari, or Edge

1. Installation

# Clone or download the project
git clone <

…

View on GitHub

How’s My Day? — A One-Shot Voice Check-In That Listens, Understands, and Responds

Abhishek Taneja — Sun, 27 Jul 2025 15:47:03 +0000

This is a submission for the Algolia MCP Server Challenge

What I Built

How’s My Day? is a voice-first emotional wellness app that helps people reflect on their mood — all through a single spoken sentence.

Users simply tap a mic button, speak freely, and the app:

Transcribes their voice in real time using AssemblyAI
Detects the emotion behind their voice using Speech Understanding
Searches Algolia MCP Server for a warm, human-written emotional tip
Reads the tip back using AssemblyAI’s or ElevenLabs’ Text-to-Speech

The result is a calm, empathetic micro-interaction that helps people feel heard — no chatbot, no complexity.

Demo

📺 Video Walkthrough

📦 GitHub Repo

[🔗 https://github.com/abhishektaneja09/how-is-your-day]

How I Utilized the Algolia MCP Server

Algolia’s MCP Server powers the backend of emotional intelligence in our app.

Here's how we used it:

We created a static moods.json index with handcrafted emotional responses for moods like anxious, burnt out, grateful, lonely, etc.
When the emotion is detected (via AssemblyAI), we query Algolia MCP with that mood keyword
MCP returns a personalized, human-style tip, emoji, and short comforting message
This response is instantly delivered to the user, giving the illusion of true emotional understanding

The fast response time of MCP and its simple API made it the perfect choice for a real-time emotional UX.

Key Takeaways

Voice-first interactions can feel personal, if designed with empathy
AssemblyAI’s Speech Understanding feature can infer emotional tone from voice — not just text
Algolia MCP makes it easy to build “domain brains” — in our case, an emotional advice brain
Text-to-Speech should be slow, warm, and natural — we used AssemblyAI, and ElevenLabs as a fallback
People don’t want a chatbot to solve their feelings — they want to be heard

Forem: Abhishek Taneja

How’s My Day? — A Voice-First Mood Tracker Using AssemblyAI’s Speech Understanding & Real-Time TTS

What I Built

🧠 How I Used AssemblyAI

What I Learned

Challenges

Demo

GitHub Repository

abhishektaneja09 / how-is-your-day

How's My Day? - AI-Powered Voice Mood Tracker

✨ Features

🚀 Quick Setup Guide

Prerequisites

1. Installation

How’s My Day? — A One-Shot Voice Check-In That Listens, Understands, and Responds

What I Built

Demo

How I Utilized the Algolia MCP Server

Key Takeaways