<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Kiran Baby</title>
    <description>The latest articles on Forem by Kiran Baby (@kiranbaby14).</description>
    <link>https://forem.com/kiranbaby14</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F744589%2F12dd11a6-9fe2-4dd2-b2d1-262e7f7e7567.jpeg</url>
      <title>Forem: Kiran Baby</title>
      <link>https://forem.com/kiranbaby14</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/kiranbaby14"/>
    <language>en</language>
    <item>
      <title>🌍 I Built MapMeet: A 3D Globe Event Platform for the Mux + DEV Challenge</title>
      <dc:creator>Kiran Baby</dc:creator>
      <pubDate>Wed, 31 Dec 2025 19:13:04 +0000</pubDate>
      <link>https://forem.com/kiranbaby14/i-built-mapmeet-a-3d-globe-event-platform-for-the-mux-dev-challenge-5ai7</link>
      <guid>https://forem.com/kiranbaby14/i-built-mapmeet-a-3d-globe-event-platform-for-the-mux-dev-challenge-5ai7</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/mux-2025-12-03"&gt;DEV's Worldwide Show and Tell Challenge Presented by Mux&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🦈 Alright Sharks... I Mean, Judges!
&lt;/h2&gt;

&lt;p&gt;I'll be honest with you. I've binge-watched way too many episodes of Shark Tank. The drama, the pitches, the "I'm out" moments... I'm completely hooked.&lt;/p&gt;

&lt;p&gt;So when I saw this challenge was literally described as &lt;strong&gt;"Shark Tank but without the sharks"&lt;/strong&gt; I knew this was my moment.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Today I'm here to pitch you &lt;strong&gt;MapMeet&lt;/strong&gt;, a global event discovery platform that lets anyone create, discover, and join events visualized on a stunning &lt;strong&gt;interactive 3D globe&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;But here's the twist that makes MapMeet different from every other event platform out there:&lt;/p&gt;

&lt;p&gt;🌐 &lt;strong&gt;Real-time geographic arcs&lt;/strong&gt; connect attendees to events on the globe. When someone RSVPs and shares their location, a beautiful animated arc draws from their location to the event showing the &lt;em&gt;global reach&lt;/em&gt; of your event in the most visually stunning way possible.&lt;/p&gt;

&lt;p&gt;Imagine hosting a hackathon and watching arcs light up from Tokyo, India, Lagos, Berlin, and San Francisco all converging on your event marker. &lt;em&gt;That's&lt;/em&gt; the MapMeet experience.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Psst... I created a live event for this hackathon so you can see it in action yourself. Link below! 👀)&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  My Pitch Video
&lt;/h2&gt;

&lt;p&gt;

&lt;iframe src="https://player.mux.com/PoySx1xvSXhMei3qc02BKKW6BVmooortd00dt1YMpt4Lg" width="710" height="399"&gt;
&lt;/iframe&gt;



&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;🌍 &lt;strong&gt;Live App:&lt;/strong&gt; &lt;a href="https://www.mapmeet.co" rel="noopener noreferrer"&gt;https://www.mapmeet.co&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🎉 JOIN THE MAPMEET EVENT I CREATED FOR THIS HACKATHON!
&lt;/h3&gt;

&lt;p&gt;I've created a special event on MapMeet to celebrate this Mux + DEV challenge. Join it to show your support and see the platform in action! I'm on Premium so &lt;strong&gt;unlimited people can join&lt;/strong&gt; - let's see how global we can make this! 🌍&lt;/p&gt;

&lt;p&gt;I did some digging and set the event location at &lt;strong&gt;Mux HQ in San Francisco&lt;/strong&gt; so all our arcs will converge right on their doorstep 😄 Also made a custom Mux + DEV cover image for it because why not go all in?&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://www.mapmeet.co/event/62e4buqr" rel="noopener noreferrer"&gt;JOIN: MapMeet Launch Party - Mux + DEV Hackathon 🌍&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No account needed to view, just click and explore! If you RSVP, you'll become one of those beautiful arcs on the globe. Let's light it up together! 🌈&lt;/p&gt;

&lt;h2&gt;
  
  
  How MapMeet Works - Complete Overview
&lt;/h2&gt;

&lt;p&gt;

&lt;iframe src="https://player.mux.com/FfGH0201WK8LlcSO9i00Aupn4NBqxQ35zXBdB902uPIfTn8" width="710" height="399"&gt;
&lt;/iframe&gt;



&lt;/p&gt;

&lt;h2&gt;
  
  
  The Story Behind It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem I Saw
&lt;/h3&gt;

&lt;p&gt;Every event platform feels &lt;em&gt;flat&lt;/em&gt;. You create an event, share a link, and hope people show up. There's no visual excitement, no sense of global community, no "wow factor" that makes people &lt;em&gt;want&lt;/em&gt; to share your event.&lt;/p&gt;

&lt;p&gt;I asked myself: &lt;strong&gt;What if attending an event felt like being part of something global?&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The MapMeet Vision
&lt;/h3&gt;

&lt;p&gt;MapMeet transforms event hosting into a visual experience. Concert organizers can show fans flying in from around the world. Hackathon hosts can visualize their global developer community. Marathon coordinators can display runners coming from every continent. Conference speakers can see their audience's geographic spread. Community meetups can prove their worldwide reach to sponsors.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Shareability Secret
&lt;/h3&gt;

&lt;p&gt;Here's something I'm really proud of: &lt;strong&gt;Event pages don't require login to view.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is huge. When you share your MapMeet event link on WhatsApp, Instagram, Twitter, or LinkedIn, anyone can see your stunning 3D globe visualization, view attendee arcs from around the world, read all event details, and get hyped about joining.&lt;/p&gt;

&lt;p&gt;No friction. No "sign up to see more" walls. Just pure, shareable, eye-catching event pages that make people stop scrolling and say &lt;em&gt;"Wait, what is THIS?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This means your event promotion just got a serious upgrade. Instead of sharing a boring event link, you're sharing an interactive 3D experience. That's the kind of link people actually click.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building in Public: The Real Journey
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Timeline
&lt;/h3&gt;

&lt;p&gt;I started building MapMeet around December 9th. I had the vision clear in my head, a 3D globe, real-time connections, the whole thing.&lt;/p&gt;

&lt;p&gt;But somewhere around week two, I hit a wall. You know that feeling when you're deep in code, nothing's working the way you want, and suddenly every other project idea seems more exciting? Yeah. I started drifting to other side projects, telling myself I'd come back to MapMeet "later."&lt;/p&gt;

&lt;p&gt;Then I saw this hackathon announcement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shark Tank-style pitches? Video submissions? $3,000 in prizes?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That was the kick I needed. Having a deadline and a reason to ship changed everything. I went from "maybe I'll finish this someday" to "this is going live, and I'm pitching it to the world."&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Thank you, Mux and DEV, for the accountability.&lt;/em&gt; 🙏&lt;/p&gt;

&lt;h3&gt;
  
  
  First-Time Integrations
&lt;/h3&gt;

&lt;p&gt;This project pushed me into territory I'd never explored before.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔐 Supabase (First Time)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I'd heard about Supabase but never actually built with it. MapMeet uses Supabase Auth for Google OAuth and email/password authentication, Supabase Realtime for broadcasting live arc updates, and Supabase Storage for event cover images.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;💳 Stripe (First Time)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I'd never implemented payments before. The idea of handling real money in my code was honestly intimidating.&lt;/p&gt;

&lt;p&gt;But Stripe's documentation is incredible. I set up checkout sessions for upgrading to Premium, and webhooks for syncing subscription status.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson learned:&lt;/strong&gt; The integrations you're scared of are usually the ones with the best documentation. Just start.&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Highlights
&lt;/h2&gt;

&lt;p&gt;While MapMeet isn't open-source (yet 👀), here's the architecture powering the platform:&lt;/p&gt;

&lt;h3&gt;
  
  
  The Stack
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Frontend&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Next.js, Tailwind CSS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Backend&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;FastAPI, SQLModel ORM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Database&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;PostgreSQL (on Supabase)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Auth&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Supabase Auth (Google OAuth + Email/Password)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Realtime&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Supabase Realtime (broadcast channels)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Storage&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Supabase Storage (event cover images)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Payments&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Stripe&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3D Globe&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Mapbox&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Domain &amp;amp; Hosting Setup
&lt;/h3&gt;

&lt;p&gt;Quick story: I snagged &lt;strong&gt;mapmeet.co&lt;/strong&gt; from GoDaddy because their pricing was great AND it included custom email addresses for the first year.&lt;/p&gt;

&lt;p&gt;Frontend is hosted on &lt;strong&gt;Vercel&lt;/strong&gt;. I just pointed my nameservers from GoDaddy to Vercel, and we're live with edge-fast global performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Business Model
&lt;/h3&gt;

&lt;p&gt;MapMeet runs on a freemium model:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Free&lt;/th&gt;
&lt;th&gt;Premium&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Active Events&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Attendees per Event&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Real-time Arcs&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Custom Marker Colors&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Price&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;td&gt;$19/month&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The free tier is genuinely useful for small meetups and testing the platform. Premium unlocks MapMeet for serious event organizers who need scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use of Mux
&lt;/h2&gt;

&lt;p&gt;Let's talk about &lt;strong&gt;Mux&lt;/strong&gt; because this was a genuine discovery for me.&lt;/p&gt;

&lt;h3&gt;
  
  
  Instant Thumbnails via URL
&lt;/h3&gt;

&lt;p&gt;Need a screenshot from your video? With YouTube, you'd have to manually screenshot and upload it.&lt;/p&gt;

&lt;p&gt;With Mux? Just construct a URL, Just a URL....&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next for MapMeet?
&lt;/h2&gt;

&lt;p&gt;This hackathon was the push to ship v1, but I'm just getting started:&lt;/p&gt;

&lt;p&gt;🎥 &lt;strong&gt;Video integration&lt;/strong&gt; (now that I've discovered Mux!)&lt;br&gt;
🌐 &lt;strong&gt;Event categories&lt;/strong&gt; for better discovery&lt;br&gt;
📊 &lt;strong&gt;Analytics dashboard&lt;/strong&gt; for organizers&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's Connect!
&lt;/h2&gt;

&lt;p&gt;If you've made it this far, thank you. Seriously. It means the world.&lt;/p&gt;

&lt;p&gt;Here's how you can support MapMeet:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. 🌍 Join the Hackathon Event!
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.mapmeet.co/event/62e4buqr" rel="noopener noreferrer"&gt;JOIN: MapMeet Launch Party - Mux + DEV.to Hackathon 🌍&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Be one of the arcs on the globe! Let's make this the most globally distributed hackathon celebration ever.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. 💬 Tell Me What You Think
&lt;/h3&gt;

&lt;p&gt;Drop a comment below. I read and respond to every single one.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. ❤️ React If This Resonated
&lt;/h3&gt;

&lt;h3&gt;
  
  
  4. 🔗 Share With Event Organizers
&lt;/h3&gt;

&lt;p&gt;Know someone who hosts meetups, conferences, or hackathons? Share MapMeet with them!&lt;/p&gt;

&lt;h2&gt;
  
  
  One Last Thing
&lt;/h2&gt;

&lt;p&gt;Building MapMeet taught me that the scariest part of any project is showing it to the world. It's easy to keep tweaking forever, telling yourself "it's not ready yet."&lt;/p&gt;

&lt;p&gt;This hackathon gave me a deadline and a stage. I'm grateful for that push.&lt;/p&gt;

&lt;p&gt;To everyone building something and waiting for the "right moment" to share it: &lt;strong&gt;this is your sign.&lt;/strong&gt; Ship it. Pitch it. Let the world see what you've made.&lt;/p&gt;

&lt;p&gt;The globe is waiting for your arcs. 🌍✨&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;Try MapMeet:&lt;/strong&gt; &lt;a href="https://www.mapmeet.co" rel="noopener noreferrer"&gt;https://www.mapmeet.co&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;Join the Event:&lt;/strong&gt; [&lt;a href="https://www.mapmeet.co/event/62e4buqr" rel="noopener noreferrer"&gt;https://www.mapmeet.co/event/62e4buqr&lt;/a&gt;]&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What would YOU host on a 3D globe?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A 24-hour coding marathon across time zones? A worldwide marathon watch party? A concert with fans lighting up from every continent?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Drop yours below!&lt;/strong&gt; 👇&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>muxchallenge</category>
      <category>showandtell</category>
      <category>video</category>
    </item>
    <item>
      <title>Video Libraries Made Searchable by AI</title>
      <dc:creator>Kiran Baby</dc:creator>
      <pubDate>Fri, 26 Dec 2025 12:35:13 +0000</pubDate>
      <link>https://forem.com/kiranbaby14/i-built-a-video-search-engine-that-understands-what-youre-looking-for-51m7</link>
      <guid>https://forem.com/kiranbaby14/i-built-a-video-search-engine-that-understands-what-youre-looking-for-51m7</guid>
      <description>&lt;p&gt;&lt;strong&gt;Ever tried finding that ONE moment in a 2-hour video? Yeah, me too. It sucks.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Back again with another project! Hope y'all had an amazing Christmas! 🎄. Jingle bells jingle bells jingle all the way&lt;/em&gt; ✌️&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;You recorded a meeting. Or a lecture. Or your kid's recital. Now you need to find that specific part where someone said something important, or that exact scene you vaguely remember.&lt;/p&gt;

&lt;p&gt;Your options:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Scrub through the entire video like a caveman&lt;/li&gt;
&lt;li&gt;Hope YouTube's auto-chapters got it right (they didn't)&lt;/li&gt;
&lt;li&gt;Give up and rewatch the whole thing&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;What if you could just... describe what you're looking for?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;"Find the part where he talks about the budget"&lt;/p&gt;

&lt;p&gt;"Show me when there's a red car on screen"&lt;/p&gt;

&lt;p&gt;"Jump to where she mentions the deadline"&lt;/p&gt;

&lt;p&gt;That's what I built.&lt;/p&gt;




&lt;h2&gt;
  
  
  Introducing SearchLightAI 🔦
&lt;/h2&gt;

&lt;p&gt;SearchLightAI lets you search your videos by describing what you see OR what was said. Upload a video, wait for it to process, then search with natural language.&lt;/p&gt;

&lt;p&gt;It returns the exact timestamp. Click it. You're there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Search your videos like you search your documents.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Tech Stack 🤓
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Tech&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;FastAPI + SQLModel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Databases&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;PostgreSQL (metadata) + Qdrant (vectors)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Vision AI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SigLIP2 (google/siglip2-base-patch16-512)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Speech AI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;faster-whisper + Sentence Transformers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Video Processing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;FFmpeg + PySceneDetect&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Frontend&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Next.js 16, React 19, Tailwind CSS, shadcn/ui&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;📥 Ingestion Pipeline&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Video Upload
    ↓
PySceneDetect → finds scene changes
    ↓
FFmpeg → extracts keyframes + audio
    ↓
faster-whisper → transcribes speech
    ↓
SigLIP2 → embeds keyframes (768-dim)
Sentence Transformers → embeds transcript (384-dim)
    ↓
Qdrant → stores all vectors
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;🔍 Search Pipeline&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your query: "when he talks about the budget"
    ↓
Same models embed your query
    ↓
Cosine similarity search in Qdrant
    ↓
Results ranked by relevance
    ↓
Click → jump to exact timestamp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Three Search Modes
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;🎬 Visual Search&lt;/strong&gt; - Describe what you see&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"man standing near whiteboard"&lt;/li&gt;
&lt;li&gt;"outdoor scene with trees"&lt;/li&gt;
&lt;li&gt;"someone holding a laptop"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;🎤 Speech Search&lt;/strong&gt; - What was said&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"when they mentioned the quarterly results"&lt;/li&gt;
&lt;li&gt;"the part about machine learning"&lt;/li&gt;
&lt;li&gt;"discussion about the timeline"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;🔀 Hybrid Search&lt;/strong&gt; - Best of both&lt;br&gt;
Combines visual and speech results. Usually what you want.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Secret Sauce: SigLIP2
&lt;/h2&gt;

&lt;p&gt;Most visual search uses CLIP. I went with SigLIP2 instead.&lt;/p&gt;

&lt;p&gt;Why? SigLIP uses sigmoid loss instead of softmax contrastive loss. The practical difference: better zero-shot performance, especially for fine-grained visual details.&lt;/p&gt;

&lt;p&gt;One quirk though - raw SigLIP scores are lower than you'd expect. A "great match" might be 0.25-0.35 cosine similarity. So I rescale them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;rescale_siglip_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cosine_score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Maps SigLIP scores to intuitive 0-1 range.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;midpoint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.18&lt;/span&gt;
    &lt;span class="n"&gt;steepness&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cosine_score&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;midpoint&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;steepness&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now 0.35 → ~90%, 0.25 → ~70%, which feels right in the UI.&lt;/p&gt;




&lt;h2&gt;
  
  
  Smart Keyframe Extraction
&lt;/h2&gt;

&lt;p&gt;I'm not extracting every frame (that would be insane). PySceneDetect uses adaptive content detection to find actual scene changes.&lt;/p&gt;

&lt;p&gt;For each scene, I grab:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Frame at the start&lt;/li&gt;
&lt;li&gt;Frame at the middle (for scenes &amp;gt; 2 seconds)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This gives good coverage without exploding storage or processing time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Running It Yourself
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Docker Compose (Recommended)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/kiranbaby14/searchlightai.git
&lt;span class="nb"&gt;cd &lt;/span&gt;searchlightai

&lt;span class="nb"&gt;cp &lt;/span&gt;apps/server/.env.example apps/server/.env
&lt;span class="nb"&gt;cp &lt;/span&gt;apps/client/.env.example apps/client/.env

docker-compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait for models to load (around 2-3 min first time), then:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Frontend: &lt;a href="http://localhost:3000" rel="noopener noreferrer"&gt;http://localhost:3000&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;API: &lt;a href="http://localhost:8000" rel="noopener noreferrer"&gt;http://localhost:8000&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;NVIDIA GPU with CUDA support&lt;/li&gt;
&lt;li&gt;Docker + Docker Compose&lt;/li&gt;
&lt;li&gt;Around 4GB+ VRAM should work (SigLIP2 + faster-whisper + Sentence Transformers are relatively lightweight)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;⏱️ &lt;strong&gt;Heads up:&lt;/strong&gt; Processing time depends on video length. A 10-min video takes a couple minutes, but longer videos (1hr+) will need more patience. Scene detection, transcription, and embedding generation all add up.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What Could You Build With This?
&lt;/h2&gt;

&lt;p&gt;Some ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📹 &lt;strong&gt;Meeting search&lt;/strong&gt; - Find decisions across hundreds of recorded meetings&lt;/li&gt;
&lt;li&gt;🎓 &lt;strong&gt;Lecture navigation&lt;/strong&gt; - Students jumping to specific topics&lt;/li&gt;
&lt;li&gt;📺 &lt;strong&gt;Media asset management&lt;/strong&gt; - Search through footage libraries&lt;/li&gt;
&lt;li&gt;📱 &lt;strong&gt;Personal video search&lt;/strong&gt; - Your phone videos, finally searchable&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Code Is Yours
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/kiranbaby14/searchlightai" rel="noopener noreferrer"&gt;github.com/kiranbaby14/SearchLightAI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Star it ⭐ if you think video search should be this easy.&lt;/p&gt;




&lt;h2&gt;
  
  
  Shoutouts 🙏
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SigLIP2&lt;/strong&gt; from Google for visual embeddings that actually work&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PySceneDetect&lt;/strong&gt; for making scene detection actually usable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Qdrant&lt;/strong&gt; for a vector DB that just works&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;faster-whisper&lt;/strong&gt; for Whisper that's actually fast&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  That's It. Go Break It.
&lt;/h2&gt;

&lt;p&gt;Clone it, throw your weirdest videos at it, see what breaks. File issues. Send PRs. Roast my code in the comments.&lt;/p&gt;

&lt;p&gt;The best part of putting stuff out there? Finding out all the ways you didn't think of using it.&lt;/p&gt;

&lt;p&gt;Catch you in the next one. ✌️&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with ⚡, mass Claude Code sessions, and an unhealthy amount of caffeine ☕ by &lt;a href="https://github.com/kiranbaby14" rel="noopener noreferrer"&gt;@kiranbaby14&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>python</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I Built a 3D AI Avatar That Actually Sees and Talks Back 🎭</title>
      <dc:creator>Kiran Baby</dc:creator>
      <pubDate>Fri, 26 Dec 2025 11:11:25 +0000</pubDate>
      <link>https://forem.com/kiranbaby14/i-built-a-3d-ai-avatar-that-actually-sees-and-talks-back-4j1a</link>
      <guid>https://forem.com/kiranbaby14/i-built-a-3d-ai-avatar-that-actually-sees-and-talks-back-4j1a</guid>
      <description>&lt;p&gt;&lt;strong&gt;Chatbots are so 2020. Let me show you what I built instead.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;It's been ages since I last posted here. Hope y'all had a great Christmas! 🎄 Feels good to be back.&lt;/em&gt; ✌️&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem With Every AI Assistant Right Now
&lt;/h2&gt;

&lt;p&gt;You know what's annoying? Typing. &lt;/p&gt;

&lt;p&gt;Every AI tool out there wants you to &lt;em&gt;type type type&lt;/em&gt; like it's 1995. And don't even get me started on the ones that "listen" but can't see what you're showing them.&lt;/p&gt;

&lt;p&gt;So I asked myself: &lt;strong&gt;What if I built an AI that works like an actual conversation?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;👀 &lt;strong&gt;Sees&lt;/strong&gt; what you show it (camera feed)&lt;/li&gt;
&lt;li&gt;👂 &lt;strong&gt;Hears&lt;/strong&gt; you naturally (no push-to-talk nonsense)&lt;/li&gt;
&lt;li&gt;🗣️ &lt;strong&gt;Responds&lt;/strong&gt; with voice and perfectly synced lip movements&lt;/li&gt;
&lt;li&gt;🎭 &lt;strong&gt;Expresses emotions&lt;/strong&gt; through a 3D avatar&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And runs &lt;strong&gt;100% locally&lt;/strong&gt; on your machine. No API keys bleeding your wallet dry.&lt;/p&gt;




&lt;h2&gt;
  
  
  Introducing TalkMateAI 🚀
&lt;/h2&gt;

&lt;p&gt;TalkMateAI is a real-time, multimodal AI companion. You talk to it, show it things through your camera, and it responds with natural speech while a 3D avatar lip-syncs perfectly to every word.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It's like having a conversation with a character from a video game, except it's actually intelligent.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Tech Stack (For My Fellow Nerds 🤓)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Backend (Python)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FastAPI + WebSockets → Real-time bidirectional communication
PyTorch + Flash Attention 2 → GPU go brrrrr
OpenAI Whisper (tiny) → Speech recognition
SmolVLM2-256M-Video-Instruct → Vision-language understanding
Kokoro TTS → Natural voice synthesis with word-level timing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Frontend (TypeScript)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Next.js 15 → Because Turbopack is fast af
Tailwind CSS + shadcn/ui → Pretty buttons
TalkingHead.js → 3D avatar with lip-sync magic
Web Audio API + AudioWorklet → Low-latency audio processing
Native WebSocket → None of that socket.io bloat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How It Actually Works
&lt;/h2&gt;

&lt;p&gt;Here's the flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You speak → 
  VAD detects speech → 
    Audio (+ camera frame if enabled) sent via WebSocket → 
      Whisper transcribes → 
        SmolVLM2 understands text + image together → 
          Generates response → 
            Kokoro synthesizes speech with timing data → 
              Audio + lip-sync data sent back → 
                3D avatar speaks with perfect sync
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All of this happens in &lt;strong&gt;real-time&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Secret Sauce: Native Word Timing 🎯
&lt;/h2&gt;

&lt;p&gt;Most TTS solutions give you audio and that's it. You're left guessing when each word starts for lip-sync.&lt;/p&gt;

&lt;p&gt;Kokoro TTS gives you &lt;strong&gt;word-level timing data&lt;/strong&gt; out of the box:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;speakData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;audioBuffer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;words&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;world&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;wtimes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;      &lt;span class="c1"&gt;// when each word starts&lt;/span&gt;
  &lt;span class="na"&gt;wdurations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;   &lt;span class="c1"&gt;// how long each word lasts&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// TalkingHead uses this for pixel-perfect lip sync&lt;/span&gt;
&lt;span class="nx"&gt;headRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;speakAudio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;speakData&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result? Lips that move &lt;em&gt;exactly&lt;/em&gt; when they should. No uncanny valley weirdness.&lt;/p&gt;




&lt;h2&gt;
  
  
  Voice Activity Detection That Actually Works
&lt;/h2&gt;

&lt;p&gt;I didn't want push-to-talk. I wanted natural conversation flow.&lt;/p&gt;

&lt;p&gt;So I built a custom VAD using the Web Audio API's AudioWorklet. It calculates energy levels in real-time and tracks speech frames vs silence frames - all from the frontend (so no unnecessary wastage of backend processing power).&lt;/p&gt;

&lt;p&gt;You just... talk. When you pause naturally, it processes. When you keep talking, it waits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It respects conversational flow.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Heads up:&lt;/strong&gt; This version doesn't support barge-in (interrupting the avatar mid-speech) or sophisticated turn-taking detection. It's purely pause-based - you talk, pause, it responds.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Vision Component 👁️
&lt;/h2&gt;

&lt;p&gt;Here's where it gets spicy. The camera isn't just for show.&lt;/p&gt;

&lt;p&gt;When enabled, every audio segment gets sent &lt;em&gt;with&lt;/em&gt; a camera snapshot. SmolVLM2 processes both together - the audio transcription AND what it sees.&lt;/p&gt;

&lt;p&gt;You can literally say &lt;em&gt;"What am I holding?"&lt;/em&gt; and it'll tell you.&lt;/p&gt;




&lt;h2&gt;
  
  
  Running It Yourself
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Node.js 20+&lt;/li&gt;
&lt;li&gt;Python 3.10&lt;/li&gt;
&lt;li&gt;NVIDIA GPU with ~4GB+ VRAM should work (I used RTX 3070 8GB, but the models are lightweight - Whisper tiny + SmolVLM2-256M + Kokoro TTS)&lt;/li&gt;
&lt;li&gt;PNPM &amp;amp; UV package managers&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Setup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone it&lt;/span&gt;
git clone https://github.com/kiranbaby14/TalkMateAI.git
&lt;span class="nb"&gt;cd &lt;/span&gt;TalkMateAI

&lt;span class="c"&gt;# Install everything&lt;/span&gt;
pnpm run monorepo-setup

&lt;span class="c"&gt;# Run both frontend and backend&lt;/span&gt;
pnpm dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Frontend: &lt;code&gt;http://localhost:3000&lt;/code&gt;&lt;br&gt;
Backend: &lt;code&gt;http://localhost:8000&lt;/code&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What Can You Build With This?
&lt;/h2&gt;

&lt;p&gt;This is open source. Fork it. Break it. Make it weird.&lt;/p&gt;

&lt;p&gt;Some ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📚 &lt;strong&gt;Language tutors&lt;/strong&gt; that watch your pronunciation&lt;/li&gt;
&lt;li&gt;🎨 &lt;strong&gt;Creative companions&lt;/strong&gt; that see your art and give feedback&lt;/li&gt;
&lt;li&gt;🔍 &lt;strong&gt;Screen assistants&lt;/strong&gt; - combine with &lt;a href="https://github.com/mediar-ai/screenpipe" rel="noopener noreferrer"&gt;Screenpipe&lt;/a&gt; for an AI that knows what you've been doing&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Code Is Yours
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/kiranbaby14/TalkMateAI" rel="noopener noreferrer"&gt;github.com/kiranbaby14/TalkMateAI&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🛠️ &lt;strong&gt;Fair warning:&lt;/strong&gt; This was a curiosity-driven project, not a polished product. There are rough edges, things I'd do differently now, and probably bugs I haven't found yet. But that's the fun of open source, right? Dig in, break stuff, make it better.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Star it ⭐ if you think chatbots should evolve.&lt;/p&gt;




&lt;h2&gt;
  
  
  Shoutouts 🙏
&lt;/h2&gt;

&lt;p&gt;Big thanks to &lt;a href="https://github.com/met4citizen" rel="noopener noreferrer"&gt;met4citizen&lt;/a&gt; for the incredible &lt;a href="https://github.com/met4citizen/TalkingHead" rel="noopener noreferrer"&gt;TalkingHead&lt;/a&gt; library. The 3D avatar rendering and lip-sync magic? That's all their work. I just plugged it in and fed it audio + timing data. Absolute legend.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Would You Build?
&lt;/h2&gt;

&lt;p&gt;Seriously, drop a comment. I want to know what wild ideas you have for real-time multimodal AI.&lt;/p&gt;

&lt;p&gt;AI that sees + hears + responds naturally? That's not the future anymore.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;That's right now. And you can run it on your GPU.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with ❤️ and probably too much caffeine by &lt;a href="https://github.com/kiranbaby14" rel="noopener noreferrer"&gt;@kiranbaby14&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>opensource</category>
      <category>learning</category>
    </item>
    <item>
      <title>My First Blog and My First Game</title>
      <dc:creator>Kiran Baby</dc:creator>
      <pubDate>Thu, 27 Jan 2022 07:38:42 +0000</pubDate>
      <link>https://forem.com/kiranbaby14/my-first-blog-and-my-first-game-33dd</link>
      <guid>https://forem.com/kiranbaby14/my-first-blog-and-my-first-game-33dd</guid>
      <description>&lt;p&gt;Hey guys, so I am new to the DEV community and I am really excited to share my first blog about the first game that I created. The game was named as &lt;strong&gt;"Spheron-The ball game"&lt;/strong&gt; because the protagonist of the game was obviously a 'sphere' and I don't know from where the 'spheron' name popped up in my head. But anyway, I created this game a long while ago back in 2020 while I was doing my undergrad, and I managed to complete the game and upload it to the PlayStore once the colleges were closed due to the pandemic. I guess I am thankful for that which I shouldn't be, but hey, I got a lot of free time to develop the game. The game was made using unity engine and C# as its prgramming language. As I was a beginner into game dev I looked into and learned from a lot of youtube tutorials on how to build a game using unity. Brackey's youtube channel helped me a lot, I am sure Unity devs would've at least heard of this channel once in their lifetime. I know that the game is not an extraordinary or over-the-top one but it was my first game so it holds a special place in my heart. The genre of the game is an endless runner type and you could also collect coins along the way. I would link the game at the bottom of the post so you guys can check it out if you're interested.&lt;/p&gt;

&lt;h4&gt;
  
  
  Controls
&lt;/h4&gt;

&lt;p&gt;The controls are fairly simple&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Touch the right side of the screen to move right&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Touch the left side of the screen to move left&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The objective of the game is to get the protagonist ie; the sphere to dodge all the obstacle that comes along the way without falling down from the platform, and increase the score to the maximum that you can. I've also created a coin system in the game so that the player can collect coins along the way which can later be used to buy different skins for the character, and also if the player dies midway it can be used for resurrection.&lt;/p&gt;

&lt;p&gt;I've also incorporated ads into the app. And these are only reward ads so you don't have to worry about ads popping up here and there and annoying you every time. The ad is optional to the person playing the game. Once the player dies, a popup menu will come that has an ad button to resurrect the player and continue playing. So the ad is completely optional. I used google AdMob for the implementation of the ads. At first, I messed up with the ads. When I uploaded my game to the PlayStore I clicked on the ads many times myself on my own phone and google as the all-seeing eye came to know of it and blocked my AdMob account. But later it got resolved.&lt;/p&gt;

&lt;p&gt;So this was my first blog. I know it took me 2 years to write a blog about the first game that I made, but hey, I wrote it at the end. And I hope to keep writing blogs on this wonderful platform. The next blog will likely be about the second game that I made and once it is done I will update the link to the blog here. So hope u guys enjoyed reading my blog and if you'd like to check out my game and give me feedback on it the link's down below.&lt;/p&gt;

&lt;h4&gt;
  
  
  PlayStore Link
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://play.google.com/store/apps/details?id=com.Jbk.Spheron" rel="noopener noreferrer"&gt;https://play.google.com/store/apps/details?id=com.Jbk.Spheron&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Screenshots
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmt1owbp62j2ekggl2zeo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmt1owbp62j2ekggl2zeo.png" alt=" " width="800" height="445"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2h5buogjxgo1ghqkafgg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2h5buogjxgo1ghqkafgg.png" alt=" " width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnyxj4icijjcnm8klwuy8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnyxj4icijjcnm8klwuy8.png" alt=" " width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>gamedev</category>
      <category>unity3d</category>
      <category>beginners</category>
      <category>android</category>
    </item>
  </channel>
</rss>
