<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mallarpu Deepak sai</title>
    <description>The latest articles on Forem by Mallarpu Deepak sai (@mdeepaksai).</description>
    <link>https://forem.com/mdeepaksai</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3892173%2F5d33140a-16b6-4bec-87c9-d27e8ad42e8a.jpeg</url>
      <title>Forem: Mallarpu Deepak sai</title>
      <link>https://forem.com/mdeepaksai</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mdeepaksai"/>
    <language>en</language>
    <item>
      <title>How to detect AI-generated voices in Tamil, Hindi and Telugu — I built a free tool for this</title>
      <dc:creator>Mallarpu Deepak sai</dc:creator>
      <pubDate>Wed, 22 Apr 2026 09:53:42 +0000</pubDate>
      <link>https://forem.com/mdeepaksai/how-i-built-a-free-ai-voice-detector-that-supports-5-indian-languages-p24</link>
      <guid>https://forem.com/mdeepaksai/how-i-built-a-free-ai-voice-detector-that-supports-5-indian-languages-p24</guid>
      <description>&lt;p&gt;Most AI voice detectors only support English. I'm a 2nd year ECE student at KIT, Tamil Nadu — so I built one that actually works for Indian languages. Here's exactly how I did it and what broke along the way.&lt;/p&gt;

&lt;p&gt;The Problem&lt;br&gt;
Deepfake audio is a real threat. Politicians, scammers, and bad actors are using AI-cloned voices to spread misinformation and fraud. The tools that exist to detect this? Almost all English-only.&lt;br&gt;
Nobody was building for Tamil. Hindi. Malayalam. Telugu.&lt;br&gt;
So I did.&lt;/p&gt;

&lt;p&gt;What I Built&lt;br&gt;
VoiceID — a free AI voice detector that analyses 88 acoustic features from any MP3 or WAV file and classifies the voice as Human or AI with a confidence score.&lt;br&gt;
🔴 Live demo: &lt;a href="https://mdeepaksai.github.io/human-or-AI/" rel="noopener noreferrer"&gt;https://mdeepaksai.github.io/human-or-AI/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Upload any MP3 or WAV file (or record live)&lt;br&gt;
Select your language&lt;br&gt;
Get a Human or AI verdict with confidence score&lt;br&gt;
Free. No login. No signup.&lt;/p&gt;

&lt;p&gt;Supports: English, Tamil, Hindi, Malayalam, Telugu.&lt;/p&gt;

&lt;p&gt;Tech Stack&lt;/p&gt;

&lt;p&gt;FastAPI backend deployed on Railway&lt;br&gt;
Vanilla JS frontend hosted on GitHub Pages&lt;br&gt;
Supabase for live visitor and analysis tracking&lt;br&gt;
Librosa for acoustic feature extraction (88 features per audio file)&lt;br&gt;
Scikit-learn for the classification model&lt;/p&gt;

&lt;p&gt;How It Works&lt;br&gt;
When you upload an audio file, the backend extracts 88 acoustic features including:&lt;/p&gt;

&lt;p&gt;MFCCs (Mel-frequency cepstral coefficients)&lt;br&gt;
Pitch variation and breathiness&lt;br&gt;
Spectral entropy and rolloff&lt;br&gt;
Zero crossing rate&lt;/p&gt;

&lt;p&gt;These features differ significantly between real human speech and AI-synthesised voices. The model was trained on a dataset of both human and TTS-generated audio across all 5 supported languages.&lt;br&gt;
The result is a confidence score — not just a binary yes/no — so you can see how certain the model is.&lt;/p&gt;

&lt;p&gt;What Actually Broke (The Real Lessons)&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;CORS almost killed the project
My first deployment had allow_credentials=True combined with allow_origins=["*"] in FastAPI. That combination is invalid and browsers block it silently. Spent 3 hours debugging what looked like a network error before I found it.
Fix: either use credentials with specific origins, or use wildcard without credentials.&lt;/li&gt;
&lt;li&gt;Audio format compatibility is a nightmare
WAV files from different recording tools have different sample rates, bit depths, and channel configurations. Had to add preprocessing to normalise all incoming audio before feature extraction or the model would fail silently on some files.&lt;/li&gt;
&lt;li&gt;Railway cold starts
Free tier Railway deployments sleep after inactivity. First request after sleep takes 10-15 seconds. Added a loading state to the UI so users don't think it's broken.&lt;/li&gt;
&lt;li&gt;File size vs accuracy tradeoff
Larger files give more accurate results but hit Railway's memory limits. Settled on 10MB max with a recommendation to use at least 3 seconds of audio for reliable results.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Results So Far&lt;/p&gt;

&lt;p&gt;133+ visitors since launch&lt;br&gt;
32+ analyses run&lt;br&gt;
Indexed on Google&lt;br&gt;
Launched on Uneed, Fazier, Peerlist&lt;/p&gt;

&lt;p&gt;Try It&lt;br&gt;
Free. No login. No signup.&lt;br&gt;
👉 &lt;a href="https://mdeepaksai.github.io/human-or-AI/" rel="noopener noreferrer"&gt;https://mdeepaksai.github.io/human-or-AI/&lt;/a&gt;&lt;br&gt;
GitHub: &lt;a href="https://github.com/mdeepaksai/human-or-AI" rel="noopener noreferrer"&gt;https://github.com/mdeepaksai/human-or-AI&lt;/a&gt;&lt;br&gt;
If you test it, drop a comment — would love to know how it performs on your audio.&lt;/p&gt;

&lt;p&gt;Built by Mallarpu Deepak Sai — 2nd year ECE @ KIT, Tamil Nadu. Building and shipping AI-powered web apps. Open to internships.&lt;/p&gt;

</description>
      <category>codenewbie</category>
      <category>showdev</category>
      <category>python</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
