<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Aryan Khola</title>
    <description>The latest articles on Forem by Aryan Khola (@randumb).</description>
    <link>https://forem.com/randumb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3627206%2F0f9037c6-466c-4e0d-8266-7ea0104d0a83.png</url>
      <title>Forem: Aryan Khola</title>
      <link>https://forem.com/randumb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/randumb"/>
    <language>en</language>
    <item>
      <title>Scaling with Celery and Redis</title>
      <dc:creator>Aryan Khola</dc:creator>
      <pubDate>Tue, 02 Dec 2025 09:57:13 +0000</pubDate>
      <link>https://forem.com/randumb/scaling-with-celery-and-redis-5aop</link>
      <guid>https://forem.com/randumb/scaling-with-celery-and-redis-5aop</guid>
      <description>&lt;p&gt;Task queues and background workers are fundamental patterns in modern software architecture. If you're building an app that does anything heavier than a simple database query and you want it to scale, you need Task Queues. &lt;/p&gt;

&lt;p&gt;I learned this while building something of my own.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Task
&lt;/h3&gt;

&lt;p&gt;So, I was working on something, a more personalized version of &lt;strong&gt;Filmot&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you aren’t familiar, the idea is simple: I wanted to build a search engine that could index my personal YouTube playlists and let me instantly search through the transcripts (as well as titles and descriptions) of every video in those playlists.&lt;/p&gt;

&lt;p&gt;The user experience was designed to be pretty straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You log in using Google OAuth and grant read access to your YouTube account.&lt;/li&gt;
&lt;li&gt;You pick a playlist from your dashboard.&lt;/li&gt;
&lt;li&gt;You hit the "Start Indexing" button.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The Issue
&lt;/h3&gt;

&lt;p&gt;Now, under the hood, the moment the indexing started, my backend initiated a synchronous loop. It attempted to process every single video in the playlist sequentially before returning any response to the user.&lt;/p&gt;

&lt;p&gt;I was pretty happy cause I was able to effortlessly index my playlists. &lt;strong&gt;However, when I showed it to a friend, he had a playlist having around 1,000 videos, and it failed.&lt;/strong&gt;  The screen froze, the spinner spun forever, and eventually, the app crashed.&lt;/p&gt;

&lt;p&gt;The bottleneck was simple math: if fetching and indexing a transcript takes roughly 1.5 seconds, doing that for 1,000 videos takes &lt;strong&gt;25 minutes&lt;/strong&gt;. The browser gave up after about 60 seconds. Now I was aware that this was a synchronous app and the issues that came with it, but I expected it to fail for some other reason, failing during indexing kinda defeats the whole purpose of making this application.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Fix
&lt;/h3&gt;

&lt;p&gt;The issue was obvious: indexing was taking too much time, and this was blocking my application from doing anything else. While looking for solutions, I came across a video by &lt;a href="https://www.youtube.com/@sriniously" rel="noopener noreferrer"&gt;Sriniously&lt;/a&gt; on task queues, an insanely insightful video; I truly learned a lot. It broke down the producer-consumer model and emphasized why offloading task execution from user requests is essential for building scalable applications.&lt;/p&gt;

&lt;p&gt;This prompted me to revamp the architecture entirely, shifting from a synchronous, blocking system to an asynchronous, non-blocking one.&lt;/p&gt;

&lt;p&gt;I'll try to explain what task queues and background workers are and unpack them methodically, from what they are to how they work, and show how they made my app &lt;strong&gt;way more efficient&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Do We Need Task Queues and Background Workers?
&lt;/h3&gt;

&lt;p&gt;When your app is under heavy load, a fully synchronous design quickly becomes a bottleneck (just like in my case). Long-running operations block the main thread, requests pile up behind each other, latency grows worse, and eventually the system starts dropping connections or even crashing. That’s exactly what happened with the 1,000-video playlist, every transcript fetch happened one after another on the main thread, taking roughly 25 minutes, while the client timed out after just 60 seconds.&lt;/p&gt;

&lt;p&gt;Task queues solve this by shifting work away from the request–response cycle and into a background process. Instead of the server waiting for every API call or slow operation to finish, the HTTP handler simply records the intent, places a job in a queue, returns an acknowledgment and hands control back immediately.&lt;/p&gt;

&lt;p&gt;Meanwhile, separate &lt;strong&gt;worker processes&lt;/strong&gt; consume tasks from the queue and execute them in parallel, or concurrently, depending on how you’ve set things up. This keeps the main server responsive because it’s no longer forced to wait on slow operations or external APIs. Scaling becomes far more straightforward as well: whenever you need additional throughput, you simply add more workers without touching the web tier. And because each worker is isolated, a failure in one won’t cascade through the entire system&lt;/p&gt;

&lt;p&gt;A particularly useful feature for me was the ability to rate-limit the queue, which helped avoid IP bans from YouTube.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Components
&lt;/h3&gt;

&lt;p&gt;At its core, a task queue is just a queue data structure to store work that should be done later instead of during the main request. You can think of it as a first-in-first-out list of jobs that won’t disappear if the &lt;strong&gt;web application restarts&lt;/strong&gt;, usually backed by a message broker that guarantees the tasks aren’t lost. Around this simple idea, you get a whole ecosystem that lets different parts of this system communicate without blocking each other, almost like lightweight microservices passing messages around.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;task&lt;/strong&gt; is a single piece of work the system needs to do. In my app, that unit of work is &lt;code&gt;process_video_task&lt;/code&gt;. This function takes on video ID, fetches its transcript through a proxy, and indexes the data into Elasticsearch. Each video is processed independently by a worker.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@celery.task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_video_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;video_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;transcript&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_video_transcript&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;video_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;index_video&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;video_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transcript&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;video_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;producer&lt;/strong&gt; is the part of the application that creates tasks and sends them to the queue. In my setup, the flask endpoint is the producer. When a user starts indexing a playlist, the endpoint gathers all the needed info and sends the main job to Celery. This immediately pushes the job into the background without blocking the user.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;broker&lt;/strong&gt; is the software that actually stores those tasks and moves them between producers and consumers. In practice the broker exposes a set of primitives, append a message, pop a message, block-and-wait for a message, acknowledge a message and many more, and the queue is the logical structure you build on top of those primitives.&lt;/p&gt;

&lt;p&gt;I chose &lt;strong&gt;Redis&lt;/strong&gt; not only as the message broker for the queue but also the storage for tracking task status, enabling real-time progress updates for the user.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Storing the task ID in Redis so we can look it up later
&lt;/span&gt;&lt;span class="n"&gt;task_id_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;TASK_KEY_PREFIX&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;playlist_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;redis_conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;7200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;consumer&lt;/strong&gt; A.K.A the workers, are the ones who actually do the tasks, by taking them from the message broker. &lt;strong&gt;This runs on a different process&lt;/strong&gt;, by having the consumer running on a &lt;strong&gt;background worker&lt;/strong&gt; or even a &lt;strong&gt;separate&lt;/strong&gt; server sometimes, the main application stays fast and responsive.&lt;/p&gt;

&lt;p&gt;In my application, I am using Celery to manage these consumers. Celery wraps the complex logic of spawning worker processes, managing memory, and acknowledging messages, so I don't have to write that infrastructure code from scratch.&lt;/p&gt;

&lt;h4&gt;
  
  
  How they all tie up together
&lt;/h4&gt;

&lt;p&gt;The definition of a broker according to Wikipedia is "a person or entity that arranges transactions between a buyer and a seller." This is exactly the role Redis plays in this ecosystem.&lt;/p&gt;

&lt;p&gt;Technically, Redis is an in-memory data structure store. Because it lives entirely in RAM, it is incredibly fast. However, in this specific architecture, it acts as a "dumb" middleman. It has no idea what it is storing and doesn't understand the business logic of my application. Its only job is to accept tasks from the producer, hold them safely in memory, and facilitate adding or removing them with microsecond latency.&lt;/p&gt;

&lt;p&gt;On the other hand, Celery acts as the brain of the operation. It sits on top of Redis and provides the logic. Celery handles the &lt;strong&gt;serialization,&lt;/strong&gt;  manages the creation of worker processes, and monitors task health. If a worker fails, Celery is the one that detects it and executes the retry logic.&lt;/p&gt;

&lt;p&gt;If I were building this in other languages, the specific tools would change, but the roles would remain exactly the same:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;In Node.js:&lt;/strong&gt;  You might use &lt;strong&gt;BullMQ&lt;/strong&gt; with &lt;strong&gt;Redis.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;In Go:&lt;/strong&gt;  You might use &lt;strong&gt;Asynq&lt;/strong&gt; with &lt;strong&gt;Redis&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  My Approach
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuxoiikxl7nf4gb0x504p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuxoiikxl7nf4gb0x504p.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The architecture I implemented with the task queues and background workers is based on the Fan-Out Pattern (also known as the Orchestrator Pattern). Basically, there are two roles: a Manager who plans the work, and the Workers who do the work. When the user clicks "Index Playlist," the backend triggers just one single task: &lt;code&gt;index_playlist_task&lt;/code&gt;. This task doesn't actually download any transcripts. Its job is to act as the foreman. It connects to the YouTube API, fetches the list of all videos in the playlist, and calculates exactly what needs to be done.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@celery.task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;index_playlist_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;playlist_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;playlist_title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;credentials_dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...):&lt;/span&gt;

    &lt;span class="c1"&gt;# 1. The Manager fetches the "To-Do List" from YouTube
&lt;/span&gt;    &lt;span class="n"&gt;videos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_playlist_videos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;playlist_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;credentials&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the manager has the list of videos (assume 500), it creates a "Signature" (a task blueprint) for every single video. It packages these hundreds of signatures into a Celery &lt;strong&gt;Group&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is the critical moment where the interaction with &lt;strong&gt;Redis&lt;/strong&gt; happens.&lt;/p&gt;

&lt;p&gt;When the code executes &lt;code&gt;job_group.apply_async()&lt;/code&gt;, Celery takes those 500 task signatures, serializes them (turns them into JSON messages), and &lt;strong&gt;blasts them into the Redis queue&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tasks_to_run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. The Manager prepares orders for the workers
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;video&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;videos&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# .s() creates a "signature" - a task message that is ready to be sent
&lt;/span&gt;        &lt;span class="n"&gt;tasks_to_run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;process_video_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;s&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;video&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tasks_to_run&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# 3. THE FAN-OUT
&lt;/span&gt;        &lt;span class="c1"&gt;# 'group' bundles all these tasks together
&lt;/span&gt;        &lt;span class="n"&gt;job_group&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tasks_to_run&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# 4. SEND TO REDIS
&lt;/span&gt;        &lt;span class="c1"&gt;# This single line pushes hundreds of messages into the Redis list instantly.
&lt;/span&gt;        &lt;span class="n"&gt;result_group&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;job_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;apply_async&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this specific millisecond, the Redis list (the queue) spikes. It goes from having 0 items to having 500 pending jobs.&lt;/p&gt;

&lt;p&gt;Redis acts as the high-speed buffer here. It holds these 500 serialized messages safely in memory. It doesn't care that they are video processing tasks; it just knows it has 500 items that need to be popped off the list.&lt;/p&gt;

&lt;p&gt;Now, my fleet of Celery workers (running on the server) wakes up. They see the queue in Redis is full and start "popping" tasks off the list as fast as they can. To make this truly efficient, I utilize &lt;strong&gt;Green Threads&lt;/strong&gt; (via Gevent).&lt;/p&gt;

&lt;p&gt;Each worker runs this isolated logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@celery.task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_video_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;video_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# The worker pops one message from Redis and processes just that ONE video
&lt;/span&gt;    &lt;span class="n"&gt;transcript&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_video_transcript&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;video_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;index_video&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;video_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transcript&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;video_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By "fanning out" the workload and using Green Threads to handle the I/O waiting, I can index a massive playlist in a fraction of the time.&lt;/p&gt;

&lt;h3&gt;
  
  
  THE RESULT
&lt;/h3&gt;

&lt;p&gt;After revamping the architecture the math completely changed.&lt;/p&gt;

&lt;p&gt;Because I am processing videos using multiple workers, I wasn't limited to the speed of a single thread. I configured my system to handle &lt;strong&gt;50 tasks concurrently&lt;/strong&gt;. Since downloading transcripts is mostly just waiting for the network, my server could easily juggle these 50 requests at the almost same time.&lt;/p&gt;

&lt;p&gt;Now the math is,&lt;br&gt;
&lt;strong&gt;(1,000 videos / 50 concurrent tasks) × 1.5 seconds ≈ 30 seconds&lt;/strong&gt;&lt;br&gt;
Compared to the old,&lt;br&gt;
&lt;strong&gt;1,000 videos × 1.5 seconds/video = 1,500 seconds (≈ 25 minutes)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;These numbers are just approximate figures based on average response times, but the speed did &lt;strong&gt;increase&lt;/strong&gt; considerably. The application went from choking on a 1000-video playlist to indexing a massive &lt;strong&gt;3,000-video playlist effortlessly&lt;/strong&gt;. Truly, task queues for the win.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The project is fully deployed on &lt;strong&gt;GCP&lt;/strong&gt; and you can try the live demo here: &lt;a href="https://yts-88.com" rel="noopener noreferrer"&gt;https://yts-88.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A quick note on stability:&lt;/em&gt;  YouTube blocks transcript requests coming from data center IPs (like GCP), flagging them as bots. To bypass this, I implemented a &lt;strong&gt;residential routing proxy using Webshare&lt;/strong&gt;, which worked flawlessly. However, maintaining residential proxies became too expensive for a side project, so I am currently trying to find some cost-effective workarounds.&lt;/p&gt;

</description>
      <category>backend</category>
      <category>architecture</category>
      <category>python</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
