<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Akshaya</title>
    <description>The latest articles on Forem by Akshaya (@akshaya-dev).</description>
    <link>https://forem.com/akshaya-dev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3791154%2F00b84767-95de-4cda-9bb8-36491c452399.png</url>
      <title>Forem: Akshaya</title>
      <link>https://forem.com/akshaya-dev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/akshaya-dev"/>
    <language>en</language>
    <item>
      <title>Efficient Parallelism in Python: A Practical Guide to concurrent.futures module</title>
      <dc:creator>Akshaya</dc:creator>
      <pubDate>Thu, 05 Mar 2026 12:55:38 +0000</pubDate>
      <link>https://forem.com/akshaya-dev/efficient-parallelism-in-python-a-practical-guide-to-concurrentfutures-package-i2a</link>
      <guid>https://forem.com/akshaya-dev/efficient-parallelism-in-python-a-practical-guide-to-concurrentfutures-package-i2a</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn8eg1xpef63tlduhmn4p.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn8eg1xpef63tlduhmn4p.jpg" alt="Modern data center processors capable of handling heavy CPU-bound workloads and concurrent data ingestion pipelines" width="800" height="599"&gt;&lt;/a&gt;&lt;br&gt;
Modern computing environments are built on multi-core processors, yet Python’s default execution model is single threaded due to the &lt;strong&gt;Global Interpreter Lock (GIL)&lt;/strong&gt;. To achieve high concurrency, the &lt;code&gt;concurrent.futures&lt;/code&gt; module can be utilized to distribute tasks across multiple execution units.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the basics
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Concurrency&lt;/strong&gt; - It is the ability of a system to handle multiple tasks while making progress on the tasks over time. The tasks may not run simultaneously, but the system switches between tasks efficiently such that progress is made on each task.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Parallelism&lt;/strong&gt; - It is the simultaneous execution of multiple tasks using multiple CPU cores.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Process&lt;/strong&gt; - An instance of a running program. Each process has its own memory space &amp;amp; CPU resources and runs independently from other processes. Processes can communicate with each other using pipes, queues, or shared memory.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Thread&lt;/strong&gt; - The smallest unit of execution within a process. When a process starts, a main thread is created automatically. Additional threads can be created to achieve task concurrency within the same process and they share the same memory space within that process. Also, threads are lightweight compared to processes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Global Interpreter Lock (GIL)&lt;/strong&gt; - A mutex (as in mutual exclusion) lock in CPython that allows only one thread to execute Python bytecode at a time.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Understanding the Bottleneck: CPU-Bound vs I/O-Bound
&lt;/h2&gt;

&lt;p&gt;Before choosing an executor, we must identify the nature of our task's latency.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;I/O-Bound:&lt;/strong&gt; The program spends most of its time waiting for external responses (Network requests, Database queries, Disk Read/Write).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CPU-Bound:&lt;/strong&gt; The program spends its time performing computations on the CPU (Mathematical calculations, Image processing, Data transformations).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Executors in &lt;code&gt;concurrent.futures&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Python provides two executors for parallel and concurrent execution.&lt;/p&gt;

&lt;h3&gt;
  
  
  ThreadPoolExecutor
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The &lt;code&gt;ThreadPoolExecutor&lt;/code&gt; manages a pool of worker threads and distributes tasks among them so multiple operations can run &lt;strong&gt;concurrently&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It is suitable for &lt;strong&gt;I/O-bound tasks&lt;/strong&gt; because when one thread waits for a network or I/O response, the GIL is released, allowing another thread to start its request.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Example: Data ingestion from multiple APIs&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;code&gt;max_workers&lt;/code&gt;&lt;strong&gt;:&lt;/strong&gt; We set &lt;code&gt;max_workers&lt;/code&gt; to 4. So, the &lt;code&gt;ThreadPoolExecutor&lt;/code&gt; uses at most 4 threads to perform the data ingestion process concurrently. In our case, each thread is assigned to one of the four APIs and performs data ingestion.&lt;/li&gt;
&lt;li&gt;  &lt;code&gt;executor.map()&lt;/code&gt;&lt;strong&gt;:&lt;/strong&gt; Concurrent version of &lt;code&gt;map()&lt;/code&gt; function. It applies a given function to all items in an iterable, executing the calls concurrently using a pool of worker threads.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ingest_source&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# data ingestion and cloud storage happens here
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;source_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; ingested successfully&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;concurrent_ingest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sources&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ingest_source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sources&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;api_sources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;customer_api&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;orders_api&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inventory_api&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;payments_api&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Initiating Data Ingestion&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;final_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;concurrent_ingest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_sources&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Data Ingestion Complete: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;final_results&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Expected Output:&lt;/strong&gt; Since this is an I/O-bound task, we will see all four "Initiating" messages appear almost instantly. The total execution time would be determined by the single slowest API call (e.g., ~2 seconds), rather than the sum of all four calls (8+ seconds).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ProcessPoolExecutor
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;This creates entirely separate instances (processes) of the Python interpreter, each with its own memory space and its own GIL.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It is suitable for &lt;strong&gt;CPU-bound tasks&lt;/strong&gt; because separate processes allow the program to utilize multiple CPU cores and achieve &lt;strong&gt;true parallelism&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; It can incur high memory overhead and serialization costs to move data between processes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Example scenario: Calculating the sum of factorials for the input number&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ProcessPoolExecutor&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;
&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_int_max_str_digits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;factorial_sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;number&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Heavy math: Calculating the sum of factorials for the input number
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;factorial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;number&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Input &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;number&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; -&amp;gt; Result length: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; digits&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;concurrent_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ProcessPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;factorial_sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;data_shards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2700&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2800&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;final_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;factorial_sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;data_shards&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;   
    &lt;span class="n"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Time taken (without ProcessPoolExecutor): &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;duration&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; seconds&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;final_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;process_pool_start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;process_pool_final_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;concurrent_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_shards&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   
    &lt;span class="n"&gt;process_pool_duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;process_pool_start_time&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Time taken (with ProcessPoolExecutor): &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;process_pool_duration&lt;/span&gt; &lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; seconds&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;process_pool_final_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Output:

Time taken (without ProcessPoolExecutor): 1.00 seconds

Input 2500 -&amp;gt; Result length: 7408 digits
Input 2600 -&amp;gt; Result length: 7749 digits
Input 2700 -&amp;gt; Result length: 8091 digits
Input 2800 -&amp;gt; Result length: 8435 digits

Time taken (with ProcessPoolExecutor): 0.62 seconds

Input 2500 -&amp;gt; Result length: 7408 digits
Input 2600 -&amp;gt; Result length: 7749 digits
Input 2700 -&amp;gt; Result length: 8091 digits
Input 2800 -&amp;gt; Result length: 8435 digits
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Comparison Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;/colgroup&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;&lt;strong&gt;Feature&lt;/strong&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;&lt;strong&gt;ThreadPoolExecutor&lt;/strong&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;&lt;strong&gt;ProcessPoolExecutor&lt;/strong&gt;&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;&lt;strong&gt;Task Type&lt;/strong&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;I/O-bound&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;CPU-bound&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;&lt;strong&gt;Parallelism&lt;/strong&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;No (Concurrent only)&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;Yes (True Parallel)&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;&lt;strong&gt;Memory Usage&lt;/strong&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;Low&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;High&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;&lt;strong&gt;Data Sharing&lt;/strong&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;Easy (Shared Memory)&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;Hard (Serialization)&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;&lt;strong&gt;Python Version&lt;/strong&gt;&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;3.2+&lt;/p&gt;&lt;/td&gt;
&lt;td colspan="1" rowspan="1"&gt;&lt;p&gt;3.2+&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Things to note
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;InterpreterPoolExecutor:&lt;/strong&gt; Introduced in Python v3.14, it is a subclass of &lt;code&gt;ThreadPoolExecutor&lt;/code&gt; where each worker thread runs in its own isolated Python interpreter. This allows for &lt;strong&gt;true multi-core parallelism&lt;/strong&gt; (multiple GILs) while avoiding the heavy memory overhead and slow startup times associated with creating entirely new processes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The&lt;/strong&gt; &lt;code&gt;__main__&lt;/code&gt; &lt;strong&gt;Guard:&lt;/strong&gt; When using &lt;code&gt;ProcessPoolExecutor&lt;/code&gt; , you must wrap your entry point in &lt;code&gt;if __name__ == "__main__":&lt;/code&gt; to prevent infinite recursion during sub-process creation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The&lt;/strong&gt; &lt;code&gt;with&lt;/code&gt; &lt;strong&gt;statement:&lt;/strong&gt; Always use the &lt;code&gt;with&lt;/code&gt; statement. It ensures the threads are cleaned once block execution completes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Worker Limits:&lt;/strong&gt; For CPU-bound tasks, setting &lt;code&gt;max_workers&lt;/code&gt; significantly higher than your CPU core count will decrease performance due to context-switching overhead.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Understanding when to use threads versus processes is essential for writing efficient Python applications. By identifying whether a task is I/O-bound or CPU-bound, we can choose the appropriate executor and significantly improve performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.amazon.in/Fluent-Python-Effective-Programming-Grayscale/dp/9355420838?source=ps-sl-shoppingads-lpcontext&amp;amp;ref_=fplfs&amp;amp;psc=1&amp;amp;smid=AP6IZ69K79O66" rel="noopener noreferrer"&gt;Fluent Python&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://docs.python.org/3/library/concurrent.futures.html" rel="noopener noreferrer"&gt;Python Documentation- Launching parallel tasks&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>backend</category>
      <category>programming</category>
      <category>dataengineering</category>
    </item>
  </channel>
</rss>
