<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Rachit Avasthi</title>
    <description>The latest articles on Forem by Rachit Avasthi (@rachit_avasthi).</description>
    <link>https://forem.com/rachit_avasthi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1226671%2F2ff776b8-faed-4313-a269-881fb742efff.png</url>
      <title>Forem: Rachit Avasthi</title>
      <link>https://forem.com/rachit_avasthi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/rachit_avasthi"/>
    <language>en</language>
    <item>
      <title>How Platforms Like Zomato, Swiggy, Uber, and Ola Update Rider’s Location in Real Time</title>
      <dc:creator>Rachit Avasthi</dc:creator>
      <pubDate>Fri, 13 Mar 2026 08:11:05 +0000</pubDate>
      <link>https://forem.com/rachit_avasthi/how-platforms-like-zomato-swiggy-uber-and-ola-update-riders-location-in-real-time-3ic5</link>
      <guid>https://forem.com/rachit_avasthi/how-platforms-like-zomato-swiggy-uber-and-ola-update-riders-location-in-real-time-3ic5</guid>
      <description>&lt;p&gt;When you order food on &lt;strong&gt;Zomato or Swiggy&lt;/strong&gt;, or book a ride on &lt;strong&gt;Uber or Ola&lt;/strong&gt;, you can see the rider or driver moving live on a map.&lt;br&gt;&lt;br&gt;
This real‑time tracking feels simple on the surface—but behind it lies a &lt;strong&gt;well‑orchestrated system of GPS, mobile apps, backend servers, and real‑time communication protocols&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this blog, we’ll break down &lt;strong&gt;how these platforms update a rider’s location in real time&lt;/strong&gt;, step by step, in a way that’s easy to understand—even if you’re not deeply technical.&lt;/p&gt;


&lt;h2&gt;
  
  
  &lt;strong&gt;1. The Core Building Block: GPS on the Rider’s Phone&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Everything starts with the &lt;strong&gt;rider’s smartphone&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;How GPS Works&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  The rider’s phone constantly communicates with &lt;strong&gt;GPS satellites&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  These satellites help calculate:

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Latitude&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Longitude&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Speed&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Direction&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Accuracy Factors&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;GPS accuracy depends on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Open sky vs crowded buildings&lt;/li&gt;
&lt;li&gt;  Network quality&lt;/li&gt;
&lt;li&gt;  Device sensors (accelerometer, gyroscope)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;📍 Typically, accuracy ranges between &lt;strong&gt;5–20 meters&lt;/strong&gt;, which is good enough for real‑time tracking.&lt;/p&gt;


&lt;h2&gt;
  
  
  &lt;strong&gt;2. Rider App Continuously Captures Location&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;delivery or driver app&lt;/strong&gt; (Zomato Delivery App, Uber Driver App, etc.) runs a &lt;strong&gt;background location service&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;What the App Does&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  Fetches GPS coordinates every:

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;2–5 seconds (Uber/Ola)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;5–10 seconds (Zomato/Swiggy)&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;  Adjusts frequency based on:

&lt;ul&gt;
&lt;li&gt;  Movement speed&lt;/li&gt;
&lt;li&gt;  Battery level&lt;/li&gt;
&lt;li&gt;  Ride/order state&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Smart Optimization&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;To save battery:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Location updates slow down when idle&lt;/li&gt;
&lt;li&gt;  Updates increase when the rider is moving&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  &lt;strong&gt;3. Sending Location to Backend Servers&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Once the app captures the location, it must &lt;strong&gt;send it to the platform’s servers&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;How Data Is Sent&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The app sends:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  riderId,
  latitude,
  longitude,
  timestamp,
  speed,
  heading
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;HTTPS (REST APIs)&lt;/strong&gt; for periodic updates&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;WebSockets / gRPC / MQTT&lt;/strong&gt; for real‑time streaming&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;✅ Secure&lt;br&gt;&lt;br&gt;
✅ Low latency&lt;br&gt;&lt;br&gt;
✅ Scalable&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;4. Real‑Time Communication: The Secret Sauce&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Polling the server every second would be slow and expensive.&lt;br&gt;&lt;br&gt;
Instead, platforms use &lt;strong&gt;real‑time messaging systems&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Common Technologies Used&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;WebSockets&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Firebase Realtime Database&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Kafka + WebSocket Gateway&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;AWS AppSync / Google Pub‑Sub&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Why Real‑Time Tech Matters&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  Server &lt;strong&gt;pushes updates instantly&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  No repeated API calls&lt;/li&gt;
&lt;li&gt;  Smooth map movement on the customer’s screen&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;5. Backend Systems Process Location Updates&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The backend doesn’t just forward location—it &lt;strong&gt;processes and optimizes it&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Backend Responsibilities&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  Validate rider authenticity&lt;/li&gt;
&lt;li&gt;  Smooth noisy GPS signals&lt;/li&gt;
&lt;li&gt;  Snap location to roads (Map Matching)&lt;/li&gt;
&lt;li&gt;  Detect anomalies (GPS jumps)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Map Matching Example&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;If GPS says the rider is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“10 meters inside a building”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Backend adjusts it to:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Nearest road on Google Maps”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This makes movement look &lt;strong&gt;natural and realistic&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;6. Customer App Receives Live Updates&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Now comes the magic users actually see.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Customer App Flow&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; Customer opens order/ride screen&lt;/li&gt;
&lt;li&gt; App subscribes to rider’s location channel&lt;/li&gt;
&lt;li&gt; Backend pushes live coordinates&lt;/li&gt;
&lt;li&gt; Map updates smoothly every few seconds&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Smooth Animations&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Instead of jumping points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Apps &lt;strong&gt;interpolate movement&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  Animate markers&lt;/li&gt;
&lt;li&gt;  Predict next position using speed &amp;amp; direction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This creates the illusion of &lt;strong&gt;continuous motion&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;7. Maps Integration (Google Maps / Mapbox)&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;These platforms don’t build maps from scratch.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Maps Providers&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Google Maps&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Mapbox&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;OpenStreetMap (with custom layers)&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;What Maps APIs Do&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  Render roads and buildings&lt;/li&gt;
&lt;li&gt;  Calculate ETA&lt;/li&gt;
&lt;li&gt;  Suggest shortest routes&lt;/li&gt;
&lt;li&gt;  Handle traffic data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;📍 Real‑time traffic also helps recalculate ETA dynamically.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;8. Handling Network &amp;amp; GPS Failures&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Real‑world conditions are messy.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Common Problems&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  Poor internet&lt;/li&gt;
&lt;li&gt;  GPS signal loss&lt;/li&gt;
&lt;li&gt;  Phone battery saver mode&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;How Apps Handle This&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  Cache last known location&lt;/li&gt;
&lt;li&gt;  Predict movement temporarily&lt;/li&gt;
&lt;li&gt;  Display “Updating location…” messages&lt;/li&gt;
&lt;li&gt;  Fall back to lower‑frequency updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ensures the experience doesn’t completely break.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;9. Security &amp;amp; Privacy Considerations&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Tracking users is sensitive, so strict rules apply.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Privacy Safeguards&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  Location tracking only during active orders/rides&lt;/li&gt;
&lt;li&gt;  Data encrypted in transit&lt;/li&gt;
&lt;li&gt;  Automatic tracking stop after completion&lt;/li&gt;
&lt;li&gt;  Compliance with GDPR &amp;amp; local laws&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🚫 Riders are &lt;strong&gt;not tracked 24/7&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;10. High‑Level Architecture Summary&lt;/strong&gt;
&lt;/h2&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Rider Phone
   ↓ (GPS)
Rider App
   ↓ (WebSocket / API)
Backend Servers
   ↓ (Real-time Push)
Customer App
   ↓
Live Map View
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Simple to look at, but &lt;strong&gt;highly optimized at scale&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Why This Matters at Scale&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Platforms like Uber or Swiggy handle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Millions of location updates per second&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Low latency requirements&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Global scale traffic&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even a &lt;strong&gt;1‑second delay&lt;/strong&gt; can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Increase customer anxiety&lt;/li&gt;
&lt;li&gt;  Cause missed pickups&lt;/li&gt;
&lt;li&gt;  Affect trust&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s why this system is &lt;strong&gt;engineered with precision&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Final Thoughts&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Real‑time rider tracking is a &lt;strong&gt;perfect blend of mobile sensors, networking, backend engineering, and UX design&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;What looks like a small moving dot on your screen is actually:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  GPS satellites&lt;/li&gt;
&lt;li&gt;  Background services&lt;/li&gt;
&lt;li&gt;  Real‑time streaming&lt;/li&gt;
&lt;li&gt;  Distributed systems&lt;/li&gt;
&lt;li&gt;  Map intelligence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All working together—seamlessly.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>beginners</category>
      <category>mobile</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Fixing PySpark “Cannot run program python3” Error on Windows</title>
      <dc:creator>Rachit Avasthi</dc:creator>
      <pubDate>Fri, 19 Dec 2025 04:56:25 +0000</pubDate>
      <link>https://forem.com/rachit_avasthi/fixing-pyspark-cannot-run-program-python3-error-on-windows-2h17</link>
      <guid>https://forem.com/rachit_avasthi/fixing-pyspark-cannot-run-program-python3-error-on-windows-2h17</guid>
      <description>&lt;p&gt;When running PySpark on Windows, many beginners (and even experienced developers) encounter the following error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;java.io.IOException: Cannot run program "python3":
CreateProcess error=2, The system cannot find the file specified

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This article explains &lt;strong&gt;why this error happens&lt;/strong&gt;, &lt;strong&gt;why one solution works and another doesn’t&lt;/strong&gt;, and the &lt;strong&gt;correct, professional way to fix it permanently&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Understanding the Problem
&lt;/h2&gt;

&lt;p&gt;Apache Spark is written in &lt;strong&gt;Java/Scala&lt;/strong&gt;, but PySpark allows us to write Spark applications in &lt;strong&gt;Python&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When Spark executes Python code, it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Starts the &lt;strong&gt;JVM (Java Virtual Machine)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Spawns a &lt;strong&gt;Python worker process&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Communicates between Java and Python using &lt;strong&gt;Py4J&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By default, Spark tries to launch a Python executable named:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python3

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works on &lt;strong&gt;Linux and macOS&lt;/strong&gt;, but on &lt;strong&gt;Windows&lt;/strong&gt;, the Python executable is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python.exe

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since &lt;code&gt;python3&lt;/code&gt; does not exist on Windows, Spark fails to start the Python worker and the job crashes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Setting Python Inside the Code Works
&lt;/h2&gt;

&lt;p&gt;A common workaround is setting the Python executable directly in the script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PYSPARK_PYTHON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;C:\Users\User\Desktop\Training\Week5\venv\Scripts\python.exe&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PYSPARK_DRIVER_PYTHON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;C:\Users\User\Desktop\Training\Week5\venv\Scripts\python.exe&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pyspark.sql&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SparkSession&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The environment variables are set &lt;strong&gt;before SparkSession is created&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Spark reads these variables immediately&lt;/li&gt;
&lt;li&gt;The correct Python interpreter is used&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, this approach is &lt;strong&gt;not ideal&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The same code must be repeated in every PySpark file&lt;/li&gt;
&lt;li&gt;Scripts become cluttered&lt;/li&gt;
&lt;li&gt;It is not how Spark is configured in real-world projects&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why the PowerShell Method Often Fails
&lt;/h2&gt;

&lt;p&gt;You may try setting environment variables in PowerShell:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$&lt;/span&gt;&lt;span class="nn"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;PYSPARK_PYTHON&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"C:\path\to\python.exe"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nv"&gt;$&lt;/span&gt;&lt;span class="nn"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;PYSPARK_DRIVER_PYTHON&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"C:\path\to\python.exe"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;python&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Lab1.py&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sometimes fails because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PowerShell variables are &lt;strong&gt;session-scoped&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Closing the terminal clears them&lt;/li&gt;
&lt;li&gt;Running Spark from another terminal or editor loses them&lt;/li&gt;
&lt;li&gt;Spark must see these variables &lt;strong&gt;before the JVM starts&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes the approach unreliable for long-term use.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Correct and Permanent Solution (Best Practice)
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;recommended and professional solution&lt;/strong&gt; is to set these variables at the &lt;strong&gt;Windows system level&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This ensures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spark always knows which Python to use&lt;/li&gt;
&lt;li&gt;No code changes are required&lt;/li&gt;
&lt;li&gt;Works across all projects and terminals&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step-by-Step: Setting Environment Variables on Windows
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Open Environment Variables
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Press &lt;strong&gt;Windows + R&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Type:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sysdm.cpl

&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Go to the &lt;strong&gt;Advanced&lt;/strong&gt; tab&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;strong&gt;Environment Variables&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Step 2: Add User Environment Variables
&lt;/h3&gt;

&lt;p&gt;Under &lt;strong&gt;User variables&lt;/strong&gt;, click &lt;strong&gt;New&lt;/strong&gt; and add the following:&lt;/p&gt;

&lt;h3&gt;
  
  
  Variable 1
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name&lt;/strong&gt;: &lt;code&gt;PYSPARK_PYTHON&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Value&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;C:\Users\User\Desktop\Training\Week5\venv\Scripts\python.exe

&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Variable 2
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name&lt;/strong&gt;: &lt;code&gt;PYSPARK_DRIVER_PYTHON&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Value&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;C:\Users\User\Desktop\Training\Week5\venv\Scripts\python.exe

&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Click &lt;strong&gt;OK&lt;/strong&gt; on all windows.&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 3: Restart Your Terminal (Very Important)
&lt;/h3&gt;

&lt;p&gt;Environment variables only load when a terminal starts.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Close &lt;strong&gt;all&lt;/strong&gt; PowerShell / CMD / VS Code terminals&lt;/li&gt;
&lt;li&gt;Open a new PowerShell&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Activate your virtual environment:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;\venv\Scripts\activate&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Step 4: Verify the Setup
&lt;/h3&gt;

&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$&lt;/span&gt;&lt;span class="nn"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;PYSPARK_PYTHON&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you see your Python path, Spark will see it too.&lt;/p&gt;




&lt;h2&gt;
  
  
  Clean PySpark Code (After Fix)
&lt;/h2&gt;

&lt;p&gt;Once the environment is set, your PySpark script stays clean:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pyspark.sql&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SparkSession&lt;/span&gt;

&lt;span class="n"&gt;spark&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SparkSession&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getOrCreate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Laptop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Electronics&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mobile&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Electronics&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tablet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Electronics&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;15000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Headphones&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Accessories&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Keyboard&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Accessories&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;spark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createDataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;product&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No environment setup code is required anymore.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Spark defaults to &lt;code&gt;python3&lt;/code&gt;, which breaks on Windows&lt;/li&gt;
&lt;li&gt;Setting Python inside the script works but is not scalable&lt;/li&gt;
&lt;li&gt;PowerShell environment variables are temporary&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Windows Environment Variables are the correct solution&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Always configure Spark &lt;strong&gt;outside your code&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;If you are learning Spark on Windows, this configuration step is &lt;strong&gt;mandatory&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Once set correctly, PySpark becomes stable, predictable, and production-ready.&lt;/p&gt;

&lt;p&gt;Happy Spark learning 🚀&lt;/p&gt;

</description>
      <category>programming</category>
      <category>python</category>
      <category>pyspark</category>
      <category>spark</category>
    </item>
    <item>
      <title>Fixing PySpark on Windows: Downgrading from Python 3.13 to 3.11 (Complete Guide)</title>
      <dc:creator>Rachit Avasthi</dc:creator>
      <pubDate>Fri, 19 Dec 2025 04:55:11 +0000</pubDate>
      <link>https://forem.com/rachit_avasthi/fixing-pyspark-on-windows-downgrading-from-python-313-to-311-complete-guide-12d6</link>
      <guid>https://forem.com/rachit_avasthi/fixing-pyspark-on-windows-downgrading-from-python-313-to-311-complete-guide-12d6</guid>
      <description>&lt;p&gt;If you’re trying to run &lt;strong&gt;PySpark on Windows&lt;/strong&gt; with &lt;strong&gt;Python 3.13&lt;/strong&gt;, you’ll quickly run into errors like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AttributeError: module 'socketserver' has no attribute 'UnixStreamServer'

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This can be frustrating—especially when your code is perfectly fine.&lt;/p&gt;

&lt;p&gt;In this post, I’ll walk you through a &lt;strong&gt;complete, working setup&lt;/strong&gt; for PySpark on Windows by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Installing &lt;strong&gt;Python 3.11 alongside Python 3.13&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Creating a clean virtual environment&lt;/li&gt;
&lt;li&gt;Installing a &lt;strong&gt;compatible PySpark version&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Optionally fixing Windows-specific Spark warnings using &lt;strong&gt;winutils.exe&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This setup is stable, beginner-friendly, and recommended for learning and local development.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why PySpark Fails with Python 3.13
&lt;/h2&gt;

&lt;p&gt;The problem isn’t your code—it’s compatibility.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;PySpark does not yet support Python 3.13&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PySpark 4.x has known issues on Windows&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Some internal APIs were removed in Python 3.13 that PySpark still relies on&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ✅ The correct combination on Windows is:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Python 3.11&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PySpark 3.5.x&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Java 8 or 11&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 1: Install Python 3.11 (Side-by-Side)
&lt;/h2&gt;

&lt;p&gt;Do &lt;strong&gt;not&lt;/strong&gt; uninstall Python 3.13. Instead, install Python 3.11 alongside it.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Download Python 3.11 (64-bit):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.python.org/downloads/release/python-3119/" rel="noopener noreferrer"&gt;https://www.python.org/downloads/release/python-3119/&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Run the installer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check &lt;strong&gt;“Add Python to PATH”&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Customize installation&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Enable &lt;strong&gt;“Install for all users”&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Finish the installation&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Verify it worked:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;11&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;--version&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Python 3.11.x

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 2: Allow Virtual Environment Activation in PowerShell
&lt;/h2&gt;

&lt;p&gt;By default, Windows blocks script execution, which prevents virtual environments from activating.&lt;/p&gt;

&lt;p&gt;Run this once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;Set-ExecutionPolicy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;RemoteSigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Scope&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;CurrentUser&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Press &lt;strong&gt;Y&lt;/strong&gt; to confirm.&lt;/p&gt;

&lt;p&gt;This change is safe and only applies to your user account.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: Create a Python 3.11 Virtual Environment
&lt;/h2&gt;

&lt;p&gt;Navigate to your project directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;C:\Users\User\Desktop\Training\Week5&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Remove any old virtual environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;Remove-Item&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Recurse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Force&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;venv&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a new one using Python 3.11:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;11&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-m&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;venv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;venv&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Activate it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;venv\Scripts\activate&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should now see:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(venv)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Confirm the Python version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;python&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;--version&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Python 3.11.x

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 4: Install the Correct PySpark Version
&lt;/h2&gt;

&lt;p&gt;Do &lt;strong&gt;not&lt;/strong&gt; install the latest PySpark blindly.&lt;/p&gt;

&lt;h3&gt;
  
  
  ❌ Avoid
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;pip&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;install&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;pyspark&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ✅ Install the Windows-safe version
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;pip&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;install&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;pyspark&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;3.5&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;1&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;pip&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;show&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;pyspark&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Version: 3.5.1

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 5: Test Your Spark Setup
&lt;/h2&gt;

&lt;p&gt;Create a file called &lt;code&gt;Lab1.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pyspark.sql&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SparkSession&lt;/span&gt;

&lt;span class="n"&gt;spark&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SparkSession&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;appName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Test&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;getOrCreate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;spark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;spark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;python&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Lab1.py&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you see numbers from &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;9&lt;/code&gt;, Spark is running successfully 🎉&lt;/p&gt;




&lt;h2&gt;
  
  
  Common Mistakes to Avoid
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Running Python explicitly from 3.13:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;C:\...\Python313\python.exe&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Lab1.py&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Installing PySpark 4.x on Windows&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Using Python 3.12 or newer with Spark&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Forgetting to activate the virtual environment&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Golden Rule
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;When (venv) is active, always use python, never a full Python path.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Optional: Fix winutils.exe Warnings on Windows
&lt;/h2&gt;

&lt;p&gt;You may see warnings like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Did not find winutils.exe
HADOOP_HOME and hadoop.home.dir are unset

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Spark works &lt;strong&gt;fine without winutils&lt;/strong&gt;, but adding it removes these warnings.&lt;/p&gt;




&lt;h3&gt;
  
  
  Which winutils Version to Use
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hadoop version:&lt;/strong&gt; 3.3.6&lt;/li&gt;
&lt;li&gt;Compatible with &lt;strong&gt;Spark 3.5.x&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Setup winutils
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Create this folder:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;C:\hadoop\bin\

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Place &lt;code&gt;winutils.exe&lt;/code&gt; inside:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;C:\hadoop\bin\winutils.exe

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Set environment variables:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;setx&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;HADOOP_HOME&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;C:\hadoop&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;setx&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;PATH&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"%PATH%;C:\hadoop\bin"&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Restart PowerShell and verify:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;winutils.exe&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If usage info prints, it’s working.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Working Setup
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;3.11.x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PySpark&lt;/td&gt;
&lt;td&gt;3.5.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hadoop (winutils)&lt;/td&gt;
&lt;td&gt;3.3.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OS&lt;/td&gt;
&lt;td&gt;Windows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Virtual Environment&lt;/td&gt;
&lt;td&gt;Enabled&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Setting up PySpark on Windows requires careful version alignment, but once configured correctly, it works reliably.&lt;/p&gt;

&lt;p&gt;By:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keeping Python 3.13 installed&lt;/li&gt;
&lt;li&gt;Using Python 3.11 in a virtual environment&lt;/li&gt;
&lt;li&gt;Pinning PySpark to 3.5.1&lt;/li&gt;
&lt;li&gt;Optionally configuring winutils&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;you now have a &lt;strong&gt;stable Spark development environment on Windows&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Happy coding 🚀&lt;/p&gt;




</description>
      <category>pyspark</category>
      <category>python</category>
      <category>programming</category>
      <category>spark</category>
    </item>
    <item>
      <title>RESTART Series: Day 1 – Laying the Foundations</title>
      <dc:creator>Rachit Avasthi</dc:creator>
      <pubDate>Mon, 04 Nov 2024 19:20:37 +0000</pubDate>
      <link>https://forem.com/rachit_avasthi/restart-series-day-1-laying-the-foundations-2dhe</link>
      <guid>https://forem.com/rachit_avasthi/restart-series-day-1-laying-the-foundations-2dhe</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;"The journey of a thousand miles begins with a single step." — Lao Tzu&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Hello everyone!&lt;/p&gt;

&lt;p&gt;Today marks &lt;strong&gt;Day 1&lt;/strong&gt; of the RESTART journey, and it’s already been a packed day of learning, building, and revisiting. Here's a breakdown of what I tackled and some highlights:&lt;/p&gt;

&lt;h2&gt;
  
  
  Revisiting HTML &amp;amp; CSS Basics
&lt;/h2&gt;

&lt;p&gt;To kick things off, I dove back into the &lt;strong&gt;Frontend Domination course by Sheriyans Coding School&lt;/strong&gt;, where I focused on HTML and CSS. Despite having experience with these tools, going back to basics has been refreshing. It's easy to overlook small yet crucial details when you're working at a high level, and today was all about strengthening that foundation. Revisiting HTML tags, CSS positioning, and flexbox tricks reminded me of how powerful even the simplest elements can be when used correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  First Full-Stack Project: A Social Media Platform for College
&lt;/h2&gt;

&lt;p&gt;Alongside the HTML and CSS revision, I started building my &lt;strong&gt;first full-stack project&lt;/strong&gt;—a social media platform designed specifically for a college community. It’s ambitious but exciting! I began by planning the core features and laying out the structure. Starting with frontend basics, I’m crafting a design that’s visually appealing but also functional and efficient for student interactions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Progress in Cloud Computing with AWS
&lt;/h2&gt;

&lt;p&gt;I also completed another module in &lt;strong&gt;AWS Cloud Practitioner from GeeksForGeeks&lt;/strong&gt;. Cloud computing has always been a field I’ve wanted to understand better, and today’s module brought me a step closer to that goal. Gaining more insights into how cloud infrastructure works has already started sparking ideas for future projects. Each lesson in this module is helping me build confidence in AWS and in understanding the potential it brings to backend development.&lt;/p&gt;

&lt;h2&gt;
  
  
  Goldman Sachs Virtual Internship: Cryptography &amp;amp; Password Security
&lt;/h2&gt;

&lt;p&gt;The highlight of the day was receiving my certificate for completing the &lt;strong&gt;Goldman Sachs virtual internship&lt;/strong&gt;. This experience was a deep dive into the world of &lt;strong&gt;cryptography&lt;/strong&gt; and &lt;strong&gt;password security&lt;/strong&gt;. During the internship, I learned how to analyse password structures, worked on cracking techniques, and explored the best practices for maintaining robust password protection. Knowing these techniques and their ethical applications has been incredibly eye-opening, especially when considering how vital security is in today’s tech-driven world.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Day 1 has set the stage for an incredible journey. From brushing up on HTML and CSS fundamentals to diving deeper into cloud computing and cybersecurity, I’m excited about what lies ahead. Tomorrow’s focus will be to progress further in my project and maybe tackle some backend concepts. Stay tuned for more!&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>beginners</category>
      <category>productivity</category>
      <category>certification</category>
    </item>
    <item>
      <title>Announcing the Start of RESTART: A Journey Back to Basics and Beyond</title>
      <dc:creator>Rachit Avasthi</dc:creator>
      <pubDate>Thu, 31 Oct 2024 06:43:02 +0000</pubDate>
      <link>https://forem.com/rachit_avasthi/announcing-the-start-of-restart-a-journey-back-to-basics-and-beyond-2d54</link>
      <guid>https://forem.com/rachit_avasthi/announcing-the-start-of-restart-a-journey-back-to-basics-and-beyond-2d54</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;“Every moment is a fresh beginning.” — T.S. Eliot&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Hello everyone!&lt;/p&gt;

&lt;p&gt;I’m thrilled to introduce a new series on this blog called RESTART! This isn’t just any regular journey; it’s a commitment to go back to the basics, relearn and reinforce the skills I already have, and dive into new territories. I’ll be starting from scratch, progressing one step at a time, and I’ll be sharing daily updates on every milestone, challenge, and “aha!” moment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why RESTART?
&lt;/h2&gt;

&lt;p&gt;Life moves fast, and with it, our skills and knowledge evolve. However, sometimes in the rush to get better, we miss refining the fundamentals that serve as the foundation of everything we do. RESTART is my way of taking a deep breath, hitting the reset button, and rebuilding from the ground up. I want to ensure my skills are solid, adaptable, and future-ready.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Expect
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Daily Updates:&lt;/strong&gt; Each day, I’ll post about what I’ve worked on. This includes recaps, insights, resources, and any breakthroughs along the way.&lt;br&gt;
&lt;strong&gt;Skill Building:&lt;/strong&gt; I’ll be revisiting old skills to make sure they’re sharp, covering basics to advanced levels, and sharing tips to master them. Alongside, I’ll dive into new skill sets, so expect to see fresh insights on tools and technologies.&lt;br&gt;
&lt;strong&gt;Engagement:&lt;/strong&gt; I’d love for this to be an interactive journey. Feel free to comment, ask questions, or share your insights if you’re on a similar journey or just want to follow along.&lt;/p&gt;

&lt;h2&gt;
  
  
  Goals of the RESTART Series
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Strengthen the Core:&lt;/strong&gt; Ensuring that fundamental knowledge is solid.&lt;br&gt;
Learn Something New: Exploring skills that are essential today, with a focus on growth and relevance.&lt;br&gt;
&lt;strong&gt;Document the Process:&lt;/strong&gt; Through daily posts, I aim to capture the real experience of learning, making mistakes, and growing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let’s Begin!
&lt;/h2&gt;

&lt;p&gt;Starting today, I’ll be sharing my journey here, one step at a time. Whether you’re here for inspiration, guidance, or just out of curiosity, I hope you’ll join me as we learn and grow together in this RESTART series.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>beginners</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
