<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: mojoatomic</title>
    <description>The latest articles on Forem by mojoatomic (@mojoatomic).</description>
    <link>https://forem.com/mojoatomic</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3707729%2F53cd8390-01cc-450a-81be-a3cedb072234.png</url>
      <title>Forem: mojoatomic</title>
      <link>https://forem.com/mojoatomic</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mojoatomic"/>
    <language>en</language>
    <item>
      <title>Your "Atomic" Deploys Probably Aren't Atomic</title>
      <dc:creator>mojoatomic</dc:creator>
      <pubDate>Mon, 12 Jan 2026 22:13:09 +0000</pubDate>
      <link>https://forem.com/mojoatomic/your-atomic-deploys-probably-arent-atomic-3p7a</link>
      <guid>https://forem.com/mojoatomic/your-atomic-deploys-probably-arent-atomic-3p7a</guid>
      <description>&lt;p&gt;If you're using symlink swaps for zero-downtime deployments, you've probably written something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-sfn&lt;/span&gt; releases/20260112 current
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looks atomic. It's not.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bug
&lt;/h2&gt;

&lt;p&gt;Run that command through strace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;symlink("releases/20260112", "current") = -1 EEXIST
unlink("current") = 0                    
symlink("releases/20260112", "current") = 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See the problem? Between &lt;code&gt;unlink&lt;/code&gt; and &lt;code&gt;symlink&lt;/code&gt;, the &lt;code&gt;current&lt;/code&gt; symlink doesn't exist. Under load, some percentage of requests hit that gap and get &lt;code&gt;ENOENT&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Your "zero-downtime" deploy just caused downtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Linux Fix
&lt;/h2&gt;

&lt;p&gt;This is well-documented:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; releases/20260112 .tmp/current.&lt;span class="nv"&gt;$$&lt;/span&gt;
&lt;span class="nb"&gt;mv&lt;/span&gt; &lt;span class="nt"&gt;-T&lt;/span&gt; .tmp/current.&lt;span class="nv"&gt;$$&lt;/span&gt; current
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a temp symlink, then use &lt;code&gt;mv -T&lt;/code&gt; to atomically replace the target. The &lt;code&gt;-T&lt;/code&gt; flag makes &lt;code&gt;mv&lt;/code&gt; call &lt;code&gt;rename(2)&lt;/code&gt;, which is atomic on POSIX filesystems.&lt;/p&gt;

&lt;p&gt;Problem solved. Unless you're on macOS.&lt;/p&gt;

&lt;h2&gt;
  
  
  The macOS Problem
&lt;/h2&gt;

&lt;p&gt;BSD &lt;code&gt;mv&lt;/code&gt; doesn't have &lt;code&gt;-T&lt;/code&gt;. And it follows symlinks differently - if &lt;code&gt;current&lt;/code&gt; is a symlink to a directory, &lt;code&gt;mv .tmp/current.$$ current&lt;/code&gt; moves the temp symlink &lt;em&gt;into&lt;/em&gt; the directory instead of replacing it.&lt;/p&gt;

&lt;p&gt;The Capistrano community has known about this for over a decade. Their workaround is clever - create the symlink in a subdirectory and move it via relative path - but it requires their Ruby runtime.&lt;/p&gt;

&lt;p&gt;Most deployment tools just accept the race condition on Mac, or tell you to develop on Linux.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Different Approach
&lt;/h2&gt;

&lt;p&gt;I needed something that works on both platforms. I manage infrastructure across Linux servers and Mac dev machines, and "just use Linux" wasn't an option.&lt;/p&gt;

&lt;p&gt;Python's &lt;code&gt;os.replace()&lt;/code&gt; calls &lt;code&gt;rename(2)&lt;/code&gt; directly on all POSIX systems:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;detect_platform&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nb"&gt;mv&lt;/span&gt; &lt;span class="nt"&gt;--version&lt;/span&gt; 2&amp;gt;/dev/null | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="s1"&gt;'GNU'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'linux'&lt;/span&gt;
    &lt;span class="k"&gt;else
        &lt;/span&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'bsd'&lt;/span&gt;
    &lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

activate_release&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;tmp_link&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;".tmp/current.&lt;/span&gt;&lt;span class="nv"&gt;$$&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"releases/&lt;/span&gt;&lt;span class="nv"&gt;$release_id&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$tmp_link&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;detect_platform&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"linux"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;mv&lt;/span&gt; &lt;span class="nt"&gt;-T&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$tmp_link&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"current"&lt;/span&gt;
    &lt;span class="k"&gt;else
        &lt;/span&gt;python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"import os; os.replace('&lt;/span&gt;&lt;span class="nv"&gt;$tmp_link&lt;/span&gt;&lt;span class="s2"&gt;', 'current')"&lt;/span&gt;
    &lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Platform detection checks for GNU coreutils by running &lt;code&gt;mv --version&lt;/code&gt; rather than relying on &lt;code&gt;uname&lt;/code&gt;. This handles edge cases like Homebrew GNU coreutils on Mac.&lt;/p&gt;

&lt;h2&gt;
  
  
  Proving the Race Condition
&lt;/h2&gt;

&lt;p&gt;Want to see the bug yourself? Here's a test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; releases/v1 releases/v2
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"v1"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; releases/v1/version
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"v2"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; releases/v2/version
&lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; releases/v1 current

&lt;span class="c"&gt;# Reader loop in background&lt;/span&gt;
&lt;span class="o"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;i &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;1..10000&lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    &lt;/span&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;current/version 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"ENOENT"&lt;/span&gt;
  &lt;span class="k"&gt;done&lt;/span&gt;
&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; reads.log &amp;amp;

&lt;span class="nv"&gt;reader_pid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$!&lt;/span&gt;

&lt;span class="c"&gt;# Writer loop - rapidly swap symlink&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;i &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;1..1000&lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-sfn&lt;/span&gt; releases/v1 current
  &lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-sfn&lt;/span&gt; releases/v2 current
&lt;span class="k"&gt;done

&lt;/span&gt;&lt;span class="nb"&gt;wait&lt;/span&gt; &lt;span class="nv"&gt;$reader_pid&lt;/span&gt;

&lt;span class="nv"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; ENOENT reads.log &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Errors: &lt;/span&gt;&lt;span class="nv"&gt;$errors&lt;/span&gt;&lt;span class="s2"&gt; / 10000 reads"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On a typical system, you'll see 10-50 errors per run. With the atomic approach, zero.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Full Script
&lt;/h2&gt;

&lt;p&gt;I wrapped this into a deployment script with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Atomic symlink swap&lt;/strong&gt; on both Linux and macOS&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Directory-based locking&lt;/strong&gt; with stale PID detection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic rollback&lt;/strong&gt; on SIGINT/SIGTERM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State machine cleanup&lt;/strong&gt; that knows whether to rollback or just clean up temp files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's a single file, no dependencies beyond bash and python3.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/mojoatomic/atomic-deploy" rel="noopener noreferrer"&gt;github.com/mojoatomic/atomic-deploy&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Doesn't Do
&lt;/h2&gt;

&lt;p&gt;Intentionally out of scope:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Shared directories&lt;/strong&gt; - No Capistrano-style shared folder symlinking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Remote deployment&lt;/strong&gt; - Wrap it in ssh/rsync&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Release pruning&lt;/strong&gt; - Add a cron job&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Service restarts&lt;/strong&gt; - Use your init system&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One thing, done right.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Why not just use Capistrano/Deployer?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;They're great if you're already in Ruby/PHP. This is a single script you can drop into any CI pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why not containers?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not everyone is on Kubernetes. VMs, bare metal, and edge devices still exist.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Python is a dependency.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes, but python3 ships with macOS and virtually every Linux distro. It's as ubiquitous as bash.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What about &lt;code&gt;renameat2()&lt;/code&gt; with &lt;code&gt;RENAME_EXCHANGE&lt;/code&gt;?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's Linux 3.15+ with glibc 2.28+. It does a true atomic swap, but it's not portable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does this work on NFS?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No. &lt;code&gt;rename(2)&lt;/code&gt; atomicity guarantees don't hold on network filesystems.&lt;/p&gt;




&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://temochka.com/blog/posts/2017/02/17/atomic-symlinks.html" rel="noopener noreferrer"&gt;Atomic symlinks&lt;/a&gt; - Deep dive on the problem&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://rcrowley.org/2010/01/06/things-unix-can-do-atomically.html" rel="noopener noreferrer"&gt;Things UNIX can do atomically&lt;/a&gt; - The &lt;code&gt;mv -T&lt;/code&gt; insight&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/capistrano/capistrano/issues/346" rel="noopener noreferrer"&gt;Capistrano issue #346&lt;/a&gt; - Original bug report from 2013&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>devops</category>
      <category>deployment</category>
      <category>linux</category>
      <category>macos</category>
    </item>
  </channel>
</rss>
