<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Tal Hoffman</title>
    <description>The latest articles on Forem by Tal Hoffman (@talhof8).</description>
    <link>https://forem.com/talhof8</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F564296%2F49cfcc95-c4c0-4237-b7b8-7f4180d779a4.jpg</url>
      <title>Forem: Tal Hoffman</title>
      <link>https://forem.com/talhof8</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/talhof8"/>
    <language>en</language>
    <item>
      <title>Software Transactional Memory: a stairway to lock-free programming heaven?</title>
      <dc:creator>Tal Hoffman</dc:creator>
      <pubDate>Fri, 26 Mar 2021 18:48:26 +0000</pubDate>
      <link>https://forem.com/talhof8/software-transactional-memory-a-stairway-to-lock-free-programming-heaven-194j</link>
      <guid>https://forem.com/talhof8/software-transactional-memory-a-stairway-to-lock-free-programming-heaven-194j</guid>
      <description>&lt;p&gt;When it comes to synchronization of shared state and intermediacy, developer-controlled locking has always been a double-edged sword. Although effective, it's been proven to be deadly when done wrong.  &lt;/p&gt;

&lt;p&gt;Deadlocks, livelocks, and &lt;a href="https://medium.com/r/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FLock_%28computer_science%29%23Lack_of_composability" rel="noopener noreferrer"&gt;a lack of composability&lt;/a&gt; are all way too common, and frankly - hard to avoid - especially when dealing with large, complex applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Enter Software Transactional Memory.
&lt;/h3&gt;

&lt;p&gt;One alternative approach to this pain in the arse problem is moving the locking part inside the runtime, basically freeing the developer from locking decision concerns. It works such that each critical section access is done using an "atomic transaction".  &lt;/p&gt;

&lt;p&gt;Before we proceed any further, one has to understand the difference between "lock-less programming" and "lock-less programs" in regard to this blog post. STM is meant to be a solution to the first one by taking care of locking for us, the developers, hence "lock-less programming". This means no deadlocks, less livelocks, and much better composability.  &lt;/p&gt;

&lt;p&gt;We'll also be using a specific STM algorithm which is lock-free, in the sense of not using traditional locking primitives, but rather bounded spinlocks which if already acquired - do not block. Therefore, we're also considered a "lock-less program".  &lt;/p&gt;

&lt;p&gt;Now let's proceed…  &lt;/p&gt;

&lt;p&gt;STM was first introduced by &lt;a href="https://medium.com/r/?url=https%3A%2F%2Fgroups.csail.mit.edu%2Ftds%2Fpapers%2FShavit%2FShavitTouitou-podc95.pdf" rel="noopener noreferrer"&gt;Shavit and Touitou&lt;/a&gt; back in 1995. It was an exciting improvement of an earlier concept called Hardware Transactional Memory, in which hardware support was used to achieve the same goal, only now it could be done at software level - either entirely or as an hybrid software-hardware solution.  &lt;/p&gt;

&lt;p&gt;It works by isolating a set of reads and writes to shared memory locations in a construct called "a transaction". The runtime executes the user code as if it were to run alone, with no other threads interfering. It attempts to commit all of the transaction's writes into memory, and aborts execution if it notices a conflict with other threads. The runtime would keep retrying to run the transaction, until it is able to successfully commit all shared memory modifications.  &lt;/p&gt;

&lt;p&gt;However, as previously stated, it is important to understand that STM by itself doesn't necessarily mean lock-free concurrency. Some implementations are indeed lock-free, while some are not. Software Transactional Memory is merely an abstraction which frees the developer from typical locking concerns.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.researchgate.net%2Fprofile%2FHans-Kestler%2Fpublication%2F43049227%2Ffigure%2Ffig2%2FAS%3A267575359701012%401440806345935%2FSoftware-transactional-memory-Software-transactional-memory-circumvents-the-need-for.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.researchgate.net%2Fprofile%2FHans-Kestler%2Fpublication%2F43049227%2Ffigure%2Ffig2%2FAS%3A267575359701012%401440806345935%2FSoftware-transactional-memory-Software-transactional-memory-circumvents-the-need-for.png"&gt;&lt;/a&gt;  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Credit: &lt;a href="https://www.researchgate.net/figure/Software-transactional-memory-Software-transactional-memory-circumvents-the-need-for_fig2_43049227" rel="noopener noreferrer"&gt;https://researchgate.net&lt;/a&gt;*&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Sounds familiar?&lt;/p&gt;

&lt;p&gt;Many of you might have already heard of this modus operandi. A similar form of this concurrency control is used in databases and version control systems. It is known as &lt;a href="https://medium.com/r/?url=https%3A%2F%2Fen.m.wikipedia.org%2Fwiki%2FOptimistic_concurrency_control" rel="noopener noreferrer"&gt;optimistic concurrency&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/N1OBW2fPuXh91zXlPh/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/N1OBW2fPuXh91zXlPh/giphy.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Transactional Locking II (with a slight touch)
&lt;/h3&gt;

&lt;p&gt;There are several approaches and algorithms for implementing STM, one of them being "Transactional Locking II (TL2)" proposed by Dive, Shalev, and Shavit, on which we'll focus here.  &lt;/p&gt;

&lt;p&gt;The algorithm is pretty straightforward, and can be split into three parts: reading a variable, writing to a variable, and committing the transaction.   &lt;/p&gt;

&lt;p&gt;The secret to it all is "versioning". Basically, each time a value is read or a commit needs to go through, the runtime makes sure that versions are up-to-date and that there is no other thread in the middle of messing up with that memory location at the time of inspection. Otherwise, it starts over until each transaction is looking at a consistent view of things.  &lt;/p&gt;

&lt;p&gt;It is, in a sense, a form of double-checked locking integrated into the runtime, if you wish.  &lt;/p&gt;

&lt;p&gt;This works by maintaining a global version "clock" and wrapping each globally shareable variable with its own (shared) version and its own lock. Lets call it a "versioned lock" from now on.  &lt;/p&gt;

&lt;p&gt;That versioned lock is a 64-bit word (&lt;code&gt;uint64&lt;/code&gt;), for that matter, where the first bit is the lock and the other 63 bits hold the version. Kind of like &lt;code&gt;seqlocks&lt;/code&gt; which are being used in the Linux Kernel:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// VersionedLock consists of a lock bit and a version number.&lt;/span&gt;
&lt;span class="c"&gt;// Note that this lock doesn't enforce ownership!&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;VersionedLock&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;

&lt;span class="c"&gt;// Tries to acquire lock.&lt;/span&gt;
&lt;span class="c"&gt;// Non-blocking.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;VersionedLock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;TryAcquire&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;currentlyLocked&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentVersion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentLock&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sample&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;currentlyLocked&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ErrAlreadyLocked&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;

   &lt;span class="c"&gt;// Lock = true; Version = current&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tryCompareAndSwap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentVersion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentLock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Releases lock.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;VersionedLock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Release&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;currentlyLocked&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentVersion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentLock&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sample&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;currentlyLocked&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ErrAlreadyReleased&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;

   &lt;span class="c"&gt;// Lock = false; Version = current&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tryCompareAndSwap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentVersion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentLock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Atomically updates lock version and releases it.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;VersionedLock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;VersionedRelease&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;newVersion&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;currentlyLocked&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentLock&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sample&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;currentlyLocked&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ErrAlreadyReleased&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;

   &lt;span class="c"&gt;// Lock = false; Version = new&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tryCompareAndSwap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newVersion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentLock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Retrieves lock state - whether it is locked, its version, and its raw form.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;VersionedLock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Sample&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;current&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;atomic&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LoadUint64&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;vl&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
   &lt;span class="n"&gt;locked&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;locked&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;current&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;VersionedLock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;tryCompareAndSwap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doLock&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;desiredVersion&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;compareTo&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;newLock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;serialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doLock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;desiredVersion&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"try compare and swap"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;

   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;swapped&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;atomic&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CompareAndSwapUint64&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;vl&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;compareTo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newLock&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;swapped&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ErrLockModified&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;VersionedLock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;serialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;locked&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;versionOffset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c"&gt;// Version mustn't override our lock bit.&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ErrVersionOverflow&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;

   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;locked&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;versionOffset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;VersionedLock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;serialized&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;versionOffset&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;serialized&lt;/span&gt;
   &lt;span class="n"&gt;lockedBit&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;serialized&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;versionOffset&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;lockedBit&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Reads
&lt;/h4&gt;

&lt;p&gt;Prior to each read we make sure of 3 things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;That no other thread is currently trying to modify that memory location (lock bit should be off)&lt;/li&gt;
&lt;li&gt;That the shared version of that memory location is older or equals the global version, meaning we hold an up-to-date version of it&lt;/li&gt;
&lt;li&gt;That the relevant memory location wasn't modified beneath our feet in case a context switch occurred before our read operation (i.e, variable version pre-read == variable version post-read)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If this read-set validation fails - the transaction will restart.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;StmContext&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;readLog&lt;/span&gt;      &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StmVariable&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
   &lt;span class="n"&gt;writeLog&lt;/span&gt;     &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StmVariable&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
   &lt;span class="n"&gt;restart&lt;/span&gt;      &lt;span class="kt"&gt;bool&lt;/span&gt;
   &lt;span class="n"&gt;readVersion&lt;/span&gt;  &lt;span class="kt"&gt;uint64&lt;/span&gt;
   &lt;span class="n"&gt;writeVersion&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// ...&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sc&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StmContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stmVariable&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StmVariable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;newVal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;foundInWriteLog&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;writeLog&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;stmVariable&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="n"&gt;foundInWriteLog&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c"&gt;// Short road to success...&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;newVal&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;

   &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;preReadVersion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stmVariable&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sample&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="n"&gt;readVal&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stmVariable&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="n"&gt;locked&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;postReadVersion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stmVariable&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sample&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

   &lt;span class="c"&gt;// Fail transaction if:&lt;/span&gt;
   &lt;span class="c"&gt;// 1. Variable is currently being changed by some other goroutine; or if&lt;/span&gt;
   &lt;span class="c"&gt;// 2. Variable was changed before/after being read; or if&lt;/span&gt;
   &lt;span class="c"&gt;// 3. Variable is too new meaning our read version is outdated&lt;/span&gt;
   &lt;span class="n"&gt;sc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;restart&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;locked&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;preReadVersion&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;postReadVersion&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;preReadVersion&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;sc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readVersion&lt;/span&gt;

   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;readVal&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Writes
&lt;/h4&gt;

&lt;p&gt;When it comes to writing to shared locations, each update is cached in-memory, waiting for the transaction to end so it can be committed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sc&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StmContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stmVariable&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StmVariable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newVal&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;sc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;writeLog&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;stmVariable&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;newVal&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Commits
&lt;/h4&gt;

&lt;p&gt;We will implement a slightly modified version of TL2 in terms of committing transactions.&lt;/p&gt;

&lt;p&gt;When user code is done executing and commit time is due, we try to "lock" each write log variable &lt;strong&gt;and each read set variable*&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The next step is to atomically increment-and-fetch the global clock and making it be our transaction's write version. This would become the official version of each write-log variable in case everything goes well and our transactions commits successfully.&lt;/p&gt;

&lt;p&gt;Next up we need to once again validate our read-set, making sure nothing had changed between us executing the user's code and our attempt to commit the transaction. If validation fails - you know the drill - the transaction restarts. A slight optimization here is to avoid running this validation if we are the only transaction (that is, the transaction's read version + 1 == transaction's write version that we've just set).&lt;/p&gt;

&lt;p&gt;Otherwise, modifications are committed one-by-one in an atomic manner (say compare-and-swap?) and seqlocks are released.&lt;/p&gt;

&lt;p&gt;This entire process repeats itself until the transaction finally looks at a consistent memory layout and is able to commit itself successfully.  &lt;/p&gt;

&lt;p&gt;Please note that the original TL2 specification doesn't mention anything about locking the read-set. This left me thinking: what would happen if a context switch occurs right after the commit-phase's read-set validation, but before actually updating the write-set memory? In that case, a second thread might modify variables which were already validated by the first one, causing it to have a false sense of memory consistency.  &lt;/p&gt;

&lt;p&gt;Nonetheless, despite me &lt;a href="https://medium.com/r/?url=https%3A%2F%2Fwww.reddit.com%2Fr%2Fhaskell%2Fcomments%2Fleva71%2Fhelp_understanding_software_transactional_memory%2F" rel="noopener noreferrer"&gt;trying to desperately get good explanations&lt;/a&gt; as to why this isn't really an issue  -  I could not find any. None of the official papers seemed to address this concern either.  &lt;/p&gt;

&lt;p&gt;Therefore, I decided to go what I believe is the safer way (until being proven otherwise) and lock the read-set as well, taking into account additional performance costs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;StmAtomic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StmContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;StmContext&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="n"&gt;readLog&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StmVariable&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
         &lt;span class="n"&gt;writeLog&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;     &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StmVariable&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
         &lt;span class="n"&gt;restart&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="no"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="n"&gt;readVersion&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;versionClock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Load&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
         &lt;span class="n"&gt;writeVersion&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="n"&gt;retVal&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;restart&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="k"&gt;continue&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="c"&gt;// "And she's buying a stairway to heaven..."&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;writeLog&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;retVal&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="n"&gt;lockSet&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StmVariable&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;tryAcquireSets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lockSet&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;fatal&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;isFatalAcquireErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;fatal&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c"&gt;// Avoid a panic if lock is already acquired.&lt;/span&gt;
            &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fatal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
         &lt;span class="p"&gt;}&lt;/span&gt;
         &lt;span class="k"&gt;continue&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;writeVersion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;versionClock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Increment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

      &lt;span class="c"&gt;// Now that our read and write sets are locked, we need to ensure that nothing has changed in terms of our&lt;/span&gt;
      &lt;span class="c"&gt;// read set, in-between running the user's code and locking everything.&lt;/span&gt;
      &lt;span class="c"&gt;// However, if no other concurrent actors were involved (readVersion == writeVersion - 1), there is no need to&lt;/span&gt;
      &lt;span class="c"&gt;// validate anything cause we were all alone.&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readVersion&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;writeVersion&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;validated&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;validateReadSet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lockSet&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;validated&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
         &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="n"&gt;commitTransaction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lockSet&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;retVal&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  The entire repository code can be found here: &lt;a href="https://github.com/talhof8/kashmir" rel="noopener noreferrer"&gt;https://github.com/talhof8/kashmir&lt;/a&gt;.
&lt;/h4&gt;

&lt;h3&gt;
  
  
  Caveats
&lt;/h3&gt;

&lt;p&gt;No rose without a thorn, though. STM indeed has some very noticeable downsides and thus was considered a research toy for a long time. First off, STM rises the inevitable question of what do you do about non-idempotent operations. That is, I/O, network, print statements, etc…? Such operations cannot be simply undone when a memory conflict is detected. The default answer to this question is as simple (and perhaps conveniently looks the other way) as just to avoid putting side-effect causing operations inside STM atomic blocks. Another approach is to queue all side-effects causing operations inside a buffer and running them only on successful commits, outside the transaction.  &lt;/p&gt;

&lt;p&gt;Another potential issue - which is more  implementation-dependent than an overall STM issue - is livelocks. Obviously threads could still conflict, causing them to keep retrying the transaction thus none progressing. It could be avoided by using some backoff mechanism (be it a random backoff, exponential one, etc…) in-between transaction attempts. This will obviously cause a performance hit, raising the question of how important is performance in this regard?   &lt;/p&gt;

&lt;p&gt;In addition, a major concern is what to do about languages which do not have garbage collection. Unless given official implementation-level support, some STM algorithms would conceptually allow threads to &lt;code&gt;free&lt;/code&gt; heap-allocated variables while other threads are de-referencing them - causing segmentation faults. Have a look &lt;a href="https://stackoverflow.com/questions/8424684/optimistic-reads-and-locking-stm-software-transactional-memory-with-c-c" rel="noopener noreferrer"&gt;here&lt;/a&gt; for a deeper explanation.&lt;/p&gt;

&lt;p&gt;Over and above that, in our implementation we use a form of spin-locking, meaning that long-running commit attempts might blow up CPU usage (unless preempted by the runtime/OS). This means that performance-wise our STM implementation is most performant in scenarios where transactions succeed on first attempt. Overall STM introduces quite a good performance (also depending on the implementation), albeit slightly less good than classic fine-grained locking.  &lt;/p&gt;

&lt;p&gt;Moreover, long-running programs could obviously make the global version clock overflow. This case can quite easily be resolved by the runtime, by making it reset to zero, for example, and syncing all relevant transactions to use the new, reset version clock, at a slight performance cost, when such cases occur. &lt;/p&gt;

&lt;h3&gt;
  
  
  Applications
&lt;/h3&gt;

&lt;p&gt;Those of you who are familiar with the likes of CPython and Ruby MRI have probably heard of the infamous GIL. It is basically a mutex which prevents some interpreters from running threads in parallel, though still concurrently, in order to support specific memory management methods and for calling unsafe native extensions safely. Attempts have been made to get rid of Global Interpreter Lock (GIL) using our dear friend - Transactional Memory. For instance, a fork of pypy called pypy-stm has been created - &lt;a href="https://doc.pypy.org/en/latest/stm.html" rel="noopener noreferrer"&gt;https://doc.pypy.org/en/latest/stm.html&lt;/a&gt; - but seemed to be abandoned and is no longer maintained.  &lt;/p&gt;

&lt;p&gt;In addition, several languages such as Clojure and Haskell support Software Transactional Memory out-of-the-box as part of the core language/runtime, offering an alternative approach to both standard locking and actor model concurrency. There's also an official proposal and implementation by core Ruby committer Koichi Sasada to add STM to Ruby's parallelism feature named "Ractors" - &lt;a href="https://bugs.ruby-lang.org/issues/17261" rel="noopener noreferrer"&gt;https://bugs.ruby-lang.org/issues/17261&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Although STM officially being an integral part of both Clojure and Haskell, overall adoption of STM has been lacking to say the least. Attempts to get rid of the GIL by using STM has not come into fruition yet, as well.  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you'd like to read some reflections on that topic, please take a look at &lt;a href="https://dl.acm.org/doi/abs/10.1145/3359619.3359747" rel="noopener noreferrer"&gt;https://dl.acm.org/doi/abs/10.1145/3359619.3359747&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;STM is often compared to Garbage Collection&lt;/strong&gt;. Both operate on memory in runtime. Whilst the first manages the state of memory, the latest manages references to memory. Please do keep in mind that garbage collection used to be doubted and back-slashed a lot when first introduced to the world, and has suffered significant performance costs. It has been vastly improved over the years, to say the least, and has become a major, integral part of many programming languages. The community's hope in regard to this matter is to put our knowledge of garbage collection and how we have improved it, to use with transactional memory - making it much more mature, performant, and production-ready.&lt;/p&gt;

&lt;h4&gt;
  
  
  Originally posted on my &lt;a href="https://www.talhoffman.com" rel="noopener noreferrer"&gt;personal website&lt;/a&gt;
&lt;/h4&gt;

</description>
      <category>go</category>
      <category>programming</category>
      <category>concurrency</category>
      <category>haskell</category>
    </item>
    <item>
      <title>Introduction to CRIU and Live Migration</title>
      <dc:creator>Tal Hoffman</dc:creator>
      <pubDate>Thu, 21 Jan 2021 09:07:01 +0000</pubDate>
      <link>https://forem.com/talhof8/introduction-to-criu-and-live-migration-3en1</link>
      <guid>https://forem.com/talhof8/introduction-to-criu-and-live-migration-3en1</guid>
      <description>&lt;p&gt;In this blog post I will try and explain what CRIU is and how it works, what Live Migration is, and how those two are related.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;So what the heck is CRIU?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Checkpoint-Restore in Userspace&lt;/strong&gt; (or &lt;strong&gt;CRIU&lt;/strong&gt;) is a really (really) cool open-source project started by virtualization software company — &lt;a href="https://www.virtuozzo.com/"&gt;Virtuozoo&lt;/a&gt; — also known for being the creator of&lt;br&gt;
&lt;a href="https://openvz.org/"&gt;OpenVZ&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;What CRIU does is letting you freeze running Linux processes and checkpoint their state to the disk as a collection of files. Those files can later be used to restore a process right from the point it’d been freezed, multiple times, on&lt;br&gt;
any other CRIU-supported Linux machine!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/35H0pwQNaO2iLTnnBf/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/35H0pwQNaO2iLTnnBf/giphy.gif" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Among its many usage scenarios, CRIU can be used for slow-boot service speed up, remote debugging, snapshots, process duplication, and for what is our main topic today — &lt;strong&gt;live migrations.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Take a look at&lt;br&gt;
&lt;a href="https://criu.org/Usage_scenarios"&gt;https://criu.org/Usage_scenarios&lt;/a&gt; for more&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;CRIU is now integrated as part of Podman, Docker (experimental), OpenVZ, LXC/LXD, and can also be used independently.&lt;/p&gt;

&lt;h3&gt;
  
  
  The mechanics (in a nutshell)
&lt;/h3&gt;

&lt;p&gt;Let’s take a little dive into the internals of how this magic happens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Checkpoint&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The first step when checkpointing a process is walking recursively through its tree, and freezing it so it will not change its state while CRIU&lt;br&gt;
needs to dump it. &lt;/p&gt;

&lt;p&gt;CRIU supports two different methods for freezing the state of the process and its sub tasks.&lt;/p&gt;

&lt;p&gt;By default, CRIU makes use of &lt;code&gt;ptrace&lt;/code&gt; to stop the process. For those not familiar with &lt;code&gt;ptrace&lt;/code&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The ptrace() system call provides a means by which one process (the "tracer") may observe and control the execution of another process (the "tracee"), and examine and change the tracee's memory and registers. It is primarily used to implement breakpoint debugging and system call tracing.

See https://man7.org/linux/man-pages/man2/ptrace.2.html.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In this method, CRIU first lists and goes through the relevant &lt;code&gt;/proc/$pid&lt;/code&gt; entries. Thread ids are collected through &lt;code&gt;/proc/$pid/task&lt;/code&gt;, whereas sub-processes are recursively collected through reading &lt;code&gt;/proc/$pid/task/$tid/children&lt;/code&gt; files. &lt;/p&gt;

&lt;p&gt;Each task it encounters — parent process itself, sub-processes, and threads  — is being attached to CRIU’s tracer process by dispatching a &lt;code&gt;PTRACE_SEIZE&lt;/code&gt; request, after which a &lt;code&gt;PTRACE_INTERRUPT&lt;/code&gt; request is also dispatched in order to&lt;br&gt;
stop that task.&lt;/p&gt;

&lt;p&gt;The second method for freezing the process tree is using Linux’s &lt;strong&gt;Cgroup Freezer&lt;/strong&gt; — available through CRIU’s &lt;code&gt;--freeze-cgroup&lt;/code&gt; flag. Cgroup Freezer is a subsystem supported by the Linux Kernel which lets us start and stop a set of&lt;br&gt;
tasks (i.e, processes and threads), by defining a control group. &lt;/p&gt;

&lt;p&gt;Here is a simple example of usage:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Create a cgroup freezer directory
mkdir /sys/fs/cgroup/freezer

# Mount directory against a cgroup filesystem of type 'freezer'  
mount -t cgroup -ofreezer freezer /sys/fs/cgroup/freezer

# Create a child cgroup directory
mkdir /sys/fs/cgroup/freezer/whatever

# Put a task into this cgroup
echo $some_pid &amp;gt; /sys/fs/cgroup/freezer/whatever/tasks

# Freeze cgroup
echo FROZEN &amp;gt; /sys/fs/cgroup/freezer/whatever/freezer.state

# cat /sys/fs/cgroup/freezer/whatever/freezer.state
FREEZING

# cat /sys/fs/cgroup/freezer/whatever/freezer.state
FROZEN

# Thaw (unfreeze) it
echo THAWED &amp;gt; /sys/fs/cgroup/freezer/whatever/freezer.state
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Read more at&lt;br&gt;
&lt;a href="https://man7.org/linux/man-pages/man7/cgroups.7.html"&gt;https://man7.org/linux/man-pages/man7/cgroups.7.html&lt;/a&gt;&lt;br&gt;
and&lt;br&gt;
&lt;a href="https://www.kernel.org/doc/Documentation/cgroup-v1/freezer-subsystem.txt"&gt;https://www.kernel.org/doc/Documentation/cgroup-v1/freezer-subsystem.txt&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now that the process’s tasks tree is all frozen, CRIU needs to collect all relevant tasks’ resources and write them to dump files, which should later be used for restore.&lt;/p&gt;

&lt;p&gt;The first set of resources being dumped are collected simply by reading  procfs.&lt;br&gt;
These resources are VMAs (Virtual Memory Areas), memory-mapped files, and opened file descriptors. In addition, registers and other core task parameters are collected using &lt;code&gt;ptrace&lt;/code&gt; and parsing &lt;code&gt;/proc/$pid/stat&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Afterwards, CRIU injects a parasite code into each task’s address space whose job is to collect some more information such as credentials and actual memory&lt;br&gt;
contents. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;More information about how the parasite code injection is done can be found&lt;br&gt;
&lt;a href="https://criu.org/Parasite_code"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The final step in the checkpointing process is cleaning up CRIU’s parasite code, restoring original code, detaching ptrace, and then resuming tasks from where they’ve been stopped (actually this is optional and can be turned-off by specifying &lt;code&gt;-R|--leave-running&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Restore&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The restore process is pretty straightforward. During restore CRIU gradually morphs itself into the target process. &lt;/p&gt;

&lt;p&gt;First off, the restorer process reads all dumped image files and finds out which processes share which resources. Next, it re-creates all processes in the tree by calling &lt;code&gt;fork()&lt;/code&gt;. Note that threads are &lt;strong&gt;not&lt;/strong&gt; restored here but rather on&lt;br&gt;
the last stage. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The PID Dance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each forked process is supposed to be assigned with its original pid. But how&lt;br&gt;
exactly?&lt;/p&gt;

&lt;p&gt;Well, in order to do so, CRIU utilizes a feature introduced in Kernel v3.3 used by the kernel for keeping track of the last pid it has assigned. &lt;/p&gt;

&lt;p&gt;It is accessible through the sysctl file &lt;code&gt;/proc/sys/kernel/ns_last_pid&lt;/code&gt;and is basically an incrementing counter of process ids. It requires &lt;code&gt;CONFIG_CHECKPOINT_RESTORE&lt;/code&gt;  to be set and it is enabled by default in most of&lt;br&gt;
Linux distributions.&lt;/p&gt;

&lt;p&gt;So in order to fork a process with a desired pid, say 3214, CRIU does the following:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Opens /proc/sys/kernel/ns_last_pid
2. Locks file
3. Sets ns_last_pid's new value to pid-1 (3214 - 1)
4. Closes /proc/sys/kernel/ns_last_pid
5. Clones process so that the child process is supposed to have pid 3124
6. Calls getpid() inside the child process to validate desired pid
7. Unlocks file
8. Voilà
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Very simple and yet cool!&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Bear in mind that if any pid already exists when trying to restore, then the&lt;br&gt;
restore will fail. The solution is to  restore the process inside a different pid namespace (and mount namespace — see&lt;br&gt;
&lt;a href="https://criu.org/CR_in_namespace"&gt;https://criu.org/CR_in_namespace&lt;/a&gt;). &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There are a few caveats to this approach, the main ones being it too slow due to multiple syscalls required for each such clone, and that it is open to race conditions.&lt;/p&gt;

&lt;p&gt;As explained in&lt;br&gt;
&lt;a href="https://lisas.de/~adrian/criu-and-the-pid-dance-article.pdf"&gt;https://lisas.de/~adrian/criu-and-the-pid-dance-article.pdf&lt;/a&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;It can always happen that between setting the desired PID via ns_last_pid and the actual clone() another process, independent of the restore, is created, which means that getpid() will not return the desired PID and CRIU will abort.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Enter &lt;code&gt;clone3()&lt;/code&gt; and &lt;code&gt;set_tid&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Using Kernel v5.3’s &lt;code&gt;clone3()&lt;/code&gt; (note there’s no matching glibc wrapper yet), and &lt;code&gt;set_tid&lt;/code&gt; array available from v5.5, we can now explicitly select specific process ids for a cloned process, in some or all of the PID namespaces where it is present, directly when we call &lt;code&gt;clone3()&lt;/code&gt;. This essentially eliminates the race conditions and saves us from multiple syscalls requirement of the ns_last_pid method. &lt;strong&gt;It&lt;/strong&gt; &lt;strong&gt;is currently supported by CRIU&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;Now that all of the required processes were created, CRIU will carry on and restore resources such as opening file descriptors, preparing namespaces, opening anonymous shared mappings, opening file mappings, opening &amp;amp; pre-mapping&lt;br&gt;
private memory areas, opening sockets, and more…&lt;/p&gt;

&lt;p&gt;For the final step, CRIU will switch to the restorer context — cleaning up its own memory mappings — and restore all  other resources left: threads, timers (so they will fire as late as possible), credentials &amp;amp; security settings (so they won’t limit us during the restore process), private memory areas re-mappings (using &lt;code&gt;mremap&lt;/code&gt;), file mappings (using &lt;code&gt;mmap&lt;/code&gt;), and anonymous shared mappings (also using &lt;code&gt;mmap&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;From now on, the process is restored and will continue to run from where it was originally checkpointed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/yoJC2COHSxjIqadyZW/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/yoJC2COHSxjIqadyZW/giphy.gif" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you’d like to dive even deeper inside the internals, have a look at&lt;br&gt;
&lt;a href="https://criu.org/Category:Under_the_hood"&gt;https://criu.org/Category:Under_the_hood&lt;/a&gt;&lt;br&gt;
:)&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Live Migration
&lt;/h3&gt;

&lt;p&gt;Live migration is the process of moving a running Virtual Machine or Application between two different nodes while keeping clients connected. Memory, relevant storage and network connectivity should all be transfered. &lt;/p&gt;

&lt;p&gt;CRIU is a perfect match for these kind of tasks, and is actually used in production by some big companies. For instance, Google uses CRIU for &lt;a href="https://www.slideshare.net/mobile/RohitJnagal/task-migration-using-criu"&gt;live migrating containers inside its Borg clusters&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Lets use CRIU ourselves to demonstrate the migration of a simple loop script from our local machine to a virtual machine.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--wyuxcHjE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://www.talhoffman.com/assets/images/criu-demo.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--wyuxcHjE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://www.talhoffman.com/assets/images/criu-demo.gif" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Our script looks like this (&lt;code&gt;test.sh&lt;/code&gt;):&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#!/bin/sh
while :; do
    sleep 1
    date
done
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In order to checkpoint the script’s process we run:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo criu dump -t &amp;lt;pid&amp;gt; --images-dir ~/demo/images --shell-job &amp;amp;&amp;amp; echo OK
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;--images-dir&lt;/code&gt; indicates where to dump the image files&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--shell-job&lt;/code&gt; tells CRIU that our process was spawned from a shell&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We then &lt;code&gt;scp&lt;/code&gt; (i.e, transfer) both the script (so CRIU is able to restore its fd) and the&lt;br&gt;
dumped images to our VM, and finally restore the process using:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; sudo criu restore -D &amp;lt;path-to-images-dir&amp;gt; -vvv --shell-job -d
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-vvv&lt;/code&gt; for higher verbosity&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--shell-job&lt;/code&gt; again to let CRIU known it was spawned from a shell&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-d&lt;/code&gt; so that the restored process will run in background&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And that’s it!&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Some more cool CRIU tutorials can be found&lt;br&gt;
&lt;a href="https://www.youtube.com/watch?v=roJ91Kqeq5w&amp;amp;list=PL86FC0XuGZPISge_th8F5Jjj-IbGXEfE6&amp;amp;index=1"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Originally posted on my personal blog: &lt;a href="https://www.talhoffman.com/introduction-to-criu-and-live-migration"&gt;https://www.talhoffman.com/introduction-to-criu-and-live-migration&lt;/a&gt;&lt;/p&gt;

</description>
      <category>linux</category>
      <category>opensource</category>
      <category>containers</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
