<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Christian Stefanescu</title>
    <description>The latest articles on Forem by Christian Stefanescu (@stchris).</description>
    <link>https://forem.com/stchris</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F557641%2Fb0ec90f1-bd30-4756-a2a3-9e493188d727.JPG</url>
      <title>Forem: Christian Stefanescu</title>
      <link>https://forem.com/stchris</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/stchris"/>
    <language>en</language>
    <item>
      <title>A tiny CI system</title>
      <dc:creator>Christian Stefanescu</dc:creator>
      <pubDate>Tue, 04 Apr 2023 07:44:58 +0000</pubDate>
      <link>https://forem.com/stchris/a-tiny-ci-system-3f60</link>
      <guid>https://forem.com/stchris/a-tiny-ci-system-3f60</guid>
      <description>&lt;p&gt;This is a little demonstration of how little you need to host your own git repositories and have a modest &lt;a href="https://en.wikipedia.org/wiki/Continuous_integration"&gt;Continuous Integration&lt;/a&gt; system for them. All you need is a unixy server you can ssh into, but arguably you can try this out locally as well.&lt;/p&gt;

&lt;p&gt;We will use Redis at one point to queue tasks, but strictly speaking this can be achieved without additional software. To keep things simple this will only work with one repository, since this is only describing a pattern.&lt;/p&gt;

&lt;p&gt;The source code to all of that follows below can be found &lt;a href="https://github.com/stchris/tiny-ci"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hosting bare git repositories
&lt;/h2&gt;

&lt;p&gt;Assuming you can ssh into a server and create a directory, this is all you need to create a shareable git repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;git init &lt;span class="nt"&gt;--bare&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ideally you are using a distinct user for it (named &lt;code&gt;git&lt;/code&gt;) and have it set to use &lt;code&gt;git-shell&lt;/code&gt; as its default shell. By convention bare repositories are stored in directories which end in &lt;code&gt;.git&lt;/code&gt;. You can now clone this repository from your machine with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;git clone ssh://git@host.example.com/~git/repo.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  post-receive hooks
&lt;/h2&gt;

&lt;p&gt;A &lt;a href="https://git-scm.com/docs/githooks#post-receive"&gt;post-receive hook&lt;/a&gt; is an executable which can do some work as soon as something new was pushed to the repository. We will use an executable shell script which needs to go inside the &lt;code&gt;hooks&lt;/code&gt; directory of the (bare) repository on the server side.&lt;/p&gt;

&lt;p&gt;Now the most trivial thing to do would be to do the actual work in here, but this would block the &lt;code&gt;git push&lt;/code&gt; on the client side, so we just want to enqueue a new job, return a handle and exit. If what you do takes only a short amount of time, you can stop here. Alternatively you can use this repository for deployments only, by defining it as a separate remote. But the goal here is to have tests run on every push, so we will split the job creation from the actual run.&lt;/p&gt;

&lt;p&gt;This is where Redis comes into play for the job queueing. We will assume redis is installed and running and we will use redis-cli to access it from the script. We will use two data structures: a list of jobs waiting to be executed, referenced by a UUID we will generate and a hash where we can store the git revision and the state associated to a given job, as well as its output.&lt;/p&gt;

&lt;p&gt;Note that git is passing three arguments to the script via stdin: the old revision before the push, the new revision and the current ref.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; _ newrev ref
&lt;span class="k"&gt;do
    &lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;uuid&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Starting CI job &lt;/span&gt;&lt;span class="nv"&gt;$id&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    redis-cli hset &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$id&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; rev &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$newrev&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null
    redis-cli hset &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$id&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; ref &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ref&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null
    redis-cli lpush &lt;span class="nb"&gt;jobs&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$id&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Defining build jobs
&lt;/h2&gt;

&lt;p&gt;By convention our system will run whatever is in an executable script named &lt;code&gt;ci.sh&lt;/code&gt;. The drawback is that this only works with trusted systems and access to the repository needs to be guarded to prevent random code execution. The big advantage is that we don't need to come up with a job definition DSL or cumbersome file format.&lt;/p&gt;

&lt;p&gt;Our convention will also be that the script will be passed one argument: the name of the git ref, so we can decide what to do based on the branch we are on.&lt;/p&gt;

&lt;p&gt;Let's just put this into a file named &lt;code&gt;ci.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

&lt;span class="c"&gt;# the git ref gets passed in as the only argument&lt;/span&gt;
&lt;span class="nv"&gt;ref&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# pretend we're running tests&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"running tests"&lt;/span&gt;

&lt;span class="c"&gt;# only deploy if we're on the main branch&lt;/span&gt;
&lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ref&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"refs/heads/main"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Deploying"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The build runner
&lt;/h2&gt;

&lt;p&gt;Now that jobs are queued the last piece missing is a job runner. We will make use of Redis' &lt;a href="https://redis.io/commands/blpop"&gt;BLPOP command&lt;/a&gt; to block until the jobs list has a new job for us. That job id will give us the revision we need to check out and will allow us to write back the output and status of the job.&lt;/p&gt;

&lt;p&gt;Note that, as discussed, this assumes a repository called &lt;code&gt;test&lt;/code&gt; is already checked out right next to the script.&lt;/p&gt;

&lt;p&gt;tiny-ci.sh&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

&lt;span class="c"&gt;# ./runner.sh is supposed to run on the server where your git repository lives&lt;/span&gt;

&lt;span class="c"&gt;# the logic in here will run in an infinite loop:&lt;/span&gt;
&lt;span class="c"&gt;# * (block and) wait for a job&lt;/span&gt;
&lt;span class="c"&gt;# * run it&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; :
&lt;span class="k"&gt;do&lt;/span&gt;

&lt;span class="c"&gt;# Announce that we're waiting&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Job runner waiting"&lt;/span&gt;

&lt;span class="c"&gt;# We are using https://redis.io/commands/blpop to block until we have a new&lt;/span&gt;
&lt;span class="c"&gt;# message on the "jobs" list. We use `tail` to get the last line because the&lt;/span&gt;
&lt;span class="c"&gt;# output of BLPOP is of the form "list-that-got-an-element\nelement"&lt;/span&gt;
&lt;span class="nv"&gt;jobid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;redis-cli blpop &lt;span class="nb"&gt;jobs &lt;/span&gt;0 | &lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; 1&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# The message we received will have the job uuid&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Running job &lt;/span&gt;&lt;span class="nv"&gt;$jobid&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Get the git revision we're supposed to check out&lt;/span&gt;
&lt;span class="nv"&gt;rev&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;redis-cli hget &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;jobid&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"rev"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;echo &lt;/span&gt;Checking out revision &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$rev&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Get the git ref&lt;/span&gt;
&lt;span class="nv"&gt;ref&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;redis-cli hget &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;jobid&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"ref"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Prepare the repository (hardcoded path) by getting that commit&lt;/span&gt;
&lt;span class="nb"&gt;cd test&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; git fetch &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git reset &lt;span class="nt"&gt;--hard&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$rev&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c"&gt;# Actually runs the job and saves the output&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;./ci.sh &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ref&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;&amp;amp;1&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nv"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"failed"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nv"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"success"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c"&gt;# Update the result status&lt;/span&gt;
redis-cli hset &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;jobid&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"status"&lt;/span&gt; &lt;span class="nv"&gt;$status&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c"&gt;# Update the job output&lt;/span&gt;
redis-cli hset &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;jobid&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"output"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$output&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Job &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;jobid&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; done"&lt;/span&gt;

&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Running it
&lt;/h2&gt;

&lt;p&gt;Summing up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;there's a bare git repository somewhere, called &lt;code&gt;test.git&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;we can clone the empty repo (or create a new one and add the respective remote)&lt;/li&gt;
&lt;li&gt;on the server hosting the git repository we clone &lt;code&gt;test.git&lt;/code&gt; into &lt;code&gt;test&lt;/code&gt; and place &lt;code&gt;tiny-ci.sh&lt;/code&gt; next to it&lt;/li&gt;
&lt;li&gt;we run builds by starting &lt;code&gt;tiny-ci.sh&lt;/code&gt; on the server hosting the repository&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now if we &lt;code&gt;git push&lt;/code&gt; a new commit to the &lt;code&gt;main&lt;/code&gt; branch with the &lt;code&gt;ci.sh&lt;/code&gt; file from above, the output will return the job id&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Enumerating objects: 5, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
...
remote: Starting CI job dab82634-21cc-11eb-b3b3-9b8767dff47c
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Checking build status
&lt;/h2&gt;

&lt;p&gt;Knowing a job uuid, the easiest way to get the status&lt;br&gt;
of a build is by using the &lt;code&gt;--csv&lt;/code&gt; style output of the &lt;a href="https://redis.io/commands/hgetall"&gt;HGETALL&lt;/a&gt; command of redis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;ssh example.com redis-cli &lt;span class="nt"&gt;--csv&lt;/span&gt; hgetall &lt;span class="nv"&gt;$JOB_UUID&lt;/span&gt;
&lt;span class="s2"&gt;"rev"&lt;/span&gt;,&lt;span class="s2"&gt;"f0706ea18a22031f84619b1161c8fbdb0dcd6850"&lt;/span&gt;,&lt;span class="s2"&gt;"ref"&lt;/span&gt;,&lt;span class="s2"&gt;"refs/heads/master"&lt;/span&gt;,&lt;span class="s2"&gt;"status"&lt;/span&gt;,&lt;span class="s2"&gt;"success"&lt;/span&gt;,&lt;span class="s2"&gt;"output"&lt;/span&gt;,&lt;span class="s2"&gt;"running tests&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Deploying"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Possible further improvements
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;multi-repo support&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This would mean changes to the &lt;code&gt;post-receive&lt;/code&gt; hook to put jobs in a list named &lt;code&gt;job-${REPONAME}&lt;/code&gt; and then have the worker also react based on that. Notice how &lt;code&gt;redis-cli blpop&lt;/code&gt; takes several lists to watch and will also return the name of the list.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;job cleanup&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Creating a key for every job pollutes the redis database unnecesarily. Enqueuing the job could be done via &lt;a href="https://redis.io/commands/setex"&gt;SETEX&lt;/a&gt; so that the keys go away after one hour / one day / one week. The purpose of Redis here is short term storage and not long-term archival of job results&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;more workers&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Scaling to multiple workers on the same machine would need different working folders (and some process isolation depending on the tasks run in there). Scaling to multiple machines would need access to a central redis instance for job distribution.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;worker isolation / sandboxing&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For more complex tasks some kind of process and file-system isolation is necessary. The worker could spin up VMs or Docker containers. The build system used on &lt;a href="https://builds.sr.ht"&gt;builds.sr.ht&lt;/a&gt; for instance uses a &lt;a href="https://man.sr.ht/builds.sr.ht/installation.md#security-model"&gt;Docker container run as an unprivileged user in a KVM qemu machine&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;timestamps&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For convenience you would definitely want timestamps for every operation. This also allows to list queries like "the last five jobs" or to do maintenance on job results based on their time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;notifications&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Any CI system will have some form of notifications and the simplest form would be to do something in the script, right at the end. But this covers only the success case, so a better approach would be to create a notification queue and have a notification worker react on that.&lt;/p&gt;

</description>
      <category>git</category>
      <category>bash</category>
      <category>ssh</category>
      <category>ci</category>
    </item>
  </channel>
</rss>
