<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Augusto Pascutti</title>
    <description>The latest articles on Forem by Augusto Pascutti (@augustohp).</description>
    <link>https://forem.com/augustohp</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F169486%2F09f7f992-fcfd-47b5-b78f-ffc0596aeaad.png</url>
      <title>Forem: Augusto Pascutti</title>
      <link>https://forem.com/augustohp</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/augustohp"/>
    <language>en</language>
    <item>
      <title>Shell for anxious developers</title>
      <dc:creator>Augusto Pascutti</dc:creator>
      <pubDate>Tue, 10 Jun 2025 03:16:55 +0000</pubDate>
      <link>https://forem.com/augustohp/shell-primer-for-anxious-developers-38i8</link>
      <guid>https://forem.com/augustohp/shell-primer-for-anxious-developers-38i8</guid>
      <description>&lt;p&gt;There are &lt;a href="https://developer.apple.com/library/archive/documentation/OpenSource/Conceptual/ShellScripting/Introduction/Introduction.html" rel="noopener noreferrer"&gt;great&lt;/a&gt; &lt;a href="https://tldp.org/LDP/Bash-Beginners-Guide/html/" rel="noopener noreferrer"&gt;guides&lt;/a&gt; on &lt;code&gt;bash&lt;/code&gt; (or Bourne-compatible shell: &lt;code&gt;sh&lt;/code&gt;, &lt;code&gt;zsh&lt;/code&gt;, &lt;code&gt;ksh&lt;/code&gt;) &lt;a href="https://blog.sanctum.geek.nz/series/unix-as-ide/" rel="noopener noreferrer"&gt;out there&lt;/a&gt;. I don't want to teach you bash, or any special trick. I want to &lt;del&gt;convince&lt;/del&gt; show you why I think it is worth learning. It won't be much but, hopefully, it is enough, if the following itches your curiosity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ git log --name-only --pretty="format:" \
  | sed '/^\s*$/'d \
  | sort \
  | uniq -c \
  | sort -rn \
  | head
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I assume you already know how to use a shell to run commands, and that you have &lt;code&gt;git&lt;/code&gt; installed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Composition using Pipes
&lt;/h2&gt;

&lt;p&gt;On a &lt;a href="https://en.wikipedia.org/wiki/POSIX" rel="noopener noreferrer"&gt;POSIX&lt;/a&gt; shell, &lt;a href="https://www.gnu.org/software/bash/" rel="noopener noreferrer"&gt;bash&lt;/a&gt; for example, you can use &lt;a href="https://www.gnu.org/software/bash/manual/html_node/Pipelines.html" rel="noopener noreferrer"&gt;pipes&lt;/a&gt; (&lt;code&gt;|&lt;/code&gt;) to use the &lt;em&gt;output&lt;/em&gt; of a program as &lt;em&gt;input&lt;/em&gt; of another:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ seq 1 5
1
2
3
4
5

$ seq 1 5 | sort -n -r
5
4
3
2
1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;To learn what a command does you can use &lt;code&gt;man &amp;lt;command&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;command&amp;gt; --help&lt;/code&gt;, &lt;code&gt;info &amp;lt;command&amp;gt;&lt;/code&gt; or &lt;code&gt;help &amp;lt;command&amp;gt;&lt;/code&gt;. An excerpt from the &lt;code&gt;man&lt;/code&gt; pages of commands above shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;seq &amp;lt;first&amp;gt; &amp;lt;last&amp;gt;&lt;/code&gt; prints a sequence of numbers from &lt;code&gt;first&lt;/code&gt; to &lt;code&gt;last&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sort [options] [file]&lt;/code&gt; sort lines of text files. Without a &lt;code&gt;file&lt;/code&gt;, it reads from standard input.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Notice thinks between &lt;code&gt;[brackets]&lt;/code&gt; and &lt;code&gt;&amp;lt;less-greater signs&amp;gt;&lt;/code&gt; ? This means &lt;code&gt;&amp;lt;required&amp;gt;&lt;/code&gt;  and &lt;code&gt;[optional]&lt;/code&gt;,  a convention mostly everyone follows. All programs used in these examples are available even on most basic distributions. Even [alpine][], which is known for being very small and lean:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ docker run --rm -it alpine sh
# seq 1 3 | sort -n -r
3
2
1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It is worth noting that &lt;code&gt;man&lt;/code&gt; (and its counterparts) work offline. &lt;a href="https://wiki.archlinux.org/title/Man_page" rel="noopener noreferrer"&gt;Getting to know how them&lt;/a&gt; and [the pager]&lt;a href="https://dev.torenders%20content%20bigger%20than%20the%20display,%20%20raw%20`man%20less`%20endraw%20%20to%20know%20more"&gt;&lt;/a&gt; will give you access to invaluable knowledge (git man pages are a treat).&lt;/p&gt;

&lt;h2&gt;
  
  
  Loops and Conditionals
&lt;/h2&gt;

&lt;p&gt;You can think about a shell as a "place to run other programs". When it really is an &lt;a href="https://en.wikipedia.org/wiki/Infinite_loop" rel="noopener noreferrer"&gt;infinite loop&lt;/a&gt; running one command: &lt;a href="https://www.gnu.org/software/readline" rel="noopener noreferrer"&gt;readline&lt;/a&gt;. Once you wrap your head around that, you can quickly develop and debug small programs. Like a never-ending running test-suite.&lt;/p&gt;

&lt;p&gt;I like to approach this is by using &lt;a href="https://www.gnu.org/software/bash/manual/html_node/History-Interaction.html" rel="noopener noreferrer"&gt;history expansion&lt;/a&gt; (zsh, macOS default shell, also &lt;a href="https://zsh.sourceforge.io/Doc/Release/Expansion.html#History-Expansion" rel="noopener noreferrer"&gt;has it&lt;/a&gt;):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;!!&lt;/code&gt; executes last successful command.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;!&amp;lt;prefix&amp;gt;&lt;/code&gt; executes last command that matches &lt;code&gt;prefix&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;!$&lt;/code&gt; expands to the last (&lt;code&gt;$&lt;/code&gt; on regex is used as "the end of a string") argument of the last executed program (only works on &lt;code&gt;bash&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some great CLI citizens use them. After a &lt;code&gt;git clone &amp;lt;repo&amp;gt; [dir]&lt;/code&gt;, for example, you can &lt;code&gt;cd !$&lt;/code&gt; to enter the directory you've just cloned. Notice how the last option is useful for other commands. That, my great comrade, is good design. Remember this and you will remember the order of argument for some pretty usefull programs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ln &amp;lt;path/to/file&amp;gt; &amp;lt;path/to/symlink&amp;gt;&lt;/code&gt;: The &lt;em&gt;symlink&lt;/em&gt; is the useful part, so it is last. You can &lt;code&gt;!$&lt;/code&gt; to run the binary or &lt;code&gt;cd !$&lt;/code&gt; if it as directory.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cp &amp;lt;source [source [source]]&amp;gt; &amp;lt;dest&amp;gt;&lt;/code&gt;: You can copy multiple files and directories to one &lt;em&gt;destination&lt;/em&gt;, which is the useful part. So it is last.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Back to our "never-ending test-suite": I try a command until I am satisfied with its result and then pass it on with &lt;a href="https://www.gnu.org/software/bash/manual/html_node/History-Interaction.html" rel="noopener noreferrer"&gt;history expansion&lt;/a&gt; to a loop or another command.&lt;/p&gt;

&lt;p&gt;Suppose you want to update all Git repositories inside your &lt;code&gt;$HOME&lt;/code&gt; directory. The outline of the idea: (1) find all directories with &lt;code&gt;.git&lt;/code&gt; inside of them, (2) for every repository &lt;code&gt;cd &amp;lt;repo&amp;gt;&lt;/code&gt; into it and (3) run &lt;code&gt;git pull&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ find "$HOME" -type d -name ".git"
/home/augustohp/.tmux/plugins/tpm/.git
/home/augustohp/.vim/bundle/vim-nerdtree-tabs/.git
/home/augustohp/.vim/bundle/nvim-lspconfig/.git
/home/augustohp/.vim/bundle/trouble.nvim/.git
/home/augustohp/src/github.com/expressjs/.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The command above lists all &lt;code&gt;.git&lt;/code&gt; (&lt;code&gt;-name&lt;/code&gt;) directories (&lt;code&gt;-type d&lt;/code&gt;) inside &lt;code&gt;$HOME&lt;/code&gt;. Note the results have &lt;code&gt;.git&lt;/code&gt; on them - we want its parent directory. So I will try and use &lt;code&gt;sed&lt;/code&gt; to remove &lt;code&gt;.git&lt;/code&gt; from the end of each line, I will keep trying until I have:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ !find | sed 's/\/\.git$//'
/home/augustohp/.tmux/plugins/tpm
/home/augustohp/.vim/bundle/vim-nerdtree-tabs
/home/augustohp/.vim/bundle/nvim-lspconfig
/home/augustohp/.vim/bundle/trouble.nvim
/home/augustohp/src/github.com/expressjs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;sed&lt;/code&gt; accepts any &lt;em&gt;regular expression&lt;/em&gt; delimiter, we are using &lt;code&gt;/&lt;/code&gt; (which most examples you see use it as well) but when dealing with paths (which use &lt;code&gt;/&lt;/code&gt; as directory separator) it is useful to use another - avoiding the escape (&lt;code&gt;\&lt;/code&gt;). Dot is also an special character we need to escape (&lt;code&gt;\&lt;/code&gt;), it is used to match "any character". Using another delimiter the command becomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ find "$HOME" -type d -name ".git" | sed 's#/\.git$##'
/home/augustohp/.tmux/plugins/tpm
/home/augustohp/.vim/bundle/vim-nerdtree-tabs
/home/augustohp/.vim/bundle/nvim-lspconfig
/home/augustohp/.vim/bundle/trouble.nvim
/home/augustohp/src/github.com/expressjs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://www.gnu.org/software/bash/" rel="noopener noreferrer"&gt;Bash&lt;/a&gt;, as other shells, have &lt;a href="https://www.gnu.org/software/bash/manual/html_node/Conditional-Constructs.html" rel="noopener noreferrer"&gt;conditions&lt;/a&gt; and &lt;a href="https://www.gnu.org/software/bash/manual/html_node/Looping-Constructs.html" rel="noopener noreferrer"&gt;loops&lt;/a&gt;. With &lt;em&gt;variables&lt;/em&gt; and &lt;a href="https://www.gnu.org/software/bash/manual/html_node/Command-Substitution.html" rel="noopener noreferrer"&gt;command substitution&lt;/a&gt;, we can start to compose more complex instructions:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ find "$HOME" -type d -name ".git" | sed 's/\/\.git$//'
$ repositories=$(!!)
$ for repo in $repositories
do
  cd "$repo"
  git pull --auto-stash
  cd -
done
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;$(!!)&lt;/code&gt; executes the previous command (&lt;code&gt;!!&lt;/code&gt;) inside a &lt;a href="https://www.gnu.org/software/bash/manual/html_node/Command-Substitution.html" rel="noopener noreferrer"&gt;sub-shell&lt;/a&gt; and returns its &lt;em&gt;output&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;repositories=$(!!)&lt;/code&gt; defines the contents of the previous command executed (&lt;code&gt;$(!!)&lt;/code&gt;) into &lt;code&gt;repositories&lt;/code&gt; variable.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;for name [ [in [words …] ] ; ] do commands; done&lt;/code&gt; executes a loop:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cd "$repo"&lt;/code&gt; enters the repository. It is good to always quote (&lt;code&gt;"&lt;/code&gt;) &lt;em&gt;paths&lt;/em&gt; because they might have spaces on their names.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;git pull --auto-stash&lt;/code&gt; will update the repository and save (stash) any uncommitted changes.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cd -&lt;/code&gt; returns to previous directory, before the first &lt;code&gt;cd&lt;/code&gt; was made.&lt;/li&gt;
&lt;li&gt;If you want to do that in one line, you need to change &lt;code&gt;\n&lt;/code&gt; (new line) to &lt;code&gt;;&lt;/code&gt;. If you search the command using &lt;code&gt;history&lt;/code&gt;, you will see it on that short format.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Let's say you don't want to update repositories that have uncommitted changes in them. For that, the output of &lt;code&gt;git status&lt;/code&gt; should be empty which can be tested with &lt;code&gt;test -z&lt;/code&gt; (&lt;code&gt;man test&lt;/code&gt; to see available operators for &lt;code&gt;if&lt;/code&gt; conditions):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ for repo in $(find "$HOME" -type d -name ".git" | sed 's/\/\.git$//')
do
  cd "$repo"
  git_status_output="$(git status)"
  if [ ! -z "$git_status_output" ]
  then
    git pull --auto-stash
  else
    echo "Error: $repo has uncommitted changes."
  fi
  cd -
done
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conditionals and exit codes
&lt;/h2&gt;

&lt;p&gt;You know conditionals right? On shells they look the same but they have a twist, one that is useful for running commands: The return of a command can always be evaluated as a conditional. If it runs successfully, it is &lt;code&gt;true&lt;/code&gt;. Every command that return &lt;code&gt;0&lt;/code&gt; (zero), is successful. So commands can have as many error codes they want.&lt;br&gt;
I've made the instructions bigger to improve understanding, usually I'd one-line them with  &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt; (AND) and &lt;code&gt;||&lt;/code&gt; (OR) operators:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ cd /tmp/non-existing-directory
-bash: cd /tmp/non-existing-directory: No such file or directory
$ echo $?
1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The special variable &lt;code&gt;$?&lt;/code&gt; has the return code of the previous command. Since it is &lt;code&gt;1&lt;/code&gt; it was an error, if the error message did not give it away. As you've guessed, you can do this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ if cd /tmp/non-existing-directory
then
    echo "great success!"
else
    echo "not"
fi
-bash: cd /tmp/non-existing-directory: No such file or directory
not
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can, of course, get rid of these error messages using &lt;a href="https://www.gnu.org/software/bash/manual/html_node/Redirections.html" rel="noopener noreferrer"&gt;redirections&lt;/a&gt;:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ cd /tmp/non-existing-directory 2&amp;gt; /dev/null
$ echo $?
1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;2&amp;gt;&lt;/code&gt; redirects &lt;em&gt;file descriptor&lt;/em&gt; &lt;code&gt;2&lt;/code&gt; (&lt;code&gt;stderr&lt;/code&gt;) to &lt;code&gt;/dev/null&lt;/code&gt;. You can also shorten every conditional using &lt;code&gt;||&lt;/code&gt; and &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt; operators:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ test -z "$git_status_output" || git pull --auto-stash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This would just execute &lt;code&gt;git pull&lt;/code&gt; if the result of &lt;code&gt;test -z&lt;/code&gt; would be false - return status code (&lt;code&gt;$?&lt;/code&gt;) different than &lt;code&gt;0&lt;/code&gt; (success). As the shell already has conditions builtin the REPL, the &lt;code&gt;test&lt;/code&gt; programs just have some handy operators:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-z&lt;/code&gt; for testing for empty strings and &lt;code&gt;-n&lt;/code&gt; for non empty strings.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-f&lt;/code&gt; for existing file and &lt;code&gt;-d&lt;/code&gt; for directories.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-lt&lt;/code&gt; and &lt;code&gt;-le&lt;/code&gt; for "less than or equal".&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How do you see other &lt;em&gt;conditional operators&lt;/em&gt;? Since it is a program: &lt;code&gt;man test&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What can you do with it?
&lt;/h2&gt;

&lt;p&gt;This may look like "too much" at first glance but think about it: How many things you could automate since everything is a program and follows the same conventions? &lt;/p&gt;

&lt;p&gt;If, for example, you have &lt;code&gt;gh&lt;/code&gt; (GitHub CLI program) installed, you can clone all the repositories of an organisation with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;for repo in $(gh repo list --limit 200 --source --no-archived "$owner" | awk '{print $1 }')
do
  gh repo clone "$repo"
done
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As long programs return text (spoiler alert: they will) you can compose them with other programs. If you need to transform text, for example, you have some great tools already available. Here are the ones I've used the most:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ alias rank="sort | uniq -c | sort -nr"
$ alias second_column_only='awk "{ print \$2 }"'
$ alias top10="rank | head -n 10 | second_column_only"
$ history | second_column_only | top10
awk
column
sed
cut
cat
tr
split
mktemp
fg
z - (zoxide, this one needs installation)
fzf - (fuzzy finder, this too needs installation)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What seems like a limitation at first, the output is just text, is actually great software design. You will notice everything is already done for you: from getting the nth column of an output to splitting a huge file into smaller ones (with &lt;code&gt;split&lt;/code&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  What now?
&lt;/h2&gt;

&lt;p&gt;Time to make your own &lt;code&gt;history&lt;/code&gt;. Make sure it is configured right on your shell, I like to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Keep it big, disk space is cheap. The default usually only holds a couple of hundred commands. I like to have it a lot. You can use &lt;code&gt;CTRL-R&lt;/code&gt; to search and use it, since its output is text you can... you get the idea.&lt;/li&gt;
&lt;li&gt;Ignore entries that start with space. You will always type something (e.g.: API Key) you don't want to keep saved in a file somewhere.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I know it is tempting to Google for one-liners and such, try not to. The best feature of a &lt;em&gt;shell&lt;/em&gt; is to make it your own. Different from an IDE or GUI, it expects you to customize it: to make its output your own. So use it: find a pattern, create a shortcut to it and learn something new (&lt;code&gt;man&lt;/code&gt; pages). All shells allow you to load custom files on startup, use them.&lt;/p&gt;

&lt;p&gt;The shell is a program. If you are a programmer, make it a good one. The journey will teach you &lt;strong&gt;a lot&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>bash</category>
      <category>shell</category>
      <category>linux</category>
    </item>
    <item>
      <title>Cleaning Quake server logs to generate score boards</title>
      <dc:creator>Augusto Pascutti</dc:creator>
      <pubDate>Mon, 03 Mar 2025 21:29:09 +0000</pubDate>
      <link>https://forem.com/augustohp/awk-your-way-parsing-logs-108e</link>
      <guid>https://forem.com/augustohp/awk-your-way-parsing-logs-108e</guid>
      <description>&lt;p&gt;It is a &lt;a href="https://gist.github.com/augustohp/073936cc213fe96bc99a498932c18be7" rel="noopener noreferrer"&gt;common&lt;/a&gt; &lt;a href="https://github.com/misaku/QuakeLogParser/blob/master/DOC/DESAFIO.md" rel="noopener noreferrer"&gt;challenge&lt;/a&gt; for &lt;em&gt;technical interviews&lt;/em&gt; to parse Quake 3 server logs and display:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Players in a match&lt;/li&gt;
&lt;li&gt;Player score card, listing player names and kill count:

&lt;ol&gt;
&lt;li&gt;Ignore &lt;code&gt;&amp;lt;world&amp;gt;&lt;/code&gt; as a &lt;em&gt;player&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;&amp;lt;world&amp;gt;&lt;/code&gt; kills a &lt;em&gt;player&lt;/em&gt;, add &lt;code&gt;-1&lt;/code&gt; to &lt;em&gt;player&lt;/em&gt;'s kill count&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;li&gt;(optional) Group &lt;em&gt;outputs&lt;/em&gt; above by &lt;em&gt;match&lt;/em&gt;
&lt;/li&gt;

&lt;li&gt;(optional) Death cause report by &lt;em&gt;match&lt;/em&gt;
&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;Working with files is a common practice for any developer. Using &lt;a href="https://www.gnu.org/software/gawk/manual/gawk.html" rel="noopener noreferrer"&gt;awk&lt;/a&gt; not so much, even though it is IMHO one of the best tools for doing so:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The language is built for (1) text matching and (2) manipulation.&lt;/li&gt;
&lt;li&gt;Working with small files is as easy as it is working with very large files.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Intending to spread the knowledge of the &lt;a href="https://www.gnu.org/software/gawk/manual/gawk.html" rel="noopener noreferrer"&gt;tool&lt;/a&gt; to more people, let's solve the challenge with &lt;a href="https://www.gnu.org/software/gawk/manual/gawk.html" rel="noopener noreferrer"&gt;AWK&lt;/a&gt; and get to know how you can effectively start using it today in your workflow. I assume you know well a programming language, your way around a (*nix) CLI and that we are using &lt;a href="https://www.gnu.org/software/gawk/manual/gawk.html" rel="noopener noreferrer"&gt;GNU awk&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The beginning of a not so usual program
&lt;/h2&gt;

&lt;p&gt;As it is common with other Unix tools, it is better to break the program into smaller pieces, &lt;a href="https://www.gnu.org/software/gawk/manual/gawk.html" rel="noopener noreferrer"&gt;Awk&lt;/a&gt; programs &lt;a href="https://www.gnu.org/software/gawk/manual/html_node/When.html#:~:text=If%20you%20find%20yourself%20writing%20awk%20scripts%20of%20more%20than%2C%20say%2C%20a%20few%20hundred%20lines%2C%20you%20might%20consider%20using%20a%20different%20programming%20language." rel="noopener noreferrer"&gt;bigger than ~150 lines are difficult to maintain&lt;/a&gt;. &lt;br&gt;
Here are the different programs we are going to create:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;clean.awk&lt;/code&gt; will read &lt;em&gt;input&lt;/em&gt; files, which are the original &lt;a href="https://gist.githubusercontent.com/cloudwalk-tests/be1b636e58abff14088c8b5309f575d8/raw/df6ef4a9c0b326ce3760233ef24ae8bfa8e33940/qgames.log" rel="noopener noreferrer"&gt;log&lt;/a&gt; files, and &lt;em&gt;output&lt;/em&gt; a cleaner version of their content. Containing just the data we need to manipulate and use.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;scoreboard.awk&lt;/code&gt; will use the &lt;em&gt;output&lt;/em&gt; from the previous &lt;em&gt;program&lt;/em&gt; to produce the score boards for each game.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let's create a &lt;em&gt;walking skeleton&lt;/em&gt; to run and debug our progress while tackling the challenge:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; /tmp/awk-quake
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;curl &lt;span class="nt"&gt;--remote-name&lt;/span&gt; &lt;span class="nt"&gt;-L&lt;/span&gt; https://gist.githubusercontent.com/augustohp/073936cc213fe96bc99a498932c18be7/raw/9e52e4da221f2f0ce1dfc11f57c1679a2cdb77f5/qgames.log
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;tail &lt;/span&gt;qgames.log
 13:55 Kill: 3 4 6: Oootsimo killed Dono da Bola by MOD_ROCKET
 13:55 Exit: Fraglimit hit.
 13:55 score: 20  ping: 8  client: 3 Oootsimo
 13:55 score: 19  ping: 14  client: 6 Zeh
 13:55 score: 17  ping: 1  client: 2 Isgalamido
 13:55 score: 13  ping: 0  client: 5 Assasinu Credi
 13:55 score: 10  ping: 8  client: 4 Dono da Bola
 13:55 score: 6  ping: 19  client: 7 Mal
 14:11 ShutdownGame:
 14:11 &lt;span class="nt"&gt;------------------------------------------------------------&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;clean.awk
&lt;span class="o"&gt;{&lt;/span&gt; print &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;watch gawk &lt;span class="nt"&gt;-f&lt;/span&gt; clean.awk qgames.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Above we:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Downloaded &lt;code&gt;qgames.log&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Created &lt;code&gt;clean.awk&lt;/code&gt; that prints everything passed to it&lt;/li&gt;
&lt;li&gt;Executed the program every couple of seconds (with &lt;code&gt;watch&lt;/code&gt;) to see its result while we change it in another session (to stop &lt;code&gt;watch&lt;/code&gt;, use CTRL-C)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let's change &lt;code&gt;clean.awk&lt;/code&gt; to filter just the lines useful to us, and help us debug what to do with them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight awk"&gt;&lt;code&gt;&lt;span class="kr"&gt;BEGIN&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kc"&gt;FS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;" "&lt;/span&gt;
    &lt;span class="nx"&gt;LFS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"\n"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="sr"&gt;/Init/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="sr"&gt;/kill/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;debug_fields&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;debug_fields&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="kc"&gt;NF&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"%d: %s\n"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Don't despair yet, it is pretty simple what we are doing:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;BEGIN&lt;/code&gt; is a &lt;em&gt;special block&lt;/em&gt;, that gets executed &lt;strong&gt;once&lt;/strong&gt; at the start 
of the parsing:

&lt;ol&gt;
&lt;li&gt;We use it to (re-)define some &lt;em&gt;special variables&lt;/em&gt;:

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;FS&lt;/code&gt; defines the &lt;strong&gt;field separator&lt;/strong&gt; (space). It is used to break a &lt;em&gt;matching line&lt;/em&gt; into a smaller array of objects.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LFS&lt;/code&gt; defines the &lt;strong&gt;line separator&lt;/strong&gt; (new line). Everything until that character will be treated as a &lt;em&gt;line&lt;/em&gt;.&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;/match/ { action }&lt;/code&gt; blocks execute a set of &lt;code&gt;actions&lt;/code&gt; when a &lt;code&gt;match&lt;/code&gt; (regex supported) is found:

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;/Init/ { print }&lt;/code&gt; prints every line that has &lt;code&gt;Init&lt;/code&gt; on it, without doing anything more.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/kill/ { debug_fields() }&lt;/code&gt; executes the &lt;code&gt;debug_fields()&lt;/code&gt; function for every line that has a matching &lt;code&gt;kill&lt;/code&gt; string on it.&lt;/li&gt;
&lt;li&gt;Every line that doesn't match the rules above is ignored.&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;function debug_fields()&lt;/code&gt; prints all fields identified after breaking the &lt;em&gt;line&lt;/em&gt; with &lt;code&gt;FS&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;NF&lt;/code&gt; is a &lt;em&gt;special variable&lt;/em&gt; containing the &lt;em&gt;number of fields&lt;/em&gt; parsed for the current line.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;$n&lt;/code&gt; is the &lt;em&gt;field&lt;/em&gt; &lt;code&gt;n&lt;/code&gt; parsed. Inside the &lt;em&gt;loop&lt;/em&gt; &lt;code&gt;$i&lt;/code&gt; will become &lt;code&gt;$1&lt;/code&gt;, &lt;br&gt;
&lt;code&gt;$2&lt;/code&gt; and &lt;code&gt;$3&lt;/code&gt; allowing us to retrieve the contents of every field on that &lt;br&gt;
line, displaying something like:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1: 20:54                                     
2: Kill:
3: 1022
4: 2
5: 22:
6: &amp;lt;world&amp;gt;
7: killed
8: Isgalamido
9: by
10: MOD_TRIGGER_HURT
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;em&gt;output&lt;/em&gt; above is useful to debug the current line contents we can work with. Try changing &lt;code&gt;debug_fields()&lt;/code&gt; &lt;em&gt;action&lt;/em&gt; to &lt;code&gt;print $6 " killed " $8&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;With little changes, we can use &lt;code&gt;$6&lt;/code&gt; (killer) and &lt;code&gt;$8&lt;/code&gt; (killed) to display who killed who, which is pretty much everything we need.&lt;/p&gt;

&lt;p&gt;🐛 If player names would not contain spaces we'd be ready. But &lt;code&gt;Assassinu Credi&lt;/code&gt;, for example, breaks our algorithm because we use &lt;em&gt;spaces&lt;/em&gt; to separate fields. &lt;br&gt;
When he kills someone &lt;code&gt;$8&lt;/code&gt; will &lt;code&gt;killed&lt;/code&gt; instead of the other player name. &lt;/p&gt;

&lt;p&gt;Let's see this happening:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight awk"&gt;&lt;code&gt;&lt;span class="kr"&gt;BEGIN&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kc"&gt;FS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;" "&lt;/span&gt;
    &lt;span class="nx"&gt;LFS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"\n"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="sr"&gt;/Init/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;next&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="sr"&gt;/Assas/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="nv"&gt;$6&lt;/span&gt; &lt;span class="s2"&gt;" killed "&lt;/span&gt; &lt;span class="nv"&gt;$8&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The program above ignores (with &lt;code&gt;next&lt;/code&gt; action) lines matching &lt;code&gt;Init&lt;/code&gt; and prints just lines matching &lt;code&gt;Assas&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; clean.awk qgames.log
Zeh killed Assasinu
&amp;lt;world&amp;gt; killed Assasinu
Isgalamido killed Assasinu
Zeh killed Assasinu
Assasinu killed killed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that &lt;code&gt;Assasinu killed killed&lt;/code&gt; line is wrong. It doesn't have the name of the &lt;em&gt;killed&lt;/em&gt; player. Let's fix this!&lt;/p&gt;

&lt;h2&gt;
  
  
  Making things more reliable with regex
&lt;/h2&gt;

&lt;p&gt;The end &lt;code&gt;clean.awk&lt;/code&gt; program is below. It &lt;em&gt;substitutes&lt;/em&gt; some strings by &lt;code&gt;nfs&lt;/code&gt; (new file separator) variable and removes the prefix on lines that notifies of a kill:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight awk"&gt;&lt;code&gt;&lt;span class="kr"&gt;BEGIN&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kc"&gt;FS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;" "&lt;/span&gt;
    &lt;span class="nx"&gt;LFS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"\n"&lt;/span&gt;
    &lt;span class="nx"&gt;nfs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"|"&lt;/span&gt;
    &lt;span class="nx"&gt;current_game&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="sr"&gt;/Init/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;current_game&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="sr"&gt;/kill/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/^&lt;/span&gt;&lt;span class="se"&gt;[&lt;/span&gt;&lt;span class="sr"&gt; 0-9:&lt;/span&gt;&lt;span class="se"&gt;]&lt;/span&gt;&lt;span class="sr"&gt;+ Kill: &lt;/span&gt;&lt;span class="se"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;0-9: &lt;/span&gt;&lt;span class="se"&gt;]&lt;/span&gt;&lt;span class="sr"&gt;+/&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/ killed /&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;nfs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/ by /&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;nfs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="nv"&gt;$0&lt;/span&gt; &lt;span class="nx"&gt;nfs&lt;/span&gt; &lt;span class="nx"&gt;current_game&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;On the &lt;code&gt;BEGIN&lt;/code&gt; section, declares 2 new variables:

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;nfs&lt;/code&gt; to separate output by something other than spaces, so next programs easily support player names with them.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;current_game&lt;/code&gt; is a variable that gets incremented every time a new game starts.&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;/Init/&lt;/code&gt; marks a new game:

&lt;ol&gt;
&lt;li&gt;Increments the variable &lt;code&gt;current_game&lt;/code&gt; for the next time it gets used&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;li&gt;For every &lt;code&gt;/kill/&lt;/code&gt;:

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;sub(regex, replacement, target)&lt;/code&gt; will put &lt;code&gt;replacement&lt;/code&gt; into every matching &lt;code&gt;regex&lt;/code&gt; on &lt;code&gt;target&lt;/code&gt;, replacing &lt;code&gt;target&lt;/code&gt;. &lt;code&gt;$0&lt;/code&gt; is the whole current line.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sub(/^[ ...&lt;/code&gt; removes the prefix of the line until the player name.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sub(/ by...&lt;/code&gt; and &lt;code&gt;sub(/ killed...&lt;/code&gt; replaces these matches by &lt;code&gt;nfs&lt;/code&gt; (the new field separator), allowing us to easily identify (&lt;code&gt;$1&lt;/code&gt;) the killer, (&lt;code&gt;$2&lt;/code&gt;) who got killed and (&lt;code&gt;$3&lt;/code&gt;) how he got killed.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;print&lt;/code&gt; will print the current line (&lt;code&gt;$0&lt;/code&gt;) with the &lt;em&gt;current game&lt;/em&gt; as a suffix:

&lt;ul&gt;
&lt;li&gt;As every &lt;code&gt;sub()&lt;/code&gt; replaces the current line (&lt;code&gt;$0&lt;/code&gt;), we now have only what we needed.&lt;/li&gt;
&lt;li&gt;As &lt;a href="https://www.gnu.org/software/gawk/manual/gawk.html" rel="noopener noreferrer"&gt;awk&lt;/a&gt; programs operate on lines, it is easier to have everything we need on them. That is why we add &lt;em&gt;current game&lt;/em&gt; to every line.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;Executing the program above, produces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; clean.awk qgames.log | &lt;span class="nb"&gt;tee &lt;/span&gt;qgames-clean.log
&amp;lt;world&amp;gt;|Isgalamido|MOD_TRIGGER_HURT|2
&amp;lt;world&amp;gt;|Isgalamido|MOD_TRIGGER_HURT|2
&amp;lt;world&amp;gt;|Isgalamido|MOD_TRIGGER_HURT|2
Isgalamido|Mocinha|MOD_ROCKET_SPLASH|2
Isgalamido|Isgalamido|MOD_ROCKET_SPLASH|2
Isgalamido|Isgalamido|MOD_ROCKET_SPLASH|2
&amp;lt;world&amp;gt;|Isgalamido|MOD_TRIGGER_HURT|2
&amp;lt;world&amp;gt;|Isgalamido|MOD_TRIGGER_HURT|2
&amp;lt;world&amp;gt;|Isgalamido|MOD_TRIGGER_HURT|2
&amp;lt;world&amp;gt;|Isgalamido|MOD_FALLING|2
&amp;lt;world&amp;gt;|Isgalamido|MOD_TRIGGER_HURT|2
Isgalamido|Mocinha|MOD_ROCKET|3
&amp;lt;world&amp;gt;|Zeh|MOD_TRIGGER_HURT|3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With the &lt;code&gt;qgames-clean.log&lt;/code&gt; file we can now easily achieve every objective of the original challenge without having to deal with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unneeded context.&lt;/li&gt;
&lt;li&gt;Space separators. With &lt;code&gt;FS = "|"&lt;/code&gt; we use &lt;code&gt;|&lt;/code&gt; as &lt;em&gt;field separator&lt;/em&gt; and have:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;$1&lt;/code&gt; as the killer&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;$2&lt;/code&gt; who got killed&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;$3&lt;/code&gt; how &lt;em&gt;killer&lt;/em&gt; killed &lt;em&gt;killed&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;$4&lt;/code&gt; in which game that happened&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;A "checkpoint". If the log changes format, or we discover a bug, as long we produce an &lt;em&gt;output&lt;/em&gt; conforming the current format we are good to use the next programs.&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next steps
&lt;/h2&gt;

&lt;p&gt;How about you try to figure out the rest? I will post my solution and, if you learned something from this, I promise you will learn something else on the next one as well.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://www.gnu.org/software/gawk/manual/gawk.html" rel="noopener noreferrer"&gt;Gnu awk's manual&lt;/a&gt; is &lt;strong&gt;really&lt;/strong&gt; good - from a time technical documents were worth reading. You don't need to read everything, the &lt;em&gt;index&lt;/em&gt; will take you where you need. Pinky promise! &lt;/p&gt;

&lt;p&gt;I won't leave you without anything though, here is a beginning for &lt;code&gt;scoreboard.awk&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight awk"&gt;&lt;code&gt;&lt;span class="kr"&gt;BEGIN&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kc"&gt;FS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"|"&lt;/span&gt;   
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;# Sets a player as a key in the players array&lt;/span&gt;
    &lt;span class="nx"&gt;players&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$1&lt;/span&gt;
    &lt;span class="nx"&gt;players&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$2&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kr"&gt;END&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;# Removes &amp;lt;world&amp;gt; from players&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;players&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"&amp;lt;world&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let me know of your solution, suggestions or doubts in the comments! ❤️&lt;/p&gt;

</description>
      <category>bash</category>
      <category>shell</category>
      <category>awk</category>
      <category>data</category>
    </item>
    <item>
      <title>Files with the most changes on Git repository</title>
      <dc:creator>Augusto Pascutti</dc:creator>
      <pubDate>Sun, 26 Apr 2020 22:31:34 +0000</pubDate>
      <link>https://forem.com/augustohp/files-with-the-most-changes-on-git-repository-46l1</link>
      <guid>https://forem.com/augustohp/files-with-the-most-changes-on-git-repository-46l1</guid>
      <description>&lt;p&gt;Reading the history of a repository is useful for multiple things. There are many ways go through it, below we will &lt;strong&gt;list the files with most changes&lt;/strong&gt; and then &lt;strong&gt;filter changes made just on them&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ git log --name-only --pretty="format:" | sed '/^\s*$/'d | sort | uniq -c | sort -r | head
$ git log --stat -- $(!!)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Other than &lt;code&gt;$(!!)&lt;/code&gt; which tells bash to "run the latest successful command" (&lt;code&gt;!!&lt;/code&gt;) inside a "sub-shell" (&lt;code&gt;$()&lt;/code&gt;), I will detail what we executed below.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using the history
&lt;/h2&gt;

&lt;p&gt;You can read commits on the current branch using &lt;code&gt;git log&lt;/code&gt;, but what if you focus on the changes of a single file?&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ git log -- src/The/Path/To/The/File/With/Most/Changes.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Explaining the command above:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;git log&lt;/code&gt; shows the changes, from the most recent to older, made to the repository. By default it displays only the commit message of each change;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--&lt;/code&gt; tells Git to stop trying to parse options (stuff like &lt;code&gt;-p&lt;/code&gt; or &lt;code&gt;--reverse&lt;/code&gt;) and start parsing arguments. For &lt;code&gt;git log&lt;/code&gt; arguments are paths of the repository (or or more);&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;src/The/Path/To/The/File/With/Most/Changes.js&lt;/code&gt; is a file that exists on our hypothetical repository, it makes git log filter changes affecting only that path. You could use other things, instead of a single &lt;em&gt;path&lt;/em&gt; pointing to a file:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;src/**&lt;/code&gt; to filter changes made just to inside this path,&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;*.txt&lt;/code&gt; to filter changes made to files with the txt extension&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Focusing on one or more paths allows you to go deeper on the history of a single part of the project, which could provide a rough idea of what the team can achieve and on what period for example.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to list files with the most changes?
&lt;/h2&gt;

&lt;p&gt;You know how navigate the history of specific files, now we want to know which files changed the most on our repository. That can be achieved in 3 steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;List files changed in a commit, for every commit;&lt;/li&gt;
&lt;li&gt;Count how many times each file appears on that list;&lt;/li&gt;
&lt;li&gt;Display only the top ones&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  List files changed in a commit
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;git log&lt;/code&gt; has the option &lt;code&gt;--name-only&lt;/code&gt; which will display the path to all files changed in a &lt;em&gt;commit&lt;/em&gt;. Formatting the &lt;em&gt;commit message&lt;/em&gt; to an &lt;strong&gt;empty format&lt;/strong&gt; will only display the files:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ git log --name-only --pretty="format:"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If you try the command above, you will notice that for every &lt;em&gt;commit&lt;/em&gt; an empty line appears. Those empty lines are the &lt;em&gt;commit messages&lt;/em&gt; we removed, to get rid of that empty line we can &lt;code&gt;sed '/^\s*$/'&lt;/code&gt;, making the whole command:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ git log --name-only --pretty="format:" | sed '/^\s*$/'d
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h4&gt;
  
  
  Count how many times each file appears in the list
&lt;/h4&gt;

&lt;p&gt;You can use &lt;code&gt;uniq&lt;/code&gt; to avoid listing duplicate items, with &lt;code&gt;-c&lt;/code&gt; as option you count their occurrences.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ git log --name-only --pretty="format:" | sed '/^\s*$/'d | uniq -c
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Since &lt;code&gt;uniq&lt;/code&gt; only joins &lt;em&gt;consecutive lines&lt;/em&gt;, we need to &lt;code&gt;sort&lt;/code&gt; our list before passing it to &lt;code&gt;uniq&lt;/code&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ git log --name-only --pretty="format:" | sed '/^\s*$/'d | sort | uniq -c
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The output of the command above will be &lt;code&gt;&amp;lt;count&amp;gt; &amp;lt;path&amp;gt;&lt;/code&gt;, so we can use sort with the &lt;code&gt;--reverse&lt;/code&gt; option to display the files with the most occurrences:&lt;/p&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; $ git log --name-only --pretty="format:" | sed '/^\s*$/'d | sort | uniq -c | sort -r&lt;br&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
  Limiting output&lt;br&gt;
&lt;/h3&gt;

&lt;p&gt;We can used &lt;code&gt;head&lt;/code&gt; to filter only the first lines, or &lt;code&gt;tail&lt;/code&gt; to filter only the last ones. The &lt;code&gt;-n &amp;lt;itens&amp;gt;&lt;/code&gt; tells how many occurrences we want to limit:&lt;/p&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ git log --name-only --pretty="format:" | sed '/^\s*$/'d | sort | uniq -c | sort -r | head -n 10&lt;br&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
  What else?&lt;br&gt;
&lt;/h2&gt;

&lt;p&gt;I usually limit changes made in the last year (`git log --since "1 year ago"). I use this every time I get in touch with a new team, allows me to get to know them better.&lt;/p&gt;

&lt;p&gt;I also don't blindly go into the "most changed files" in the project. As I want to know more about the project and people, I try to focus on &lt;em&gt;controllers&lt;/em&gt; or &lt;em&gt;models&lt;/em&gt; first so I get a grasp on what kind of changes they suffer.&lt;/p&gt;

&lt;p&gt;Do you think this will help you? In what way?&lt;/p&gt;

</description>
      <category>git</category>
      <category>productivity</category>
      <category>bash</category>
      <category>shell</category>
    </item>
  </channel>
</rss>
