<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Thomas Albertini</title>
    <description>The latest articles on Forem by Thomas Albertini (@thomscoder).</description>
    <link>https://forem.com/thomscoder</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F934715%2F7401cdda-20f8-4b1d-9352-d4b9b8b771eb.jpeg</url>
      <title>Forem: Thomas Albertini</title>
      <link>https://forem.com/thomscoder</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/thomscoder"/>
    <language>en</language>
    <item>
      <title>Let's build a WebAssembly compiler and runtime - Tokenizer</title>
      <dc:creator>Thomas Albertini</dc:creator>
      <pubDate>Thu, 22 Dec 2022 12:44:04 +0000</pubDate>
      <link>https://forem.com/thomscoder/lets-build-a-webassembly-compiler-and-runtime-tokenizer-2cp5</link>
      <guid>https://forem.com/thomscoder/lets-build-a-webassembly-compiler-and-runtime-tokenizer-2cp5</guid>
      <description>&lt;p&gt;Holy smokes.&lt;br&gt;
A whole month has passed since my &lt;a href="https://dev.to/thomscoder/lets-build-a-webassembly-compiler-and-runtime-webassembly-text-format-1pn"&gt;last article&lt;/a&gt;. I apologize, but I've been quite busy with work.&lt;/p&gt;

&lt;p&gt;But I'm back to continue the series of articles in which I explain how I built &lt;a href="https://github.com/thomscoder/luna" rel="noopener noreferrer"&gt;Luna&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In the last article we started building &lt;a href="https://github.com/thomscoder/luna" rel="noopener noreferrer"&gt;Luna&lt;/a&gt;, a toy WebAssembly compiler, and we explored WAT (the WebAssembly text format).&lt;/p&gt;

&lt;p&gt;Now that Christmas holidays are coming, it won't pass another month for the next article.&lt;/p&gt;

&lt;p&gt;Today I want to talk about Luna's tokenizer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First thing first&lt;/strong&gt;&lt;br&gt;
Luna is basically composed of &lt;strong&gt;three parts&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tokenizer&lt;/li&gt;
&lt;li&gt;Parser&lt;/li&gt;
&lt;li&gt;Compiler&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The flow is simple: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the input is divided into tokens&lt;/li&gt;
&lt;li&gt;these tokens are passed onto the parser which further 
analyzes them and builds an AST (Abstract Syntax Tree) &lt;/li&gt;
&lt;li&gt;the AST is passed to the compiler and transformed into a 
WebAssembly module. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So what is a &lt;strong&gt;Tokenizer?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;tokenizer&lt;/strong&gt;, sometimes called &lt;em&gt;Lexer&lt;/em&gt; is responsible for dividing the input stream into individual tokens, identifying the token type, and passing tokens one at a time to the &lt;strong&gt;Parser&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Token&lt;/strong&gt;&lt;br&gt;
It is basically a sequence of characters that are treated as a unit as it cannot be further broken down. - &lt;a href="https://www.geeksforgeeks.org/token-patterns-and-lexems/" rel="noopener noreferrer"&gt;geeksforgeeks&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So, the first step is to define tokens.&lt;br&gt;
Let's say we have this input:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(module
    (func (export "example") (param i32 i32) (result i32)
        local.get 0
        local.get 1
        i32.add)
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This example exports a function called &lt;strong&gt;"example"&lt;/strong&gt; that takes two arguments and adds them.&lt;/p&gt;

&lt;p&gt;Let's define the tokens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"func"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"module"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"export"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"param"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While developing your own compiler the choice of what's going to be a token and what not is yours.&lt;br&gt;
Since Luna is compiling a &lt;strong&gt;WAT-like language&lt;/strong&gt; there are also "special tokens" called instructions (e.g. local.get, i32.add etc..)&lt;/p&gt;

&lt;p&gt;For simplicity in Luna's code I've regrouped the tokens by type (because of the different manipulations the parser will later do on them)&lt;/p&gt;

&lt;p&gt;So for instance we have&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;instructions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"local&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;.get"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"i32&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;.(add|sub|mul|div|const)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Wait a moment!!!&lt;/strong&gt;&lt;br&gt;
We were talking about tokens what are these instructions?&lt;br&gt;
Citing WASM specification:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;WebAssembly code consists of sequences of instructions. Its computational model is based on a stack machine in that instructions manipulate values on an implicit operand stack, consuming (popping) argument values and producing or returning (pushing) result values.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;TLDR instructions are the mean to manipulate values.&lt;br&gt;
All instructions are immediately followed by arguments (such in our example above) except &lt;a href="https://webassembly.github.io/spec/core/syntax/instructions.html#control-instructions" rel="noopener noreferrer"&gt;control instructions&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Instructions are encoded by opcodes. Each opcode is represented by a single byte.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We will look into &lt;strong&gt;Opcodes&lt;/strong&gt; when we'll start talking about the parser.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Back to the Tokenizer&lt;/strong&gt;&lt;br&gt;
Okay, we have our tokens and instructions. As you might've already guessed. I decided to implement the tokenizer with &lt;strong&gt;Regex&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Since I'm trying to keep Luna as simple as possible and I do not care much about performances. &lt;br&gt;
I'm considering switching to some FSM algorithm or doing some refactor soon but the idea behind the tokenizer will still be as follows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Step zero&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's the complete list of my tokens&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"func"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"module"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"export"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"param"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;instructions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"local&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;.get"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"i32&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;.(add|sub|mul|div|const)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;numTypes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"i32"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"i64"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"f32"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"f64"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// We can hard hardcode a bunch of names for function export&lt;/span&gt;
&lt;span class="c"&gt;// so we do not need to change this everytime&lt;/span&gt;
&lt;span class="c"&gt;// or implement a simple regex to catch anything between quotes&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;literals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;([^&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;]+)&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;First step - Preparation&lt;/strong&gt; &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Since we are using regex we compile them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;tokensRegex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MustCompile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"^("&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;strings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"|"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;")"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;instructionRegex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MustCompile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"^("&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;strings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"|"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;")"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;typeNumRegex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MustCompile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"^("&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;strings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;numTypes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"|"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;")"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;literalsRegex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MustCompile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"^("&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;literals&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;")"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;numberRegex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MustCompile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"^[0-9]+"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;whitespaceRegex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MustCompile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;`^\s+`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A Luna's token has a &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Type: number, literal, token, instruction etc...&lt;/li&gt;
&lt;li&gt;Value: the token's value&lt;/li&gt;
&lt;li&gt;Index: where's the token is positioned&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Easily implemented with a Token struct&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Token&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Type&lt;/span&gt;  &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Index&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Optionally an helper &lt;strong&gt;Matcher&lt;/strong&gt; struct to help handling the regex matches.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Matcher&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Type&lt;/span&gt;  &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Second step&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;code&gt;tokenize&lt;/code&gt; function&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;Tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;Token&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;Token&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="n"&gt;matches&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;Matcher&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;

    &lt;span class="n"&gt;matchers&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Matcher&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;){&lt;/span&gt;
        &lt;span class="n"&gt;matchChecker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokensRegex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TypeToken&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;matchChecker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instructionRegex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TypeInstruction&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;matchChecker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;typeNumRegex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TypeNum&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;matchChecker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;literalsRegex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TypeLiteral&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;matchChecker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;numberRegex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;matchChecker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;whitespaceRegex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Whitespace&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;matchers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;matchFound&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;notFound&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c"&gt;// Prevent panic if no match is found&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;notFound&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="n"&gt;matches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;matchFound&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Token&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Type&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Index&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;"whitespace"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;matches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;Matcher&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is quite intuitive.&lt;br&gt;
The tokenize function loops through the input and runs all the regex we have compiled earlier.&lt;/p&gt;

&lt;p&gt;This is not the best solution if you want to make it fast. Please refer to other algorithms if that's your purpose.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Third step&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Does the token exist?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;matchChecker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rxp&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Regexp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;whichType&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Matcher&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Matcher&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

        &lt;span class="n"&gt;substr&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;rxp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FindString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Matcher&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;whichType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Matcher&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"no match found"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;Now try pass the input above to your Tokenize function and if you see this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;`(module
                (func (export "example") (param i32 i32) (result i32)
                    local.get 0
                    local.get 1
                    i32.add)
                )
            `&lt;/span&gt;
       &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;Tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Tokens:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Tokens: &lt;span class="o"&gt;[{&lt;/span&gt;token module 1&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;token func 13&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;token &lt;span class="nb"&gt;export &lt;/span&gt;19&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;literal &lt;span class="s2"&gt;"example"&lt;/span&gt; 26&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;token param 38&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;typeNum i32 44&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;typeNum i32 48&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;token result 54&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;typeNum i32 61&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;instruction local.get 71&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;number 0 81&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;instruction local.get 88&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;number 1 98&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;instruction i32.add 105&lt;span class="o"&gt;}]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Congratulation!&lt;br&gt;
You have a tokenizer.&lt;/p&gt;

&lt;p&gt;If my explanation was not clear enough or you have any code improvements, you can check and contribute to the source code directly, &lt;a href="https://github.com/thomscoder/luna" rel="noopener noreferrer"&gt;here&lt;/a&gt;. &lt;br&gt;
And of course I apologize for that.&lt;/p&gt;

&lt;p&gt;Try Luna: &lt;a href="https://luna-demo.vercel.app" rel="noopener noreferrer"&gt;https://luna-demo.vercel.app&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Resources:&lt;br&gt;
&lt;a href="https://webassembly.github.io/spec/core/intro/index.html" rel="noopener noreferrer"&gt;WebAssembly Specs&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next we'll tackle the parser!&lt;/p&gt;

&lt;p&gt;Stay safe and Merry Christmas 🫱🏻‍🫲🏽✨&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>database</category>
    </item>
    <item>
      <title>Let's build a WebAssembly compiler and runtime - WebAssembly Text Format</title>
      <dc:creator>Thomas Albertini</dc:creator>
      <pubDate>Tue, 15 Nov 2022 20:24:49 +0000</pubDate>
      <link>https://forem.com/thomscoder/lets-build-a-webassembly-compiler-and-runtime-webassembly-text-format-1pn</link>
      <guid>https://forem.com/thomscoder/lets-build-a-webassembly-compiler-and-runtime-webassembly-text-format-1pn</guid>
      <description>&lt;p&gt;Today I want to start a series of articles on how I've managed to build a WebAssembly compiler.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/thomscoder/luna"&gt;Luna&lt;/a&gt; is a really tiny compiler written mainly as a quest to conquer the WebAssembly dungeon.&lt;/p&gt;

&lt;p&gt;If you do not know what I'm talking about I suggest to read the &lt;a href="https://dev.to/thomscoder/luna-wrote-a-tiny-webassembly-compiler-that-runs-in-browser-built-for-demonstrative-and-educational-purposes-4o31"&gt;introductory article&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Long story short, some weeks ago I've decided to build my own Web Assembly compiler.&lt;br&gt;
This was quite the challenge.&lt;/p&gt;

&lt;p&gt;First of all what the heck is WAT??&lt;br&gt;
Second problem: I didn't know how to write a compiler.&lt;/p&gt;

&lt;p&gt;So here I am trying to dissect my journey and the informations I've acquired while writing &lt;a href="https://github.com/thomscoder/luna"&gt;Luna&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Today I'll give you an overview of the WebAssembly Text Format (the thing we are going to compile).&lt;/p&gt;


&lt;h2&gt;
  
  
  WAT (WebAssembly Text Format)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://developer.mozilla.org/en-US/docs/WebAssembly/Understanding_the_text_format"&gt;The WebAssembly text format&lt;/a&gt; is a textual representation of the WASM binary format.&lt;/p&gt;

&lt;p&gt;Let's say we have the following &lt;code&gt;.wat&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(module
  (func (export "add") (param i32) (param i32) (result i32)
  local.get 0
  local.get 1
  i32.add)
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What the code above does is simple (and quite intuitive): it exports a function aliased as "add" that takes two arguments and returns the sum of them.&lt;/p&gt;

&lt;p&gt;Easy.&lt;/p&gt;

&lt;p&gt;So let's analyze each word.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Module&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;A module is the fundamental unit of code in WebAssembly and it is loaded by a WASM runtime. In textual format a module is represented as a big &lt;a href="https://it.wikipedia.org/wiki/S-expression"&gt;S-Expression&lt;/a&gt; which Wikipedia defines as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;an expression in a like-named notation for nested list (tree-structured) data.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;NOTE: the shortest WASM program you can write is &lt;code&gt;(module)&lt;/code&gt; which does absolutely nothing lol, but it is still a valid WASM program.&lt;/p&gt;




&lt;p&gt;Inside our module above there are two structures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Func: the first structure is a function declared by the func keyword. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Export: the second structure is an export declared by the export keyword&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The function has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An identifier &lt;code&gt;$add&lt;/code&gt; that is the name of the function&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Two parameters &lt;code&gt;$num1&lt;/code&gt; e &lt;code&gt;$num2&lt;/code&gt; of type i32 (&lt;a href=""&gt;WebAssembly provides only four basic number types i32, i64, f32, f64&lt;/a&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A result of type &lt;code&gt;i32&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Three instructions &lt;code&gt;local.get 0&lt;/code&gt;, &lt;code&gt;local.get 1&lt;/code&gt;, &lt;code&gt;i32.add&lt;/code&gt; (to understand them we need to first learn about &lt;strong&gt;Stack Machines&lt;/strong&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Stack Machines&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;WASM execution is defined in terms of stack machines. The idea behind of a &lt;strong&gt;stack machine&lt;/strong&gt; is that every instruction is executed in order and either &lt;strong&gt;pushes&lt;/strong&gt; or &lt;strong&gt;pops&lt;/strong&gt; a number (i32/i64/f32/f64) from a stack.&lt;br&gt;
There are basically two types of instructions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;Simple instructions&lt;/code&gt; (e.g. i32.add, f32.sub etc...): generally &lt;strong&gt;pop&lt;/strong&gt; arguments from the stack and push the result back on it.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Control instructions&lt;/code&gt;: alter the control flow (we won't be seeing them in this series).
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(func (param $n i32) (result i32)
  local.get $n
  local.get $n
  i32.add)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;For example &lt;code&gt;local.get&lt;/code&gt; takes the param &lt;code&gt;$n&lt;/code&gt; and pushes it onto the stack, &lt;code&gt;i32.add&lt;/code&gt; instruction adds them (or better, adds all the elements present in the stack) and pushes the result onto the stack.&lt;/p&gt;

&lt;p&gt;So, I hope I've give you an idea of how to start writing your own WAT modules. They ain't that scary, are they?&lt;/p&gt;

&lt;p&gt;But there's one last thing I want to show before we get to the code.&lt;/p&gt;

&lt;p&gt;At the beginning of the article I've said that WebAssembly Text Format is a textual representation of the WASM binary format, but how does a WASM binary format look like?&lt;/p&gt;



&lt;p&gt;WASM Binary Format&lt;/p&gt;

&lt;p&gt;Our &lt;code&gt;.wat&lt;/code&gt; example&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(module
  (func (export "add") (param i32) (param i32) (result i32)
  local.get 0
  local.get 1
  i32.add)
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;would look like this (compiled with &lt;a href="https://webassembly.github.io/wabt/demo/wat2wasm/"&gt;wat2wasm&lt;/a&gt;)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0000000: 0061 736d                                 ; WASM_BINARY_MAGIC
0000004: 0100 0000                                 ; WASM_BINARY_VERSION
; section "Type" (1)
0000008: 01                                        ; section code
0000009: 00                                        ; section size (guess)
000000a: 01                                        ; num types
; func type 0
000000b: 60                                        ; func
000000c: 02                                        ; num params
000000d: 7f                                        ; i32
000000e: 7f                                        ; i32
000000f: 01                                        ; num results
0000010: 7f                                        ; i32
0000009: 07                                        ; FIXUP section size
; section "Function" (3)
0000011: 03                                        ; section code
0000012: 00                                        ; section size (guess)
0000013: 01                                        ; num functions
0000014: 00                                        ; function 0 signature index
0000012: 02                                        ; FIXUP section size
; section "Export" (7)
0000015: 07                                        ; section code
0000016: 00                                        ; section size (guess)
0000017: 01                                        ; num exports
0000018: 03                                        ; string length
0000019: 6164 64                                  add  ; export name
000001c: 00                                        ; export kind
000001d: 00                                        ; export func index
0000016: 07                                        ; FIXUP section size
; section "Code" (10)
000001e: 0a                                        ; section code
000001f: 00                                        ; section size (guess)
0000020: 01                                        ; num functions
; function body 0
0000021: 00                                        ; func body size (guess)
0000022: 00                                        ; local decl count
0000023: 20                                        ; local.get
0000024: 00                                        ; local index
0000025: 20                                        ; local.get
0000026: 01                                        ; local index
0000027: 6a                                        ; i32.add
0000028: 0b                                        ; end
0000021: 07                                        ; FIXUP func body size
000001f: 09                                        ; FIXUP section size
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or like this (compiled with &lt;a href="https://luna-demo.vercel.app"&gt;Luna&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--O72GOjM_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/32vp3ck9mwp8fok83hgy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--O72GOjM_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/32vp3ck9mwp8fok83hgy.png" alt="Luna binary format" width="880" height="541"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see, each module is divided in sections and each section has its own rules, there's the MAGIC WORD, a section for the function body, a section for the code, a section for the function type and whatnot... &lt;/p&gt;

&lt;p&gt;Do not worry,&lt;br&gt;
we will be tackling them all and we will conquer this WebAssembly dungeon.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Thank you for the reading, I hope you've enjoyed this and if you want to go deeper in the explanation I will leave some useful resources below. See ya in the next article!!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://webassembly.github.io/spec/core/intro/index.html"&gt;WebAssembly Spec&lt;/a&gt;&lt;br&gt;
&lt;a href="https://developer.mozilla.org/en-US/docs/WebAssembly/Understanding_the_text_format"&gt;Understanding WebAssembly Text Format&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/thomscoder/luna"&gt;Luna's Repo&lt;/a&gt;&lt;br&gt;
&lt;a href="https://luna-demo.vercel.app"&gt;Luna's Demo&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>webassembly</category>
      <category>programming</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Luna 🌙 - Wrote a tiny WebAssembly compiler, that also runs in browser, built for demonstrative and educational purposes.</title>
      <dc:creator>Thomas Albertini</dc:creator>
      <pubDate>Tue, 01 Nov 2022 21:27:07 +0000</pubDate>
      <link>https://forem.com/thomscoder/luna-wrote-a-tiny-webassembly-compiler-that-runs-in-browser-built-for-demonstrative-and-educational-purposes-4o31</link>
      <guid>https://forem.com/thomscoder/luna-wrote-a-tiny-webassembly-compiler-that-runs-in-browser-built-for-demonstrative-and-educational-purposes-4o31</guid>
      <description>&lt;p&gt;I've been wanting to learn how to make my own programming language for a really long time, but I could not find the road in the hard theory behind compilers, parsers, ASTs etc... I found the tutorials hard and my attention was not lasting for more than 5 minutes before completely shut down the project.&lt;/p&gt;

&lt;p&gt;Some days ago, though something changed.&lt;br&gt;
I'm currently in love with WebAssembly and had this absurd idea of developing a programming language that compiles to WebAssembly.&lt;/p&gt;

&lt;p&gt;I didn't know how WebAssembly internals worked so it was the perfect stage to learn something new and make a cool project.&lt;/p&gt;

&lt;p&gt;Sold!&lt;/p&gt;

&lt;p&gt;Started doing some researches and I found no tutorials except two amazing articles, &lt;a href="https://blog.scottlogic.com/2019/05/17/webassembly-compiler.html"&gt;one&lt;/a&gt; was how to build a compile for a custom programming language that compiles to WebAssembly (exactly what i was looking for), &lt;a href="https://www.bitfalter.com/webassembly-compiler-building-a-compiler"&gt;the other&lt;/a&gt; was a series of articles on how to build a Web assembly compiler (exactly what i was looking for, again!)&lt;/p&gt;

&lt;p&gt;But here come the problems: the first was written in Typescript and the second in Rust.&lt;/p&gt;

&lt;p&gt;I wanted to practice Go and I don't know anything about Rust.&lt;/p&gt;

&lt;p&gt;So long story short I studied the theory behind those two amazing articles, &lt;a href="https://webassembly.github.io/spec/core/intro/index.html"&gt;WebAssembly specification&lt;/a&gt; etc... and I started writing &lt;a href="https://github.com/thomscoder/luna"&gt;a compiler in Go&lt;/a&gt; (i love Go).&lt;/p&gt;

&lt;p&gt;I've really found no articles about writing WebAssembly compilers with Go, so I decided to document the building of Luna so future Gophers that will have my same idea won't have to struggle as much as I did to find resources.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--uUZghtBA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lufz9oyy136j3hba0f65.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--uUZghtBA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lufz9oyy136j3hba0f65.png" alt="Luna screenshot" width="880" height="430"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Luna mainly compiles WebAssembly Text Format to Wasm, but with small modifications in the tokenizer and parser, it  could also serve as a blueprint for developing custom programming languages in the browser that compile to WebAssembly.&lt;/p&gt;

&lt;p&gt;So this is Luna and in this series of articles I will explain EXACTLY how I built it.&lt;/p&gt;

&lt;p&gt;Meanwhile you can have a look at: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Source code: &lt;a href="https://github.com/thomscoder/luna"&gt;https://github.com/thomscoder/luna&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Demo: &lt;a href="https://luna-demo.vercel.app"&gt;https://luna-demo.vercel.app&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Luna is really tiny and it's in early stage, feel free to contribute (update docs, open PR, issues, leaving feedback or suggestions etc...), anything is welcomed!!&lt;/p&gt;

</description>
      <category>go</category>
      <category>webassembly</category>
      <category>programming</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Building a Git tool to play with Git entirely in-memory on the browser</title>
      <dc:creator>Thomas Albertini</dc:creator>
      <pubDate>Sun, 09 Oct 2022 13:46:53 +0000</pubDate>
      <link>https://forem.com/thomscoder/building-a-git-tool-that-runs-entirely-in-memory-on-the-browser-1c2j</link>
      <guid>https://forem.com/thomscoder/building-a-git-tool-that-runs-entirely-in-memory-on-the-browser-1c2j</guid>
      <description>&lt;p&gt;Hello everyone,&lt;br&gt;
today I released v0.0.3 of Harmony, a git tool, powered by web assembly, that runs entirely in-memory on the browser.&lt;/p&gt;

&lt;p&gt;Harmony was born as a tool to create and/or modify local files, on the fly, in your browser. &lt;br&gt;
Few weeks ago I decided to try to implement a sort of version control system in it. &lt;/p&gt;

&lt;p&gt;I think it would be cool one day to either use it for personal use or to teach Git concepts in a sandboxed area. Harmony is powered by web assembly and it runs all the git related stuff in-memory.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--BVZwDJZy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bk50pxpa5cxhcz9489aw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--BVZwDJZy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bk50pxpa5cxhcz9489aw.png" alt="Image description" width="880" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this release I brought an initial support for directories and the possibility to checkout to a particular commit&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--NXYJ9LBc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/i1e2wpqourn5qpdhyocs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--NXYJ9LBc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/i1e2wpqourn5qpdhyocs.png" alt="Image description" width="880" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is still in very early stages (couple of weeks of development)&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/thomscoder/harmony"&gt;https://github.com/thomscoder/harmony&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://harmonyland.vercel.app"&gt;https://harmonyland.vercel.app&lt;/a&gt;&lt;/p&gt;

</description>
      <category>github</category>
      <category>webassembly</category>
      <category>opensource</category>
      <category>devjournal</category>
    </item>
    <item>
      <title>Harmony ✨- Create, upload, edit (multiple) files on the fly, in the browser and track them with Git, via Web Assembly.</title>
      <dc:creator>Thomas Albertini</dc:creator>
      <pubDate>Thu, 29 Sep 2022 18:04:23 +0000</pubDate>
      <link>https://forem.com/thomscoder/harmony-create-upload-edit-on-the-fly-in-the-browser-and-track-them-with-git-via-web-assembly-4pic</link>
      <guid>https://forem.com/thomscoder/harmony-create-upload-edit-on-the-fly-in-the-browser-and-track-them-with-git-via-web-assembly-4pic</guid>
      <description>&lt;p&gt;I write my first post here to present &lt;strong&gt;Harmony&lt;/strong&gt;. My latest project.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--QDIOC3MI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/64s4x97fqkwj0dbr8d93.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--QDIOC3MI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/64s4x97fqkwj0dbr8d93.png" alt="Harmony view" width="880" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Some months ago, I built &lt;a href="https://github.com/thomscoder/nova-git"&gt;Nova&lt;/a&gt; a Git playground in Go in which you can clone repositories, create files, branches, commits etc... all in-memory.&lt;/p&gt;

&lt;p&gt;What does that have to do with &lt;a href="https://github.com/thomscoder/harmony"&gt;Harmony&lt;/a&gt; ? Well, I'm currently  experimenting with WebAssembly and the first idea that I had was to port Nova to the Web to give other people a better UX.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ReXFRlKE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/s1a022p4o20pgh6r0lze.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ReXFRlKE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/s1a022p4o20pgh6r0lze.png" alt="Harmony editor" width="880" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So Golang -&amp;gt; WASM -&amp;gt; Javascript -&amp;gt; React BOOM! &lt;br&gt;
I can finally present a &lt;a href="https://harmonyland.vercel.app"&gt;stable version of Harmony&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;What Harmony does is simple: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Allows the creation, upload and editing of multiple (local) files. &lt;/li&gt;
&lt;li&gt;Allows the creation of Git Branches and Commits to save different "workspaces"&lt;/li&gt;
&lt;li&gt;Allows to switch between files and between "workspaces" in one click.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jrJYM_tF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/3b7qhi30ftbrxup18r6y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jrJYM_tF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/3b7qhi30ftbrxup18r6y.png" alt="Harmony commit" width="880" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It does all that right in your browser. In-memory. Refresh the page to start anew, in another empty repository.&lt;/p&gt;

&lt;p&gt;I won't go into how, that's for another post (maybe). &lt;/p&gt;

&lt;p&gt;Here's the &lt;a href="https://github.com/thomscoder/harmony"&gt;repository&lt;/a&gt; where there's a video.&lt;/p&gt;

&lt;p&gt;Here's the &lt;a href="https://harmonyland.vercel.app"&gt;demo&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>webassembly</category>
      <category>git</category>
      <category>react</category>
      <category>devjournal</category>
    </item>
  </channel>
</rss>
