<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: shaafiee</title>
    <description>The latest articles on Forem by shaafiee (@shaafiee).</description>
    <link>https://forem.com/shaafiee</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F422435%2F0620454a-c74d-4312-8b5a-0dd03cde208f.png</url>
      <title>Forem: shaafiee</title>
      <link>https://forem.com/shaafiee</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/shaafiee"/>
    <language>en</language>
    <item>
      <title>FizzBuzz</title>
      <dc:creator>shaafiee</dc:creator>
      <pubDate>Tue, 26 Jan 2021 02:19:07 +0000</pubDate>
      <link>https://forem.com/shaafiee/fizzbuzz-2pj5</link>
      <guid>https://forem.com/shaafiee/fizzbuzz-2pj5</guid>
      <description>&lt;p&gt;It’s exam day and, as soon as you wake up, your mind starts racing through all manners of algorithms, until it occurs to you that FizzBuzz was something you never attempted. You quickly fill your moka pot with water, spoon some cheap espresso grind into the filter, tighten the kettle and put it on the hob at low heat.&lt;/p&gt;

&lt;p&gt;You flop into the only chair in your dorm room and skillfully enter your PIN on the Windows box in front of you. It takes some time for the window manager to compete with the likes of your Claymore miner to allow you to interface with your machine. As soon as the cursor on the screen synchronizes with your jiggling of the mouse (yes, the mouse), you summon Visual Studio, and start a new C++ console project.&lt;/p&gt;

&lt;p&gt;You hastily type in a program for the typical FizzBuzz algorithm, demonstrating, to nobody in particular, the touch-typing skills that you have been honing since you were a pup. You sit back, admiring your work for two seconds, and immediately feel like challenging yourself. Leaning forward and typing furiously, you insert a high-resolution timer and change the algorithm to count (instead of displaying) the number of fizzes, buzzes and fizzbuzzes for numbers 1 through 4 billion. You end up with the following program:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#include &amp;lt;iostream&amp;gt;
#include &amp;lt;chrono&amp;gt;


int main()
{
    int fizz = 0, buzz = 0, fizzbuzz = 0;


    auto startTime = std::chrono::high_resolution_clock::now();


    for (unsigned int i = 1; i &amp;lt;= 4000000000; i++) {
        if (i % 3 == 0 &amp;amp;&amp;amp; i % 5 == 0) {
            fizzbuzz++;
        }
        else if (i % 3 == 0) {
            fizz++;
        }
        else if (i % 5 == 0) {
            buzz++;
        }
    }


    auto endTime = std::chrono::high_resolution_clock::now();
    auto totalTime = endTime - startTime;


    printf("\t fizz : %d, buzz: %d, fizzbuzz: %d, duration %lld milliseconds\n", fizz, buzz, fizzbuzz, (totalTime / std::chrono::milliseconds(1)));


    return 0;


}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then you turn on optimizations, compile it and run it. Before the run completes, you go over to the hob, pick up a semi-clean mug nearby and fill it with the hot black liquid from the moka pot.&lt;br&gt;
You sit back down and, taking a sip from the mug, note that your algorithm took 6 seconds to complete.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Domzk7zE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/94kwwu8hfnga6g10aevn.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Domzk7zE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/94kwwu8hfnga6g10aevn.PNG" alt="regular fizzbuzz"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Even for a machine as old as your’s (Intel i7-7700K), this feels like a long time to crunch through 4 billion numbers. So, you start optimizing your algorithm.&lt;/p&gt;

&lt;p&gt;Being a computer science student, you note (with a mildly pompous sense of pride) that all the time is consumed by the code inside your for-loop. You recognize that for 5% (fizzbuzz) of the 4 billion numbers there are 2 modulus operations being performed, whilst for about 23% (fizz) of the iterations 3 modulus operations were performed and for the remaining iterations, 4 modulus operations were performed (not taking into account the optimizations performed by the compiler). After a flurry of keystrokes, you end up with the following modified for-loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    for (unsigned int i = 1; i &amp;lt;= 4000000000; i++) {
        if (i % 3 == 0) {
            if (i % 5 == 0) {
                fizzbuzz++;
            }
            else {
                fizz++;
            }
        }
        else if (i % 5 == 0) {
            buzz++;
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compile. Run! Down to 5.55 seconds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--X2l-JPHm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/i8dyo5tll40vks5x0yck.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--X2l-JPHm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/i8dyo5tll40vks5x0yck.PNG" alt="halfway fizzbuzz"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You go through the algorithm and find yourself bemused upon realizing that each cycle of the for-loop performed exactly 2 modulus operations. That’s when you realize that the overheads of managing the for-loop and the logical operations were considerable. You start work on further optimizing your for-loop. Minutes later, you end up with the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    for (unsigned int i = 1; i &amp;lt;= 4000000000; i++) {
        isFizz = false;
        if (i % 3 == 0) {
            isFizz = true;
            fizz++;
        }
        if (i % 5 == 0) {
            if (isFizz) {
                fizz--;
                fizzbuzz++;
            }
            else {
                buzz++;
            }
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After running the new program, you get a result of under 5.2 seconds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--zPVGs9u6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/asx0l8yhhl3ssfrsx2bu.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--zPVGs9u6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/asx0l8yhhl3ssfrsx2bu.PNG" alt="optimized fizzbuzz"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once again, your program performs only 2 modulus operations in every iteration. You realize that in the previous optimization your program performed 2 logical operations in each iteration. However, in your new optimization, your program performs 3 logical operations through 13% of the cycles and 2 logical operations each throughout the rest of the cycles. You have now established that  incursions also have a cost, and you start wondering if you could limit them.&lt;/p&gt;

&lt;p&gt;You get cracking on transforming your entire linear program into a multithreaded one. So, relying on native Windows multithreading, you fire off three CreateThread() calls, one for fizz, the other for buzz, and the last for fizzbuzz counting. This way, each thread does only 2 modulus operations with a single logic operation that does not have an  incursion. Sometime after your mug is drained, you end up with the following program:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/shaafiee/fizzbuzz/blob/main/ConsoleApplication2/fizzbuzz_multithreaded.cpp"&gt;Mutithreaded Fizzbuzz&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When you compile and run the program, you are thrilled to yield a computation time of 4.6 seconds. Success!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s---hpnZ5FY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/05jxffipu96tvb50kjd7.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s---hpnZ5FY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/05jxffipu96tvb50kjd7.PNG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You then look at your watch and realize there was ample time to take the long way through the campus to the exam halls. Smelling your breath, and establishing that it was just south of unbearable, you pick up your bag and head outside. As you lock the door, you chuckle to yourself, “those humans don’t know what’s coming.” Then you run your tentacles through your antennae and slither off to the exam halls.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/shaafiee/fizzbuzz"&gt;All the source&lt;/a&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>algorithms</category>
      <category>cpp</category>
    </item>
    <item>
      <title>WASM: Memory Management</title>
      <dc:creator>shaafiee</dc:creator>
      <pubDate>Thu, 02 Jul 2020 16:37:50 +0000</pubDate>
      <link>https://forem.com/shaafiee/wasm-memory-management-33l6</link>
      <guid>https://forem.com/shaafiee/wasm-memory-management-33l6</guid>
      <description>&lt;p&gt;So you have chosen to write your new web app in WASM - exciting! On top of that, you want to write it in C++ to have fine-grained control over data storage and manipulation.&lt;/p&gt;

&lt;p&gt;Here's some great advice that will help you overcome serious headaches.&lt;/p&gt;

&lt;p&gt;Firstly, because the memory available to your program is actually a JS object, it is available as one contiguous chunk that is limited to linear scaling. This means that you have to be very careful about deleting objects and freeing memory. In fact, stop deleting objects altogether. If you feel the need to get rid of temporary memory objects then create a separate temporary memory object within JS for that operation, like so:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--WBOHNXEO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/mmngiq9nrnjdtbmgzrot.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--WBOHNXEO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/mmngiq9nrnjdtbmgzrot.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The second big hint is, align your data structures. When you have lots of data structures that go in and out of the execution scope, you will run into lots of segmentation faults due to memory misalignments, particularly if your structures have many levels of invariably scaling sub-structures, such as in the case of Markov chains.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2EmoqezG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/esebq0tujxfbdrepjvhl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2EmoqezG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/esebq0tujxfbdrepjvhl.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No alt text provided for this image&lt;br&gt;
Explicit memory alignment will have penalties in terms of growth of memory as your Markov chains' complexities increase - this is where multiple memory objects come in handy. This drawback is worth the performance and stability bonuses, which you will learn as you dig into WASM.&lt;/p&gt;

&lt;p&gt;Have fun in your WASM journey!&lt;/p&gt;

</description>
      <category>cpp</category>
      <category>webassembly</category>
      <category>javascript</category>
    </item>
  </channel>
</rss>
