<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mainak Bhattacharjee</title>
    <description>The latest articles on Forem by Mainak Bhattacharjee (@mainak55512).</description>
    <link>https://forem.com/mainak55512</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2195600%2F9ed4ccd0-3977-436d-a614-310a07e3773b.jpeg</url>
      <title>Forem: Mainak Bhattacharjee</title>
      <link>https://forem.com/mainak55512</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mainak55512"/>
    <language>en</language>
    <item>
      <title>From Regex Rampage to Lazy Bliss: My rjq Performance Adventure</title>
      <dc:creator>Mainak Bhattacharjee</dc:creator>
      <pubDate>Sat, 12 Oct 2024 16:19:08 +0000</pubDate>
      <link>https://forem.com/mainak55512/from-regex-rampage-to-lazy-bliss-my-rjq-performance-adventure-5bih</link>
      <guid>https://forem.com/mainak55512/from-regex-rampage-to-lazy-bliss-my-rjq-performance-adventure-5bih</guid>
      <description>&lt;p&gt;Hey there, fellow Rustaceans 🦀! &lt;/p&gt;

&lt;p&gt;I've been building a JSON filter tool called &lt;code&gt;rjq&lt;/code&gt;, inspired by the awesome &lt;code&gt;jq&lt;/code&gt;. But things took a turn for the worse when I hit a performance wall during lexing. The culprit? Compiling regular expressions in a hot loop . It turns out, regexes are like hungry hippos – they chomp up performance if you're not careful!&lt;br&gt;
Here's the story of how I tamed the regex beast and saved my program from a slow, sluggish fate:&lt;/p&gt;
&lt;h2&gt;
  
  
  The Regex Rampage 🦖:
&lt;/h2&gt;

&lt;p&gt;At first, I naively compiled the regex patterns within the lexing loop. This meant every iteration involved creating a brand new regex object. Think of it like baking a whole new pizza for every bite – inefficient, right? This constant creation caused a major performance bottleneck i.e. ~80% execution time was consumed by this.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Lazylock Solution 🧙‍♂️:
&lt;/h2&gt;

&lt;p&gt;Thankfully, the Rust gods (and some helpful folks on the &lt;a href="https://www.reddit.com/r/rust/s/Uqx8gxnhkO" rel="noopener noreferrer"&gt;r/Rust&lt;/a&gt; subreddit) pointed me towards &lt;code&gt;lazy_static&lt;/code&gt; and a technique called &lt;code&gt;lazy initialization&lt;/code&gt;. This magic combo allowed me to compile the regex only once and store it in a thread-safe location using a &lt;code&gt;LazyLock&lt;/code&gt;. Now, it's like having a box of pizza ready with a fresh slices whenever you need it – much more efficient!&lt;/p&gt;
&lt;h2&gt;
  
  
  The Lazy Bliss ✨:
&lt;/h2&gt;

&lt;p&gt;The impact was phenomenal! Performance soared, and my lexing code became as smooth as butter . No more regex rampage, just happy filtering .&lt;br&gt;
Want to See the Code?&lt;br&gt;
Curious about the details? Head over to my GitHub repo for rjq: &lt;a href="https://github.com/mainak55512/rjq" rel="noopener noreferrer"&gt;https://github.com/mainak55512/rjq&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Lessons Learned 📚:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Regex compilation can be expensive, avoid hot loops!&lt;/li&gt;
&lt;li&gt;Embrace lazy initialization for performance gains.&lt;/li&gt;
&lt;li&gt;There's always a better way to do things in Rust (and life!)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, the next time you encounter a performance bottleneck, remember – there might be a lazy solution waiting to be discovered!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;P.S.&lt;/strong&gt; If you have any other tips or tricks for optimizing JSON filtering in Rust, leave a comment below!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But wait, there's more!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let's dive deeper into the technical aspects of this adventure.&lt;br&gt;
Understanding &lt;code&gt;lazy_static&lt;/code&gt; and &lt;code&gt;LazyLock&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;lazy_static&lt;/code&gt;: This macro provides a way to declare static variables that are initialized only once, even in a multi-threaded environment.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LazyLock&lt;/code&gt;: This is a type provided by the lazy_static crate that ensures thread-safety during initialization.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's a simplified example of how I used these to optimize the regex compilation in rjq:&lt;/p&gt;

&lt;p&gt;Outside the hot loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;MATCH_NUMBER&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;LazyLock&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Regex&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;LazyLock&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(||&lt;/span&gt; &lt;span class="nn"&gt;Regex&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;r"^\d+\.?\d+"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.unwrap&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="n"&gt;and&lt;/span&gt; &lt;span class="n"&gt;so&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside the hot loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;MATCH_NUMBER&lt;/span&gt;&lt;span class="nf"&gt;.is_match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;source_string&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;MATCH_NUMBER&lt;/span&gt;
            &lt;span class="nf"&gt;.find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;source_string&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="nf"&gt;.map&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="nf"&gt;.as_str&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nf"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="nf"&gt;.len&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
                &lt;span class="n"&gt;token_array&lt;/span&gt;&lt;span class="nf"&gt;.push_back&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;token&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;TokenType&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;NUMBER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="nf"&gt;.to_string&lt;/span&gt;&lt;span class="p"&gt;()));&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="nb"&gt;None&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="n"&gt;so&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, the MATCH_NUMBER variable is declared using LazyLock, and it's initialized only once when the code is first executed. The LazyLock within the code ensures that the initialization is thread-safe.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Additional Performance Tips&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Profiling: Use tools like &lt;code&gt;perf&lt;/code&gt; or &lt;code&gt;cargo-flamegraph&lt;/code&gt; to identify other performance bottlenecks in your code.&lt;/li&gt;
&lt;li&gt;Data Structures: Choose appropriate data structures for your use case. For example, consider using HashMap for efficient lookups.&lt;/li&gt;
&lt;li&gt;Algorithms: Optimize algorithms to reduce computational complexity.&lt;/li&gt;
&lt;li&gt;Memory Management: Be mindful of memory allocations and deallocations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By following these tips and leveraging techniques like lazy initialization, you can significantly improve the performance of your Rust applications.&lt;/p&gt;

&lt;p&gt;Happy coding 🎉!&lt;/p&gt;

</description>
      <category>rust</category>
      <category>regex</category>
      <category>performance</category>
      <category>linux</category>
    </item>
    <item>
      <title>Introducing rjq: A Fast and Lightweight CLI JSON Filtering Tool</title>
      <dc:creator>Mainak Bhattacharjee</dc:creator>
      <pubDate>Fri, 11 Oct 2024 06:05:31 +0000</pubDate>
      <link>https://forem.com/mainak55512/introducing-rjq-a-fast-and-lightweight-cli-json-filtering-tool-2ifo</link>
      <guid>https://forem.com/mainak55512/introducing-rjq-a-fast-and-lightweight-cli-json-filtering-tool-2ifo</guid>
      <description>&lt;p&gt;In the world of data manipulation, JSON has become a ubiquitous format, but filtering and querying JSON data can be cumbersome without the right tools. Enter &lt;a href="https://github.com/mainak55512/rjq" rel="noopener noreferrer"&gt;rjq&lt;/a&gt;, a command-line JSON filtering tool developed in Rust🦀, designed to be a performant and lightweight alternative to the popular jq tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Motivation Behind rjq
&lt;/h2&gt;

&lt;p&gt;rjq began as a hobby project, driven by a desire to create a tool that prioritizes performance and simplicity. With a keen focus on providing a robust alternative to jq, rjq is compatible with both Linux and Windows, making it accessible to a wider audience.&lt;br&gt;
Key Features&lt;/p&gt;
&lt;h2&gt;
  
  
  Performance:
&lt;/h2&gt;

&lt;p&gt;rjq has been optimized for speed, running nearly 2x faster than jq when tested on a Linux machine with 4GB RAM and an Intel i3 6th Gen processor. This performance boost can significantly enhance workflows, especially for users dealing with large datasets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0zprniev626lok0bwtg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0zprniev626lok0bwtg.png" alt="rjq vs jq benchmark" width="800" height="190"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Simplicity:
&lt;/h2&gt;

&lt;p&gt;The query structure of rjq is designed to be intuitive. Writing queries feels akin to crafting simple conditional statements in any programming language, which lowers the barrier to entry for new users.&lt;/p&gt;
&lt;h2&gt;
  
  
  Lightweight:
&lt;/h2&gt;

&lt;p&gt;With a minimalistic approach, rjq ensures that users can quickly load and filter JSON data without unnecessary overhead.&lt;/p&gt;
&lt;h2&gt;
  
  
  How to Use rjq
&lt;/h2&gt;

&lt;p&gt;Using rjq is straightforward. You can load JSON data from a file using the --load flag, or you can pipe input directly into the tool. Here are some usage examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
rjq &lt;span class="nt"&gt;--load&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"test.json"&lt;/span&gt; &lt;span class="nt"&gt;--query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;query string&amp;gt;"&lt;/span&gt; &lt;span class="nt"&gt;--params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;comma separated parameter list&amp;gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alternatively, you can pipe JSON output from other commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
stto &lt;span class="nt"&gt;--json&lt;/span&gt; cpython | rjq &lt;span class="nt"&gt;--query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;query string&amp;gt;"&lt;/span&gt; &lt;span class="nt"&gt;--params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;comma separated parameter list&amp;gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Development Journey
&lt;/h2&gt;

&lt;p&gt;The development of rjq has been a valuable learning experience, particularly in mastering the intricacies of Rust. The support from the Reddit community was instrumental in overcoming challenges. You can check out some of the discussions and insights from fellow developers in this &lt;a href="https://www.reddit.com/r/rust/s/AdvcGtZhkD" rel="noopener noreferrer"&gt;Reddit post&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Future Plans
&lt;/h2&gt;

&lt;p&gt;Looking ahead, the goal for rjq is to become the go-to choice for JSON filtering on both Linux and Windows. rjq is still in early development stages and there are plans to incorporate new features and enhance the tool's capabilities to meet the needs of users day-to-day usage. Any contribution, stars ⭐ and fork 🔗 to the &lt;a href="https://github.com/mainak55512/rjq" rel="noopener noreferrer"&gt;rjq repo&lt;/a&gt; is greatly appreciated 👍.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who Can Benefit from rjq?
&lt;/h2&gt;

&lt;p&gt;rjq is tailored for:&lt;/p&gt;

&lt;h4&gt;
  
  
  Developers:
&lt;/h4&gt;

&lt;p&gt;Those working with JSON data who need a reliable filtering tool.&lt;/p&gt;

&lt;h4&gt;
  
  
  Data Analysts:
&lt;/h4&gt;

&lt;p&gt;Professionals seeking efficient data extraction methods.&lt;/p&gt;

&lt;h4&gt;
  
  
  DevOps Teams:
&lt;/h4&gt;

&lt;p&gt;Teams automating data processing tasks in their workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;p&gt;Getting started with rjq is easy. Binaries for both Linux and Windows are available in the &lt;a href="https://github.com/mainak55512/rjq/releases" rel="noopener noreferrer"&gt;releases section&lt;/a&gt; of the GitHub repository, allowing users to install the tool without hassle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Whether you’re a developer, data analyst, or part of a DevOps team, rjq offers a fast, lightweight solution for filtering JSON data. With its performance, simplicity, and growing feature set, rjq is poised to become an essential tool in your data processing arsenal. Check out the &lt;a href="https://github.com/mainak55512/rjq" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt; to learn more and get started today!&lt;/p&gt;

</description>
      <category>jq</category>
      <category>rust</category>
      <category>microsoft</category>
      <category>linux</category>
    </item>
  </channel>
</rss>
