<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Aniruddha Bhattacharjee</title>
    <description>The latest articles on Forem by Aniruddha Bhattacharjee (@anirudd).</description>
    <link>https://forem.com/anirudd</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F242889%2F2368b4af-f55f-4698-9a8f-e495abaa73ec.jpeg</url>
      <title>Forem: Aniruddha Bhattacharjee</title>
      <link>https://forem.com/anirudd</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/anirudd"/>
    <language>en</language>
    <item>
      <title>Algorithmic Deep-dive: Dynamic String Matching</title>
      <dc:creator>Aniruddha Bhattacharjee</dc:creator>
      <pubDate>Sun, 08 Nov 2020 12:19:31 +0000</pubDate>
      <link>https://forem.com/anirudd/algorithmic-deep-dive-dynamic-string-matching-430h</link>
      <guid>https://forem.com/anirudd/algorithmic-deep-dive-dynamic-string-matching-430h</guid>
      <description>&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;

&lt;p&gt;Consider a minimalistic and austere typing game wherein a player is required to type a bunch of words appearing on the screen before the timer expires.&lt;/p&gt;

&lt;p&gt;Something like this:&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fx7ibq94kbllbkkjpdt4l.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fx7ibq94kbllbkkjpdt4l.PNG" alt="Pygame Typing Challenge"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The entire implementation of the game is &lt;a href="https://github.com/anirudnits/Pygame-Typing-Challenge" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Okay! Lets get to it then.&lt;/p&gt;

&lt;h2&gt;
  
  
  Problem
&lt;/h2&gt;

&lt;p&gt;Lets consider the problem of actually &lt;strong&gt;matching&lt;/strong&gt; the &lt;em&gt;player string&lt;/em&gt; to the list of words. The typical implementation of it would work something like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Record the player input, we will refer to its present state as &lt;em&gt;player_string&lt;/em&gt; &lt;/li&gt;
&lt;li&gt;When the player hits enter, match the current player string with the list of words like
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;player_string&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;word_list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# do something
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The time complexity of this is &lt;strong&gt;O(len(player_string)*len(word_list))&lt;/strong&gt;, which is fine if we are only processing such matching when the player hits enter. But say we don't want that, what we want is whenever the player presses any key, the &lt;em&gt;player_string&lt;/em&gt; is matched with words in the word_list. Now the earlier time complexity seems a bit extraneous and we will try to come with a more elegant and efficient approach to solve this.&lt;/p&gt;

&lt;h2&gt;
  
  
  Requirements
&lt;/h2&gt;

&lt;p&gt;Before going into the nitty-gritty of the different approaches used, let us take a pause and list down our exact requirements.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The string matching needs to be done with each keystroke, not just when player hits enter.&lt;/li&gt;
&lt;li&gt;We should support the following user actions:

&lt;ul&gt;
&lt;li&gt;The player can insert an alphanumeric character.&lt;/li&gt;
&lt;li&gt;The player can delete the character left of the cursor                     position, if any, by pressing backspace.&lt;/li&gt;
&lt;li&gt;The player can move the cursor position using the left-right direction keys.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Trie Approach
&lt;/h2&gt;

&lt;p&gt;If you are not familiar with the &lt;em&gt;Trie&lt;/em&gt; data structure, refer &lt;a href="https://www.youtube.com/watch?v=AXjmTQ8LEoI" rel="noopener noreferrer"&gt;here1&lt;/a&gt;,&lt;br&gt;
&lt;a href="https://www.hackerearth.com/practice/data-structures/advanced-data-structures/trie-keyword-tree/tutorial/" rel="noopener noreferrer"&gt;here2&lt;/a&gt;. Very briefly, &lt;em&gt;Trie&lt;/em&gt; or prefix tree is a data structure that stores a list of strings in a tree structure and helps maintain the time complexity of many operations like insertion, deletion and search to order of &lt;strong&gt;O(len(word))&lt;/strong&gt;. Note &lt;em&gt;Trie&lt;/em&gt; can be used in various contexts other than just strings, refer &lt;a href="https://www.hackerearth.com/practice/notes/lalitkundu95/tutorial-on-trie-and-example-problems/" rel="noopener noreferrer"&gt;here3&lt;/a&gt; to understand how &lt;em&gt;tries&lt;/em&gt; can be used to solve &lt;em&gt;bit manipulation&lt;/em&gt; questions.&lt;/p&gt;

&lt;p&gt;Now that we have got the introductions out of the way, lets see how &lt;em&gt;Tries&lt;/em&gt; are relevant to our problem. Tries are pertinent to us in two ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Store the list of words:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Something like &lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Ffb1azzu32cpkvatfyb8e.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Ffb1azzu32cpkvatfyb8e.jpg" alt="Trie data structure to store words"&gt;&lt;/a&gt;&lt;br&gt;
The time required to store all the words in the list will be of order &lt;strong&gt;O(n*m)&lt;/strong&gt;, where n = len(words) and m = max(len(word))&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check if the &lt;em&gt;player_string&lt;/em&gt; matches with any word:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Now, if you're following along, this part is rather trivial. We start at the root of the tree and for every character entered we traverse the tree accordingly and if we ever arrive at a node with &lt;code&gt;last == True&lt;/code&gt;, VOILA! there is a match and we can strike that word out from the list, easy peasy.&lt;/p&gt;

&lt;p&gt;But what happens if the player presses &lt;em&gt;backspace&lt;/em&gt;, our data structure still doesn't have the capability to process this user action. A typical &lt;em&gt;Trie&lt;/em&gt; implementation doesn't support going back along a path but this can be easily added by either appending an extra attribute &lt;em&gt;parent&lt;/em&gt; to each &lt;em&gt;trie node&lt;/em&gt; that way we can move back one node whenever the user presses backspace or the way I did it, by augmenting a list to keep track of all the nodes we have visited while traversing some input and then popping nodes from this list with each backspace. This does increase the space complexity to &lt;strong&gt;O(len(word))&lt;/strong&gt; but keeps the time complexity to &lt;strong&gt;O(1)&lt;/strong&gt; for insertion and backspace user actions.&lt;/p&gt;

&lt;p&gt;The implementation for this &lt;em&gt;Trie&lt;/em&gt; based approach can be found &lt;a href="https://github.com/anirudnits/Pygame-Typing-Challenge/blob/4b79ea29dfac1e086c64848bb94daea03657f0b9/data_structures.py#L25" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Rolling hash + Deque Approach
&lt;/h2&gt;

&lt;p&gt;The problem with &lt;em&gt;Trie&lt;/em&gt; based approach is that, at least to the best of my knowledge, there's no way to incorporate the cursor moving actions. Come to think of it this does make sense, &lt;em&gt;Trie&lt;/em&gt; or prefix tree, is a data structure which is designed to process, no surprises here, &lt;em&gt;prefixes&lt;/em&gt; and as soon a player starts moving his/her cursor and start typing from arbitrary positions in the string that prerequisite is broken. So what do we do?&lt;/p&gt;

&lt;p&gt;Rolling hash comes to the rescue. Rolling hash is a hashing technique which processes one character at a time for computing the hash value and depending how one implements it, can efficiently recompute hash values if some character(s) are to inserted/deleted. Rolling hash is used in the &lt;em&gt;Rabin Karp&lt;/em&gt; algorithm for pattern matching &lt;a href="https://en.wikipedia.org/wiki/Rolling_hash" rel="noopener noreferrer"&gt;here1&lt;/a&gt; and also palindrome based queries &lt;a href="https://www.quora.com/q/threadsiiithyderabad/String-Hashing-for-competitive-programming?share=1" rel="noopener noreferrer"&gt;here2&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Rolling hash intuitively fits the bill as it allows us to efficiently compute hash values for insertion and deletion operations. But what about the cursor moving actions? Well, for that we have to employ &lt;em&gt;deques&lt;/em&gt; to dynamically maintain characters to the right and left of the cursor position. This approach works this way:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Store the list of words:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We compute the hash value for all the words in the list and store them in a dictionary.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check if the &lt;em&gt;player_string&lt;/em&gt; matches with any word:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We dynamically compute the hash value of the &lt;em&gt;player_string&lt;/em&gt; with each user action and check if the hash value is present in the dictionary of stored hashes, if yes then we ensure that the strings actually match by brute-force string-matching of the player_string with the potential match.&lt;/p&gt;

&lt;p&gt;This approach maintains a time complexity of &lt;strong&gt;O(log(mod))&lt;/strong&gt; per user action, where &lt;em&gt;mod&lt;/em&gt; is the modulo used in calculating the hash values of the strings. Note that this approach does depend on the uniqueness of the hashing algorithm and key space used. For this instance I used a modulo closer to ~10^10.&lt;/p&gt;

&lt;p&gt;The implementation of this Rolling hash + Deque approach can be found &lt;a href="https://github.com/anirudnits/Pygame-Typing-Challenge/blob/4b79ea29dfac1e086c64848bb94daea03657f0b9/data_structures.py#L64" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So that's it, the &lt;em&gt;rolling hash + deque&lt;/em&gt; approach maintains a minimal time complexity for our problem while supporting all the intended user actions.&lt;/p&gt;

&lt;p&gt;I've tried to explain the approaches and their essential points succinctly and so they may not be a 100% comprehensive but do look at the code, I believe that will clear a lot of confusion. Plus do comment if you believe anything is wrong or you think there is a more elegant or efficient way of solving this, I would love to learn about it.&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>tutorial</category>
      <category>python</category>
      <category>datastructures</category>
    </item>
    <item>
      <title>Python's Import System</title>
      <dc:creator>Aniruddha Bhattacharjee</dc:creator>
      <pubDate>Sat, 15 Aug 2020 10:49:14 +0000</pubDate>
      <link>https://forem.com/anirudd/python-s-import-system-11ij</link>
      <guid>https://forem.com/anirudd/python-s-import-system-11ij</guid>
      <description>&lt;p&gt;In this blog, I want to shed some light on a somewhat overlooked and taken-for-granted feature of Python, and many programming languages for that matter, the &lt;strong&gt;Import System&lt;/strong&gt;, specifically how Python goes about importing various modules and the subtle nuances involved in the process.&lt;/p&gt;

&lt;p&gt;Before delving into the topic at hand, I want to briefly outline the difference between a &lt;em&gt;Python module&lt;/em&gt; and a &lt;em&gt;Python script&lt;/em&gt;, this will come in handy when we discuss relative imports.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python Module &amp;amp; Python Script
&lt;/h3&gt;

&lt;p&gt;In essence, both a python module and a python script are executable python code but serve different purposes, a &lt;em&gt;script&lt;/em&gt; is a python file that is executed on its own like &lt;code&gt;python xyz.py&lt;/code&gt; whereas a &lt;em&gt;module&lt;/em&gt; is a python file that is imported by other python files and typically not executed on its own. Another subtle implementation difference shows up in the value of their global attribute &lt;code&gt;__name__&lt;/code&gt;. The &lt;code&gt;__name__&lt;/code&gt; value for a python script is &lt;code&gt;__main__&lt;/code&gt; while that of a module is its &lt;em&gt;path&lt;/em&gt;. To try it out, write an innocuous python file &lt;em&gt;X.py&lt;/em&gt; and add the solitary line &lt;code&gt;print("the __name__ attribute for X is", __name__)&lt;/code&gt;. Now run X.py directly with &lt;code&gt;python X.py&lt;/code&gt; and observe the output. Now, write another python file Y.py in the same directory and just add &lt;code&gt;import X&lt;/code&gt; inside it. Run &lt;em&gt;Y.py&lt;/em&gt; and Voila! see the difference in the value in the &lt;code&gt;__name__&lt;/code&gt; attribute for X.&lt;/p&gt;

&lt;p&gt;When you first started learning to code in Python, you might have come across a statement at the end of the python program that read something like&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if __name__ == "__main__":
# do something
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Now you know the reason why! &lt;a href="https://docs.python.org/3/library/__main__.html#module-__main__"&gt;Python's &lt;code&gt;__main__&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Python Import System
&lt;/h3&gt;

&lt;p&gt;For the sake of brevity, I'll confine this blog to the "basic" and "pertinent" (obviously basic and pertinent are used in relative terms, so don't take them absolutely) steps on what happens when Python executes an &lt;code&gt;import XYZ&lt;/code&gt; statement without going into the more esoteric details, for that, you can refer &lt;a href="https://docs.python.org/3/reference/import.html#:~:text=Python%20code%20in%20one%20module,Functions%20such%20as%20importlib"&gt;Python's Import System&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;From Python's official documentation, importing a module involves two stages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Searching the named module.&lt;/li&gt;
&lt;li&gt;Binding the result of the search to some name in the local namespace.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now, the question is where does Python search the module in?&lt;br&gt;
So, Python searches the required module in the following places in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;sys.modules (the module cache)&lt;/li&gt;
&lt;li&gt;the directories listed in sys.path (in order)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Open up a new python file and type in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import sys
print(sys.path)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;When you run this, it lists the directories Python searches to see if the requested module exists.&lt;/p&gt;

&lt;p&gt;Another small experiment to test the precedence of imports, try this:&lt;/p&gt;

&lt;p&gt;Create a file random.py and type in whatever you feel like it. Fire up the python shell in the same directory and import random. Which random module do you think is imported? The answer actually depends on whether you have already import random some earlier time, if yes then the in-built random module would have been cached in sys.modules and will be imported. If not then your recently created random module will be imported because the current working directory(in the interpreter it will be termed as '', an empty string) comes first in sys.path.&lt;/p&gt;

&lt;p&gt;So, if you're stuck with an infuriating ModuleNotFoundError, try printing &lt;code&gt;sys.path&lt;/code&gt; and append the intended module's directory in sys.path, &lt;code&gt;sys.path.append('path_to_the_module_directory')&lt;/code&gt;. Be wary of name clashes though!&lt;/p&gt;

&lt;h3&gt;
  
  
  Relative Imports
&lt;/h3&gt;

&lt;p&gt;This is one of the conundrums that has vexed a lot of people and mostly because the official Python documentation isn't very clear on this topic (I think). To be fair, Python's documentation does refer to the section as &lt;br&gt;
&lt;a href="https://docs.python.org/3/reference/import.html#package-relative-imports"&gt;&lt;strong&gt;Package Relative Imports&lt;/strong&gt;&lt;/a&gt; and &lt;strong&gt;not Relative Imports&lt;/strong&gt; and this is where we return to the distinction between &lt;em&gt;modules&lt;/em&gt; and &lt;em&gt;scripts&lt;/em&gt;, because again a &lt;em&gt;Package&lt;/em&gt; is not supposed to be run on its own and anytime you call a &lt;em&gt;Package&lt;/em&gt; you are essentially importing the &lt;code&gt;__init__.py&lt;/code&gt; file defined in the top directory of the package.&lt;/p&gt;

&lt;p&gt;So, here's the gist &lt;strong&gt;you cannot do relative imports on a file whose &lt;code&gt;__name__&lt;/code&gt; is &lt;code&gt;__main__&lt;/code&gt;&lt;/strong&gt; because python uses this very attribute to construct paths for importing relative modules. So, if the &lt;code&gt;__name__&lt;/code&gt; of some file is &lt;em&gt;/xyz/subpackage1/path2&lt;/em&gt; then this is the base path used for any relative imports, &lt;code&gt;import .moduleX&lt;/code&gt; basically implies to import a python module located in &lt;em&gt;/xyz/subpackage1/path2&lt;/em&gt; and &lt;code&gt;import ..moduleY&lt;/code&gt; means to import &lt;em&gt;moduleY&lt;/em&gt; from &lt;em&gt;/xyz/subpackage1&lt;/em&gt; (if it exists that is!)&lt;/p&gt;

&lt;p&gt;If you're familiar with Django, its standard to see code like&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from . import views
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;in urls.py but again we never really run urls.py on its own, we call &lt;em&gt;manage.py&lt;/em&gt; and it implicitly calls &lt;em&gt;urls.py&lt;/em&gt; in the background to do perform some tasks.&lt;/p&gt;

&lt;p&gt;So there you have it, a small and hopefully definitive guide to imports in Python. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3UP1yw3w--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/4er6rz7l3ut8696p73l6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3UP1yw3w--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/4er6rz7l3ut8696p73l6.jpg" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks :)&lt;/p&gt;

</description>
      <category>python</category>
      <category>imports</category>
      <category>tutorial</category>
      <category>codenewbie</category>
    </item>
  </channel>
</rss>
