<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Deepak Prasad</title>
    <description>The latest articles on Forem by Deepak Prasad (@dpkpr).</description>
    <link>https://forem.com/dpkpr</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3921187%2F7208419b-ccd2-487e-a296-842a47675b03.jpg</url>
      <title>Forem: Deepak Prasad</title>
      <link>https://forem.com/dpkpr</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/dpkpr"/>
    <language>en</language>
    <item>
      <title>Python Performance Analysis</title>
      <dc:creator>Deepak Prasad</dc:creator>
      <pubDate>Sat, 09 May 2026 06:43:56 +0000</pubDate>
      <link>https://forem.com/dpkpr/python-performance-analysis-1kkc</link>
      <guid>https://forem.com/dpkpr/python-performance-analysis-1kkc</guid>
      <description>&lt;p&gt;When I first started using pandas for data analysis, I started using loops. However, loops are time consuming and so I started looking for alternatives and that is what I am trying to share with everyone.&lt;/p&gt;

&lt;h2&gt;
  
  
  The apply() Method
&lt;/h2&gt;

&lt;p&gt;We can use &lt;code&gt;apply&lt;/code&gt; with a lambda function.&lt;/p&gt;

&lt;p&gt;The performance of &lt;code&gt;apply&lt;/code&gt; function depends on the content of the expression.&lt;/p&gt;

&lt;h2&gt;
  
  
  Swifter apply()
&lt;/h2&gt;

&lt;p&gt;It is a combination of pandas &lt;code&gt;apply&lt;/code&gt; (non-parallel) and dask &lt;code&gt;apply&lt;/code&gt; (parallelized). Since swifter toggles between these two modes, it is faster than dask. Refer to the reference section for more detail.&lt;/p&gt;

&lt;p&gt;This is a third-party library which has to be installed before importing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Numpy vectorize()
&lt;/h2&gt;

&lt;p&gt;The vectorized function evaluates &lt;code&gt;pyfunc&lt;/code&gt; over successive tuples of the input arrays like the Python &lt;code&gt;map&lt;/code&gt; function, except it uses the broadcasting rules of numpy.&lt;/p&gt;

&lt;p&gt;The data type of the output of vectorized is determined by calling the function with the first element of the input. This can be avoided by specifying the &lt;code&gt;otypes&lt;/code&gt; argument.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vectorization
&lt;/h2&gt;

&lt;p&gt;Vectorization is used to speed up Python code without using loops.&lt;/p&gt;

&lt;p&gt;Using such a function can help in minimizing the running time of code efficiently.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; We might not be able to use vectorization for all kinds of operations. Pandas tries to provide vectorized functions in such cases.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Split each row:&lt;/strong&gt; We use &lt;code&gt;map&lt;/code&gt; as it is the fastest for this approach, but there are functions like &lt;code&gt;partition&lt;/code&gt; that do the same thing. For code reference, please refer to the Sample Code section.&lt;/p&gt;

&lt;h2&gt;
  
  
  Other Observations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Use inplace
&lt;/h3&gt;

&lt;p&gt;Many Pandas operations have an &lt;code&gt;inplace&lt;/code&gt; parameter, always defaulting to &lt;code&gt;False&lt;/code&gt;, meaning the original DataFrame is untouched and the operation returns a new DataFrame.&lt;/p&gt;

&lt;p&gt;When setting &lt;code&gt;inplace=True&lt;/code&gt;, the operation might work on the original DataFrame, but it might still work on a copy behind the scenes and just reassign the reference when done.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Can be both faster and less memory intensive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prevents chained/functional syntax: &lt;code&gt;df.dropna().rename().sum()...&lt;/code&gt; which offers a chance for lazy evaluation.&lt;/li&gt;
&lt;li&gt;When using &lt;code&gt;inplace=True&lt;/code&gt; on an object which is potentially a slice/view of an underlying DataFrame, Pandas has to do a &lt;code&gt;SettingWithCopy&lt;/code&gt; check, which is expensive. &lt;code&gt;inplace=False&lt;/code&gt; avoids this.&lt;/li&gt;
&lt;li&gt;Less consistent and predictable behaviour behind the scenes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Use Hash for Lookups
&lt;/h3&gt;

&lt;p&gt;Hash lookup is lightning fast. We can replace series lookup with hash lookup in Python using a single line of code (&lt;code&gt;to_dict()&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Use &lt;code&gt;describe()&lt;/code&gt; and &lt;code&gt;memory_usage()&lt;/code&gt; functions to get the stats of your DataFrame and the memory details.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sample Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# APPLY with lambda FUNCTION
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lamdba&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# APPLY with FUNCTION
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lambda_func&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;addTwoCols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;swifter&lt;/span&gt;
&lt;span class="c1"&gt;# SWIFTER APPLY with lambda FUNCTION
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lambda_swifter&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;swifter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# NP VECTORIZE
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;vectorize&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;vectorize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# VECTORIZATION
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;vec&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# NUMERIC VECTORIZATION
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;num_vec&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;

&lt;span class="c1"&gt;# STRING SPLIT - using map
&lt;/span&gt;&lt;span class="n"&gt;capaConsPeg1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;capaConsPeg1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ROUTE_CODE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;capaConsPeg1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;capaConsPeg1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ROUTE_CODE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;capaConsPeg1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;capaConsPeg1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ROUTE_CODE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# STRING SPLIT - using Partition function
&lt;/span&gt;&lt;span class="n"&gt;capaConsPeg7&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;capaConsPeg7&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;capaConsPeg7&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; \
    &lt;span class="n"&gt;capaConsPeg7&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ROUTE_CODE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;partition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# STRING SPLIT - using str.split
&lt;/span&gt;&lt;span class="n"&gt;capaConsPeg2&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;capaConsPeg2&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;capaConsPeg2&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; \
    &lt;span class="n"&gt;capaConsPeg3&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ROUTE_CODE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expand&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; \
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;FROM_ACTIVITY1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;FROM_ACTIVITY2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;FROM_ACTIVITY3&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# STRING SPLIT - using str.split (another approach)
&lt;/span&gt;&lt;span class="n"&gt;capaConsPeg4&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;capaConsPeg4&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;capaConsPeg4&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; \
    &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;capaConsPeg4&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ROUTE_CODE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FROM_ACTIVITY1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FROM_ACTIVITY2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FROM_ACTIVITY3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Best Practices
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Always use vectorization wherever possible. It is the most efficient way to code without using third-party libraries.&lt;/li&gt;
&lt;li&gt;Use numpy when dealing with numeric data.&lt;/li&gt;
&lt;li&gt;Swifter is not available in the o9 platform, so ignore it for now.&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;inplace=True&lt;/code&gt; wherever possible.&lt;/li&gt;
&lt;li&gt;Use hash for lookups.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/jmcarpenter2/swifter" rel="noopener noreferrer"&gt;swifter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/one-word-of-code-to-stop-using-pandas-so-slowly-793e0a81343c" rel="noopener noreferrer"&gt;One word of code to stop using pandas so slowly&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/how-to-make-your-pandas-loop-71-803-times-faster-805030df4f06" rel="noopener noreferrer"&gt;How to make your pandas loop 71,803 times faster&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://numpy.org/doc/stable/reference/generated/numpy.vectorize.html" rel="noopener noreferrer"&gt;numpy.vectorize documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://stackoverflow.com/questions/52673285/performance-of-pandas-apply-vs-np-vectorize-to-create-new-column-from-existing-c" rel="noopener noreferrer"&gt;Performance of pandas apply vs np.vectorize&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>pandas</category>
      <category>performance</category>
    </item>
    <item>
      <title>I Built a Personal Finance App That Doesn't Touch Your Data</title>
      <dc:creator>Deepak Prasad</dc:creator>
      <pubDate>Sat, 09 May 2026 06:40:54 +0000</pubDate>
      <link>https://forem.com/dpkpr/i-built-a-personal-finance-app-that-doesnt-touch-your-data-2fl4</link>
      <guid>https://forem.com/dpkpr/i-built-a-personal-finance-app-that-doesnt-touch-your-data-2fl4</guid>
      <description>&lt;p&gt;It started with distrust.&lt;/p&gt;

&lt;p&gt;I'd been using a couple of popular finance apps — the kind you sign up for, link your bank, and hope for the best. They worked fine. But there was always this low-level discomfort: somewhere on a server I've never seen, in a database I'll never inspect, sits every salary credit, every embarrassing impulse purchase, every month I blew the grocery budget. That felt wrong.&lt;/p&gt;

&lt;p&gt;So I built my own. This is the story of how it works, the decisions I made, and the things that surprised me along the way.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Constraint That Shaped Everything
&lt;/h2&gt;

&lt;p&gt;Before writing a single line of code, I made one rule:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The app will have no backend. My data stays in my Google account.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That sounds simple. It isn't. Almost every interesting feature in a finance app — sync, multi-device access, historical data — assumes a server somewhere. Removing the server meant I had to find creative replacements for things I'd been taking for granted.&lt;/p&gt;

&lt;p&gt;The answer turned out to be something hiding in plain sight: &lt;strong&gt;Google Sheets&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Google Sheets as a Database
&lt;/h2&gt;

&lt;p&gt;I know what you're thinking. But hear me out.&lt;/p&gt;

&lt;p&gt;Google Sheets gives you a spreadsheet that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lives in your own Google Drive&lt;/li&gt;
&lt;li&gt;Is readable and writable via a REST API&lt;/li&gt;
&lt;li&gt;Has fine-grained OAuth scopes&lt;/li&gt;
&lt;li&gt;Is human-inspectable (you can just... open it and look)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The app creates one spreadsheet — named &lt;code&gt;DKP Finance - your@email.com&lt;/code&gt; — with tabs for Transactions, Budgets, Goals, NetWorth, Categories, and Settings. Each tab has a header row. The app reads and writes those tabs on sign-in and after changes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Transactions tab:
id | type | amount | group | category | description | date | notes | tags | ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's not blazing fast. It's not a real database. But for a personal finance app with a few thousand rows at most, it's more than enough — and the user can open their sheet any time and see exactly what the app knows about them.&lt;/p&gt;

&lt;p&gt;That last part matters. &lt;strong&gt;Transparency is a feature.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The OAuth Scope I'm Most Proud Of
&lt;/h2&gt;

&lt;p&gt;When you sign in with Google, apps request scopes — permissions to access parts of your account. Most apps request broad access. I was careful here.&lt;/p&gt;

&lt;p&gt;DKP Finance requests:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;openid, email, profile
https://www.googleapis.com/auth/spreadsheets
https://www.googleapis.com/auth/drive.file
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That last one — &lt;code&gt;drive.file&lt;/code&gt; — is the interesting one. It grants access only to files the app itself creates. The app literally cannot see any other file in your Drive. Not your documents, not your photos, not any other spreadsheet. If it didn't create the file, it's invisible.&lt;/p&gt;

&lt;p&gt;This isn't just a privacy talking point. It's enforced by Google's API at the scope level. The app is technically incapable of snooping.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tabs as Schema, Headers as Contracts
&lt;/h2&gt;

&lt;p&gt;One early problem: Google Sheets has no schema enforcement. A row is just a list of values. If I add a new column next week, existing sheets won't have that header.&lt;/p&gt;

&lt;p&gt;I solved this with a header-based read pattern. When the app reads a tab, it uses the first row as a key map:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Row: ["id", "amount", "description", ...]&lt;/span&gt;
&lt;span class="c1"&gt;// Data rows: ["abc123", "2500", "Groceries", ...]&lt;/span&gt;
&lt;span class="c1"&gt;// Result: { id: "abc123", amount: "2500", description: "Groceries" }&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means &lt;strong&gt;additive changes are safe by default&lt;/strong&gt; — add a new column to the code, and existing sheets just return &lt;code&gt;undefined&lt;/code&gt; for that field, which the app handles gracefully. The new column appears in the sheet after the next sync.&lt;/p&gt;

&lt;p&gt;For destructive changes — renaming a column, changing a value format — I built a schema versioning system. The Settings tab stores a &lt;code&gt;schema_version&lt;/code&gt; number. On sign-in, if the sheet version is behind the app version, a migration function runs in memory first, then pushes the upgraded data back. The sheet is never written to before the data is safely transformed.&lt;/p&gt;

&lt;p&gt;I genuinely enjoyed this part. It felt like designing a tiny database migration system for a spreadsheet.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Token Problem
&lt;/h2&gt;

&lt;p&gt;Here's the part that caused the most grief.&lt;/p&gt;

&lt;p&gt;The app uses Google Identity Services (GIS) for authentication. GIS gives you an &lt;strong&gt;access token&lt;/strong&gt; that expires after one hour. For apps with a backend, this is fine — you store a refresh token on the server and silently get new access tokens forever.&lt;/p&gt;

&lt;p&gt;I have no backend. So I only get the access token.&lt;/p&gt;

&lt;p&gt;My first approach was lazy: check the token before every sync, refresh it if expired. This worked until it didn't — sometimes the refresh triggered a popup, interrupting the user mid-session. Not great for something supposed to feel like a native app.&lt;/p&gt;

&lt;p&gt;The fix was proactive: schedule a refresh &lt;strong&gt;5 minutes before the token expires&lt;/strong&gt;. The GIS silent refresh (&lt;code&gt;prompt: ''&lt;/code&gt;) is truly invisible if the user's Google session cookie is still alive, which it typically is for a few weeks on the same browser.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;msUntilRefresh&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;expiry&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="nx"&gt;refreshTimer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;expiresIn&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;silentTokenRefresh&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="nf"&gt;scheduleTokenRefresh&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;expiresIn&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// reschedule for new token&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;msUntilRefresh&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the session stays alive indefinitely on the same browser, self-renewing every hour in the background. After a few weeks of inactivity, or if the user clears cookies, one re-login is required — that's Google's hard limit for apps without a backend, and there's no way around it without a server.&lt;/p&gt;

&lt;p&gt;I made peace with that.&lt;/p&gt;




&lt;h2&gt;
  
  
  Making It a Real App (PWA)
&lt;/h2&gt;

&lt;p&gt;A finance app you can only use at a desk defeats the purpose. I wanted it on my phone, feeling like a native app.&lt;/p&gt;

&lt;p&gt;Progressive Web Apps are underrated for this. With &lt;code&gt;vite-plugin-pwa&lt;/code&gt; and Workbox, I got:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Installable&lt;/strong&gt; on Android (Chrome → Add to Home Screen) and iOS (Safari → Share → Add to Home Screen)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Offline shell&lt;/strong&gt; — all JS, CSS, fonts precached at install time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standalone display&lt;/strong&gt; — no browser chrome, full screen, dark status bar matching the app&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google Fonts cached&lt;/strong&gt; for a year (CacheFirst strategy)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The whole PWA config is about 30 lines in &lt;code&gt;vite.config.ts&lt;/code&gt;. The service worker is generated at build time. It just works.&lt;/p&gt;

&lt;p&gt;One thing I learned: iOS is more aggressive than Android about clearing service worker caches if the app isn't used for a few weeks. Budget for that in your UX — don't assume cached assets are always there on iOS.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Features That Emerged
&lt;/h2&gt;

&lt;p&gt;I started with just transactions and a budget. Then I kept using the app and noticing things I wanted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Budget violation history&lt;/strong&gt; — not just "are you over this month" but "how many of the last 6 months did you blow this budget?" One line of text: &lt;code&gt;Over budget 3 of 6 months · avg 112% used&lt;/code&gt;. Simple, but surprisingly useful for spotting patterns you'd otherwise rationalize away month by month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Savings transfer exclusion&lt;/strong&gt; — early on, my savings rate calculation looked terrible. It was counting money I moved to my own investment account as an expense. Of course the rate looked bad — I was counting saving money as spending it. Fixed by tagging savings-destination categories and filtering them out of expense totals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Contextual predictions&lt;/strong&gt; — not ML, just arithmetic. Daily burn rate extrapolated to end of month. Three-month moving average per category to flag anomalies (↑23% this month vs average). Savings rate trend (last 3 months vs prior 3). These feel smart but the math is embarrassingly simple.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No recurring transactions&lt;/strong&gt; — I started building a recurring rules engine that auto-populated transactions. Then I removed it. The problem: transactions appearing in your ledger without you explicitly adding them is unsettling in a finance app. Trust matters. The user should always know where every transaction came from.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Use the Authorization Code flow with PKCE.&lt;/strong&gt; The implicit token flow I'm using is fine but the 1-hour token expiry is a real constraint. With PKCE you get a refresh token and can stay logged in indefinitely — but it technically requires a token exchange endpoint. A single Cloudflare Worker could handle it for pennies a month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IndexedDB instead of localStorage.&lt;/strong&gt; All data currently lives in &lt;code&gt;localStorage&lt;/code&gt;, which is synchronous and has a 5-10MB limit. For a few hundred transactions it's fine. For a few years of data, it could get tight.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Better conflict resolution.&lt;/strong&gt; Right now the app does a full push or pull. If you open it on two devices simultaneously and both make changes, last-write-wins. That's fine for a single-user app but fragile.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Thing I Didn't Expect
&lt;/h2&gt;

&lt;p&gt;I thought the hard part would be the UI — charts, responsive layout, dark mode, themes. It wasn't. The UI was straightforward.&lt;/p&gt;

&lt;p&gt;The hard part was &lt;strong&gt;trust&lt;/strong&gt;. Specifically, designing every feature so that the user — me, primarily — could trust what the numbers meant. That meant being careful about what counted as income vs savings vs expense. It meant not auto-generating transactions. It meant keeping the data visible and inspectable in a spreadsheet anyone can open.&lt;/p&gt;

&lt;p&gt;A finance app that's clever but untrustworthy is worse than useless. Every design decision eventually came back to: &lt;em&gt;does this make the numbers more or less trustworthy?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That turned out to be a pretty good question to ask about software in general.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Try it:&lt;/strong&gt; The app is live. &lt;a href="https://beingdpkpr.github.io/aryas-finance" rel="noopener noreferrer"&gt;Sign in with Google&lt;/a&gt;, and your spreadsheet is created in your own Drive — I never see it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ko-fi:&lt;/strong&gt; If it saves you money or time, &lt;a href="https://ko-fi.com/deepakprasad" rel="noopener noreferrer"&gt;a coffee is appreciated&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>privacy</category>
      <category>showdev</category>
      <category>javascript</category>
    </item>
  </channel>
</rss>
