<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Rafael Lourenço</title>
    <description>The latest articles on Forem by Rafael Lourenço (@kingjotaro).</description>
    <link>https://forem.com/kingjotaro</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1037857%2F516bd2ef-6ec4-4791-86af-6d5c2a80b956.jpg</url>
      <title>Forem: Rafael Lourenço</title>
      <link>https://forem.com/kingjotaro</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/kingjotaro"/>
    <language>en</language>
    <item>
      <title>Setting Up a PySpark Cluster with Docker: Guide</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Thu, 02 May 2024 06:46:01 +0000</pubDate>
      <link>https://forem.com/kingjotaro/setting-up-a-pyspark-cluster-with-docker-guide-b35</link>
      <guid>https://forem.com/kingjotaro/setting-up-a-pyspark-cluster-with-docker-guide-b35</guid>
      <description>&lt;p&gt;Diving back into Python with a focus on expanding my knowledge in data processing, which has sparked my interest in creating a Proof of Concept (POC) for a data playground. With help of &lt;a class="mentioned-user" href="https://dev.to/caiocampoos"&gt;@caiocampoos&lt;/a&gt; we started this repo &lt;a href="https://github.com/caiocampoos/data-playground" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Even with almost zero knowledge in the area, we began discussing ideas about what we want to do. The main idea is to create knowledge through learning, and one of the best ways to learn is by creating something. In this project, we're going to develop a basic data analysis platform. Throughout the process, we documented the code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Talk is cheap, let's code!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let's start with a PySpark. PySpark is essentially Apache Spark tailored to integrate smoothly with Python. Python dominance in the data science realm makes PySpark an ideal choice for our business-oriented project. While Apache Spark offers support for various languages, our emphasis on Python stems from its versatility and efficiency in our business-focused endeavor. Therefore, we'll proceed with PySpark to ensure seamless integration with our Python-centric workflow.&lt;/p&gt;

&lt;p&gt;Pyspark has a &lt;a href="https://hub.docker.com/r/apache/spark-py/tags" rel="noopener noreferrer"&gt;convenience Docker container image&lt;/a&gt; you can also browse around &lt;a href="https://spark.apache.org/downloads.html" rel="noopener noreferrer"&gt;Apache Spark&lt;/a&gt; website to check others version.&lt;/p&gt;

&lt;p&gt;We starting our project creating a Dockerfile and with this code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;python:3.10-bullseye&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;spark-base&lt;/span&gt;
&lt;span class="k"&gt;ARG&lt;/span&gt;&lt;span class="s"&gt; SPARK_VERSION=3.5.1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first line of code set the base image to a Python image with version 3.10 on the Debian "Bullseye" distribution, naming it spark-base.&lt;/p&gt;

&lt;p&gt;The second line of code defines a version of spark we gonna use, you may change that in the future.&lt;/p&gt;

&lt;p&gt;Then we write this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nt"&gt;--no-install-recommends&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;      &lt;span class="nb"&gt;sudo&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;      curl &lt;span class="se"&gt;\
&lt;/span&gt;      vim &lt;span class="se"&gt;\
&lt;/span&gt;      unzip &lt;span class="se"&gt;\
&lt;/span&gt;      rsync &lt;span class="se"&gt;\
&lt;/span&gt;      openjdk-11-jdk &lt;span class="se"&gt;\
&lt;/span&gt;      build-essential &lt;span class="se"&gt;\
&lt;/span&gt;      software-properties-common &lt;span class="se"&gt;\
&lt;/span&gt;      ssh &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    apt-get clean &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/apt/lists/&lt;span class="k"&gt;*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this next lines of code we setup the directories for our Spark and Hadoop installations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; SPARK_HOME=${SPARK_HOME:-"/opt/spark"}&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; HADOOP_HOME=${HADOOP_HOME:-"/opt/hadoop"}&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HADOOP_HOME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_HOME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; ${SPARK_HOME}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command downloads and installs Spark and Hadoop.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;RUN &lt;/span&gt;curl https://dlcdn.apache.org/spark/spark-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/spark-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="nt"&gt;-bin-hadoop3&lt;/span&gt;.tgz &lt;span class="nt"&gt;-o&lt;/span&gt; spark-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="nt"&gt;-bin-hadoop3&lt;/span&gt;.tgz &lt;span class="se"&gt;\
&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;tar &lt;/span&gt;xvzf spark-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="nt"&gt;-bin-hadoop3&lt;/span&gt;.tgz &lt;span class="nt"&gt;--directory&lt;/span&gt; /opt/spark &lt;span class="nt"&gt;--strip-components&lt;/span&gt; 1 &lt;span class="se"&gt;\
&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; spark-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="nt"&gt;-bin-hadoop3&lt;/span&gt;.tgz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next lines of code in the Dockerfile import and install all Python dependencies that we are going to cover in another folder.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;spark-base&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;pyspark&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; requirements/requirements.txt .&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;pip3 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we gonna set all the enveriroment variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PATH="/opt/spark/sbin:/opt/spark/bin:${PATH}"&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; SPARK_MASTER="spark://spark-master:7077"&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; SPARK_MASTER_HOST spark-master&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; SPARK_MASTER_PORT 7077&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PYSPARK_PYTHON python3&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next lines of code will copy the configuration files, set permissions, and configure the environment variables accordingly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; conf/spark-defaults.conf "$SPARK_HOME/conf"&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;chmod &lt;/span&gt;u+x /opt/spark/sbin/&lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;chmod &lt;/span&gt;u+x /opt/spark/bin/&lt;span class="k"&gt;*&lt;/span&gt;

&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally, copy the entrypoint script and set it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; entrypoint.sh .&lt;/span&gt;

&lt;span class="k"&gt;ENTRYPOINT&lt;/span&gt;&lt;span class="s"&gt; ["./entrypoint.sh"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here the full code of dockerfile.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;

&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;python:3.10-bullseye&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;spark-base&lt;/span&gt;

&lt;span class="k"&gt;ARG&lt;/span&gt;&lt;span class="s"&gt; SPARK_VERSION=3.5.1&lt;/span&gt;

&lt;span class="c"&gt;# Install tools required by the OS&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nt"&gt;--no-install-recommends&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;      &lt;span class="nb"&gt;sudo&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;      curl &lt;span class="se"&gt;\
&lt;/span&gt;      vim &lt;span class="se"&gt;\
&lt;/span&gt;      unzip &lt;span class="se"&gt;\
&lt;/span&gt;      rsync &lt;span class="se"&gt;\
&lt;/span&gt;      openjdk-11-jdk &lt;span class="se"&gt;\
&lt;/span&gt;      build-essential &lt;span class="se"&gt;\
&lt;/span&gt;      software-properties-common &lt;span class="se"&gt;\
&lt;/span&gt;      ssh &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    apt-get clean &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/apt/lists/&lt;span class="k"&gt;*&lt;/span&gt;


&lt;span class="c"&gt;# Setup the directories for our Spark and Hadoop installations&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; SPARK_HOME=${SPARK_HOME:-"/opt/spark"}&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; HADOOP_HOME=${HADOOP_HOME:-"/opt/hadoop"}&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HADOOP_HOME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_HOME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; ${SPARK_HOME}&lt;/span&gt;

&lt;span class="c"&gt;# Download and install Spark&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;curl https://dlcdn.apache.org/spark/spark-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/spark-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="nt"&gt;-bin-hadoop3&lt;/span&gt;.tgz &lt;span class="nt"&gt;-o&lt;/span&gt; spark-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="nt"&gt;-bin-hadoop3&lt;/span&gt;.tgz &lt;span class="se"&gt;\
&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;tar &lt;/span&gt;xvzf spark-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="nt"&gt;-bin-hadoop3&lt;/span&gt;.tgz &lt;span class="nt"&gt;--directory&lt;/span&gt; /opt/spark &lt;span class="nt"&gt;--strip-components&lt;/span&gt; 1 &lt;span class="se"&gt;\
&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; spark-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SPARK_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="nt"&gt;-bin-hadoop3&lt;/span&gt;.tgz


&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;spark-base&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;pyspark&lt;/span&gt;

&lt;span class="c"&gt;# Install python deps&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; requirements/requirements.txt .&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;pip3 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;# Setup Spark related environment variables&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PATH="/opt/spark/sbin:/opt/spark/bin:${PATH}"&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; SPARK_MASTER="spark://spark-master:7077"&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; SPARK_MASTER_HOST spark-master&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; SPARK_MASTER_PORT 7077&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PYSPARK_PYTHON python3&lt;/span&gt;

&lt;span class="c"&gt;# Copy the default configurations into $SPARK_HOME/conf&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; conf/spark-defaults.conf "$SPARK_HOME/conf"&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;chmod &lt;/span&gt;u+x /opt/spark/sbin/&lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;chmod &lt;/span&gt;u+x /opt/spark/bin/&lt;span class="k"&gt;*&lt;/span&gt;

&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH&lt;/span&gt;

&lt;span class="c"&gt;# Copy appropriate entrypoint script&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; entrypoint.sh .&lt;/span&gt;

&lt;span class="k"&gt;ENTRYPOINT&lt;/span&gt;&lt;span class="s"&gt; ["./entrypoint.sh"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we need create a entrypoint.sh file for our script, this Bash script is designed to start different components of Apache Spark based on the value passed to the environment variable SPARK_WORKLOAD. We are using a flag &lt;strong&gt;--memory 1g&lt;/strong&gt; to limit the use of memory for workers, you can change that if you want.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="nv"&gt;SPARK_WORKLOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"SPARK_WORKLOAD: &lt;/span&gt;&lt;span class="nv"&gt;$SPARK_WORKLOAD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SPARK_WORKLOAD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"master"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;then
  &lt;/span&gt;start-master.sh &lt;span class="nt"&gt;-p&lt;/span&gt; 7077
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SPARK_WORKLOAD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"worker"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;then
  &lt;/span&gt;start-worker.sh spark://spark-master:7077 &lt;span class="nt"&gt;--memory&lt;/span&gt; 1g
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SPARK_WORKLOAD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"history"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;then
  &lt;/span&gt;start-history-server.sh
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And a folder requirements with requirements.txt file inside.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ipython
pandas
pyarrow
numpy
pyspark
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our Docker Compose configuration is divided into three key components: the Spark Master, responsible for orchestrating the Spark cluster; the Spark History Server, which provides historical data on completed Spark applications; and the Spark Worker, representing a worker node within the cluster.&lt;/p&gt;

&lt;p&gt;This is all the code we use in the docker-compose file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;version: '3.8'

services:
  spark-master:
    container_name: da-spark-master
    build: .
    image: da-spark-image
    entrypoint: ['./entrypoint.sh', 'master']
    healthcheck:
      test: [ "CMD", "curl", "-f", "http://localhost:8080" ]
      interval: 5s
      timeout: 3s
      retries: 3
    volumes:
      - ./book_data:/opt/spark/data
      - ./spark_apps:/opt/spark/apps
      - spark-logs:/opt/spark/spark-events
    env_file:
      - .env.spark
    ports:
      - '9090:8080'
      - '7077:7077'


  spark-history-server:
    container_name: da-spark-history
    image: da-spark-image
    entrypoint: ['./entrypoint.sh', 'history']
    depends_on:
      - spark-master
    env_file:
      - .env.spark
    volumes:
      - spark-logs:/opt/spark/spark-events
    ports:
      - '18080:18080'

  spark-worker:
#    container_name: da-spark-worker
    image: da-spark-image
    entrypoint: ['./entrypoint.sh', 'worker']
    depends_on:
      - spark-master
    env_file:
      - .env.spark
    volumes:
      - ./book_data:/opt/spark/data
      - ./spark_apps:/opt/spark/apps
      - spark-logs:/opt/spark/spark-events


volumes:
  spark-logs:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We also need to create a folder named 'config' with a file named 'spark-defaults.conf' to store all configurations for Spark.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;
spark.master                            spark://localhost:7077
spark.eventLog.enabled                  true
spark.eventLog.dir                      /opt/spark/spark-events
spark.history.fs.logDirectory           /opt/spark/spark-events

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And a ssh_config file with.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;
Host *
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And a .env.spark file with.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SPARK_NO_DAEMONIZE=true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally our makefile file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;
&lt;span class="nl"&gt;build&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    docker-compose build

&lt;span class="nl"&gt;build-nc&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    docker-compose build &lt;span class="nt"&gt;--no-cache&lt;/span&gt;

&lt;span class="nl"&gt;build-progress&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    docker-compose build &lt;span class="nt"&gt;--no-cache&lt;/span&gt; &lt;span class="nt"&gt;--progress&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;plain

&lt;span class="nl"&gt;down&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    docker-compose down &lt;span class="nt"&gt;--volumes&lt;/span&gt; &lt;span class="nt"&gt;--remove-orphans&lt;/span&gt;

&lt;span class="nl"&gt;run&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    make down &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; docker-compose up

&lt;span class="nl"&gt;run-scaled&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    make down &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; docker-compose up &lt;span class="nt"&gt;--scale&lt;/span&gt; spark-worker&lt;span class="o"&gt;=&lt;/span&gt;3 

&lt;span class="nl"&gt;run-d&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    make down &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; docker-compose up &lt;span class="nt"&gt;-d&lt;/span&gt;

&lt;span class="nl"&gt;stop&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    docker-compose stop

&lt;span class="nl"&gt;submit&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    docker &lt;span class="nb"&gt;exec &lt;/span&gt;da-spark-master spark-submit &lt;span class="nt"&gt;--master&lt;/span&gt; spark://spark-master:7077 &lt;span class="nt"&gt;--deploy-mode&lt;/span&gt; client ./apps/&lt;span class="p"&gt;$(&lt;/span&gt;app&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nl"&gt;submit-da-book&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    make submit &lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;data_analysis_book/&lt;span class="p"&gt;$(&lt;/span&gt;app&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nl"&gt;rm-results&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; book_data/results/&lt;span class="k"&gt;*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can just build and run how many workers we need!&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Just one more time</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Thu, 25 Jan 2024 14:27:55 +0000</pubDate>
      <link>https://forem.com/kingjotaro/just-one-more-time-d2l</link>
      <guid>https://forem.com/kingjotaro/just-one-more-time-d2l</guid>
      <description>&lt;p&gt;"Just one more state, bro! I promise this is the last one."&lt;/p&gt;

&lt;p&gt;Perhaps you've already heard a technobro (technology enthusiast) saying something like that. Well, that technobro is me, just mentioning one more challenge. I promise this is the last one.&lt;/p&gt;

&lt;p&gt;I'm back to the Woovi challenge again, and this time I'll try to make something simpler. I'll attempt to create a droplist wiki for a game fueled by players with a few moderators on Discord.&lt;/p&gt;

&lt;p&gt;The idea is pretty straightforward—a CRUD with GraphQL, Relay, and React. I'll use Next.js version 13 because I'm more familiar with it, and every new version of Next.js adds overhead to understand with tons of new documentation. So, I hope this time things work better.&lt;/p&gt;

&lt;p&gt;I have talked with people who want to use the wiki, and the design should be something very simple. The interface should be as simple as possible—perhaps this is the most minimalist project that I have done.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>The end of a project!</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Tue, 23 Jan 2024 17:30:26 +0000</pubDate>
      <link>https://forem.com/kingjotaro/the-end-of-a-project-3ib8</link>
      <guid>https://forem.com/kingjotaro/the-end-of-a-project-3ib8</guid>
      <description>&lt;p&gt;Today, I found myself pondering a question: When can we genuinely declare a project as complete? This question becomes even more complex in the programming world, where even minor projects can undergo refactoring over an extended duration. The reason is straightforward: with an expanding understanding, there’s always an opportunity for enhancing and optimizing your code.&lt;/p&gt;

&lt;p&gt;In my view, a project can be deemed complete when specific objectives are established. After all, we wouldn’t want to be perpetually coding one thing – the universe of programming is vast! Setting clear objectives not only steers the process but also serves as a roadmap to finish a project without any regrets.&lt;/p&gt;

&lt;p&gt;Now, if you find the outcome unsatisfactory after some time, don’t hesitate to establish new objectives. Just be wary not to plunge into coding without a plan; you might lose track of your initial motivation and end up feeling somewhat lost in the process.&lt;/p&gt;

&lt;p&gt;As for my latest project, the Leaky Bucket with Redis and Node, it’s complete, and I’m satisfied with it for the time being. Perhaps in the future, equipped with more knowledge and new tools, I’ll revisit and revamp it. But for now, I believe the initial objective has been accomplished.&lt;/p&gt;

&lt;p&gt;The next step is to return to the Woovi Challenge, but I’m contemplating doing something simpler than an e-commerce. Since my friend has already established a page with no-code tools, I just need to think what I can do. There are numerous problems in the world that need solutions, but your personal issues should be the top priority.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>learning</category>
      <category>newbie</category>
    </item>
    <item>
      <title>What I have learned in the last project</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Thu, 11 Jan 2024 22:02:48 +0000</pubDate>
      <link>https://forem.com/kingjotaro/what-i-have-learned-in-the-last-project-bi6</link>
      <guid>https://forem.com/kingjotaro/what-i-have-learned-in-the-last-project-bi6</guid>
      <description>&lt;p&gt;My last project is almost done, it's a Leaky Bucket algorithm with Node and Redis, and a simple front end showing requests and data. Here are a few things that I learned while building this.&lt;/p&gt;

&lt;p&gt;First of all, it's almost impossible to build things without a list of tasks. I tried skipping a to-do list in the beginning because I thought it would be faster, but I encountered a few problems that I had to think about. Most of the problems are easy to solve, but they appear in abundance when you are very inexperienced.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A to-do list can help you stay focused on what matters most.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The second thing that I really learned is how easy it is to use tools when they are very well documented. Good documentation can be a real game-changer when choosing technologies to use. If the technology is poorly documented, no matter how good it is, at the end of the day, you want to choose the simplest and well-documented tech.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The most valuable weapon is the one that you can use&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The third and last, but not least, is the importance of writing down what you have learned. Programming is not easy; it starts to become easy because you have practiced so many times. Writing can help you solidify that practice. You can write the same code three times or write it once and create a text explaining how and why you have done it. When you explain what you have done, you are learning more, because explaining something is the ultimate level of acquiring knowledge. If you don't know how to explain something, you really don't know that thing, you only kinda know.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You only really know when you can explain. If you don't explain, you kinda know.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>tutorial</category>
      <category>learning</category>
      <category>codenewbie</category>
    </item>
    <item>
      <title>How am I using AI to get the best of myself</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Thu, 28 Dec 2023 17:31:11 +0000</pubDate>
      <link>https://forem.com/kingjotaro/how-am-i-using-ai-to-get-the-best-of-myself-b9f</link>
      <guid>https://forem.com/kingjotaro/how-am-i-using-ai-to-get-the-best-of-myself-b9f</guid>
      <description>&lt;p&gt;In this text, I'll try to explain how I am using AI to bring out the best in myself, and I think I can start with one of my most significant weaknesses.&lt;/p&gt;

&lt;p&gt;Writing can be quite a challenge for me sometimes due to my dyslexia, and it's not uncommon to encounter misspellings in my personal handbook, like 'wrods.' Thankfully, with the help of AI, I don't struggle as much in the virtual world. All of my texts pass by AI to catch misspellings and errors. Thus, my most crucial and effective use of AI is in enhancing my writing.&lt;/p&gt;

&lt;p&gt;My second most important use of AI is for searching. The Bing search engine is actually more powerful than Google. It's easy for me to find anything on the internet using Bing. Even if I don't fully understand, Bing can summarize and assist me in the search.&lt;/p&gt;

&lt;p&gt;Thirdly, I consider it to be the most frequent use case, although not necessarily the most important. I use AI to generate code blocks. I spend a significant amount of time writing and incorporating these code blocks into my projects. The reason for using AI is the convenience it provides in reviewing the meaning of the code block and identifying necessary corrections to achieve my desired outcome. Creating code blocks with AI not only streamlines the process but also saves a considerable amount of time&lt;/p&gt;

&lt;p&gt;All three of these use cases are very important in my life for bringing out the best in myself. Without them, it would probably be much harder for me to be where I am right now&lt;/p&gt;

</description>
      <category>programming</category>
      <category>tutorial</category>
      <category>productivity</category>
      <category>learning</category>
    </item>
    <item>
      <title>I made a logical mistake</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Tue, 19 Dec 2023 14:51:24 +0000</pubDate>
      <link>https://forem.com/kingjotaro/i-made-a-logical-mistake-peh</link>
      <guid>https://forem.com/kingjotaro/i-made-a-logical-mistake-peh</guid>
      <description>&lt;p&gt;I made a logical mistake in my code, but that's okay. Making mistakes is entirely normal. In fact, every attempt involves a few failures, and my error is a straightforward logical resolution in a problem that involves a leaky bucket algorithm with Redis.&lt;/p&gt;

&lt;p&gt;Redis must have a bucket to count how many tokens we have, and only API requests should reduce our tokens. However, we can only call the API if we have enough tokens to spend in our bucket. There are two errors that I made here. First, my Redis calls are depleting the buckets too, and secondly, all my calls are going to the API first, and only after that, the tokens are reduced. All these implementation ideas come from me, but when implemented, things just go wrong.&lt;/p&gt;

&lt;p&gt;Practical learning is not easy; you're going to be wrong most of the time. However, every mistake you solve contributes to solid knowledge. If you want to be one of the best, you should practice more.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>tutorial</category>
      <category>codenewbie</category>
      <category>career</category>
    </item>
    <item>
      <title>The Bubble</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Mon, 18 Dec 2023 12:47:22 +0000</pubDate>
      <link>https://forem.com/kingjotaro/the-bubble-281l</link>
      <guid>https://forem.com/kingjotaro/the-bubble-281l</guid>
      <description>&lt;p&gt;This weekend, I started thinking about what non-coding problems I could solve. After spending some time reflecting, I realized there are many things I can do to help solve problems. First of all, there's basic accountability. I just helped a friend who doesn't know much about managing money by creating a spreadsheet template for him. Then I realized he doesn't know how to search for things for his business, so I introduced him to the Bing search AI.&lt;/p&gt;

&lt;p&gt;I just realized there are so many people who don't use AI engines and others who don't even know about the existence of powerful AIs. And courses like 'How to use AI in your job' are starting to make sense to me right now. It's curious how we live in a bubble of knowledge, and sometimes reality breaks through that bubble.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How to Export Global Variables in Node.js</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Fri, 15 Dec 2023 15:27:01 +0000</pubDate>
      <link>https://forem.com/kingjotaro/how-to-export-global-variables-in-nodejs-3084</link>
      <guid>https://forem.com/kingjotaro/how-to-export-global-variables-in-nodejs-3084</guid>
      <description>&lt;p&gt;Imagine you are using Koa.js to create a route, and within that route, you retrieve a key that you intend to use in another router, function, or elsewhere. In such cases, you may want to use the easiest method to export that key for subsequent use.&lt;/p&gt;

&lt;p&gt;One of the easiest case are export that key with global variable, to do that you may try weird thinngs likes me&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/createkey&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ParameterizedContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tokenData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getToken&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tokenData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;access_token&lt;/span&gt; &lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;createkey&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;tokenData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;access_token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;EX&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2399&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Key created&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Key created&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Trust me, that doesn't work. The easiest way to achieve this in Node.js is by using the global object.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nb"&gt;global&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/createkey&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ParameterizedContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tokenData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getToken&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nb"&gt;global&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tokenData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;access_token&lt;/span&gt; &lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;createkey&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;tokenData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;access_token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;EX&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2399&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Key created&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Key created&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;global&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;


&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With that object, you can access and modify your global variable in every part of your project.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Understanding the Leaky Bucket Problem with Redis</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Thu, 14 Dec 2023 12:45:14 +0000</pubDate>
      <link>https://forem.com/kingjotaro/understanding-the-leaky-bucket-problem-with-redis-35o</link>
      <guid>https://forem.com/kingjotaro/understanding-the-leaky-bucket-problem-with-redis-35o</guid>
      <description>&lt;p&gt;In my path to become a software engineer, I was challenged by my mentor &lt;a class="mentioned-user" href="https://dev.to/sibelius"&gt;@sibelius&lt;/a&gt; to demonstrate how I implement the leaky bucket with this &lt;a href="https://www.bcb.gov.br/content/estabilidadefinanceira/pix/API-DICT.html#section/Seguranca/Assinatura-digital" rel="noopener noreferrer"&gt;API&lt;/a&gt; documentation.&lt;/p&gt;

&lt;p&gt;To work on that i started drawinng on &lt;a href="https://excalidraw.com/" rel="noopener noreferrer"&gt;Excalidraw&lt;/a&gt; and got this design.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftxvm68kivusmrab1qc04.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftxvm68kivusmrab1qc04.png" alt="leaky-bucket design api"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Understanding this picture is pretty simple; all the queries are going to hit Redis to check if the information already exists in memory. If the information exists in memory, we get the answer back. If not, we have to check if we have 'coins' in our bucket to make a query to the DICT API and get the answer back to the client. We then save that in Redis. If our bucket is empty, we don't call the DICT API and respond with a rate limit.&lt;/p&gt;

&lt;p&gt;This initial application will have a very simple calculation for bucket usage:&lt;/p&gt;

&lt;p&gt;Status 200: Remove 1 coin&lt;br&gt;
Status 400: Remove 20 coins&lt;br&gt;
Bucket refill: 2 coins per minute&lt;br&gt;
Bucket size: 100 for individuals and 1000 for legal entities&lt;/p&gt;

</description>
      <category>programming</category>
      <category>redis</category>
      <category>tutorial</category>
      <category>beginners</category>
    </item>
    <item>
      <title>The Hydra Knowledge Theorem</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Wed, 13 Dec 2023 13:42:57 +0000</pubDate>
      <link>https://forem.com/kingjotaro/the-hydra-knowledge-theorem-3dhk</link>
      <guid>https://forem.com/kingjotaro/the-hydra-knowledge-theorem-3dhk</guid>
      <description>&lt;p&gt;I start feeling something that I am considering good. I wake up every day with doubt, and I go to sleep with more doubts than I had when I woke up.&lt;/p&gt;

&lt;p&gt;So, it's like a Hydra, where you cut down one head, and two more sprout in its place. The path I'm taking in software development is always expanding my doubts.&lt;/p&gt;

&lt;p&gt;Yesterday, I woke up trying to figure out how to implement the leaky-bucket algorithm with Node and Redis, and I went to sleep with the doubt of how to configure and connect mutual TLS. The difficulties in programming seem to escalate every time I take a step up and they don't seem to have an end.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>beginners</category>
      <category>codenewbie</category>
      <category>career</category>
    </item>
    <item>
      <title>What problem am I solving?</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Mon, 11 Dec 2023 14:26:58 +0000</pubDate>
      <link>https://forem.com/kingjotaro/what-problem-am-i-solving-29nd</link>
      <guid>https://forem.com/kingjotaro/what-problem-am-i-solving-29nd</guid>
      <description>&lt;p&gt;I read a tweet today and started asking myself, What problem am I solving?&lt;/p&gt;

&lt;p&gt;Solving other people's problems is the way to be valued. If I want to be highly valued, I should solve problems. The manner in which I solve them doesn't matter too much, whether a problem can be resolved with code or a call doesn't matter at the end of the day. As long as the problem is solved, the result should be the same.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Custom Paths for mocks Folder in Monorepo using Jest</title>
      <dc:creator>Rafael Lourenço</dc:creator>
      <pubDate>Fri, 08 Dec 2023 11:16:46 +0000</pubDate>
      <link>https://forem.com/kingjotaro/custom-paths-for-mocks-folder-in-monorepo-using-jest-omp</link>
      <guid>https://forem.com/kingjotaro/custom-paths-for-mocks-folder-in-monorepo-using-jest-omp</guid>
      <description>&lt;p&gt;Imagine you have a monorepo with tons of repositories. For each repository, you encounter the same issue: a need to maintain an identical mock in each repository. Whenever you update one, you must manually update all the others. This can result in a significant waste of your time.&lt;/p&gt;

&lt;p&gt;You have a good idea to place your mock at the root level for importing into all tests as needed. While this approach works, it introduces a new small problem: &lt;strong&gt;Your imports can become like Morse code&lt;/strong&gt;, especially if your tests are deeply nested.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6nga0mga1r9bk4rlrw3g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6nga0mga1r9bk4rlrw3g.png" alt="Morse Mock"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To solve that, we're going to create a custom path for your mocks folder inside your Jest configuration for each repository in your monorepo.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt; &lt;span class="nx"&gt;moduleNameMapper&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;^@mocks/(.*)$&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;&amp;lt;rootDir&amp;gt;/&amp;lt;your_path_to_your_mock_folder&amp;gt;/$1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And now, we can import our mock inside each test using this syntax:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;mockAdd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@mocks/math&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;


&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;uses mockAdd to add 1 + 2 to equal 3&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;mockAdd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And our mock tests are well-exported.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;mockAdd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;mockAdd&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your tests are correct, this should be the result.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcc5sq5z632jpo1jk0bkg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcc5sq5z632jpo1jk0bkg.png" alt="Test Pass"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
