<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Ahmet Can Aydemir</title>
    <description>The latest articles on Forem by Ahmet Can Aydemir (@ahmetcanaydemir).</description>
    <link>https://forem.com/ahmetcanaydemir</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F765261%2Ff221a575-5ca4-4446-b29f-0ac8e23ec4ff.jpeg</url>
      <title>Forem: Ahmet Can Aydemir</title>
      <link>https://forem.com/ahmetcanaydemir</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ahmetcanaydemir"/>
    <language>en</language>
    <item>
      <title>How to Use Apache Airflow</title>
      <dc:creator>Ahmet Can Aydemir</dc:creator>
      <pubDate>Wed, 16 Mar 2022 08:26:26 +0000</pubDate>
      <link>https://forem.com/ahmetcanaydemir/how-to-use-apache-airflow-3jcg</link>
      <guid>https://forem.com/ahmetcanaydemir/how-to-use-apache-airflow-3jcg</guid>
      <description>&lt;p&gt;Airflow is an orchestration tool that ensures that tasks are running at the right time, in the correct order, and in the right way.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why You Need Apache Airflow
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm9803a05zin2dg81g3gs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm9803a05zin2dg81g3gs.png" alt="ETL"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;Imagine you have a data pipeline like the one above.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What if an error occurs in any of these stages? There may be an error in the API from which you are pulling the data, there may be an error while processing the data, or there may be an error while saving to the DB.&lt;/li&gt;
&lt;li&gt;If you have a lot of data pipelines like this, it will eventually become overwhelming.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Roughly, as in the example above, taking the data from a source and saving it to the target after certain operations are called ETL (Extract Transform Load). Such transactions can be managed in an advanced way by using Airflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Benefits of Apache Airflow
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Dynamic:&lt;/strong&gt; What can be done with Python also can be done with Airflow. As a result, Airflow provides tremendous dynamics when creating our tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scalable:&lt;/strong&gt; As many tasks as desired can be easily run in parallel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User Interface:&lt;/strong&gt; Airflow has a useful UI. Errors that occur in data pipelines and where they occur can be easily observed. Problematic tasks can be restarted etc.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Extensible:&lt;/strong&gt; No need to wait for Airflow update when a new tool comes out. You can write your plugins and integrate them easily.&lt;/p&gt;

&lt;h3&gt;
  
  
  Apache Airflow’s Core Components
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzqxxkmass3ie637tzjvd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzqxxkmass3ie637tzjvd.png" alt="Airflow's Core Components"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Web Server:&lt;/strong&gt; A Flask server that serves the UI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scheduler:&lt;/strong&gt; The daemon that schedules the workflows. It is the most important component of Airflow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Metastore:&lt;/strong&gt; Database where metadata is stored.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Executor:&lt;/strong&gt; Class that defines how tasks should work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Worker:&lt;/strong&gt; The process or sub-process executing the task.&lt;/p&gt;

&lt;h3&gt;
  
  
  DAG (Directed Acyclic Graph)
&lt;/h3&gt;

&lt;p&gt;A DAG (Directed Acyclic Graph) is the core concept of Airflow, collecting Tasks together, organized with dependencies and relationships to say how they should run.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmldbmor2s91ffb7ho5z5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmldbmor2s91ffb7ho5z5.png" alt="DAG"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Example DAG above defines four Tasks - A, B, C, and D - and dictates the order in which they have to run, and which tasks depend on what others. It will also say how often to run the DAG - maybe “every 5 minutes starting tomorrow”, or “every day since January 1st, 2020”.&lt;/li&gt;
&lt;li&gt;The DAG itself doesn’t care about &lt;em&gt;what&lt;/em&gt; is happening inside the tasks; it is merely concerned with &lt;em&gt;how&lt;/em&gt; to execute them - the order to run them in, how many times to retry them, if they have timeouts, and so on.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Basic DAG example&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;DAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_dag_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;start_date&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2020&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;schedule_interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@daily&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;catchup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;dag&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;op&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DummyOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Operators
&lt;/h3&gt;

&lt;p&gt;Operators are wrappers that cover the task. By default, there are many different types of operators and can be viewed at &lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/operators/index.html" rel="noopener noreferrer"&gt;this link&lt;/a&gt;. In addition, &lt;a href="https://airflow.apache.org/docs/apache-airflow-providers/operators-and-hooks-ref/index.html" rel="noopener noreferrer"&gt;much more&lt;/a&gt; can be added as needed.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Action Operators:&lt;/strong&gt; Operators which executes functions or commands eg. &lt;code&gt;bash scripts&lt;/code&gt;, &lt;code&gt;python code&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transfer Operators:&lt;/strong&gt; Operators for moving data from source to destination eg. &lt;code&gt;ElasticSearch to Mysql&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sensor Operators:&lt;/strong&gt; Operators which wait for something to happen before moving on to another task, eg. checking the file in the directory and continuing to the other task after that.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What Isn’t Airflow?
&lt;/h3&gt;

&lt;p&gt;Airflow is not a data streaming solution or data processing framework. If you need to process data every second, instead of using Airflow, Spark or Flink would be a better solution. If terabytes of data are being processed, it is recommended to run the Spark job with the operator in Airflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Apache Airflow Setup
&lt;/h2&gt;

&lt;p&gt;Although Airflow can be installed with Docker, Kubernetes or different methods, in this article, we will install it locally.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Airflow needs a home. `~/airflow` is the default, but you can put it
# somewhere else if you prefer (optional)
&lt;/span&gt;&lt;span class="n"&gt;export&lt;/span&gt; &lt;span class="n"&gt;AIRFLOW_HOME&lt;/span&gt;&lt;span class="o"&gt;=~/&lt;/span&gt;&lt;span class="n"&gt;airflow&lt;/span&gt;

&lt;span class="c1"&gt;# Install Airflow using the constraints file. For this example, we will use Python 3.6 and Airflow 2.2.4
&lt;/span&gt;&lt;span class="n"&gt;pip&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;apache-airflow==${AIRFLOW_VERSION}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;constraint&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://raw.githubusercontent.com/apache/airflow/constraints-2.2.4/constraints-3.6.txt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# The Standalone command will initialize the database, make a user,
# and start all components for you.
&lt;/span&gt;&lt;span class="n"&gt;airflow&lt;/span&gt; &lt;span class="n"&gt;standalone&lt;/span&gt;

&lt;span class="c1"&gt;# Visit localhost:8080 in the browser and use the admin account details
# shown on the terminal to login.
# Enable the example_bash_operator dag on the home page
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  First Pipeline with Apache Airflow
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl4of3ly2k0c3zaqi00o9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl4of3ly2k0c3zaqi00o9.png" alt="Pipeline"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For example, let's create a data pipeline like the one in the figure using Airflow. For this, let's create a new python file in &lt;code&gt;~/airflow/dags&lt;/code&gt; directory eg: &lt;code&gt;user_processing.py&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Importing Modules
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json_normalize&lt;/span&gt;

&lt;span class="c1"&gt;# DAG object; We will need this to create DAG
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DAG&lt;/span&gt;

&lt;span class="c1"&gt;# Operators
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.operators.bash&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BashOperator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.operators.python&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PythonOperator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.providers.sqlite.operators.sqlite&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SqliteOperator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.providers.http.operators.http&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SimpleHttpOperator&lt;/span&gt;

&lt;span class="c1"&gt;# Sensors
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.providers.http.sensors.http&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HttpSensor&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Instantiate a DAG
&lt;/h3&gt;

&lt;p&gt;We’ll need a DAG object to nest our tasks into. Here we pass a string that defines the &lt;code&gt;dag_id&lt;/code&gt;, which serves as a unique identifier for your DAG. We also pass a default argument dictionary and define a &lt;code&gt;schedule_interval&lt;/code&gt; of 1 day for the DAG.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;default_args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;start_date&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2020&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;DAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user_processing&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schedule_interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@daily&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;catchup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;dag&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

        &lt;span class="c1"&gt;# Define tasks/operators
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Creating the SQLite Table
&lt;/h3&gt;

&lt;p&gt;We can use &lt;code&gt;SqliteOperator&lt;/code&gt; for this task. But &lt;code&gt;SqliteOperator&lt;/code&gt; will need a parameter named &lt;code&gt;conn_id&lt;/code&gt;, so first let's create a connection for SQLite DB.&lt;/p&gt;

&lt;p&gt;In Airflow, connections are kept in the metadata database. We will need to do the same for the &lt;code&gt;HTTPOperator&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We can use UI to create a connection&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="http://localhost:8080/connection/add" rel="noopener noreferrer"&gt;http://localhost:8080/connection/add&lt;/a&gt; (&lt;strong&gt;Admin / Connections / +&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Type &lt;code&gt;Conn Id&lt;/code&gt; field a unique name e.g.: &lt;code&gt;db_sqlite&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;code&gt;Conn Type&lt;/code&gt; field as &lt;code&gt;SQLite&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;In the &lt;code&gt;Host&lt;/code&gt; field, type the path to the SQLite database file. In this example, we can use Airflow's metadata database instead of creating a new SQLite database file. &lt;code&gt;/home/airflow/airflow.db&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Save&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After this process, we can create the operator. Each operator must have a unique &lt;code&gt;task_id&lt;/code&gt;. SqliteOperator also has &lt;code&gt;sqlite_conn_id&lt;/code&gt; and &lt;code&gt;sql&lt;/code&gt; parameters. We will use the &lt;code&gt;db_sqlite&lt;/code&gt; as the sqlite_conn_id (see step 2).&lt;/p&gt;

&lt;p&gt;In SQL code; if the &lt;code&gt;users&lt;/code&gt; table does not already exist; It creates a &lt;code&gt;users&lt;/code&gt; table with fields &lt;code&gt;firstname&lt;/code&gt;, &lt;code&gt;lastname&lt;/code&gt;, &lt;code&gt;country&lt;/code&gt;, &lt;code&gt;username&lt;/code&gt;, &lt;code&gt;password&lt;/code&gt; and &lt;code&gt;email&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt; &lt;span class="n"&gt;creating_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SqliteOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;creating_table&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sqlite_conn_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;db_sqlite&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sql&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="s"&gt;
            CREATE TABLE IF NOT EXISTS users (
                firstname TEXT NOT NULL,
                lastname TEXT NOT NULL,
                country TEXT NOT NULL,
                username TEXT NOT NULL,
                password TEXT NOT NULL,
                email TEXT NOT NULL PRIMARY KEY
            );
        &lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt; 
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Verify API
&lt;/h3&gt;

&lt;p&gt;We can add a sensor to check if there is a problem with the API by making an HTTP request. &lt;code&gt;HttpSensor&lt;/code&gt; will be useful for this, but &lt;code&gt;HttpSensor&lt;/code&gt; needs &lt;code&gt;http_conn_id&lt;/code&gt; parameter. So we have to create a new connection as we did in the previous step. Unlike before, this time we will create a connection with &lt;code&gt;HTTP&lt;/code&gt; type, not an &lt;code&gt;SQLite&lt;/code&gt; type.&lt;/p&gt;

&lt;p&gt;We can use UI to create a connection&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="http://localhost:8080/connection/add" rel="noopener noreferrer"&gt;http://localhost:8080/connection/add&lt;/a&gt; (&lt;strong&gt;Admin / Connections / +&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Type &lt;code&gt;Conn Id&lt;/code&gt; field a unique name e.g.: &lt;code&gt;user_api&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;code&gt;Conn Type&lt;/code&gt; field as &lt;code&gt;HTTP&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;In the &lt;code&gt;Host&lt;/code&gt; field, enter the address of the API that brings random users. eg. &lt;a href="https://randomuser.me/" rel="noopener noreferrer"&gt;https://randomuser.me/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Save&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Now we can create the sensor. Since the API we use works at &lt;code&gt;[https://randomuser.me/api/](https://randomuser.me/api)&lt;/code&gt;, we used &lt;code&gt;api/&lt;/code&gt; in the endpoint parameter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;is_api_available&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpSensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;is_api_available&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;http_conn_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user_api&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;api/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Retrieve Random Users from API
&lt;/h3&gt;

&lt;p&gt;With using &lt;code&gt;SimpleHttpOperator&lt;/code&gt; we send a &lt;code&gt;GET&lt;/code&gt; HTTP request to the &lt;code&gt;user_api&lt;/code&gt; connection we created before. With &lt;code&gt;response_filter&lt;/code&gt; we convert the JSON string in the &lt;code&gt;response.text&lt;/code&gt; field into a python object.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt; &lt;span class="n"&gt;extracting_user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SimpleHttpOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;extracting_user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;http_conn_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user_api&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;api/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;GET&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;response_filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;log_response&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Processing the API Result
&lt;/h3&gt;

&lt;p&gt;With using &lt;code&gt;PythonOperator&lt;/code&gt;, we will keep only the desired fields of the python object that we obtained in step 5. And we will save the result as csv file.&lt;/p&gt;

&lt;p&gt;Any python function can be run directly with the &lt;code&gt;python_callable&lt;/code&gt; parameter.&lt;/p&gt;

&lt;p&gt;The important part here is how we will get the response from the previous section. For this, we can send the &lt;code&gt;task instance (ti)&lt;/code&gt; object as a parameter to the python function and get the results of the &lt;code&gt;task_ids&lt;/code&gt; with &lt;code&gt;users = ti.xcom_pull(task_ids=['extracting_user'])&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;processing_user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PythonOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;processing_user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;python_callable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;_processing_user&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Somewhere outside the scope of DAG
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_processing_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;extracting_user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;User is empty&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;processed_user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;json_normalize&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;firstname&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;first&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lastname&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;last&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;country&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;location&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;country&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;username&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;login&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;username&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;password&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;login&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;password&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;processed_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/tmp/processed_user.csv&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Saving Processed Users to SQLite DB
&lt;/h3&gt;

&lt;p&gt;We can use &lt;code&gt;BashOperator&lt;/code&gt; for trying new operators. &lt;code&gt;BashOperator&lt;/code&gt; allows us to run bash commands.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;storing_user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BashOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;storing_user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;bash_command&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;echo -e &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.separator &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;.import /tmp/processed_user.csv users&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; | sqlite3 /home/airflow/airflow/airflow.db&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  8. Relationship Between Tasks
&lt;/h3&gt;

&lt;p&gt;Simply, the direction of the tasks and their connection with each other can be defined as follows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Somewhere outside the scope of DAG
&lt;/span&gt;&lt;span class="n"&gt;creating_table&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;is_api_available&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;extracting_user&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;processing_user&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;storing_user&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Final DAG File
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json_normalize&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DAG&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.operators.bash&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BashOperator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.operators.python&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PythonOperator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.providers.sqlite.operators.sqlite&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SqliteOperator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.providers.http.operators.http&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SimpleHttpOperator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.providers.http.sensors.http&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HttpSensor&lt;/span&gt;

&lt;span class="n"&gt;default_args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;start_date&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2020&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_processing_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;extracting_user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;User is empty&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;processed_user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;json_normalize&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;firstname&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;first&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lastname&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;last&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;country&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;location&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;country&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;username&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;login&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;username&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;password&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;login&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;password&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;processed_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/tmp/processed_user.csv&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;DAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user_processing&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schedule_interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@daily&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;catchup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;dag&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Define tasks/operators
&lt;/span&gt;
    &lt;span class="n"&gt;creating_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SqliteOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;creating_table&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sqlite_conn_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;db_sqlite&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sql&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="s"&gt;
            CREATE TABLE IF NOT EXISTS users (
                firstname TEXT NOT NULL,
                lastname TEXT NOT NULL,
                country TEXT NOT NULL,
                username TEXT NOT NULL,
                password TEXT NOT NULL,
                email TEXT NOT NULL PRIMARY KEY
            );
        &lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt; 
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;is_api_available&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpSensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;is_api_available&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;http_conn_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user_api&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;api/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;extracting_user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SimpleHttpOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;extracting_user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;http_conn_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user_api&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;api/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;GET&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;response_filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;log_response&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;processing_user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PythonOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;processing_user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;python_callable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;_processing_user&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;storing_user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BashOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;storing_user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;bash_command&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;echo -e &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.separator &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;.import /tmp/processed_user.csv users&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; | sqlite3 /home/airflow/airflow/airflow.db&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;creating_table&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;is_api_available&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;extracting_user&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;processing_user&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;storing_user&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;To understand the concept, we have defined a simple and parallelism-free data pipeline. Much more complex, parallel tasks also can be created using Airflow. Thank you for reading!&lt;/p&gt;

&lt;h2&gt;
  
  
  Screenshots
&lt;/h2&gt;

&lt;p&gt;
  Show Screenshots
  &lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa3a0r96acrkak6cfceck.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa3a0r96acrkak6cfceck.png" alt="Graph View"&gt;&lt;/a&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgc2uehcap4a2azidlppc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgc2uehcap4a2azidlppc.png" alt="Tree View"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj68phdyzzbll0wfx7q6q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj68phdyzzbll0wfx7q6q.png" alt="Gantt"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5vjunmrg6fd6gq0gg5wi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5vjunmrg6fd6gq0gg5wi.png" alt="Log"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6rb1hz3e4qs1pmdpc53.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6rb1hz3e4qs1pmdpc53.png" alt="Homepage"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.udemy.com/course/the-complete-hands-on-course-to-master-apache-airflow/" rel="noopener noreferrer"&gt;The Complete Hands-On Introduction to Apache Airflow by Marc Lamberti&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://airflow.apache.org/docs/" rel="noopener noreferrer"&gt;Apache Airflow Docs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>airflow</category>
      <category>apache</category>
      <category>orchestration</category>
      <category>etl</category>
    </item>
    <item>
      <title>KrakenD Monitoring with Grafana</title>
      <dc:creator>Ahmet Can Aydemir</dc:creator>
      <pubDate>Mon, 17 Jan 2022 12:16:21 +0000</pubDate>
      <link>https://forem.com/ahmetcanaydemir/krakend-monitoring-with-grafana-5b6b</link>
      <guid>https://forem.com/ahmetcanaydemir/krakend-monitoring-with-grafana-5b6b</guid>
      <description>&lt;p&gt;KrakenD is an ultra-performant open-source Gateway that can transform, aggregate, or remove data from multiple services, with linear scalability.&lt;/p&gt;

&lt;p&gt;I will explain how to monitor the status of KrakenD services with the Grafana dashboard.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuit8k4zbtgrh648g7wac.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuit8k4zbtgrh648g7wac.png" alt="KrakenD Grafana Dashboard"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Instead of creating a dashboard from scratch, we can use one of the pre-configured dashboards available on grafana.com. In this article, we will use the KrakenD dashboard with ID 5722, prepared by &lt;code&gt;dlopez&lt;/code&gt;. If you want, you can customize the dashboard according to your needs.&lt;/p&gt;

&lt;p&gt;By default, this dashboard provides us with the following metrics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requests from users to KrakenD&lt;/li&gt;
&lt;li&gt;Requests from KrakenD to your backends&lt;/li&gt;
&lt;li&gt;Response times&lt;/li&gt;
&lt;li&gt;Memory usage and details&lt;/li&gt;
&lt;li&gt;Endpoints and status codes&lt;/li&gt;
&lt;li&gt;Heatmaps&lt;/li&gt;
&lt;li&gt;Open connections&lt;/li&gt;
&lt;li&gt;Throughput&lt;/li&gt;
&lt;li&gt;Distributions, timers, garbage collection, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;KrakenD&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  InfluxDB Setup
&lt;/h2&gt;

&lt;p&gt;The dashboard we will use uses InfluxDB to read metric data. You can easily run InfluxDB with Docker.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 8086:8086 &lt;span class="se"&gt;\ &lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;INFLUXDB_DB&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;krakend &lt;span class="se"&gt;\ &lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;INFLUXDB_USER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;myusername &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;INFLUXDB_USER_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;mypassword &lt;span class="se"&gt;\ &lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;INFLUXDB_ADMIN_USER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;admin &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;INFLUXDB_ADMIN_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;myadminpassword &lt;span class="se"&gt;\ &lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;influx-1.8 &lt;span class="se"&gt;\ &lt;/span&gt;
  influxdb:1.8 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; influx-1.8 /bin/bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Grafana Setup
&lt;/h2&gt;

&lt;p&gt;If Grafana is not installed, you run it with Docker. After the following command, Grafana will be running at &lt;code&gt;http://localhost:3000&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="se"&gt;\ &lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\ &lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 3000:3000 &lt;span class="se"&gt;\ &lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;grafana &lt;span class="se"&gt;\ &lt;/span&gt;
  grafana/grafana 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  KrakenD Configuration
&lt;/h2&gt;

&lt;p&gt;Add the following configuration to your krakend.json at the root level. After this, your KrakenD metrics will start saving to InfluxDB at &lt;code&gt;&amp;lt;your-influx-db-server-ip&amp;gt;:8086&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"extra_config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"github_com/letgoapp/krakend-influx"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"address"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"http://&amp;lt;your-influx-db-server-ip&amp;gt;:8086"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"ttl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"25s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"buffer_size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"github_com/devopsfaith/krakend-metrics"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"collection_time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"30s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"listen_address"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"127.0.0.1:8090"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Importing the Grafana Dashboard
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Go to the browser and open &lt;code&gt;http://localhost:3000&lt;/code&gt;. Use &lt;code&gt;admin&lt;/code&gt; for both username and password&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;code&gt;Configuration&lt;/code&gt; from the side menu and find the button to add the data source. Select InfluxDB as the database and fill in the details you provided when starting InfluxDB:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Query Language: &lt;code&gt;InfluxQL&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;URL: &lt;code&gt;http://localhost:8086&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Access: &lt;code&gt;Browser&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Database: &lt;code&gt;krakend&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;User: &lt;code&gt;admin&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Password: &lt;code&gt;myadminpassword&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;HTTP Method: &lt;code&gt;GET&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;To import the dashboard: Click the &lt;code&gt;+&lt;/code&gt; icon in the side menu from the Grafana interface and then click &lt;code&gt;Import&lt;/code&gt;. Select &lt;code&gt;Import via Grafana.com&lt;/code&gt;. Enter &lt;code&gt;5722&lt;/code&gt; as the ID and click &lt;code&gt;Load&lt;/code&gt;. Your dashboard is ready, enjoy it.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://grafana.com/grafana/dashboards/5722" rel="noopener noreferrer"&gt;KrakenD Grafana Dashboard on Grafana.com&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.krakend.io/docs/extended-metrics/influxdb/" rel="noopener noreferrer"&gt;Native InfluxDB exporter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.krakend.io/docs/extended-metrics/grafana/" rel="noopener noreferrer"&gt;Preconfigured Grafana dashboard&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>krakend</category>
      <category>grafana</category>
      <category>influxdb</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>Elasticsearch Snapshots to Minio</title>
      <dc:creator>Ahmet Can Aydemir</dc:creator>
      <pubDate>Thu, 23 Dec 2021 18:23:05 +0000</pubDate>
      <link>https://forem.com/ahmetcanaydemir/elasticsearch-snapshots-to-minio-54ob</link>
      <guid>https://forem.com/ahmetcanaydemir/elasticsearch-snapshots-to-minio-54ob</guid>
      <description>&lt;p&gt;Minio is a high-performance, self-hosted AWS S3 compatible object storage solution.&lt;/p&gt;

&lt;p&gt;I will explain how we can store Elasticsearch snapshots in Minio.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Minio&lt;/li&gt;
&lt;li&gt;Elasticsearch&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Creating Bucket
&lt;/h2&gt;

&lt;p&gt;First, we need a bucket to store Easticsearch snapshots.&lt;/p&gt;

&lt;p&gt;You can create a bucket at Minio UI Console. Click &lt;strong&gt;Buckets&lt;/strong&gt; from the left menu and click &lt;strong&gt;Create Bucket&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--LZNTidDS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/iippohmwlinkwrcxern2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--LZNTidDS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/iippohmwlinkwrcxern2.png" alt="Creating Bucket" width="880" height="395"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Creating a Service Account
&lt;/h2&gt;

&lt;p&gt;A service account is also required to establish a Minio connection with Elasticsearch.&lt;/p&gt;

&lt;p&gt;You can create a service account at Minio UI Console. Click &lt;strong&gt;Service Accounts&lt;/strong&gt; from the left menu and click &lt;strong&gt;Create Service Account&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Save the &lt;strong&gt;Secret Key&lt;/strong&gt; and &lt;strong&gt;Access Key&lt;/strong&gt;, we will use this information during the plugin installation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GUlTJ_7c--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8ejh1gjw6wu1nw53ds1s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GUlTJ_7c--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8ejh1gjw6wu1nw53ds1s.png" alt="Creating Service Account" width="880" height="505"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Elasticsearch S3 Repository Plugin Installation
&lt;/h2&gt;

&lt;p&gt;Since Minio is compatible with Amazon AWS S3, we will use the S3 repository plugin. To install the plugin, connect to all Elasticsearch nodes and follow the steps below.&lt;/p&gt;

&lt;p&gt;First of all, install &lt;code&gt;repository-s3&lt;/code&gt; plugin with following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/usr/share/elasticsearch/bin/elasticsearch-plugin &lt;span class="nb"&gt;install &lt;/span&gt;repository-s3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An S3 client named &lt;code&gt;default&lt;/code&gt; will be created after the plugin is installed. Actions to be taken can be performed by this client. We will define the &lt;code&gt;access_key&lt;/code&gt; and &lt;code&gt;secret_key&lt;/code&gt; information to the default client for accessing Minio.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/usr/share/elasticsearch/bin/elasticsearch-keystore add s3.client.default.secret_key
&lt;span class="c"&gt;# Waiting for secret key&lt;/span&gt;

/usr/share/elasticsearch/bin/elasticsearch-keystore add s3.client.default.access_key
&lt;span class="c"&gt;# Waiting for access key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After this stage, the Elasticsearch service should be restarted.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl restart elasticsearch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Adding Elasticsearch Repository
&lt;/h2&gt;

&lt;p&gt;A repository named &lt;code&gt;my-minio-repository&lt;/code&gt; can be created by sending the following request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;PUT _snapshot/my-minio-repository &lt;span class="o"&gt;{&lt;/span&gt;
   &lt;span class="s2"&gt;"type"&lt;/span&gt;: &lt;span class="s2"&gt;"s3"&lt;/span&gt;,
   &lt;span class="s2"&gt;"settings"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;
      &lt;span class="s2"&gt;"client"&lt;/span&gt;: &lt;span class="s2"&gt;"default"&lt;/span&gt;,
      &lt;span class="s2"&gt;"bucket"&lt;/span&gt;: &lt;span class="s2"&gt;"elasticsearch-snapshots"&lt;/span&gt;,
      &lt;span class="s2"&gt;"endpoint"&lt;/span&gt;: &lt;span class="s2"&gt;"minio.example.com"&lt;/span&gt;,
      &lt;span class="s2"&gt;"path_style_access"&lt;/span&gt;: &lt;span class="s2"&gt;"true"&lt;/span&gt;,
      &lt;span class="s2"&gt;"protocol"&lt;/span&gt;: &lt;span class="s2"&gt;"http"&lt;/span&gt;
   &lt;span class="o"&gt;}&lt;/span&gt; 
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Manual Snapshot
&lt;/h2&gt;

&lt;p&gt;A snapshot named &lt;code&gt;my-first-snapshot&lt;/code&gt; can be taken manually by sending the following request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;PUT /_snapshot/my-minio-repository/my-first-snapshot?wait_for_completion&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Taking Snapshots with Policy
&lt;/h2&gt;

&lt;p&gt;If desired, the snapshot process can be automated using &lt;code&gt;snapshot policy&lt;/code&gt;. For example, the policy below will save snapshots of all indexes to the Minio repository at 07:30 every day.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;PUT _slm/policy/daily-snapshots
&lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"name"&lt;/span&gt;: &lt;span class="s2"&gt;"&amp;lt;daily-snapshot-{now/d}&amp;gt;"&lt;/span&gt;,
    &lt;span class="s2"&gt;"schedule"&lt;/span&gt;: &lt;span class="s2"&gt;"0 30 7 * * ?"&lt;/span&gt;,
    &lt;span class="s2"&gt;"repository"&lt;/span&gt;: &lt;span class="s2"&gt;"my-minio-repository"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Restoring snapshots
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;my-first-snapshot&lt;/code&gt; backup can be restored with the following request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;POST /_snapshot/my-minio-repository/my-first-snapshot/_restore
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html"&gt;Snapshot and restore&lt;/a&gt;&lt;/p&gt;

</description>
      <category>elasticsearch</category>
      <category>minio</category>
      <category>backup</category>
      <category>snapshot</category>
    </item>
    <item>
      <title>GitLab CI: Update MR With Jira Issue Id</title>
      <dc:creator>Ahmet Can Aydemir</dc:creator>
      <pubDate>Thu, 02 Dec 2021 16:50:36 +0000</pubDate>
      <link>https://forem.com/ahmetcanaydemir/gitlab-ci-update-mr-with-jira-issue-id-3amp</link>
      <guid>https://forem.com/ahmetcanaydemir/gitlab-ci-update-mr-with-jira-issue-id-3amp</guid>
      <description>&lt;p&gt;In my &lt;a href="https://dev.to/ahmetcanaydemir/git-hooks-enforce-commit-message-and-branch-names-576d"&gt;previous article&lt;/a&gt;, we enforced branch names and commit messages to certain rules. Now I will go a step further. When an MR is opened, if there is a Jira issue id (eg. ISSUE-0000) in the title, we will pull the Jira title and description and update MR. CI will fail if there is no Jira Issue Id in the title.&lt;/p&gt;




&lt;p&gt;To do this, we can briefly follow these steps.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check if there is an issue id in the MR title. Otherwise, return fail exit code. We will use regex for this. &lt;code&gt;$CI_MERGE_REQUEST_TITLE&lt;/code&gt; will give us the MR header.&lt;/li&gt;
&lt;li&gt;Extract Issue Id and pull data from Jira API. When you make a GET request to the &lt;code&gt;https://jira.&amp;lt;your-hostname&amp;gt;/rest/api/2/issue/ISSUE-0000&lt;/code&gt; Jira address, Issue details will be returned to you as JSON.&lt;/li&gt;
&lt;li&gt;Take the &lt;code&gt;summary&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt; parts from the JSON data returned by Jira API with &lt;code&gt;jq&lt;/code&gt; and assign them to variables.&lt;/li&gt;
&lt;li&gt;Update MR's title and description using GitLab API.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Curl ve jq Setup
&lt;/h2&gt;

&lt;p&gt;We need &lt;code&gt;curl&lt;/code&gt; for our API requests and &lt;code&gt;jq&lt;/code&gt; to parse JSON.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;apk add &lt;span class="nt"&gt;--update&lt;/span&gt; curl &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apk add &lt;span class="nt"&gt;--update&lt;/span&gt; jq &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/cache/apk/&lt;span class="k"&gt;*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Issue Id Check
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"^(ISSUE-[0-9]+)"&lt;/span&gt;
&lt;span class="nv"&gt;ISSUE_ID_IN_MR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CI_MERGE_REQUEST_TITLE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ISSUE_ID_IN_MR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then &lt;/span&gt;&lt;span class="nb"&gt;exit &lt;/span&gt;1&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;fi&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Jira Issue Parsing
&lt;/h2&gt;

&lt;p&gt;We store new lines (eg. &lt;code&gt;\n&lt;/code&gt;) as characters in &lt;code&gt;DESCRIPTION_JSON&lt;/code&gt; variable because newlines are the problem when sending curl requests.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;JSON&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; &amp;lt;jira-username&amp;gt;:&amp;lt;jira-password&amp;gt; &lt;span class="nt"&gt;-X&lt;/span&gt; GET &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type:application/json"&lt;/span&gt; &lt;span class="s2"&gt;"https://jira.&amp;lt;your-hostname&amp;gt;/rest/api/2/issue/&lt;/span&gt;&lt;span class="nv"&gt;$ISSUE_ID_IN_MR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nv"&gt;SUMMARY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$JSON&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;".fields.summary"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;DESCRIPTION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$JSON&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;".fields.description"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;DESCRIPTION_JSON&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;jq &lt;span class="nt"&gt;--null-input&lt;/span&gt; &lt;span class="nt"&gt;--compact-output&lt;/span&gt; &lt;span class="nt"&gt;--arg&lt;/span&gt; msg &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DESCRIPTION&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s1"&gt;'$msg'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Updating Merge Request
&lt;/h2&gt;

&lt;p&gt;You need a private token for this &lt;a href="https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html"&gt;check GitLab documentation&lt;/a&gt; and learn how to do that.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;title&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ISSUE_ID_IN_MR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SUMMARY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\"\"&lt;/span&gt;&lt;span class="s2"&gt;description&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DESCRIPTION_JSON&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;}"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--fail&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type:application/json; charset=utf-8"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s2"&gt;"PRIVATE-TOKEN:&amp;lt;private-token&amp;gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--output&lt;/span&gt; &lt;span class="s2"&gt;"/dev/null"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--request&lt;/span&gt; PUT &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--show-error&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--silent&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--trace-ascii&lt;/span&gt; &lt;span class="s2"&gt;"/dev/stderr"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--write-out&lt;/span&gt; &lt;span class="s2"&gt;"HTTP response: %{http_code}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CI_API_V4_URL&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/projects/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CI_PROJECT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/merge_request&lt;/span&gt;&lt;span class="nv"&gt;$CI_MERGE_REQUEST_IID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Result
&lt;/h2&gt;

&lt;p&gt;When we combine all this into one GitLab CI yml file it will look like this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;check_mr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;alpine&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;apk add --update curl &amp;amp;&amp;amp; apk add --update jq &amp;amp;&amp;amp; rm -rf /var/cache/apk/*&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;IFS=$"\n"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;REGEX_ISSUE_ID="^(ISSUE-[0-9]+)"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ISSUE_ID_IN_MR=$(echo "$CI_MERGE_REQUEST_TITLE" | grep -o -E "$REGEX_ISSUE_ID")&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;if [[ -z "$ISSUE_ID_IN_MR" ]]; then exit 1; fi&lt;/span&gt; 
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;JSON=$(curl -s -u &amp;lt;jira-username&amp;gt;:&amp;lt;jira-password&amp;gt; -X GET -H "Content-Type:application/json" "https://jira.&amp;lt;your-hostname&amp;gt;/rest/api/2/issue/$ISSUE_ID_IN_MR")&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;SUMMARY=$(echo "$JSON" | jq -r ".fields.summary")&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;DESCRIPTION=$(echo "$JSON" | jq -r ".fields.description")&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;DESCRIPTION_JSON=$(jq --null-input --compact-output --arg msg "$DESCRIPTION" '$msg')&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
      &lt;span class="s"&gt;curl -v \&lt;/span&gt;
      &lt;span class="s"&gt;--data "{\"title\": \"${ISSUE_ID_IN_MR} ${SUMMARY}\", \"description\": ${DESCRIPTION_JSON}}" \&lt;/span&gt;
      &lt;span class="s"&gt;--fail \&lt;/span&gt;
      &lt;span class="s"&gt;--header "Content-Type:application/json; charset=utf-8" \&lt;/span&gt;
      &lt;span class="s"&gt;--header "PRIVATE-TOKEN:&amp;lt;private-token&amp;gt;" \&lt;/span&gt;
      &lt;span class="s"&gt;--output "/dev/null" \&lt;/span&gt;
      &lt;span class="s"&gt;--request PUT \&lt;/span&gt;
      &lt;span class="s"&gt;--show-error \&lt;/span&gt;
      &lt;span class="s"&gt;--silent \&lt;/span&gt;
      &lt;span class="s"&gt;--trace-ascii "/dev/stderr" \&lt;/span&gt;
      &lt;span class="s"&gt;--write-out "HTTP response: %{http_code}\n\n" \&lt;/span&gt;
      &lt;span class="s"&gt;"${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/$CI_MERGE_REQUEST_IID"&lt;/span&gt;
    &lt;span class="na"&gt;only&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;merge_requests&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;You should add this file you created to your GitLab project as a CI file. After all these processes, your MR titles and descriptions will now be much more organized and understandable!&lt;/p&gt;

</description>
      <category>gitlab</category>
      <category>ci</category>
      <category>jira</category>
      <category>mr</category>
    </item>
    <item>
      <title>Git Hooks: Enforce Commit Message and Branch Name</title>
      <dc:creator>Ahmet Can Aydemir</dc:creator>
      <pubDate>Wed, 01 Dec 2021 19:05:48 +0000</pubDate>
      <link>https://forem.com/ahmetcanaydemir/git-hooks-enforce-commit-message-and-branch-names-576d</link>
      <guid>https://forem.com/ahmetcanaydemir/git-hooks-enforce-commit-message-and-branch-names-576d</guid>
      <description>&lt;p&gt;If you want to standardize things among your team, you may need to enforce strict rules. Some examples of rules might be for git are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Branches must be created as ISSUE-0000/feature-name.&lt;/li&gt;
&lt;li&gt;All commit messages must start as ISSUE-0000 (Jira issue id).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this post, I will show you how we can add the two rules I mentioned using git hooks.&lt;/p&gt;




&lt;p&gt;There are multiple &lt;a href="https://git-scm.com/docs/githooks"&gt;git hooks&lt;/a&gt; that work in different stages. We will use two client-side hooks for our rules.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;pre-commit:&lt;/strong&gt; This hook is invoked by git-commit, and can be bypassed with the &lt;code&gt;--no-verify&lt;/code&gt; option. We will use this hook for checking the branch name before committing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;commit-msg:&lt;/strong&gt; This hook is invoked by git-commit and git-merge, and can be bypassed with the &lt;code&gt;--no-verify&lt;/code&gt; option. It takes a single parameter, the name of the file that holds the proposed commit message. We will use this hook for checking the commit message. And we will modify the commit message if needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Creating Hooks Folder
&lt;/h2&gt;

&lt;p&gt;Go to your home directory and create a folder named &lt;code&gt;hooks&lt;/code&gt;. And create hook files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; ~/hooks
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/hooks

&lt;span class="nb"&gt;touch &lt;/span&gt;pre-commit
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x pre-commit

&lt;span class="nb"&gt;touch &lt;/span&gt;commit-msg
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x commit-msg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Restricting Branch Names
&lt;/h2&gt;

&lt;p&gt;We will accept branch names as ISSUE-0000/feature-name but we have to consider also other branch names like &lt;code&gt;main&lt;/code&gt;, &lt;code&gt;develop&lt;/code&gt;, &lt;code&gt;release&lt;/code&gt; etc. For this, we can use the following regular expression.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"^(ISSUE-[0-9]+&lt;/span&gt;&lt;span class="se"&gt;\/&lt;/span&gt;&lt;span class="s2"&gt;([a-zA-Z0-9]|-)+|develop|main|release"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We also need the branch name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git rev-parse &lt;span class="nt"&gt;--abbrev-ref&lt;/span&gt; HEAD
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it! We have regex and we have branch name. We will check for branch name starts with &lt;code&gt;ISSUE-0000/feature-name&lt;/code&gt; or &lt;code&gt;develop&lt;/code&gt; or &lt;code&gt;main&lt;/code&gt; or &lt;code&gt;release&lt;/code&gt;. If our check fails we will echo our error message and exit with &lt;code&gt;exit code 1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pre-commit&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="nv"&gt;REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"^(ISSUE-[0-9]+&lt;/span&gt;&lt;span class="se"&gt;\/&lt;/span&gt;&lt;span class="s2"&gt;([a-zA-Z0-9]|-)+|develop|main|release"&lt;/span&gt;

&lt;span class="nv"&gt;ISSUE_ID_IN_BRANCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;git rev-parse &lt;span class="nt"&gt;--abbrev-ref&lt;/span&gt; HEAD&lt;span class="si"&gt;)&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ISSUE_ID_IN_BRANCH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"[pre-commit-hook] Your branch name is illegal. Please rename your branch with using following regex: &lt;/span&gt;&lt;span class="nv"&gt;$REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Restricting Commit Messages
&lt;/h2&gt;

&lt;p&gt;Steps are similar to restricting branch names. But if there is no issue id in the commit message, we will look at the branch name and try to get it from there. If the issue id is also not in the branch name, we will throw an error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"^(ISSUE-[0-9]+|Merge|hotfix)"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our first argument &lt;code&gt;$1&lt;/code&gt; gives us the commit-msg file. We can read the commit message with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We have regex and we have to commit message name, and from the last step, we know how to get branch name. We will check for commit message starts with &lt;code&gt;ISSUE-0000&lt;/code&gt; or &lt;code&gt;Merge&lt;/code&gt; or &lt;code&gt;hotfix&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If the commit check fails we will check our branch name and parse the issue id and insert to commit message.&lt;/p&gt;

&lt;p&gt;If our check fails again we will echo our error message and exit with &lt;code&gt;exit code 1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;commit-msg&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="nv"&gt;REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"^(ISSUE-[0-9]+|Merge|hotfix)"&lt;/span&gt;
&lt;span class="nv"&gt;ISSUE_ID_IN_COMMIT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ISSUE_ID_IN_COMMIT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nv"&gt;BRANCH_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;git symbolic-ref &lt;span class="nt"&gt;--short&lt;/span&gt; HEAD&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;ISSUE_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BRANCH_NAME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ISSUE_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"[commit-msg-hook] Your commit message is illegal. Please rename your branch with using following regex: &lt;/span&gt;&lt;span class="nv"&gt;$REGEX_ISSUE_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="nb"&gt;exit &lt;/span&gt;1
    &lt;span class="k"&gt;fi

    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ISSUE_ID&lt;/span&gt;&lt;span class="s2"&gt; | &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Usage
&lt;/h2&gt;

&lt;p&gt;Go to the project folder and run the following command to use your hooks. You need to make this for every project that you want to use hooks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; &amp;lt;your-team-repository&amp;gt;
git config core.hooksPath ~/hooks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Sharing With Team Members
&lt;/h2&gt;

&lt;p&gt;You need to share your &lt;code&gt;~/hooks&lt;/code&gt; folder. You can share with a git repository or you can send the folder directly to your team. &lt;/p&gt;

</description>
      <category>jira</category>
      <category>precommit</category>
      <category>commitmsg</category>
      <category>githooks</category>
    </item>
  </channel>
</rss>
