<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Michael Salata</title>
    <description>The latest articles on Forem by Michael Salata (@michael_salata).</description>
    <link>https://forem.com/michael_salata</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3671495%2F1161f879-87a4-4e9f-921b-7c196f35451f.jpg</url>
      <title>Forem: Michael Salata</title>
      <link>https://forem.com/michael_salata</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/michael_salata"/>
    <language>en</language>
    <item>
      <title>The NoFluff Cheatsheet for the Airflow 3 Fundamentals</title>
      <dc:creator>Michael Salata</dc:creator>
      <pubDate>Sat, 20 Dec 2025 05:54:22 +0000</pubDate>
      <link>https://forem.com/michael_salata/the-nofluff-cheatsheet-for-the-airflow-3-fundamentals-4j76</link>
      <guid>https://forem.com/michael_salata/the-nofluff-cheatsheet-for-the-airflow-3-fundamentals-4j76</guid>
      <description>&lt;h1&gt;
  
  
  Meta Article
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Curated Info for the Airflow 3 Fundamentals Certification
&lt;/h2&gt;

&lt;p&gt;If you’re looking to get up to speed on Airflow 3 or master the essentials while earning a certification, the &lt;a href="https://academy.astronomer.io/certification-exam-apache-airflow-3-fundamentals" rel="noopener noreferrer"&gt;Airflow Fundamentals Certification&lt;/a&gt; is a solid option. That said, the existing study guides can be outdated or over-scoped. This article is an &lt;strong&gt;updated cheatsheet, validated for correctness&lt;/strong&gt; and &lt;strong&gt;curated for the Airflow topics that &lt;em&gt;directly&lt;/em&gt; helped answer questions on the certification.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  My Background
&lt;/h2&gt;

&lt;p&gt;I’m a Software Engineer who used Airflow to complete the &lt;a href="https://github.com/DataTalksClub/data-engineering-zoomcamp" rel="noopener noreferrer"&gt;&lt;strong&gt;data-engineering-zoomcamp&lt;/strong&gt;&lt;/a&gt; by &lt;a href="https://datatalks.club/blog/data-engineering-zoomcamp.html" rel="noopener noreferrer"&gt;Datatalks.club&lt;/a&gt; and build my &lt;a href="https://github.com/MichaelSalata/compare-my-biometrics" rel="noopener noreferrer"&gt;Fitbit ETL pipeline&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Thanks for reading! Subscribe for free to receive new posts and support my work.&lt;/p&gt;

&lt;p&gt;I aced the &lt;a href="https://academy.astronomer.io/certification-exam-apache-airflow-3-fundamentals" rel="noopener noreferrer"&gt;Airflow 3 Fundamentals Certification&lt;/a&gt; after completing the &lt;a href="https://academy.astronomer.io/path/airflow-101" rel="noopener noreferrer"&gt;Astronomer Airflow 3 Learning Path&lt;/a&gt; and watching Marc Lamberti’s live Airflow 3 Crash Course.&lt;/p&gt;

&lt;p&gt;You can learn more about me on &lt;a href="https://github.com/MichaelSalata/" rel="noopener noreferrer"&gt;my GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Material in BOLD
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pay close attention to what’s in BOLD.&lt;/strong&gt; The info in bold was specifically asked about in the exam, often word-for-word.&lt;/p&gt;

&lt;p&gt;Topics that are not bold are indirectly relevant and often necessary background for identifying problems or potential solutions.&lt;/p&gt;




&lt;h1&gt;
  
  
  The Cheatsheet
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Airflow Architecture and Purpose
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DAG Parser, API Server, Scheduler, Executor, Worker&lt;/strong&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/overview.html#airflow-components" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Life of a DAG &amp;amp; Task
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DAG Parser&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;parses DAGs in the DAGs folder every &lt;strong&gt;5 minutes&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;serializes DAGs into the Metadata DB&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scheduler&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;reads DAGs and state from the Metadata DB&lt;/li&gt;
&lt;li&gt;schedules &lt;strong&gt;&lt;em&gt;Task Instances&lt;/em&gt;&lt;/strong&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/tasks.html#task-instances" rel="noopener noreferrer"&gt;ref&lt;/a&gt;) with the Executor&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Executor&lt;/strong&gt; (&lt;em&gt;component of the Scheduler&lt;/em&gt;) (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/executor/index.html" rel="noopener noreferrer"&gt;Executors&lt;/a&gt;)&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;pushes Task Instances to the queue&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Worker&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;picks up Task Instances from the queue&lt;/li&gt;
&lt;li&gt;updates Task Instance status through the API Server&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;Previously&lt;/em&gt;, Airflow 2 workers updated the metadata DB directly.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;executes the Task Instance&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;h3&gt;
  
  
  Key Properties
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Airflow 2 Webserver is now the Airflow 3 API Server.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The default time zone for Airflow is UTC (Coordinated Universal Time).&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Refresh interval to display NEW DAGs is&lt;/strong&gt; &lt;code&gt;dag_dir_list_interval&lt;/code&gt; &lt;strong&gt;and defaults to 5 minutes&lt;/strong&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#config-dag-processor-refresh-interval" rel="noopener noreferrer"&gt;ref&lt;/a&gt;).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Refresh interval to update MODIFIED DAGs is&lt;/strong&gt; &lt;code&gt;min_file_process_interval&lt;/code&gt; &lt;strong&gt;and defaults to 30 seconds&lt;/strong&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#min-file-process-interval" rel="noopener noreferrer"&gt;ref&lt;/a&gt;).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;“The Executor determines where and how tasks are run.”&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;TransferOperator&lt;/code&gt; &lt;strong&gt;moves or copies data.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  CLI
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;airflow db init&lt;/code&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/cli-and-env-variables-ref.html#db" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;initializes the metadata database&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;code&gt;airflow users create&lt;/code&gt; (ref)&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;airflow standalone&lt;/code&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/cli-and-env-variables-ref.html#standalone" rel="noopener noreferrer"&gt;ref&lt;/a&gt; &lt;a href="https://airflow.apache-airflow/2.0.0/howto/initialize-database.html" rel="noopener noreferrer"&gt;ref2&lt;/a&gt;)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;initializes the DB and starts the API server and scheduler&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;airflow info&lt;/code&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/cli-and-env-variables-ref.html#info" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;prints Airflow environment info:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;providers installed + provider versions&lt;/li&gt;
&lt;li&gt;paths&lt;/li&gt;
&lt;li&gt;tools&lt;/li&gt;
&lt;li&gt;system info (Python version, OS)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;airflow cheat-sheet&lt;/code&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/cli-and-env-variables-ref.html#cheat-sheet" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;quick reference for common commands&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;airflow * export&lt;/code&gt; (ref)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;paired with &lt;code&gt;airflow * import&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;exports &lt;code&gt;connections&lt;/code&gt;, &lt;code&gt;pools&lt;/code&gt;, &lt;code&gt;users&lt;/code&gt;, &lt;code&gt;variables&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;note: environment variables are NOT exported/imported&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;airflow tasks test &amp;lt;dag_id&amp;gt; &amp;lt;task_id&amp;gt; &amp;lt;logical_date&amp;gt;&lt;/code&gt; (ref)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;runs a single task without checking dependencies or recording its state in the database&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;airflow dags backfill --start-date &amp;lt;START_DATE&amp;gt; --end-date &amp;lt;END_DATE&amp;gt; &amp;lt;DAG_ID&amp;gt;&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/cli-and-env-variables-ref.html#backfill" rel="noopener noreferrer"&gt;CLI ref1&lt;/a&gt;, &lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/backfill.html#backfill" rel="noopener noreferrer"&gt;ref2&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Airflow Connections (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;conn_id&lt;/strong&gt; — the required unique ID for a connection&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;parameters&lt;/strong&gt; — specific to the connection (login, password, hostname, compute_type, etc)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Stored encrypted by default&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Connection Creation Options
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;UI, CLI, environment variables, API Server REST API, Python code, Secrets Backend&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Connections Created from Environment
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;environment connection variables start with&lt;/strong&gt; &lt;code&gt;AIRFLOW_CONN_...&lt;/code&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html#storing-connections-in-environment-variables" rel="noopener noreferrer"&gt;Env connections&lt;/a&gt;)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;append a custom unique&lt;/strong&gt; &lt;code&gt;conn_id&lt;/code&gt; &lt;strong&gt;to the end&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Know the URI format&lt;/strong&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html#uri-format-example" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)&lt;br&gt;&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AIRFLOW_CONN_MY_HTTP=my-conn-type://login:password@host:port/schema?param1=val1&amp;amp;param2=val2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Connections created via environment variables have special visibility (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html#visibility-in-ui-and-cli" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;NOT stored in the metadata DB&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;NOT shown in the UI&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;YES — still accessible from tasks&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Airflow Variables (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/variables.html" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;JSON key–value store&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;composed of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Unique Key/ID&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Value&lt;/strong&gt; (JSON-serializable)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Description&lt;/strong&gt; (optional)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API URLs &amp;amp; keys&lt;/li&gt;
&lt;li&gt;values that change across environments (dev, staging, prod)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;example usage:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;Variable&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deserialize_json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Creation Options
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Airflow REST API, Airflow CLI, Python inside Airflow (not advised), Airflow UI, environment variables&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Creation via environment variable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;AIRFLOW_VAR_...&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AIRFLOW_VAR_MY_VAR=’{”my_params”: [1,2,3,4]}’&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Certain Keywords will hide a Variable from the UI &amp;amp; Logs
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;variables containing certain keywords (&lt;code&gt;access_token&lt;/code&gt;, &lt;code&gt;api_key&lt;/code&gt;, &lt;code&gt;password&lt;/code&gt;, etc.) are hidden from the UI &amp;amp; Logs (&lt;a href="https://www.astronomer.io/docs/learn/airflow-variables#hide-sensitive-information-in-airflow-variables" rel="noopener noreferrer"&gt;ref&lt;/a&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  DAG Setup
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;dag_id&lt;/code&gt; &lt;strong&gt;is the only REQUIRED parameter.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Valid DAG declaration syntax (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html#declaring-a-dag" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DAG&lt;/span&gt;
&lt;span class="n"&gt;dag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DAG&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;span class="nc"&gt;PythonOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dag&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dag&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;


&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.sdk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DAG&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;DAG&lt;/span&gt;&lt;span class="p"&gt;(...):&lt;/span&gt;
    &lt;span class="nc"&gt;PythonOperator&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;


&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.sdk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DAG&lt;/span&gt;
&lt;span class="nd"&gt;@dag&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;my_dag&lt;/span&gt;&lt;span class="p"&gt;(...):&lt;/span&gt;
    &lt;span class="nd"&gt;@task&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;my_task&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="nf"&gt;my_task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;my_dag&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  default_args (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html#default-arguments" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Purpose: avoid repetition&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;It is a dict of default task parameters applied to all tasks in the DAG.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Task-level args override the &lt;code&gt;default_args&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;DAG runs&lt;/strong&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dag-run.html" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Created by the scheduler&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;properties_:_&lt;/strong&gt; state, &lt;strong&gt;dag_id&lt;/strong&gt;, logical_date, start_date, end_date, duration, run_id&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;logical_date&lt;/code&gt; is the timestamp associated with the run&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;run_id&lt;/code&gt; is a timestamp-based identifier&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;start_date&lt;/code&gt;: earliest logical time from which the Scheduler considers creating runs&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;end_date&lt;/code&gt;: latest logical time to create runs for&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;State transitions: queued → running → [success or failed]&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A DAG run is&lt;/strong&gt; &lt;code&gt;success&lt;/code&gt; &lt;strong&gt;if its last task succeeds.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;How many runs will happen when unpausing a DAG with certain parameters &amp;amp; scenarios?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;often asked in the context of backfilling&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;start_date&lt;/code&gt;, &lt;code&gt;end_date&lt;/code&gt;, &lt;code&gt;schedule&lt;/code&gt;, and &lt;code&gt;catchup&lt;/code&gt; are varied&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The behavior around&lt;/strong&gt; &lt;code&gt;logical_date&lt;/code&gt; &lt;strong&gt;changed from Airflow 2 to 3,&lt;/strong&gt; so older resources may be outdated.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;dag_id&lt;/code&gt; &lt;strong&gt;is the only mandatory parameter,&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;but it’s good practice to set &lt;code&gt;description&lt;/code&gt;, &lt;code&gt;tags&lt;/code&gt;, &lt;code&gt;schedule&lt;/code&gt;, &lt;code&gt;start_date&lt;/code&gt;, &lt;code&gt;end_date&lt;/code&gt;, &lt;code&gt;catchup&lt;/code&gt;, &lt;code&gt;default_args&lt;/code&gt;, &lt;code&gt;max_active_runs&lt;/code&gt; and &lt;code&gt;max_active_tasks&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;catchup&lt;/code&gt; (DAG parameter)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;catchup=False&lt;/code&gt; &lt;strong&gt;is the default in Airflow 3&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Most recently scheduled and missed DAG runs still execute immediately after unpausing.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  acceptable values for &lt;code&gt;schedule&lt;/code&gt; (DAG parameter)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;None&lt;/code&gt; &lt;strong&gt;(only manual/API-triggered runs),&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;cron expressions,&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;datetime.timedelta&lt;/code&gt; &lt;strong&gt;objects,&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;presets:&lt;/strong&gt; &lt;code&gt;@once&lt;/code&gt;&lt;strong&gt;,&lt;/strong&gt; &lt;code&gt;@hourly&lt;/code&gt;&lt;strong&gt;,&lt;/strong&gt; &lt;code&gt;@daily&lt;/code&gt; &lt;strong&gt;(aka&lt;/strong&gt; &lt;code&gt;@midnight&lt;/code&gt;&lt;strong&gt;),&lt;/strong&gt; &lt;code&gt;@weekly&lt;/code&gt;&lt;strong&gt;,&lt;/strong&gt; &lt;code&gt;@monthly&lt;/code&gt;&lt;strong&gt;,&lt;/strong&gt; &lt;code&gt;@quarterly&lt;/code&gt;&lt;strong&gt;,&lt;/strong&gt; &lt;code&gt;@yearly&lt;/code&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/2.2.1/dag-run.html#cron-presets" rel="noopener noreferrer"&gt;ref&lt;/a&gt;),&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;@continuous&lt;/code&gt; = run as soon as the previous run finishes&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  XCOMs (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/xcoms.html" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;for passing small metadata between tasks&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;must be JSON-serializable&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;written to the metadata DB via the API Server&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;XCom&lt;/strong&gt; &lt;code&gt;pull&lt;/code&gt; &lt;strong&gt;requirements&lt;/strong&gt; — example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Task1 &lt;code&gt;push&lt;/code&gt;es a value and key to the Metadata DB&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Task2 &lt;code&gt;pull&lt;/code&gt;s the value by key and one of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;run_id&lt;/code&gt;, &lt;code&gt;task_id&lt;/code&gt;, &lt;code&gt;dag_id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Usually &lt;code&gt;key&lt;/code&gt; + &lt;code&gt;task_id&lt;/code&gt; are sufficient.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@task&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;task1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;xcom_push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="n"&gt;my_key&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@task&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;task2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="n"&gt;task1&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="n"&gt;my_key&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;XCom size limits depend on the metadata database&lt;/strong&gt; (&lt;a href="https://www.astronomer.io/docs/learn/airflow-passing-data-between-tasks#when-to-use-xcoms" rel="noopener noreferrer"&gt;Astronomer ref&lt;/a&gt;):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;SQLite = 2GB&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Postgres = 1GB&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;MySQL = 64KB&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;XComs are for passing metadata necessary for the pipeline, not the pipeline’s bulk data&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Tasks tested with the&lt;/strong&gt; &lt;code&gt;airflow tasks test&lt;/code&gt; &lt;strong&gt;command still store their XComs in the Metadata DB and may need to be cleared manually using&lt;/strong&gt; &lt;code&gt;airflow xcom clear&lt;/code&gt;&lt;strong&gt;.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Task dependency orchestration&lt;/strong&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/tasks.html#relationships" rel="noopener noreferrer"&gt;Task relationships&lt;/a&gt;)
&lt;/h3&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;task1 &amp;gt;&amp;gt; [task2, task3] &amp;gt;&amp;gt; task4&lt;/code&gt; = task1 runs, then task2 &amp;amp; task3 in parallel, then task4&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;task1 &amp;lt;&amp;lt; [task2, task3] &amp;lt;&amp;lt; task4&lt;/code&gt; = reverse dependency notation&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;[t1, t2] &amp;gt;&amp;gt; [t3, t4]&lt;/code&gt; — errors&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;chain&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;t1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t4&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# establishes sequential dependencies across lists
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Options to Backfill a DAG&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;CLI, Airflow UI, REST API call&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Sensors (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/sensors.html" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;checks a condition and waits &lt;code&gt;poke_interval&lt;/code&gt; seconds before checking again
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;PythonSensor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;&lt;span class="n"&gt;waiting_for_condition&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;python_callable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;_condition&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;poke_interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;&lt;span class="n"&gt;poke&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;timeout&lt;/code&gt; &lt;strong&gt;and&lt;/strong&gt; &lt;code&gt;poke_interval&lt;/code&gt; &lt;strong&gt;are specified in seconds&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Default&lt;/strong&gt; &lt;code&gt;timeout&lt;/code&gt; &lt;strong&gt;is 1 week&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Setting a meaningful &lt;code&gt;timeout&lt;/code&gt; is important because the default can stall a worker&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Sensor modes
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;mode=”poke”&lt;/code&gt; is the default&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Live&lt;/strong&gt; &lt;code&gt;poke&lt;/code&gt; &lt;strong&gt;Sensors hold worker control and consume a worker slot.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It’s easy to freeze an entire Airflow instance like this (tasks are scheduled but not started).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use&lt;/strong&gt; &lt;code&gt;poke&lt;/code&gt; &lt;strong&gt;Sensors when the&lt;/strong&gt; &lt;code&gt;poke_interval&lt;/code&gt; &lt;strong&gt;&amp;lt;= 5 minutes.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;mode=”reschedule”&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;allows workers to do other tasks between checks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sensor Task Instance is put in&lt;/strong&gt; &lt;code&gt;up_for_reschedule&lt;/code&gt; &lt;strong&gt;state between condition checks.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Airflow Providers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Airflow Providers are third-party packages and integrations.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;often include Connections, Operators, Hooks, Python modules, etc&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Registry of Providers: &lt;a href="https://registry.astronomer.io/" rel="noopener noreferrer"&gt;https://registry.astronomer.io/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best practices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Always define a meaningful &lt;code&gt;timeout&lt;/code&gt; parameter.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Default is seven days and can block a DAG.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;If&lt;/strong&gt; &lt;code&gt;poke_interval&lt;/code&gt; &lt;strong&gt;≤ 5 minutes, set&lt;/strong&gt; &lt;code&gt;mode=”poke”&lt;/code&gt;&lt;strong&gt;.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Define a meaningful &lt;code&gt;poke_interval&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Task lifecycle states (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/tasks.html#task-instances" rel="noopener noreferrer"&gt;ref&lt;/a&gt;)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;scheduled: Task Instance created and waiting for a slot&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;queued: handed to the executor; waiting for a worker&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;running: executing on a worker&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;success: finished successfully&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;failed: finished with error and no retries left (or retries exhausted)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;up_for_retry: failed but will retry after &lt;code&gt;retry_delay&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;up_for_reschedule: Sensor in reschedule mode, sleeping until the next check&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;deferred: deferrable operator yielded to a trigger; not using a worker slot&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;skipped: bypassed by branching/short-circuit/trigger rules&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;upstream_failed: did not run because upstream tasks failed and the trigger rule wasn’t met&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  DAG Debugging
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Deleting a DAG from the UI removes all run history &amp;amp; task instances from the metadata database and temporarily hides the DAG until it is re-parsed. It does not remove the DAG file itself.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Always &lt;code&gt;import&lt;/code&gt; using full paths starting from the &lt;code&gt;dags&lt;/code&gt; folder.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;avoid relative imports.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  DAG not showing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Wait for the UI &lt;strong&gt;refresh interval for new DAGs:&lt;/strong&gt; &lt;code&gt;dag_dir_list_interval&lt;/code&gt; &lt;strong&gt;(default 5 min).&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Wait for the UI &lt;strong&gt;refresh interval for modified DAGs:&lt;/strong&gt; &lt;code&gt;min_file_process_interval&lt;/code&gt; &lt;strong&gt;(default 30 sec).&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Ensure the &lt;code&gt;dag_id&lt;/code&gt; is unique.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;When two DAGs share the same&lt;/strong&gt; &lt;code&gt;dag_id&lt;/code&gt;&lt;strong&gt;, the one that’s displayed will be random.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Check if the DAG is in &lt;code&gt;.airflowignore&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Airflow only recognizes files with “DAG” and “airflow” inside them as DAGs.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  DAG not running
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Check that the DAG is unpaused.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ensure &lt;code&gt;start_date&lt;/code&gt; is in the past.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Confirm &lt;code&gt;end_date&lt;/code&gt; is in the future.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Allow multiple versions to run at the same time if intended.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Check &lt;code&gt;max_active_runs_per_dag&lt;/code&gt; (defaults to 16)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Check &lt;code&gt;max_active_tasks_per_dag&lt;/code&gt; (defaults to 16)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Set &lt;code&gt;parallelism&lt;/code&gt; (max Task Instances that can run per scheduler; default 32)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Validate Airflow Connections
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Airflow UI → Admin → Connections → enter password → click &lt;code&gt;TEST&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  New Changes moving from Airflow 2 to 3
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;start_date=None&lt;/code&gt; &lt;strong&gt;is acceptable and now the default&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Logical date is when the DAG starts running:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;runs immediately; doesn’t wait for the interval to end&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;code&gt;airflow db init&lt;/code&gt; initializes the Metadata DB.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;code&gt;catchup=False&lt;/code&gt; &lt;strong&gt;by default&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Airflow 2 Webserver is now the Airflow 3 API Server.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;When&lt;/strong&gt; &lt;code&gt;CREATE_CRON_DATA_INTERVALS=True&lt;/code&gt;&lt;strong&gt;, DAG scheduling behaves like Airflow 2.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Airflow 2:&lt;/strong&gt; DAGs execute after the interval ends.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Airflow 3:&lt;/strong&gt; DAGs execute at the start of the interval.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;code&gt;schedule_interval&lt;/code&gt; is now named &lt;code&gt;schedule&lt;/code&gt; (&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/scheduling.html" rel="noopener noreferrer"&gt;Scheduling API&lt;/a&gt;).&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Certification Topics NOT covered here
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Identify the most helpful Airflow UI view for real-world scenarios:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://airflow.apache.org/docs/apache-airflow/stable/ui.html" rel="noopener noreferrer"&gt;Documentation — UI Overview&lt;/a&gt; or &lt;a href="https://academy.astronomer.io/path/airflow-101/airflow-ui" rel="noopener noreferrer"&gt;Astronomer 17-minute UI course&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Given a specific scenario, identify if Airflow is an applicable solution.&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;cron expressions&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Final Certification Tips
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;If it’s a multi-select problem, always select more than one box.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Currently, the industry is migrating from Airflow 2 to 3, so the differences in their version they were highlighted in this Certification. They may not in the future.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Good luck → &lt;a href="https://academy.astronomer.io/certification-exam-apache-airflow-3-fundamentals" rel="noopener noreferrer"&gt;Certification Exam: Apache Airflow 3 Fundamentals&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>airflow</category>
      <category>astronomer</category>
      <category>python</category>
      <category>certification</category>
    </item>
  </channel>
</rss>
