<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Pascal Gillet</title>
    <description>The latest articles on Forem by Pascal Gillet (@psclgllt).</description>
    <link>https://forem.com/psclgllt</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F399556%2F540f4292-a0a7-4700-8255-a78fd7918c5c.jpeg</url>
      <title>Forem: Pascal Gillet</title>
      <link>https://forem.com/psclgllt</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/psclgllt"/>
    <language>en</language>
    <item>
      <title>Restricting Access by Geographical Location using NGINX with Helm</title>
      <dc:creator>Pascal Gillet</dc:creator>
      <pubDate>Wed, 29 May 2024 08:35:00 +0000</pubDate>
      <link>https://forem.com/psclgllt/restricting-access-by-geographical-location-using-nginx-with-helm-1o8m</link>
      <guid>https://forem.com/psclgllt/restricting-access-by-geographical-location-using-nginx-with-helm-1o8m</guid>
      <description>&lt;p&gt;This article explains how you can restrict content distribution to a particular country from services in your Kubernetes cluster, using the GeoIP2 dynamic module. &lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Install NGINX Ingress Controller in your Kubernetes cluster using Helm.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting the GeoLite2 databases from MaxMind
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;MaxMind&lt;/strong&gt; company provides the &lt;a href="https://dev.maxmind.com/geoip/geolite2-free-geolocation-data"&gt;&lt;strong&gt;GeoLite2&lt;/strong&gt;&lt;/a&gt; free IP geolocation databases. You need to create an account on the MaxMind website and generate a &lt;strong&gt;license key&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuring the NGINX Ingress Controller
&lt;/h2&gt;

&lt;p&gt;Override the NGINX Helm chart with the following values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;controller&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# Maxmind license key to download GeoLite2 Databases&lt;/span&gt;
  &lt;span class="na"&gt;maxmindLicenseKey&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
  &lt;span class="na"&gt;extraArgs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# GeoLite2 Databases to download (default "GeoLite2-City,GeoLite2-ASN")&lt;/span&gt;
    &lt;span class="na"&gt;maxmind-edition-ids&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GeoLite2-Country&lt;/span&gt;
  &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Preserve source IP...&lt;/span&gt;
    &lt;span class="na"&gt;externalTrafficPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Local&lt;/span&gt;
    &lt;span class="c1"&gt;# ...Which is only supported if we enable the v2 proxy protocol for the OVH load balancer (specific to OVH Cloud provider)&lt;/span&gt;
    &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;service.beta.kubernetes.io/ovh-loadbalancer-proxy-protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;v2"&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;use-proxy-protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
    &lt;span class="c1"&gt;# Enable Ingress to parse and add -snippet annotations/directives&lt;/span&gt;
    &lt;span class="na"&gt;allow-snippet-annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
    &lt;span class="c1"&gt;# Enable geoip2 module&lt;/span&gt;
    &lt;span class="na"&gt;use-geoip&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;false"&lt;/span&gt;
    &lt;span class="na"&gt;use-geoip2&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
    &lt;span class="c1"&gt;# Configure access by geographical location.&lt;/span&gt;
    &lt;span class="c1"&gt;# Here, we create a variable $allowed_country whose values &lt;/span&gt;
    &lt;span class="c1"&gt;# depend on values of GeoIP2 variable $geoip2_country_code,&lt;/span&gt;
    &lt;span class="c1"&gt;# which lists all ISO 3166 country codes.&lt;/span&gt;
    &lt;span class="c1"&gt;# Map directives are only allowed at Ingress Controller level.&lt;/span&gt;
    &lt;span class="na"&gt;http-snippet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
      &lt;span class="s"&gt;map $geoip2_country_code $allowed_country {&lt;/span&gt;
        &lt;span class="s"&gt;default no;&lt;/span&gt;
        &lt;span class="s"&gt;FR yes;&lt;/span&gt;
        &lt;span class="s"&gt;US yes;&lt;/span&gt;
      &lt;span class="s"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Example Ingress
&lt;/h2&gt;

&lt;p&gt;A minimal Ingress resource example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Ingress&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;minimal-ingress&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;nginx.ingress.kubernetes.io/rewrite-target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/&lt;/span&gt;
    &lt;span class="c1"&gt;# Restrict access by geographical location&lt;/span&gt;
    &lt;span class="na"&gt;nginx.ingress.kubernetes.io/server-snippet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
      &lt;span class="s"&gt;if ($allowed_country = no) {&lt;/span&gt;
        &lt;span class="s"&gt;return 451;&lt;/span&gt;
      &lt;span class="s"&gt;}&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ingressClassName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx-example&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/testpath&lt;/span&gt;
        &lt;span class="na"&gt;pathType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Prefix&lt;/span&gt;
        &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
            &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;number&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: The &lt;a href="https://en.wikipedia.org/wiki/HTTP_451"&gt;HTTP status code 451&lt;/a&gt; was chosen as a reference to the novel &lt;br&gt;
&lt;a href="https://en.wikipedia.org/wiki/Fahrenheit_451"&gt;"Fahrenheit 451"&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.nginx.com/nginx-ingress-controller/installation/installing-nic/installation-with-helm/"&gt;NGINX Installation with Helm&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.nginx.com/nginx/admin-guide/security-controls/controlling-access-by-geoip/"&gt;Restricting Access by Geographical Location&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.nginx.com/nginx-ingress-controller/configuration/global-configuration/configmap-resource/"&gt;NGINX ConfigMap Resource&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>nginx</category>
      <category>geoip2</category>
      <category>helm</category>
    </item>
    <item>
      <title>KubeCon 2022: nouveautés dans Kubernetes v1.24</title>
      <dc:creator>Pascal Gillet</dc:creator>
      <pubDate>Thu, 19 May 2022 21:14:13 +0000</pubDate>
      <link>https://forem.com/stack-labs/kubecon-2022-nouveautes-dans-kubernetes-v124-3ob8</link>
      <guid>https://forem.com/stack-labs/kubecon-2022-nouveautes-dans-kubernetes-v124-3ob8</guid>
      <description>&lt;p&gt;Au deuxième jour de la KubeCon 2022, la keynote démarre par l'annonce des nouveautés du projet Kubernetes v1.24.&lt;br&gt;
En voici les principales:&lt;/p&gt;

&lt;h1&gt;
  
  
  Suppression de Dockershim
&lt;/h1&gt;

&lt;p&gt;Rappelez-vous, Kubernetes v1.20 avait annoncé que &lt;strong&gt;Dockershim&lt;/strong&gt; serait bientôt supprimé, c’est chose faite dans Kubernetes v1.24!&lt;/p&gt;

&lt;p&gt;Pour rappel, les premières versions de Kubernetes ne fonctionnaient qu'avec un seul &lt;em&gt;runtime&lt;/em&gt; de conteneurs: Docker Engine. Pourtant il en existe plein d'autres: containerd, CRI-O, Mirantis Container Runtime, etc.&lt;br&gt;
Plus tard, Kubernetes a ajouté la prise en charge de l'utilisation d'autres runtimes de conteneurs, au travers de la norme CRI. La norme CRI a été créée pour permettre l'interopérabilité entre des orchestrateurs (comme Kubernetes) et de nombreux runtimes de conteneurs différents. &lt;br&gt;
Or, Docker Engine n'implémente pas cette interface (CRI), donc le projet Kubernetes a dû créer un code spécifique pour continuer à supporter Docker Engine le temps de la transition et a intégré ce code dockershim à Kubernetes lui-même.&lt;/p&gt;

&lt;p&gt;Le code dockershim a toujours été destiné à être une solution temporaire (d'où le nom : shim). En fait, la maintenance de dockershim était même devenue un lourd fardeau pour les mainteneurs de Kubernetes: des fonctionnalités qui sont maintenant implémentées dans les nouveaux runtimes CRI étaient largement incompatibles avec le dockershim, telles que cgroups v2 et les espaces de noms utilisateurs. La suppression du dockershim de Kubernetes permet de poursuivre le développement dans ces domaines.&lt;/p&gt;

&lt;p&gt;Pour en savoir plus sur ce que cela signifie vraiment, consultez le billet de blog &lt;a href="https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/" rel="noopener noreferrer"&gt;"Don't Panic: Kubernetes and Docker"&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Pour déterminer l'impact que la suppression de dockershim peut avoir sur vos projets, vous pouvez lire &lt;a href="https://kubernetes.io/docs/tasks/administer-cluster/migrating-from-dockershim/check-if-dockershim-removal-affects-you/" rel="noopener noreferrer"&gt;"Check whether dockershim removal affects you"&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Kubernetes Cluster API
&lt;/h1&gt;

&lt;p&gt;Cluster API v1.0 est sortie!&lt;/p&gt;

&lt;p&gt;Cluster API est un sous-projet Kubernetes axé sur la fourniture d'API déclaratives et d'outils pour simplifier le provisionnement, la mise à niveau et l'exploitation de clusters Kubernetes.&lt;/p&gt;

&lt;p&gt;Lancé par le &lt;em&gt;Special Interest Group&lt;/em&gt; (SIG) Kubernetes "Cluster Lifecycle", le projet Cluster API utilise des APIs et des modèles de style Kubernetes pour automatiser la gestion du cycle de vie des clusters pour les opérateurs de plates-formes. Cela signifie que l'infrastructure de votre cluster Kubernetes, c'est-à-dire les machines virtuelles, les réseaux, les &lt;em&gt;load balancers&lt;/em&gt; et les VPCs, ainsi que la configuration du cluster lui-même sont définis de la même manière que les charges de travail qui y sont déployées: au travers d'objets Kubernetes définis dans des fichiers de configuration au format yaml. Cela permet des déploiements de cluster cohérents et reproductibles dans différents environnements d'infrastructure.&lt;/p&gt;

&lt;h1&gt;
  
  
  Storage
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Volume Expansion
&lt;/h2&gt;

&lt;p&gt;L'"expansion de volume" a été introduite en tant que fonctionnalité &lt;em&gt;alpha&lt;/em&gt; dans Kubernetes 1.8, passée en &lt;em&gt;bêta&lt;/em&gt; dans 1.11, et finalement GA avec Kubernetes 1.24.&lt;/p&gt;

&lt;p&gt;Cette fonctionnalité permet de spécifier simplement une nouvelle taille dans la &lt;em&gt;Spec&lt;/em&gt; des objets &lt;code&gt;PersistentVolumeClaim&lt;/code&gt; et Kubernetes étendra automatiquement le volume à l'aide du backend de stockage et étendra également à chaud (quand cela est possible) le système de fichiers sous-jacent utilisé par le pod.&lt;/p&gt;

&lt;h2&gt;
  
  
  Storage Capacity Tracking
&lt;/h2&gt;

&lt;p&gt;Kubernetes v1.24 inclut la prise en charge de l'API (&lt;em&gt;cluster-level&lt;/em&gt;) pour le suivi de la capacité de stockage. &lt;br&gt;
Pour l'utiliser, vous devez également utiliser un driver CSI prenant en charge le suivi de la capacité.&lt;/p&gt;

&lt;p&gt;Kubernetes garde une trace de la capacité de stockage pour permettre au &lt;em&gt;scheduler&lt;/em&gt; de placer des pods sur des noeuds avec une capacité de stockage suffisante pour les volumes manquants restants. Sans suivi de la capacité de stockage, le scheduler peut choisir un noeud qui n'a pas assez de capacité pour provisionner un volume et plusieurs tentatives de scheduling seront alors nécessaires.&lt;/p&gt;

&lt;h1&gt;
  
  
  Security - Pod Security Admission
&lt;/h1&gt;

&lt;p&gt;Dans Kubernetes v1.24, la fonctionnalité en bêta &lt;code&gt;PodSecurity&lt;/code&gt; est activée par défaut.&lt;/p&gt;

&lt;p&gt;Pour rappel, la fonctionnalité "Pod Security Admission" permet de définir différents niveaux d'isolation pour les pods, pour restreindre le comportement des pods de manière claire et cohérente. Plus d'infos &lt;a href="https://kubernetes.io/docs/concepts/security/pod-security-admission/" rel="noopener noreferrer"&gt;ici&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Performances
&lt;/h1&gt;

&lt;p&gt;Les performances du scheduler sont accrues dans Kubernetes 1.24: notamment la latence est réduite et le débit de placement atteint maintenant 300 pods/sec (avez-vous besoin d'autant dans votre projet?)&lt;/p&gt;

&lt;p&gt;D'autre part, le SIG "Batch Working Group" a été créé pour discuter et améliorer la prise en charge des workloads de type &lt;em&gt;batch&lt;/em&gt; (HPC, AI/ML, analyse de données, CI, etc.) dans Kubernetes. Le but est d'unifier la façon dont les utilisateurs déploient ce type de charges de travail pour améliorer la portabilité et simplifier la prise en charge pour les &lt;em&gt;providers&lt;/em&gt; Kubernetes.&lt;/p&gt;

&lt;p&gt;Le travail se poursuit également au sein du SIG Scheduling avec le support d'une dizaine de plugins, pour notamment faire du co-scheduling avec les &lt;code&gt;PodGroup&lt;/code&gt;s (analogue au &lt;em&gt;gang scheduling&lt;/em&gt; de Volcano?).&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>kubecon</category>
    </item>
    <item>
      <title>My Journey With Spark On Kubernetes... In Python (3/3)</title>
      <dc:creator>Pascal Gillet</dc:creator>
      <pubDate>Mon, 12 Apr 2021 08:11:03 +0000</pubDate>
      <link>https://forem.com/stack-labs/my-journey-with-spark-on-kubernetes-in-python-3-3-536e</link>
      <guid>https://forem.com/stack-labs/my-journey-with-spark-on-kubernetes-in-python-3-3-536e</guid>
      <description>&lt;p&gt;We need to operate Kubernetes as part of a Python client application. So, we need to interact with the Kubernetes REST API. Luckily we do not need to implement the API calls and manage HTTP requests/responses ourselves: we can rely on the &lt;a href="https://github.com/kubernetes-client/python" rel="noopener noreferrer"&gt;Kubernetes Python client&lt;/a&gt;, among other officially-supported Kubernetes client libraries for other languages such as Go, Java, .NET, JavaScript and Haskell (there are also a lot of community-maintained client libraries for many languages).&lt;/p&gt;

&lt;h1&gt;
  
  
  Kubernetes Python Client
&lt;/h1&gt;

&lt;p&gt;The Kubernetes Python Client is compatible with Python 2.7 and 3.4+. See the &lt;a href="https://github.com/kubernetes-client/python#compatibility" rel="noopener noreferrer"&gt;compatibility matrix&lt;/a&gt; for the supported versions of Kubernetes.&lt;/p&gt;

&lt;p&gt;When using the client library, we must first load authentication and cluster information.&lt;/p&gt;

&lt;h2&gt;
  
  
  Load Authentication And Cluster Information
&lt;/h2&gt;

&lt;p&gt;First, you need to setup the required service account and roles.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;k8s/python-client-sa-rbac.yaml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ServiceAccount&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python-client-sa&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Role&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python-client-role&lt;/span&gt;
&lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;apiGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;configmaps"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pods"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pods/log"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pods/status"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;services"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;verbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;apiGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;networking.k8s.io"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ingresses"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ingresses/status"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;verbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;apiGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sparkoperator.k8s.io"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;sparkapplications&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;verbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;RoleBinding&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python-client-role-binding&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
&lt;span class="na"&gt;subjects&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ServiceAccount&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python-client-sa&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
&lt;span class="na"&gt;roleRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Role&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python-client-role&lt;/span&gt;
  &lt;span class="na"&gt;apiGroup&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterRole&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node-reader&lt;/span&gt;
&lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;apiGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nodes"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;verbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;list"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterRoleBinding&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python-client-cluster-role-binding&lt;/span&gt;
&lt;span class="na"&gt;subjects&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ServiceAccount&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python-client-sa&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
&lt;span class="na"&gt;roleRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterRole&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node-reader&lt;/span&gt;
  &lt;span class="na"&gt;apiGroup&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl create &lt;span class="nt"&gt;-f&lt;/span&gt; k8s/python-client-sa-rbac.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command creates a new service account named &lt;code&gt;python-client-sa&lt;/code&gt;, a new role with the needed permissions in the &lt;code&gt;spark-jobs&lt;/code&gt; namespace and then binds the new role to the newly created service account.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;WARNING&lt;/strong&gt;: The &lt;code&gt;python-client-sa&lt;/code&gt; is the service account that will provide the identity for the Kubernetes Python Client in our application. Do not confuse this service account with the &lt;code&gt;driver-sa&lt;/code&gt; service account for driver pods.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Easy Way
&lt;/h3&gt;

&lt;p&gt;In this method, we can use an helper utility to load authentication and cluster information from a &lt;code&gt;kubeconfig&lt;/code&gt; file and store them in &lt;code&gt;kubernetes.client.configuration&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;

&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_kube_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path/to/kubeconfig_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;v1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CoreV1Api&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Listing pods with their IPs:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_namespaced_pod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spark-jobs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pod_ip&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But we &lt;strong&gt;DO NOT&lt;/strong&gt; want to rely on the default &lt;code&gt;kubeconfig&lt;/code&gt; file, denoted by the environment variable &lt;code&gt;KUBECONFIG&lt;/code&gt; or, failing that, in &lt;code&gt;~/.kube/config&lt;/code&gt;. This &lt;code&gt;kubeconfig&lt;/code&gt; file is yours, as user of the &lt;code&gt;kubectl&lt;/code&gt; command. Concretely, with this &lt;code&gt;kubeconfig&lt;/code&gt; file, you have the right to do almost everything in the Kubernetes cluster, and in all namespaces. Instead, we're going to generate one especially for the service account created above, with the help of the script &lt;code&gt;kubeconfig-gen.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

&lt;span class="c"&gt;# set -eux&lt;/span&gt;

&lt;span class="c"&gt;# Reads the API server name from the default `kubeconfig` file.&lt;/span&gt;
&lt;span class="c"&gt;# Here we suppose that the kubectl command-line tool is already configured to communicate with our cluster.&lt;/span&gt;
&lt;span class="nv"&gt;APISERVER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl config view &lt;span class="nt"&gt;--minify&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.clusters[0].cluster.server}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nv"&gt;SERVICE_ACCOUNT_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;1&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;python&lt;/span&gt;&lt;span class="p"&gt;-client-sa&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;NAMESPACE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;2&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;spark&lt;/span&gt;&lt;span class="p"&gt;-jobs&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;SECRET_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl get serviceaccount &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SERVICE_ACCOUNT_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;NAMESPACE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.secrets[0].name}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl get secret &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SECRET_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;NAMESPACE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.data.token}'&lt;/span&gt; | &lt;span class="nb"&gt;base64&lt;/span&gt; &lt;span class="nt"&gt;--decode&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;CACERT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl get secret &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SECRET_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;NAMESPACE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"{['data']['ca&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;crt']}"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;


&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; kubeconfig-sa &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
apiVersion: v1
kind: Config
clusters:
- cluster:
    certificate-authority-data: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CACERT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
    server: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;APISERVER&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
  name: default-cluster
contexts:
- context:
    cluster: default-cluster
    namespace: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;NAMESPACE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
    user: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SERVICE_ACCOUNT_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
  name: default-context
current-context: default-context
users:
- user:
    token: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TOKEN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
  name: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SERVICE_ACCOUNT_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;kubeconfig&lt;/code&gt; file thus created configures access to the cluster for the &lt;code&gt;python-client-sa&lt;/code&gt; service account, with only the rights needed for our client application and in the single namespace &lt;code&gt;spark-jobs&lt;/code&gt; (&lt;em&gt;"principle of least privilege"&lt;/em&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  The Hard Way
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Fetch credentials
&lt;/h4&gt;

&lt;p&gt;Here, we're going to configure the Python client in the most programmatic way possible.&lt;br&gt;&lt;br&gt;
First, we need to fetch the credentials to access the Kubernetes cluster. We’ll store these in Python environmental variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;APISERVER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl config view &lt;span class="nt"&gt;--minify&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.clusters[0].cluster.server}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;SECRET_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl get serviceaccount python-client-sa &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.secrets[0].name}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl get secret &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SECRET_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.data.token}'&lt;/span&gt; | &lt;span class="nb"&gt;base64&lt;/span&gt; &lt;span class="nt"&gt;--decode&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CACERT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl get secret &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;SECRET_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"{['data']['ca&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;crt']}"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that environment variables are captured the first time the &lt;code&gt;os&lt;/code&gt; module is imported, typically during IDE/Python startup. Changes to the environment made after this time are not reflected in &lt;code&gt;os.environ&lt;/code&gt; (except for changes made by modifying os.environ directly).&lt;/p&gt;

&lt;h4&gt;
  
  
  Python sample usage
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;tempfile&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;NamedTemporaryFile&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;

&lt;span class="n"&gt;api_server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;APISERVER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;cacert&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CACERT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Set the configuration
&lt;/span&gt;&lt;span class="n"&gt;configuration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Configuration&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;NamedTemporaryFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cert&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;cert&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base64&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;b64decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacert&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;configuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ssl_ca_cert&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cert&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;
&lt;span class="n"&gt;configuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;api_server&lt;/span&gt;
&lt;span class="n"&gt;configuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;verify_ssl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="n"&gt;configuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;debug&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
&lt;span class="n"&gt;configuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Configuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;configuration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;v1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CoreV1Api&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Listing pods with their IPs:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_namespaced_pod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spark-jobs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pod_ip&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Kubernetes Object Management
&lt;/h3&gt;

&lt;p&gt;With the Kubernetes Python Client, you can create and manage Kubernetes objects programmatically.&lt;/p&gt;

&lt;p&gt;In the following example (provided in the &lt;a href="https://githubcom/kubernetes-client/python/blob/master/examples/deployment_crud.py" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;), we create, update then delete a &lt;code&gt;Deployment&lt;/code&gt; using &lt;code&gt;AppsV1Api&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Creates, updates, and deletes a deployment using AppsV1Api.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;

&lt;span class="n"&gt;DEPLOYMENT_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nginx-deployment&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_deployment_object&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Configureate Pod template container
&lt;/span&gt;    &lt;span class="n"&gt;container&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;V1Container&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nginx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nginx:1.15.4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;ports&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;V1ContainerPort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;container_port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
        &lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;V1ResourceRequirements&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cpu&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;100m&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;200Mi&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;limits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cpu&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;500m&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;500Mi&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Create and configurate a spec section
&lt;/span&gt;    &lt;span class="n"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;V1PodTemplateSpec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;V1ObjectMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;app&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nginx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;V1PodSpec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;containers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
    &lt;span class="c1"&gt;# Create the specification of deployment
&lt;/span&gt;    &lt;span class="n"&gt;spec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;V1DeploymentSpec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;replicas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;selector&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;matchLabels&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;app&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;nginx&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}})&lt;/span&gt;
    &lt;span class="c1"&gt;# Instantiate the deployment object
&lt;/span&gt;    &lt;span class="n"&gt;deployment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;V1Deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;api_version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;apps/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deployment&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;V1ObjectMeta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DEPLOYMENT_NAME&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;deployment&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_instance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deployment&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Create deployement
&lt;/span&gt;    &lt;span class="n"&gt;api_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;api_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_namespaced_deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;deployment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deployment created. status=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_instance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deployment&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Update container image
&lt;/span&gt;    &lt;span class="n"&gt;deployment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nginx:1.16.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="c1"&gt;# Update the deployment
&lt;/span&gt;    &lt;span class="n"&gt;api_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;api_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;patch_namespaced_deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DEPLOYMENT_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;deployment&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deployment updated. status=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;delete_deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_instance&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Delete deployment
&lt;/span&gt;    &lt;span class="n"&gt;api_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;api_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete_namespaced_deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DEPLOYMENT_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;V1DeleteOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;propagation_policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Foreground&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;grace_period_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deployment deleted. status=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Configs can be set in Configuration class directly or using helper
&lt;/span&gt;    &lt;span class="c1"&gt;# utility. If no argument provided, the config will be loaded from
&lt;/span&gt;    &lt;span class="c1"&gt;# default location.
&lt;/span&gt;    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_kube_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path/to/kubeconfig_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;apps_v1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AppsV1Api&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;deployment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_deployment_object&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="nf"&gt;create_deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;apps_v1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deployment&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;update_deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;apps_v1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deployment&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;delete_deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;apps_v1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is great, but this involves mastering the client's API, and above all we must configure our objects &lt;em&gt;imperatively&lt;/em&gt;: we specify the desired operation (create, replace, etc.) on Python objects that represent Kubernetes objects. Here, we prefer to manage our objects in a &lt;em&gt;declarative&lt;/em&gt; way and operate on object configuration files (stored locally along the Python code source) like we usually do with the &lt;code&gt;kubectl&lt;/code&gt; command. Indeed, Python code should only be a simple execution backend to trigger Kubernetes operations, and &lt;em&gt;business&lt;/em&gt; logic, so to speak, should be concentrated in manifest files.&lt;/p&gt;

&lt;p&gt;The deployment we created above is the same as in the &lt;code&gt;nginx-deployment.yaml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx-deployment&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx:1.15.4&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can directly load the manifest as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;


&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_kube_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path/to/kubeconfig_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nginx-deployment.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;dep&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;safe_load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k8s_apps_v1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AppsV1Api&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;k8s_apps_v1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_namespaced_deployment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dep&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deployment created. status=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the equivalent in Python of &lt;code&gt;kubectl create -f nginx-deployment.yaml&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;As you can see, you must call &lt;code&gt;create_namespaced_deployment&lt;/code&gt; to create a Deployment. In the same way, you would call &lt;code&gt;create_namespaced_pod&lt;/code&gt; to create a Pod, and so on. This is because the Python client is automatically generated following the &lt;code&gt;OpenAPI&lt;/code&gt; specifications of the Kubernetes API.&lt;/p&gt;

&lt;p&gt;It's a shame to have to call a specific method to create a particular type of object, even though the type of object itself is already specified in the manifest that we load through this method. Luckily, the Kubernetes Python Client provides a utility method that acts as an input hub for any kind of object.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;utils&lt;/span&gt;


&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_kube_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path/to/kubeconfig_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nginx-deployment.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;dep&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;safe_load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k8s_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ApiClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_from_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dep&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deployment created. status=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;utils.create_from_dict&lt;/code&gt; is the magic method here. It only takes a &lt;code&gt;Dict&lt;/code&gt; holding valid kubernetes objects. It is a blessing to have found it, because it is well hidden in the client and not documented at all.&lt;/p&gt;

&lt;p&gt;So, to launch a Spark job with spark-submit, you could just call the code snippet above with a single YAML file which groups all the needed resources (separated by --- in YAML). &lt;/p&gt;

&lt;p&gt;But what about the Spark Operator? &lt;code&gt;utils.create_from_dict&lt;/code&gt; does not support &lt;a href="https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/" rel="noopener noreferrer"&gt;custom resources&lt;/a&gt;, that means object types that are not part of the core Kubernetes API, namely &lt;code&gt;SparkApplication&lt;/code&gt; from the Spark Operator. To run a Spark job with the Spark Operator, you have no other choice than calling the &lt;code&gt;create_namespaced_custom_object&lt;/code&gt; function of &lt;code&gt;CustomObjectsApi&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;utils&lt;/span&gt;


&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_kube_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path/to/kubeconfig_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;k8s/spark-operator/pyspark-pi.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;dep&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;safe_load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;custom_object_api&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CustomObjectsApi&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;custom_object_api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_namespaced_custom_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;group&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sparkoperator.k8s.io&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;v1beta2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spark-jobs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;plural&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sparkapplications&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dep&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SparkApplication created&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Regarding spark-submit, there is an even more direct method &lt;code&gt;utils.create_from_yaml&lt;/code&gt;, which reads Kubernetes objects from a YAML file. But we cannot use it, as we need to "parameterize" our YAML files before submitting them to the Kubernetes Python client.&lt;/p&gt;

&lt;h1&gt;
  
  
  Templating
&lt;/h1&gt;

&lt;p&gt;As you know, when you apply a manifest file to Kubernetes - the YAML-formatted resource descriptions that Kubernetes can understand - you must specify the resource name which must be unique for that type of resource (and within the same namespace), otherwise Kubernetes will complain that the resource already exists.&lt;/p&gt;

&lt;p&gt;For example, you can only have one Pod named &lt;code&gt;myapp-1234&lt;/code&gt; within the same namespace, but you can have one Pod and one Deployment that are each named &lt;code&gt;myapp-1234&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;As we want to run multiple Spark jobs simultaneously, and as these Spark jobs are mostly identical except for a few runtime parameters, we need to parameterize, or &lt;em&gt;templatize&lt;/em&gt;, our Kubernetes YAML files.&lt;/p&gt;

&lt;p&gt;Normally, you don't do that, at least that's not in Kubernetes' philosophy: Kubernetes files should be template-free and should only be patched, by the means of &lt;a href="https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/" rel="noopener noreferrer"&gt;Kustomize&lt;/a&gt; for instance. &lt;a href="https://helm.sh/" rel="noopener noreferrer"&gt;Helm&lt;/a&gt; has also its own templating system.  &lt;/p&gt;

&lt;p&gt;You cannot use such a tool with the Kubernetes Python client. Instead, we are going to substitute references to variables of the form &lt;code&gt;$VAR&lt;/code&gt; or &lt;code&gt;${VAR}&lt;/code&gt; with the corresponding values, exactly like &lt;a href="https://www.gnu.org/software/gettext/manual/html_node/envsubst-Invocation.html" rel="noopener noreferrer"&gt;envsubst&lt;/a&gt;, but programmatically.&lt;/p&gt;

&lt;p&gt;Let's get back to Spark (native). We saw earlier the YAML file that defines a driver pod to run the Pi example program. As you can see, we have placeholders to specify the &lt;code&gt;namespace&lt;/code&gt;, the &lt;code&gt;priorityClassName&lt;/code&gt;, the &lt;br&gt;
&lt;code&gt;serviceAccountName&lt;/code&gt;, the &lt;code&gt;nodeAffinity&lt;/code&gt; and a &lt;code&gt;NAME_SUFFIX&lt;/code&gt; to make the pod's name unique.&lt;/p&gt;

&lt;p&gt;Now, in the Python code, we replace at runtime these variables with the desired values before creating the pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;binascii&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;listdir&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pprint&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pprint&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;utils&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_k8s_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yaml_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yaml_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;safe_load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Configs can be set in Configuration class directly or using helper utility
&lt;/span&gt;    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_kube_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path/to/kubeconfig_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;name_suffix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;binascii&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;b2a_hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;urandom&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;priority_class_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;routine&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;env_subst&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${NAMESPACE}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spark-jobs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${SERVICE_ACCOUNT_NAME}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;driver-sa&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${DRIVER_NODE_AFFINITIES}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;driver&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${EXECUTOR_NODE_AFFINITIES}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compute&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${NAME_SUFFIX}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;name_suffix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${PRIORITY_CLASS_NAME}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;priority_class_name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;k8s_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ApiClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;verbose&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

    &lt;span class="c1"&gt;# Create driver pod
&lt;/span&gt;    &lt;span class="n"&gt;k8s_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;k8s/spark-native&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k8s_object_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_k8s_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pyspark-pi-driver-pod.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_object_dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k8s_objects&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_from_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k8s_object_dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# TODO: create the other resources
&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Submitted %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_objects&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;app-name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Putting the pieces together
&lt;/h1&gt;

&lt;p&gt;We have now the general mechanics and we can create all the resources needed.&lt;br&gt;
Remember, the driver pod consumes a &lt;code&gt;ConfigMap&lt;/code&gt; to define environment variables and to mount configuration files in the Spark container (including the template for the executor pods). We also have a &lt;code&gt;Service&lt;/code&gt; that allows executors to communicate back with the driver. And finally, we have another &lt;code&gt;Service&lt;/code&gt;, backed by an &lt;code&gt;Ingress&lt;/code&gt;, to expose the Spark UI.&lt;/p&gt;

&lt;p&gt;We just iterate over the YAML files that define these resources and just call the same method &lt;code&gt;utils.create_from_dict&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;    &lt;span class="c1"&gt;# List all YAML files in k8s/spark-native directory, except the driver pod definition file
&lt;/span&gt;    &lt;span class="n"&gt;other_resources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;other_resources&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pyspark-pi-driver-pod.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;other_resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;k8s_object_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_k8s_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_object_dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_from_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k8s_object_dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Submitted %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_objects&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;app-name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that we've launched a &lt;em&gt;full&lt;/em&gt; Spark application, let's see what happens when we kill it 😈 or when the application completes normally.&lt;/p&gt;

&lt;h1&gt;
  
  
  Garbage collection
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Killed applications
&lt;/h2&gt;

&lt;p&gt;The role of the Kubernetes garbage collector is to delete certain objects that once had an owner, but no longer have one. The goal is to make sure that the garbage collector properly deletes resources that are no longer needed when killing a Spark application.&lt;br&gt;
It is important to free up the resources of the Kubernetes cluster when you are going to run tens / hundreds of Spark applications in parallel.&lt;/p&gt;

&lt;p&gt;For this, certain Kubernetes objects can be declared owners of other objects. "Owned" objects are called &lt;em&gt;dependent&lt;/em&gt; on the owner object. Each dependent object has a &lt;code&gt;metadata.ownerReferences&lt;/code&gt; field that points to the owning object. When deleting an owner object, all dependent objects are also automatically deleted (&lt;em&gt;cascading deletion&lt;/em&gt;) by default.&lt;/p&gt;

&lt;p&gt;Example of an executor pod owned by its driver pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;spark-role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;executor&lt;/span&gt;
  &lt;span class="na"&gt;ownerReferences&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
      &lt;span class="na"&gt;controller&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pyspark-pi-routine-0245dc3d340cd533-driver&lt;/span&gt;
      &lt;span class="na"&gt;uid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;3b10fa97-c847-4fce-b3e1-71f779cffbef&lt;/span&gt;
&lt;span class="nn"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Spark Operator automatically sets the value of &lt;code&gt;ownerReference&lt;/code&gt;, at different levels: the custom &lt;code&gt;SparkApplication&lt;/code&gt; resource owns the driver pod which owns its executors.&lt;/p&gt;

&lt;p&gt;For applications that are submitted &lt;em&gt;natively&lt;/em&gt; (without the Spark Operator), the highest level owner object is the driver pod: the executor pods automatically set the &lt;code&gt;ownerReference&lt;/code&gt; field, pointing to the driver pod. But we must manage the ownership relationship ourselves for the other &lt;code&gt;ConfigMap&lt;/code&gt;, &lt;code&gt;Service&lt;/code&gt; and &lt;code&gt;Ingress&lt;/code&gt; resources.&lt;br&gt;
For this, we must retrieve the auto-generated &lt;code&gt;uid&lt;/code&gt; of the newly created driver pod and inject it into the dependent objects: it is impossible to manually set the &lt;code&gt;uid&lt;/code&gt; in the YAML definition files, this can only be done at runtime through code (and that's why we cannot put all the resources in a single YAML file).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;binascii&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;listdir&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pprint&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pprint&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;utils&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes.client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApiClient&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_k8s_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yaml_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;yaml_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;safe_load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Configs can be set in Configuration class directly or using helper utility
&lt;/span&gt;    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_kube_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path/to/kubeconfig_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;name_suffix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;binascii&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;b2a_hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;urandom&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;priority_class_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;routine&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;env_subst&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${NAMESPACE}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spark-jobs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${SERVICE_ACCOUNT_NAME}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;driver-sa&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${DRIVER_NODE_AFFINITIES}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;driver&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${EXECUTOR_NODE_AFFINITIES}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compute&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${NAME_SUFFIX}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;name_suffix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${PRIORITY_CLASS_NAME}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;priority_class_name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;k8s_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApiClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;verbose&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

    &lt;span class="c1"&gt;# Create driver pod
&lt;/span&gt;    &lt;span class="n"&gt;k8s_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;k8s/spark-native&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k8s_object_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_k8s_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pyspark-pi-driver-pod.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_object_dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k8s_objects&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_from_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k8s_object_dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Prepare ownership on dependent objects
&lt;/span&gt;    &lt;span class="n"&gt;owner_refs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;apiVersion&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;controller&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kind&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Pod&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;k8s_objects&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;uid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;k8s_objects&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uid&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;

    &lt;span class="c1"&gt;# List all YAML files in k8s/spark-native directory, except the driver pod definition file
&lt;/span&gt;    &lt;span class="n"&gt;other_resources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;other_resources&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pyspark-pi-driver-pod.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;other_resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;k8s_object_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_k8s_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;env_subst&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Set ownership
&lt;/span&gt;        &lt;span class="n"&gt;k8s_object_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ownerReferences&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;owner_refs&lt;/span&gt;
        &lt;span class="nf"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_object_dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_from_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k8s_object_dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Submitted %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k8s_objects&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;app-name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Applications normally completed
&lt;/h2&gt;

&lt;p&gt;When an application completes normally, the executor pods terminate and are cleaned up, but the driver pod persists logs and remains in "completed" state in the Kubernetes API "until it's eventually garbage collected or cleaned up manually".&lt;/p&gt;

&lt;p&gt;Note that in the completed state, the driver pod does not use any compute or memory resources.&lt;/p&gt;

&lt;p&gt;The Spark Operator has TTL support for &lt;code&gt;SparkApplications&lt;/code&gt; through the optional field named &lt;code&gt;.spec.timeToLiveSeconds&lt;/code&gt;, which if set, defines the Time-To-Live (TTL) duration in seconds for a &lt;code&gt;SparkAplication&lt;/code&gt; after its termination. The &lt;code&gt;SparkApplication&lt;/code&gt; object will be garbage collected if the current time is more than the &lt;code&gt;.spec.timeToLiveSeconds&lt;/code&gt; since its termination. The example below illustrates how to use the field:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;timeToLiveSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;86400&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the native Spark side, there is nothing in the doc that specifies how driver pods are ultimately deleted. We could set up a simple Kubernetes &lt;a href="https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/" rel="noopener noreferrer"&gt;&lt;code&gt;CronJob&lt;/code&gt;&lt;/a&gt; that would run periodically to delete them automatically.&lt;/p&gt;

&lt;p&gt;At the time of writing this article, there are pending requests in Kubernetes to support TTL in &lt;code&gt;Pods&lt;/code&gt; like in &lt;code&gt;Jobs&lt;/code&gt;: &lt;em&gt;"TTL controller only handles Jobs for now, and may be expanded to handle other resources that will finish execution, such as Pods and custom resources."&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Background cascading deletion
&lt;/h2&gt;

&lt;p&gt;When killing an application from the Python code, we delete the owner object using the &lt;em&gt;Background&lt;/em&gt; cascading deletion policy.&lt;br&gt;
In background cascading deletion, Kubernetes deletes the owner object immediately and the garbage collector then deletes the dependents in the background. This is useful so as not to delay the main execution thread.&lt;/p&gt;

&lt;p&gt;To delete a Spark job launched by &lt;code&gt;spark-submit&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;


&lt;span class="n"&gt;core_v1_api&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CoreV1Api&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;core_v1_api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete_namespaced_pod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;driver-pod-name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spark-jobs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;propagation_policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Background&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To delete a Spark job launched with the Spark Operator, we must delete the enclosing &lt;code&gt;SparkApplication&lt;/code&gt; resource:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;

&lt;span class="n"&gt;custom_object_api&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CustomObjectsApi&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;custom_object_api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete_namespaced_custom_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;group&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sparkoperator.k8s.io&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;v1beta2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spark-jobs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;plural&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sparkapplications&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;app_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;propagation_policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Background&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Pod priority &amp;amp; preemption
&lt;/h1&gt;

&lt;p&gt;Whether it is Volcano or the default &lt;code&gt;kube-scheduler&lt;/code&gt;, job preemption relies on job priorities. For two jobs, the scheduler decides whose priority is higher by comparing &lt;code&gt;.spec.priorityClassName&lt;/code&gt; (then &lt;code&gt;createTime&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The priority is propagated to driver and executor pods, whether with native spark-submit or with Spark Operator, and regardless of the node affinities.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to know which pods have been preempted
&lt;/h2&gt;

&lt;p&gt;You can retrieve high-level information on what is happening in the cluster. To list all events in the namespace &lt;code&gt;spark-jobs&lt;/code&gt; you can use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# List Events sorted by timestamp&lt;/span&gt;
kubectl get events &lt;span class="nt"&gt;--sort-by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;.metadata.creationTimestamp &lt;span class="nt"&gt;--namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;spark-jobs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;pre&gt;
LAST SEEN   TYPE      REASON             OBJECT                                                   MESSAGE
69s         Normal    Scheduled          podgroup/podgroup-6083b4f9-f240-4a0e-95f2-06882aec2942   pod group is ready
93s         Normal    Scheduled          pod/pyspark-pi-driver-routine-bf20cae50b6a8253           Successfully assigned spark-jobs/pyspark-pi-driver-routine-bf20cae50b6a8253 to gke-yippee-spark-k8s-clus-default-pool-0b72dd1d-jpfp
92s         Normal    Started            pod/pyspark-pi-driver-routine-bf20cae50b6a8253           Started container pyspark-pi
92s         Normal    Pulled             pod/pyspark-pi-driver-routine-bf20cae50b6a8253           Container image "eu.gcr.io/yippee-spark-k8s/spark-py:3.0.1" already present on machine
92s         Normal    Created            pod/pyspark-pi-driver-routine-bf20cae50b6a8253           Created container pyspark-pi
82s         Normal    Scheduled          pod/pythonpi-34b3597593d246a1-exec-2                     Successfully assigned spark-jobs/pythonpi-34b3597593d246a1-exec-2 to gke-yippee-spark-k8s-clus-default-pool-0b72dd1d-pvdl
82s         Normal    Scheduled          pod/pythonpi-34b3597593d246a1-exec-1                     Successfully assigned spark-jobs/pythonpi-34b3597593d246a1-exec-1 to gke-yippee-spark-k8s-clus-default-pool-0b72dd1d-wxck
82s         Normal    Created            pod/pythonpi-34b3597593d246a1-exec-1                     Created container spark-kubernetes-executor
82s         Normal    Started            pod/pythonpi-34b3597593d246a1-exec-1                     Started container spark-kubernetes-executor
82s         Normal    Pulled             pod/pythonpi-34b3597593d246a1-exec-1                     Container image "eu.gcr.io/yippee-spark-k8s/spark-py:3.0.1" already present on machine
81s         Normal    Created            pod/pythonpi-34b3597593d246a1-exec-2                     Created container spark-kubernetes-executor
81s         Normal    Pulled             pod/pythonpi-34b3597593d246a1-exec-2                     Container image "eu.gcr.io/yippee-spark-k8s/spark-py:3.0.1" already present on machine
80s         Normal    Started            pod/pythonpi-34b3597593d246a1-exec-2                     Started container spark-kubernetes-executor
42s         Normal    Killing            pod/pythonpi-34b3597593d246a1-exec-1                     Stopping container spark-kubernetes-executor
42s         Normal    Killing            pod/pythonpi-34b3597593d246a1-exec-2                     Stopping container spark-kubernetes-executor
42s         Warning   FailedScheduling   pod/pyspark-pi-driver-rush-bb25fc0b8efe7c4d              all nodes are unavailable: 3 node(s) resource fit failed.
&lt;b&gt;42s         Normal    Killing            pod/pyspark-pi-driver-routine-bf20cae50b6a8253           Stopping container pyspark-pi
42s         Warning   Evict              pod/pyspark-pi-driver-routine-bf20cae50b6a8253           Pod is evicted, because of preempt&lt;/b&gt;
34s         Warning   Unschedulable      podgroup/podgroup-ee4b9210-35c2-4d68-841a-2daf7712a816   0/1 tasks in gang unschedulable: pod group is not ready, 1 Pipelined, 1 minAvailable.
41s         Warning   FailedScheduling   pod/pyspark-pi-driver-rush-bb25fc0b8efe7c4d              1/1 tasks in gang unschedulable: pod group is not ready, 1 Pipelined, 1 minAvailable.
18s         Normal    Scheduled          podgroup/podgroup-ee4b9210-35c2-4d68-841a-2daf7712a816   pod group is ready
33s         Normal    Scheduled          pod/pyspark-pi-driver-rush-bb25fc0b8efe7c4d              Successfully assigned spark-jobs/pyspark-pi-driver-rush-bb25fc0b8efe7c4d to gke-yippee-spark-k8s-clus-default-pool-0b72dd1d-jpfp
32s         Normal    Started            pod/pyspark-pi-driver-rush-bb25fc0b8efe7c4d              Started container pyspark-pi
32s         Normal    Created            pod/pyspark-pi-driver-rush-bb25fc0b8efe7c4d              Created container pyspark-pi
32s         Normal    Pulled             pod/pyspark-pi-driver-rush-bb25fc0b8efe7c4d              Container image "eu.gcr.io/yippee-spark-k8s/spark-py:3.0.1" already present on machine
22s         Normal    Scheduled          pod/pythonpi-f36cce7593d332f1-exec-1                     Successfully assigned spark-jobs/pythonpi-f36cce7593d332f1-exec-1 to gke-yippee-spark-k8s-clus-default-pool-0b72dd1d-wxck
22s         Normal    Scheduled          pod/pythonpi-f36cce7593d332f1-exec-2                     Successfully assigned spark-jobs/pythonpi-f36cce7593d332f1-exec-2 to gke-yippee-spark-k8s-clus-default-pool-0b72dd1d-pvdl
22s         Normal    Pulled             pod/pythonpi-f36cce7593d332f1-exec-1                     Container image "eu.gcr.io/yippee-spark-k8s/spark-py:3.0.1" already present on machine
21s         Normal    Started            pod/pythonpi-f36cce7593d332f1-exec-1                     Started container spark-kubernetes-executor
21s         Normal    Created            pod/pythonpi-f36cce7593d332f1-exec-1                     Created container spark-kubernetes-executor
21s         Normal    Pulled             pod/pythonpi-f36cce7593d332f1-exec-2                     Container image "eu.gcr.io/yippee-spark-k8s/spark-py:3.0.1" already present on machine
21s         Normal    Created            pod/pythonpi-f36cce7593d332f1-exec-2                     Created container spark-kubernetes-executor
21s         Normal    Started            pod/pythonpi-f36cce7593d332f1-exec-2                     Started container spark-kubernetes-executor
&lt;/pre&gt;

&lt;p&gt;In the output above, we can see that the pod &lt;code&gt;pyspark-pi-driver-routine-bf20cae50b6a8253&lt;/code&gt; has been "evicted because of preempt" by another job with the "rush" priority.&lt;/p&gt;

&lt;h2&gt;
  
  
  Future work
&lt;/h2&gt;

&lt;p&gt;Kubernetes provides containers with lifecycle hooks. The hooks enable containers to be aware of events in their management lifecycle and run code implemented in a handler when the corresponding lifecycle hook is executed.&lt;/p&gt;

&lt;p&gt;In particular, the &lt;code&gt;PreStop&lt;/code&gt; hook can be called immediately before a container is terminated due to preemption (among other events).&lt;br&gt;
Thus, we can consider an action, whatever it is, to be triggered in case of preemption. All you need to do is implement and register a handler for this hook.&lt;/p&gt;

&lt;p&gt;See &lt;a href="https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/" rel="noopener noreferrer"&gt;Container Lifecycle Hooks&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We are soon coming to the end of our journey. Before wishing you good night, let's take a look at how we can monitor a Spark application from Python code.&lt;/p&gt;
&lt;h1&gt;
  
  
  Monitoring
&lt;/h1&gt;
&lt;h2&gt;
  
  
  Getting the status of a Spark application
&lt;/h2&gt;

&lt;p&gt;Many operations in the Kubernetes Python client can be &lt;em&gt;watched&lt;/em&gt;. This allows our Python program to watch for changes in a specific resource until you get the desired result or the watch expires.&lt;/p&gt;

&lt;p&gt;Here, we want to monitor the lifecycle of the pod driver, starting in the &lt;code&gt;Pending&lt;/code&gt; phase, moving through &lt;code&gt;Running&lt;/code&gt; if the Spark container starts OK, and then through either the &lt;code&gt;Succeeded&lt;/code&gt; or &lt;code&gt;Failed&lt;/code&gt; phases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;watch&lt;/span&gt;

&lt;span class="n"&gt;app_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pyspark-pi-routine-bf20cae50b6a8253&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;v1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CoreV1Api&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;watch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Watch&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;label_selector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;app-name=%s,spark-role=driver&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;app_name&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;list_namespaced_pod&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;spark-jobs&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label_selector&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;label_selector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Event: %s&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, the driver pod (hence, the Spark application) is expected to complete, successfully or not, within 60 seconds or less. Its status should only change twice during this period: ideally &lt;code&gt;Pending&lt;/code&gt; &amp;gt; &lt;code&gt;Running&lt;/code&gt; &amp;gt; &lt;code&gt;Succeeded&lt;/code&gt;. &lt;/p&gt;

&lt;h2&gt;
  
  
  Getting the logs
&lt;/h2&gt;

&lt;p&gt;It is possible to retrieve the logs of the driver pod and mix them into those of the host application.&lt;br&gt;
Getting the logs is just as easy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;watch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Thread&lt;/span&gt;

&lt;span class="n"&gt;v1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CoreV1Api&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;pod_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pyspark-pi-routine-bf20cae50b6a8253-driver&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;logs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pod_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;watch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Watch&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_namespaced_pod_log&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pod_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pod_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;spark-jobs&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_request_timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;

&lt;span class="c1"&gt;# We surely don't want to block the main thread while reading the logs
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;logs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pod_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Getting the Ingress URI
&lt;/h2&gt;

&lt;p&gt;Once a Spark application has started, the ingress (at least, its public IP address) that exposes the Spark UI may take a while before it becomes available. Here too, we can monitor the Ingress resource as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;watch&lt;/span&gt;

&lt;span class="n"&gt;app_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pyspark-pi-routine-bf20cae50b6a8253&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="n"&gt;networking_v1_beta1_api&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;NetworkingV1beta1Api&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;watch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Watch&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;label_selector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;app-name=%s&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;app_name&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;networking_v1_beta1_api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;list_namespaced_ingress&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;label_selector&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;label_selector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;timeout_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ingress&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load_balancer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ingress&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ingress&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;external_ip&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ingress&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;ip&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Event: The Spark Web UI is available at http://%s/%s&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;external_ip&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;app_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Event: Ingress not yet available&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;What a hell of a journey!&lt;/p&gt;

&lt;p&gt;We have seen that the Python code for launching or deleting Spark applications is slightly different depending on whether we are using the Spark Operator or spark-submit. But since we name and label the Kubernetes objects consistently between the two, and as we set the ownership relationships properly, we can monitor our Spark applications and manage their lifecycle equally. &lt;/p&gt;

&lt;p&gt;Would you like to know more? The Python scripts explained in this last article are available in this &lt;a href="https://github.com/pgillet/k8s-python-client-examples" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;. Serve yourself.&lt;/p&gt;

</description>
      <category>spark</category>
      <category>kubernetes</category>
      <category>python</category>
    </item>
    <item>
      <title>My Journey With Spark On Kubernetes... In Python (2/3)</title>
      <dc:creator>Pascal Gillet</dc:creator>
      <pubDate>Mon, 12 Apr 2021 08:10:36 +0000</pubDate>
      <link>https://forem.com/stack-labs/my-journey-with-spark-on-kubernetes-in-python-2-3-5e8j</link>
      <guid>https://forem.com/stack-labs/my-journey-with-spark-on-kubernetes-in-python-2-3-5e8j</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/stack-labs/my-journey-with-spark-on-kubernetes-in-python-1-3-4nl3"&gt;previous article&lt;/a&gt;, we saw how to launch Spark applications with the Spark Operator. In this article, we'll see how to do the same thing, but natively with spark-submit. Let's first explain the differences between the two ways of deploying your driver on the worker nodes.&lt;/p&gt;

&lt;h1&gt;
  
  
  Client vs Cluster Mode
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Spark-submit in cluster mode
&lt;/h2&gt;

&lt;p&gt;In &lt;em&gt;cluster mode&lt;/em&gt;, your application is submitted from a machine far from the worker machines (e.g. locally on your laptop). You need a Spark distribution installed on this machine to be able to actually run the &lt;code&gt;spark-submit&lt;/code&gt; script. &lt;br&gt;
In this mode, the script exits normally as soon as the application has been submitted. The driver is then detached and can run on its own in the kubernetes cluster. You can print the logs of the driver pod with the &lt;code&gt;kubectl logs&lt;/code&gt; command to see the output of the application.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5fm50tip9stkxemr7jpf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5fm50tip9stkxemr7jpf.png" alt="k8s cluster mode" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is possible to use the authenticating &lt;code&gt;kubectl&lt;/code&gt; proxy to communicate to the Kubernetes API.&lt;/p&gt;

&lt;p&gt;The local proxy can be started by:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl proxy &amp;amp;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the local proxy is running at localhost:8001, the remote Kubernetes cluster can be reached by &lt;code&gt;spark-submit&lt;/code&gt; by specifying &lt;code&gt;--master k8s://http://127.0.0.1:8001&lt;/code&gt; as an argument to &lt;code&gt;spark-submit&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Spark-submit in client mode
&lt;/h2&gt;

&lt;p&gt;In &lt;em&gt;client mode&lt;/em&gt;, the &lt;code&gt;spark-submit&lt;/code&gt; command is directly passed with its arguments to the Spark container in the driver pod. With the &lt;code&gt;deploy-mode&lt;/code&gt; option set to &lt;code&gt;client&lt;/code&gt;, the driver is launched directly within the &lt;code&gt;spark-submit&lt;/code&gt; process which acts as a client to the cluster. The input and output of the application are attached to the logs from the pod.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flnqt2jv1t0pnshvxiy5p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flnqt2jv1t0pnshvxiy5p.png" alt="k8s client mode" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Who Does What?
&lt;/h2&gt;

&lt;p&gt;With "native" Spark, we will execute Spark applications in client mode, so as not to depend on a local Spark distribution. Specifically, the user creates a driver pod resource with &lt;code&gt;kubectl&lt;/code&gt;, and the driver pod will then run &lt;code&gt;spark-submit&lt;/code&gt; in client mode internally to run the driver program.&lt;/p&gt;

&lt;p&gt;With Spark Operator, a &lt;code&gt;SparkApplication&lt;/code&gt; should set &lt;code&gt;.spec.deployMode&lt;/code&gt; to &lt;code&gt;cluster&lt;/code&gt;, as &lt;code&gt;client&lt;/code&gt; is not currently implemented. Behind the scenes, the behavior is exactly the same as with native Spark: the operator's controller thus embeds a Spark distribution which plays the role of Spark scheduler; driver pods are spawned from the controller... and then run &lt;code&gt;spark-submit&lt;/code&gt; in client mode internally to run the driver program. But this is globally transparent for the end user.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi8bwquafdakm4ea310ha.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi8bwquafdakm4ea310ha.png" alt="Spark Operator architecture diagram" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Additional details of how &lt;code&gt;SparkApplications&lt;/code&gt; are run can be found in the &lt;a href="https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/design.md#architecture" rel="noopener noreferrer"&gt;design documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Driver Pod
&lt;/h1&gt;

&lt;p&gt;With native Spark, the main resource is the driver pod.&lt;br&gt;
To run the Pi example program like with the Spark Operator, the driver pod must be created using the data in the following YAML file:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pyspark-pi-driver-pod.yaml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pyspark-pi-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}&lt;/span&gt;
    &lt;span class="na"&gt;spark-role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;driver&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pyspark-pi-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}-driver&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${NAMESPACE}&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pyspark-pi&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eu.gcr.io/yippee-spark-k8s/spark-py:3.0.1&lt;/span&gt;
    &lt;span class="na"&gt;imagePullPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;IfNotPresent&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5678&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;headless-svc&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4040&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-ui&lt;/span&gt;
    &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;512Mi&lt;/span&gt;
        &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
      &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1200m&lt;/span&gt;
    &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Overriding configuration directory&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SPARK_CONF_DIR&lt;/span&gt;
      &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/spark-conf&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SPARK_HOME&lt;/span&gt;
      &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/opt/spark&lt;/span&gt;
    &lt;span class="c1"&gt;# Configure all key-value pairs in ConfigMap as container environment variables&lt;/span&gt;
    &lt;span class="na"&gt;envFrom&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;configMapRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pyspark-pi-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}-cm&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;$(SPARK_HOME)/bin/spark-submit&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/opt/spark/examples/src/main/python/pi.py&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10"&lt;/span&gt;
    &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-config&lt;/span&gt;
        &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/spark-conf&lt;/span&gt;
        &lt;span class="na"&gt;readOnly&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;affinity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;nodeAffinity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;requiredDuringSchedulingIgnoredDuringExecution&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;nodeSelectorTerms&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;matchExpressions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;type&lt;/span&gt;
              &lt;span class="na"&gt;operator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;In&lt;/span&gt;
              &lt;span class="na"&gt;values&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$&lt;/span&gt;&lt;span class="pi"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;DRIVER_NODE_AFFINITIES&lt;/span&gt;&lt;span class="pi"&gt;}]&lt;/span&gt;
  &lt;span class="na"&gt;priorityClassName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${PRIORITY_CLASS_NAME}&lt;/span&gt;
  &lt;span class="na"&gt;restartPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;OnFailure&lt;/span&gt;
  &lt;span class="na"&gt;schedulerName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;volcano&lt;/span&gt;
  &lt;span class="na"&gt;serviceAccountName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${SERVICE_ACCOUNT_NAME}&lt;/span&gt;
  &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Add the executor pod template in read-only volume, for the driver to read&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-config&lt;/span&gt;
      &lt;span class="na"&gt;configMap&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pyspark-pi-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}-cm&lt;/span&gt;
        &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-defaults.conf&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-defaults.conf&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-env.sh&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-env.sh&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;executor-pod-template.yaml&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;executor-pod-template.yaml&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Don't pay attention to the variable placeholders for now&lt;/strong&gt;, even though you can imagine what they can be used for.&lt;/p&gt;

&lt;p&gt;As you can see, the driver pod depends on other Kubernetes resources. We will detail each of them.&lt;/p&gt;

&lt;h1&gt;
  
  
  Spark Configuration
&lt;/h1&gt;

&lt;p&gt;We use a &lt;a href="https://kubernetes.io/docs/concepts/configuration/configmap/" rel="noopener noreferrer"&gt;ConfigMap&lt;/a&gt; for setting Spark configuration data separately from the driver pod definition.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ConfigMap&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}-cm&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${NAMESPACE}&lt;/span&gt;
&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# Comma-separated list of .zip, .egg, or .py files dependencies for Python apps.&lt;/span&gt;
  &lt;span class="c1"&gt;# spark.submit.pyFiles can be used instead in spark-defaults.conf below.&lt;/span&gt;
  &lt;span class="c1"&gt;# PYTHONPATH: ...&lt;/span&gt;
  &lt;span class="na"&gt;spark-env.sh&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;#!/usr/bin/env bash&lt;/span&gt;

    &lt;span class="s"&gt;export DUMMY=dummy&lt;/span&gt;
    &lt;span class="s"&gt;echo "Here we are, ${DUMMY}!"&lt;/span&gt;
  &lt;span class="na"&gt;spark-defaults.conf&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;spark.master                                k8s://https://kubernetes.default&lt;/span&gt;
    &lt;span class="s"&gt;spark.submit.deployMode                     client&lt;/span&gt;

    &lt;span class="s"&gt;spark.executor.instances                    2&lt;/span&gt;
    &lt;span class="s"&gt;spark.executor.cores                        1&lt;/span&gt;
    &lt;span class="s"&gt;spark.executor.memory                       512m&lt;/span&gt;

    &lt;span class="s"&gt;spark.kubernetes.executor.container.image   eu.gcr.io/yippee-spark-k8s/spark-py:3.0.1&lt;/span&gt;
    &lt;span class="s"&gt;spark.kubernetes.container.image.pullPolicy IfNotPresent&lt;/span&gt;
    &lt;span class="s"&gt;spark.kubernetes.namespace                  spark-jobs&lt;/span&gt;
    &lt;span class="s"&gt;# Must match the mount path of the ConfigMap volume in driver pod&lt;/span&gt;
    &lt;span class="s"&gt;spark.kubernetes.executor.podTemplateFile   /spark-conf/executor-pod-template.yaml&lt;/span&gt;
    &lt;span class="s"&gt;spark.kubernetes.pyspark.pythonVersion      2&lt;/span&gt;
    &lt;span class="s"&gt;spark.kubernetes.driver.pod.name            spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}-driver&lt;/span&gt;

    &lt;span class="s"&gt;spark.driver.host                           spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}-driver-svc&lt;/span&gt;
    &lt;span class="s"&gt;spark.driver.port                           5678&lt;/span&gt;

    &lt;span class="s"&gt;# Config params for use with an ingress to expose the Web UI&lt;/span&gt;
    &lt;span class="s"&gt;spark.ui.proxyBase                          /spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}&lt;/span&gt;
    &lt;span class="s"&gt;# spark.ui.proxyRedirectUri                   http://&amp;lt;load balancer static IP address&amp;gt;&lt;/span&gt;
  &lt;span class="na"&gt;executor-pod-template.yaml&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;apiVersion: v1&lt;/span&gt;
    &lt;span class="s"&gt;kind: Pod&lt;/span&gt;
    &lt;span class="s"&gt;metadata:&lt;/span&gt;
      &lt;span class="s"&gt;labels:&lt;/span&gt;
        &lt;span class="s"&gt;app-name: spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}&lt;/span&gt;
        &lt;span class="s"&gt;spark-role: executor&lt;/span&gt;
    &lt;span class="s"&gt;spec:&lt;/span&gt;
      &lt;span class="s"&gt;affinity:&lt;/span&gt;
        &lt;span class="s"&gt;nodeAffinity:&lt;/span&gt;
          &lt;span class="s"&gt;requiredDuringSchedulingIgnoredDuringExecution:&lt;/span&gt;
            &lt;span class="s"&gt;nodeSelectorTerms:&lt;/span&gt;
            &lt;span class="s"&gt;- matchExpressions:&lt;/span&gt;
              &lt;span class="s"&gt;- key: type&lt;/span&gt;
                &lt;span class="s"&gt;operator: In&lt;/span&gt;
                &lt;span class="s"&gt;values: [${EXECUTOR_NODE_AFFINITIES}]&lt;/span&gt;
      &lt;span class="s"&gt;priorityClassName: ${PRIORITY_CLASS_NAME}&lt;/span&gt;
      &lt;span class="s"&gt;schedulerName: volcano&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Executor Pod Template
&lt;/h2&gt;

&lt;p&gt;We use a template file in the &lt;code&gt;ConfigMap&lt;/code&gt; to define the executor pod configuration. &lt;a href="https://spark.apache.org/docs/latest/running-on-kubernetes.html#pod-template" rel="noopener noreferrer"&gt;Templates&lt;/a&gt; are typically used for configuring Spark pods that cannot be configured otherwise, via Spark properties or environment variables. Thus, template files mostly contain fine-grained configuration related to deployment at the Kubernetes level: here, the node affinity and the priority class name.&lt;/p&gt;

&lt;p&gt;To make the pod template file accessible to the spark-submit process, we must set the Spark property &lt;code&gt;spark.kubernetes.executor.podTemplateFile&lt;/code&gt; with its local pathname in the driver pod. To do so, the file will be automatically mounted onto a volume in the driver pod when it’s created.&lt;/p&gt;

&lt;h2&gt;
  
  
  Executor Pod Garbage Collection
&lt;/h2&gt;

&lt;p&gt;We must also set &lt;code&gt;spark.kubernetes.driver.pod.name&lt;/code&gt; for the executors to the name of the driver pod. When this property is set, the Spark scheduler will deploy the executor pods with an &lt;code&gt;ownerReference&lt;/code&gt;, which in turn will ensure that once the driver pod is deleted from the cluster, all of the application’s executor pods will also be deleted.&lt;/p&gt;

&lt;h1&gt;
  
  
  Client Mode Networking
&lt;/h1&gt;

&lt;p&gt;In client mode, the driver runs inside a pod. Spark executors must be able to connect to the Spark driver by the means of Kubernetes networking. To do that, we use a &lt;a href="https://kubernetes.io/docs/concepts/services-networking/service/#headless-services" rel="noopener noreferrer"&gt;headless&lt;/a&gt; service to allow the driver pod to be routable from the executors by a stable hostname. When deploying the headless service, we ensure that the service will only match the driver pod and no other pods by assigning the driver pod a (sufficiently) unique label and by using that label in the &lt;code&gt;label selector&lt;/code&gt; of the headless service. We can then pass to the executors the driver’s hostname via &lt;code&gt;spark.driver.host&lt;/code&gt; with the service name and the spark driver’s port to &lt;code&gt;spark.driver.port&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;spark-driver-svc.yaml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}-driver-svc&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;clusterIP&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;None&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5678-5678&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5678&lt;/span&gt;
    &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
    &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5678&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}&lt;/span&gt;
    &lt;span class="na"&gt;spark-role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;driver&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterIP&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Spark UI
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Ingress
&lt;/h2&gt;

&lt;p&gt;The Spark UI is accessible by creating a service of type &lt;code&gt;ClusterIP&lt;/code&gt; which exposes the UI from the driver pod:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;spark-ui-svc.yaml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}-ui-svc&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-driver-ui-port&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4040&lt;/span&gt;
    &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
    &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4040&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}&lt;/span&gt;
    &lt;span class="na"&gt;spark-role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;driver&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterIP&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this service alone, the UI is only accessible from inside the cluster. We must then create an &lt;code&gt;Ingress&lt;/code&gt; to expose the UI outside the cluster. In order for the Ingress resource to work, the cluster must have an ingress controller &lt;br&gt;
running. You can choose the &lt;a href="https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/" rel="noopener noreferrer"&gt;ingress controller implementation&lt;/a&gt; that best fits your cluster. &lt;br&gt;
&lt;a href="https://doc.traefik.io/traefik/providers/kubernetes-ingress/" rel="noopener noreferrer"&gt;Traefik&lt;/a&gt; and &lt;a href="https://kubernetes.github.io/ingress-nginx/" rel="noopener noreferrer"&gt;Nginx&lt;/a&gt; are very popular choices. The ingress below is configured for Nginx: &lt;/p&gt;

&lt;p&gt;&lt;code&gt;spark-ui-ingress.yaml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;networking.k8s.io/v1beta1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Ingress&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}-ui-ingress&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;kubernetes.io/ingress.class&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx&lt;/span&gt;
    &lt;span class="na"&gt;nginx.ingress.kubernetes.io/rewrite-target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/$2&lt;/span&gt;
    &lt;span class="na"&gt;nginx.ingress.kubernetes.io/proxy-redirect-from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://$host/"&lt;/span&gt;
    &lt;span class="na"&gt;nginx.ingress.kubernetes.io/proxy-redirect-to&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://$host/spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}/"&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}(/|$)(.*)&lt;/span&gt;
        &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;serviceName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-${PRIORITY_CLASS_NAME}${NAME_SUFFIX}-ui-svc&lt;/span&gt;
          &lt;span class="na"&gt;servicePort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4040&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Ingress is backed by as many services as driver pods that run concurrently in the cluster. Each UI must then be addressed with a unique &lt;code&gt;path&lt;/code&gt; in the Ingress. The path in question is directly derived from the name of the Spark application, with its unique ID. &lt;/p&gt;

&lt;p&gt;The interface is always available at &lt;code&gt;http://&amp;lt;driver-pod-ip&amp;gt;:4040&lt;/code&gt; (from inside the cluster), as each driver pod is assigned a unique IP and there is only one &lt;code&gt;SparkContext&lt;/code&gt; running in the pod host (otherwise, the UI would be bound to successive ports beginning with 4040: 4041, 4042, etc).&lt;/p&gt;

&lt;p&gt;So, natively, the different UIs that might be available at a given time on a same host are only differentiated by their assigned port number. The UI has been designed following that principle, and all HTTP operations are performed relatively to the same root path in the URL, which is...&lt;code&gt;/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;So for the UI to be effectively accessible through the Ingress, we must set up HTTP redirection with an alternative root path (the one configured in the Ingress). To work smoothly, the UI itself must be aware of this redirection by setting &lt;code&gt;spark.ui.proxyBase&lt;/code&gt; to this root path...and that's it!&lt;/p&gt;

&lt;h3&gt;
  
  
  Spark Operator
&lt;/h3&gt;

&lt;p&gt;The operator supports creating an optional Ingress for the Spark Web UI. This can be turned on by setting the &lt;code&gt;ingress-url-format&lt;/code&gt; command-line flag. The &lt;code&gt;ingress-url-format&lt;/code&gt; should be a template like &lt;code&gt;{{$appName}}.{ingress_suffix}/{{$appNamespace}}/{{$appName}}&lt;/code&gt;. The &lt;code&gt;{ingress_suffix}&lt;/code&gt; should be replaced by the user to indicate the cluster's Ingress url and the operator will replace the &lt;code&gt;{{$appName}}&lt;/code&gt; and &lt;code&gt;{{$appNamespace}}&lt;/code&gt; with the appropriate value. Please note that Ingress support requires that cluster's Ingress url routing is correctly setup. For e.g. if the &lt;code&gt;ingress-url-format&lt;/code&gt; is &lt;code&gt;{{$appName}}.ingress.cluster.com&lt;/code&gt;, it requires that anything &lt;code&gt;*.ingress.cluster.com&lt;/code&gt; should be routed to the Ingress controller on the K8s cluster.&lt;/p&gt;

&lt;p&gt;Example of Spark Operator install with the &lt;code&gt;ingress-url-format&lt;/code&gt; command-line flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./helm &lt;span class="nb"&gt;install &lt;/span&gt;spark-operator incubator/sparkoperator &lt;span class="nt"&gt;--namespace&lt;/span&gt; spark-operator &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;enableWebhook&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;enableBatchScheduler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;ingressUrlFormat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\{\{\$&lt;/span&gt;&lt;span class="s2"&gt;appName&lt;/span&gt;&lt;span class="se"&gt;\}\}&lt;/span&gt;&lt;span class="s2"&gt;.ingress.cluster.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that curly braces must be escaped.&lt;/p&gt;

&lt;h2&gt;
  
  
  Future Work
&lt;/h2&gt;

&lt;p&gt;I could not enable the Spark Operator Ingress during my experiments, simply because a DNS name is required. Instead, the same Ingress as for native Spark is "grafted" to the &lt;code&gt;SparkApplication&lt;/code&gt;, with path-based routing.&lt;/p&gt;

&lt;p&gt;The operation proposed by the Spark Operator, with routing based on hostname wildcards (for example &lt;code&gt;*.ingress.cluster.com&lt;/code&gt;), is nevertheless interesting as it would overcome the problem of HTTP redirect described above.&lt;/p&gt;

&lt;p&gt;With hostname wildcards, and therefore without the HTTP redirect, the UI service could be switched to &lt;code&gt;NodePort&lt;/code&gt; type (a NodePort service exposes the Service on each Node's IP at a static port) and still be compatible with the Ingress.&lt;br&gt;
The UI would thus be accessible outside the cluster through both the Ingress with its external URL configured, and the &lt;code&gt;NodePort&lt;/code&gt; service at &lt;code&gt;http://&amp;lt;node-ip&amp;gt;:&amp;lt;service-port&amp;gt;&lt;/code&gt;. A service of type &lt;code&gt;NodePort&lt;/code&gt; is still relevant in a private cloud.&lt;/p&gt;

&lt;h1&gt;
  
  
  Object Names and Labels
&lt;/h1&gt;

&lt;p&gt;We use the &lt;code&gt;app-name&lt;/code&gt; label to semantically group all the Kubernetes resources related to a single Spark application.&lt;br&gt;
In our case, this label provides uniqueness, and we do not expect multiple Spark applications to carry the same value for this label (at least in the &lt;em&gt;namespace-time&lt;/em&gt; considered).&lt;/p&gt;

&lt;p&gt;For a given Spark application, all the object names are then derived from the &lt;code&gt;app-name&lt;/code&gt;. We simply add a suffix which also qualifies the type of the object: &lt;code&gt;-driver&lt;/code&gt; for the driver pod, &lt;code&gt;-driver-svc&lt;/code&gt; for the driver service, &lt;code&gt;-ui-svc&lt;/code&gt; for the Spark UI service, &lt;code&gt;-ui-ingress&lt;/code&gt; for the Spark UI ingress, and &lt;code&gt;-cm&lt;/code&gt; for the ConfigMap.&lt;/p&gt;

&lt;p&gt;We also set the label &lt;code&gt;spark-role&lt;/code&gt; at the pod level to differentiate the drivers from their executors.&lt;/p&gt;

&lt;p&gt;This naming and labeling is consistent with the Spark Operator. As a result, Spark applications can be treated and filtered in the same way, whether they are launched with the Spark Operator or with spark-submit. 👍&lt;/p&gt;

&lt;h1&gt;
  
  
  Deployment
&lt;/h1&gt;

&lt;p&gt;We have now managed to mimic the same behavior we got with the Spark Operator, here is what we have deployed in Kubernetes:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F63lp45g1c8bh3tf9rprx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F63lp45g1c8bh3tf9rprx.png" alt="spark submit" width="800" height="569"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now that we've defined all of our Kubernetes resources for spark-submit, we're going to get our hands on some Python code to orchestrate all of this. See you in the &lt;a href="https://dev.to/stack-labs/my-journey-with-spark-on-kubernetes-in-python-3-3-536e"&gt;third and final article&lt;/a&gt; in this series.&lt;/p&gt;

</description>
      <category>spark</category>
      <category>kubernetes</category>
      <category>python</category>
    </item>
    <item>
      <title>My Journey With Spark On Kubernetes... In Python (1/3)</title>
      <dc:creator>Pascal Gillet</dc:creator>
      <pubDate>Mon, 12 Apr 2021 08:09:38 +0000</pubDate>
      <link>https://forem.com/stack-labs/my-journey-with-spark-on-kubernetes-in-python-1-3-4nl3</link>
      <guid>https://forem.com/stack-labs/my-journey-with-spark-on-kubernetes-in-python-1-3-4nl3</guid>
      <description>&lt;p&gt;&lt;em&gt;Je vous parle d'un temps&lt;br&gt;
Que les moins de vingt ans&lt;br&gt;
Ne peuvent pas connaître&lt;/em&gt; 🎶&lt;/p&gt;

&lt;p&gt;Until not long ago, the way to go to run Spark on a cluster was either with Spark's own standalone cluster manager, Mesos or YARN. In the meantime, the Kingdom of Kubernetes has risen and spread widely.&lt;/p&gt;

&lt;p&gt;And when it comes to run Spark on Kubernetes, you now have two choices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Use "native" Spark's Kubernetes capabilities: Spark can run on clusters managed by Kubernetes since Spark 2.3. Kubernetes support was still flagged as experimental until very recently, but as per &lt;a href="https://issues.apache.org/jira/browse/SPARK-33005" rel="noopener noreferrer"&gt;SPARK-33005 Kubernetes GA Preparation&lt;/a&gt;, Spark on Kubernetes is now fully supported and production ready! 🎊&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use the Spark Operator, proposed and maintained by Google, which is still in beta version (and always will be).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This series of 3 articles tells the story of my experiments with both methods, and how I launch Spark applications from Python code.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Cabin crew, arm doors and cross check"&lt;/em&gt;. Let's go! ✈️&lt;/p&gt;


&lt;h1&gt;
  
  
  Prerequisites
&lt;/h1&gt;
&lt;h2&gt;
  
  
  Service Account for Driver Pods
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Remember, &lt;em&gt;Spark applications run as independent sets of processes on a cluster, coordinated by the &lt;code&gt;SparkContext&lt;/code&gt; object in your main program, called the &lt;code&gt;driver&lt;/code&gt;. Once connected, the SparkContext acquires &lt;code&gt;executors&lt;/code&gt; on nodes in the cluster, which are the processes that run computations and store data for your application&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Thus, Spark driver pods need a Kubernetes service account in the pod's namespace that has permissions to create, get, list, and delete executor pods. Below an example RBAC setup that creates a driver service account named &lt;code&gt;driver-sa&lt;/code&gt; in the namespace &lt;code&gt;spark-jobs&lt;/code&gt;, with a RBAC role binding giving the service account the needed permissions.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;k8s/driver-sa-rbac.yaml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ServiceAccount&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;driver-sa&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Role&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-role&lt;/span&gt;
&lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;apiGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pods"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;verbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;apiGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;services"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;verbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;RoleBinding&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-role-binding&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
&lt;span class="na"&gt;subjects&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ServiceAccount&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;driver-sa&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
&lt;span class="na"&gt;roleRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Role&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-role&lt;/span&gt;
  &lt;span class="na"&gt;apiGroup&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl create namespace spark-jobs
kubectl create &lt;span class="nt"&gt;-f&lt;/span&gt; k8s/driver-sa-rbac.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Node Affinity
&lt;/h2&gt;

&lt;p&gt;By default, the scheduler automatically places pods on nodes by ensuring nodes have sufficient free resources, distributing pods evenly across nodes, etc.&lt;br&gt;
But there are circumstances where you may want more control on a node where a pod lands, for example to ensure that a pod ends up on a memory or compute-optimized machine, or with an SSD attached to it.&lt;br&gt;
In our case, Spark executors need more resources than drivers. We thus need to constrain driver pods and executor pods to only be able to run on particular node(s). We will use &lt;a href="https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/" rel="noopener noreferrer"&gt;Node Affinities&lt;/a&gt; with &lt;a href="https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/" rel="noopener noreferrer"&gt;label selectors&lt;/a&gt; to make the selection.&lt;/p&gt;

&lt;p&gt;Execute the following command for the node(s) intended to execute driver pods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl label nodes &amp;lt;node-name&amp;gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;driver
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For executor pods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl label nodes &amp;lt;node-name&amp;gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;compute
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Pod Priority and Preemption
&lt;/h2&gt;

&lt;p&gt;In my project, we aim to run multiple Spark jobs simultaneously in parallel. But some workloads have higher priority than others. If a job cannot be scheduled, the scheduler (here, Volcano) tries to preempt (evict) lower priority Pods to make scheduling of the pending Pod possible.&lt;br&gt;
To use priority and preemption capabilities, we must first create the necessary &lt;code&gt;PriorityClasses&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;k8s/priorities.yaml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scheduling.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PriorityClass&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;routine&lt;/span&gt;
&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
&lt;span class="na"&gt;preemptionPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Never&lt;/span&gt;
&lt;span class="na"&gt;globalDefault&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Routine&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;priority"&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scheduling.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PriorityClass&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;urgent&lt;/span&gt;
&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
&lt;span class="na"&gt;preemptionPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Never&lt;/span&gt;
&lt;span class="na"&gt;globalDefault&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Urgent&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;priority"&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scheduling.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PriorityClass&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;exceptional&lt;/span&gt;
&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;
&lt;span class="na"&gt;preemptionPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Never&lt;/span&gt;
&lt;span class="na"&gt;globalDefault&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Exceptional&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;priority"&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scheduling.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PriorityClass&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rush&lt;/span&gt;
&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;
&lt;span class="na"&gt;preemptionPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PreemptLowerPriority&lt;/span&gt;
&lt;span class="na"&gt;globalDefault&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Rush&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;priority"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl create &lt;span class="nt"&gt;-f&lt;/span&gt; k8s/priorities.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, only the PriorityClass "rush" is allowed to preempt lower-priority pods. Pods with other priorities will be placed in the scheduling queue ahead of lower-priority pods, but they cannot preempt other pods. They'll just have to wait until sufficient resources are free to be scheduled.&lt;/p&gt;

&lt;h1&gt;
  
  
  Volcano Scheduler
&lt;/h1&gt;

&lt;p&gt;For our experiments, we will use &lt;a href="https://github.com/volcano-sh/volcano" rel="noopener noreferrer"&gt;Volcano&lt;/a&gt; which is a batch scheduler for Kubernetes, well-suited for scheduling Spark applications pods with a better efficiency than the default &lt;a href="https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/" rel="noopener noreferrer"&gt;kube-scheduler&lt;/a&gt;.&lt;br&gt;
The main reason is that Volcano allows &lt;strong&gt;"group scheduling"&lt;/strong&gt; or &lt;strong&gt;"gang scheduling"&lt;/strong&gt;: while the default scheduler of Kubernetes schedules containers one by one, Volcano ensures that a &lt;em&gt;gang&lt;/em&gt; of related containers (here, the Spark driver and its executors) can be scheduled at the same time. If for any reason it is not possible to deploy all the containers in a gang, Volcano will not schedule that gang. &lt;br&gt;
This &lt;a href="https://www.cncf.io/blog/2021/02/10/three-reasons-why-you-need-volcano/" rel="noopener noreferrer"&gt;article&lt;/a&gt; explains in more detail the reasons for using Volcano.&lt;/p&gt;
&lt;h2&gt;
  
  
  Install Volcano
&lt;/h2&gt;

&lt;p&gt;Install Volcano on the cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; https://raw.githubusercontent.com/volcano-sh/volcano/master/installer/volcano-development.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Enable job preemption
&lt;/h2&gt;

&lt;p&gt;The preempt action is responsible for preemptive scheduling of high priority tasks in the same queue according to priority rules.&lt;/p&gt;

&lt;p&gt;This action is disabled by default in Volcano. To enable job preemption, edit the Volcano configuration as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl edit configmap volcano-scheduler-configmap &lt;span class="nt"&gt;--namespace&lt;/span&gt; volcano-system
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And add &lt;code&gt;preempt&lt;/code&gt; to the list of &lt;code&gt;actions&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Please edit the object below. Lines beginning with a '#' will be ignored,&lt;/span&gt;
&lt;span class="c1"&gt;# and an empty file will abort the edit. If an error occurs while saving this file will be&lt;/span&gt;
&lt;span class="c1"&gt;# reopened with the relevant failures.&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;volcano-scheduler.conf&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;actions: "enqueue, allocate, preempt, backfill"&lt;/span&gt;
    &lt;span class="s"&gt;tiers:&lt;/span&gt;
    &lt;span class="s"&gt;- plugins:&lt;/span&gt;
      &lt;span class="s"&gt;- name: priority&lt;/span&gt;
      &lt;span class="s"&gt;- name: gang&lt;/span&gt;
      &lt;span class="s"&gt;- name: conformance&lt;/span&gt;
    &lt;span class="s"&gt;- plugins:&lt;/span&gt;
      &lt;span class="s"&gt;- name: drf&lt;/span&gt;
      &lt;span class="s"&gt;- name: predicates&lt;/span&gt;
      &lt;span class="s"&gt;- name: proportion&lt;/span&gt;
      &lt;span class="s"&gt;- name: nodeorder&lt;/span&gt;
      &lt;span class="s"&gt;- name: binpack&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ConfigMap&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="s"&gt;...&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that job preemption in Volcano relies on the priority plugin that compares the priorities of two jobs or tasks. For two jobs, it decides whose priority is higher by comparing &lt;code&gt;job.spec.priorityClassName&lt;/code&gt;. For two tasks, it decides whose priority is higher by comparing &lt;code&gt;task.priorityClassName&lt;/code&gt;, &lt;code&gt;task.createTime&lt;/code&gt;, and &lt;code&gt;task.id&lt;/code&gt; in order.&lt;/p&gt;

&lt;h2&gt;
  
  
  Enable Volcano scheduling in your workload
&lt;/h2&gt;

&lt;p&gt;For your workload to be scheduled by Volcano, you just need to set &lt;code&gt;schedulerName: volcano&lt;/code&gt; in your pod's &lt;code&gt;spec&lt;/code&gt; (or &lt;code&gt;batchScheduler: volcano&lt;/code&gt; in the &lt;code&gt;SparkApplication&lt;/code&gt;'s &lt;code&gt;spec&lt;/code&gt; if you use the Spark Operator). By default, the workload is scheduled with the default &lt;code&gt;kube-scheduler&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;To be consistent, we will ensure that the same scheduler is used for driver and executor pods.&lt;/strong&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Spark In Docker
&lt;/h1&gt;

&lt;p&gt;To run Spark applications on Kubernetes ... ehh ... You need a Docker image that embeds a Spark distribution.&lt;br&gt;
This section explains how to build an "official" Spark Docker image and how to run a basic Spark application with it.&lt;/p&gt;
&lt;h2&gt;
  
  
  Spark Docker image
&lt;/h2&gt;

&lt;p&gt;Spark (starting with version 2.3) ships with Dockerfiles that can be used to build different Spark Docker images (and customize them to match an individual application’s needs) to use with a Kubernetes backend. They can be found in the &lt;code&gt;kubernetes/dockerfiles/&lt;/code&gt; directory.&lt;/p&gt;

&lt;p&gt;Spark also ships with a &lt;code&gt;bin/docker-image-tool.sh&lt;/code&gt; script that eases the building (and the publishing) of the Docker images.&lt;/p&gt;

&lt;p&gt;Example usage to build an image with the Python binding (PySpark):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./bin/docker-image-tool.sh &lt;span class="nt"&gt;-t&lt;/span&gt; &amp;lt;tag&amp;gt; &lt;span class="nt"&gt;-p&lt;/span&gt; ./kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create a local Docker image named &lt;code&gt;spark-py:&amp;lt;tag&amp;gt;&lt;/code&gt;. You can set the &lt;code&gt;&amp;lt;tag&amp;gt;&lt;/code&gt; with the Spark's actual version. &lt;/p&gt;

&lt;h2&gt;
  
  
  Running the Examples
&lt;/h2&gt;

&lt;p&gt;Once the image is bundled, you can launch a Spark application using the &lt;code&gt;bin/spark-submit&lt;/code&gt; script.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run Spark locally with two worker threads&lt;/span&gt;
docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; spark-py:3.0.1 /opt/spark/bin/spark-submit &lt;span class="nt"&gt;--master&lt;/span&gt; &lt;span class="nb"&gt;local&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;2] /opt/spark/examples/src/main/python/pi.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Recommendation
&lt;/h2&gt;

&lt;p&gt;It is strongly recommended starting from an official &lt;em&gt;base&lt;/em&gt; image to create any custom Spark image.&lt;/p&gt;

&lt;h1&gt;
  
  
  Spark Operator
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Configuring and installing the Kubernetes Operator for Apache Spark
&lt;/h2&gt;

&lt;p&gt;In this section, you use &lt;a href="https://github.com/kubernetes/helm" rel="noopener noreferrer"&gt;Helm&lt;/a&gt; to deploy the &lt;a href="https://github.com/GoogleCloudPlatform/spark-on-k8s-operator" rel="noopener noreferrer"&gt;Kubernetes Operator for Apache Spark&lt;/a&gt; from the incubator &lt;a href="https://github.com/helm/charts/tree/master/incubator/sparkoperator" rel="noopener noreferrer"&gt;Chart&lt;/a&gt; repository. Helm is a package manager you can use to configure and deploy Kubernetes apps.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install Helm
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Download and install the &lt;code&gt;Helm&lt;/code&gt; binary:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;wget https://get.helm.sh/helm-v3.3.4-linux-amd64.tar.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Unzip the file to your local system:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;tar &lt;/span&gt;zxfv helm-v2.12.3-linux-amd64.tar.gz
&lt;span class="nb"&gt;cp &lt;/span&gt;linux-amd64/helm &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Ensure that Helm is properly installed by running the following command:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./helm version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If Helm is correctly installed, you should see the following output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;version.BuildInfo&lt;span class="o"&gt;{&lt;/span&gt;Version:&lt;span class="s2"&gt;"v3.3.4"&lt;/span&gt;, GitCommit:&lt;span class="s2"&gt;"a61ce5633af99708171414353ed49547cf05013d"&lt;/span&gt;, GitTreeState:&lt;span class="s2"&gt;"clean"&lt;/span&gt;, GoVersion:&lt;span class="s2"&gt;"go1.14.9"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Install the chart
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
kubectl create namespace spark-operator
helm &lt;span class="nb"&gt;install &lt;/span&gt;spark-operator incubator/sparkoperator &lt;span class="nt"&gt;--namespace&lt;/span&gt; spark-operator &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;enableWebhook&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;enableBatchScheduler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The flag &lt;code&gt;enableBatchScheduler=true&lt;/code&gt; enables Volcano. To install the operator with Vocano enabled, you must also install the mutating admission webhook with the flag &lt;code&gt;enableWebhook=true&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now you should see the operator running in the cluster by checking the status of the Helm release:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./helm status spark-operator &lt;span class="nt"&gt;--namespace&lt;/span&gt; spark-operator
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  About the Spark Job Namespace and the Service Account for Driver Pods
&lt;/h3&gt;

&lt;p&gt;We did not set a specific value for the Helm chart property &lt;code&gt;sparkJobNamespace&lt;/code&gt; when installing the operator, that means the Spark Operator supports deploying &lt;code&gt;SparkApplications&lt;/code&gt; to all namespaces.&lt;br&gt;
As a consequence, the Spark Operator did not automatically create the service account for driver pods, and we must set up the RBAC for driver pods of our &lt;code&gt;SparkApplications&lt;/code&gt; to be able to manipulate executor pods in a specific namespace.&lt;/p&gt;

&lt;p&gt;See &lt;a href="https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/quick-start-guide.md#about-the-spark-job-namespace" rel="noopener noreferrer"&gt;About the Spark Job Namespace&lt;/a&gt; and &lt;a href="https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/quick-start-guide.md#about-the-service-account-for-driver-pods" rel="noopener noreferrer"&gt;About the Service Account for Driver Pods&lt;/a&gt; sections for more details.&lt;/p&gt;
&lt;h2&gt;
  
  
  Running the Examples
&lt;/h2&gt;

&lt;p&gt;To run the Spark Pi example provided within the operator, run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; examples/spark-py-pi.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Spark-submit vs Spark Operator
&lt;/h1&gt;

&lt;p&gt;Let's take a closer look at the &lt;a href="https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/examples/spark-py-pi.yaml" rel="noopener noreferrer"&gt;Pi example&lt;/a&gt; from the Spark Operator. A single YAML file is needed, adapted to our configuration: &lt;code&gt;.metadata.namespace&lt;/code&gt; must be set to "spark-jobs" and &lt;code&gt;.spec.driver.serviceAccount&lt;/code&gt; is set to the name of the service account "driver-sa" previously created.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sparkoperator.k8s.io/v1beta2"&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SparkApplication&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pyspark-pi&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spark-jobs&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;batchScheduler&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;volcano&lt;/span&gt;
  &lt;span class="na"&gt;batchSchedulerOptions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;priorityClassName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;routine&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Python&lt;/span&gt;
  &lt;span class="na"&gt;pythonVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2"&lt;/span&gt;
  &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cluster&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gcr.io/spark-operator/spark-py:v3.0.0"&lt;/span&gt;
  &lt;span class="na"&gt;imagePullPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Always&lt;/span&gt;
  &lt;span class="na"&gt;mainApplicationFile&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;local:///opt/spark/examples/src/main/python/pi.py&lt;/span&gt;
  &lt;span class="na"&gt;arguments&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10"&lt;/span&gt;
  &lt;span class="na"&gt;sparkVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3.0.0"&lt;/span&gt;
  &lt;span class="na"&gt;restartPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;OnFailure&lt;/span&gt;
    &lt;span class="na"&gt;onFailureRetries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
    &lt;span class="na"&gt;onFailureRetryInterval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
    &lt;span class="na"&gt;onSubmissionFailureRetries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
    &lt;span class="na"&gt;onSubmissionFailureRetryInterval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;
  &lt;span class="na"&gt;timeToLiveSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;86400&lt;/span&gt;
  &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;affinity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;nodeAffinity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;requiredDuringSchedulingIgnoredDuringExecution&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;nodeSelectorTerms&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;matchExpressions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;type&lt;/span&gt;
              &lt;span class="na"&gt;operator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;In&lt;/span&gt;
              &lt;span class="na"&gt;values&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;cores&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;coreLimit&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1200m"&lt;/span&gt;
    &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;512m"&lt;/span&gt;
    &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;3.0.0&lt;/span&gt;
    &lt;span class="na"&gt;serviceAccount&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;driver-sa&lt;/span&gt;
  &lt;span class="na"&gt;executor&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;affinity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;nodeAffinity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;requiredDuringSchedulingIgnoredDuringExecution&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;nodeSelectorTerms&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;matchExpressions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;type&lt;/span&gt;
              &lt;span class="na"&gt;operator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;In&lt;/span&gt;
              &lt;span class="na"&gt;values&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;compute&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;cores&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;instances&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
    &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;512m"&lt;/span&gt;
    &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;3.0.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pretty simple, right!?&lt;/p&gt;

&lt;p&gt;The Spark Operator aims to make specifying and running Spark applications in a cloud-native way and as easy and idiomatic as running other workloads on Kubernetes. It uses Kubernetes &lt;a href="https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/" rel="noopener noreferrer"&gt;custom resources&lt;/a&gt; for specifying, running, and monitoring Spark applications.&lt;/p&gt;

&lt;p&gt;With the high-level resource &lt;code&gt;SparkApplication&lt;/code&gt;, the operator greatly reduces the boilerplate YAML configuration files and takes care of all the needed plumbing for you: networking between the driver and its executors, garbage collection, pod configuration, access to the driver UI.&lt;/p&gt;

&lt;p&gt;The following diagram shows what is actually deployed in Kubernetes under the hood:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8lmiwqxh1nsq4jl69nv2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8lmiwqxh1nsq4jl69nv2.png" alt="spark operator" width="800" height="569"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In use, the operator is way much easier than &lt;code&gt;spark-submit&lt;/code&gt;. But spark-submit is definitely not going away and is still the Spark native way of launching applications. &lt;em&gt;"In the long term, for application submission, the operator will not semantically nor functionally diverge from spark-submit and will always use it under the hood"&lt;/em&gt;. More importantly, &lt;em&gt;"the spark-submit script use all of Spark’s supported cluster managers through a uniform interface so you don’t have to configure your application especially for each one"&lt;/em&gt; (see &lt;a href="https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/225" rel="noopener noreferrer"&gt;here&lt;/a&gt;). Still, that shouldn't prevent the Apache Spark project from developing its own operator in my opinion.&lt;/p&gt;

&lt;p&gt;Eventually, choosing between the Spark Operator and spark-submit is a matter of if you are more &lt;em&gt;Kubernetes&lt;/em&gt;-centric and you run Spark workloads among other types of workloads, or you do Spark first, and Kubernetes is just a mean to allocate resources on a cluster.&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://dev.to/stack-labs/my-journey-with-spark-on-kubernetes-in-python-2-3-5e8j"&gt;following article&lt;/a&gt;, we will see how the magic of the Spark Operator operates, by reproducing all of its internals with spark-submit.&lt;/p&gt;

</description>
      <category>spark</category>
      <category>kubernetes</category>
      <category>python</category>
    </item>
    <item>
      <title>Asguard, a security solution for bringing sensitive code into the Cloud</title>
      <dc:creator>Pascal Gillet</dc:creator>
      <pubDate>Tue, 02 Jun 2020 09:31:17 +0000</pubDate>
      <link>https://forem.com/stack-labs/asguard-a-security-solution-for-bringing-sensitive-code-into-the-cloud-37o1</link>
      <guid>https://forem.com/stack-labs/asguard-a-security-solution-for-bringing-sensitive-code-into-the-cloud-37o1</guid>
      <description>&lt;p&gt;This article presents a security solution implemented as part of a project led by Stack Labs on behalf of one of our customers in the space sector. &lt;br&gt;
The purpose of this project was to migrate a legacy application to Google Cloud (&lt;em&gt;Lift &amp;amp; Shift&lt;/em&gt;). It is a sensitive application, as its source code is the fruit of a real &lt;em&gt;savoir-faire&lt;/em&gt; in satellite imagery analysis, with algorithms developed in-house over many years.  &lt;/p&gt;



&lt;p&gt;The application is protected by a legal framework which aims to protect access to the knowledge and technologies of the organization that requests it. Conversely, and as far as I understand, the said organization can be prosecuted if it does not put in place the necessary measures to protect their assets, like the access control, physical or digital, to sensitive information.&lt;/p&gt;
&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;Our customer brings a lot of its activities into public cloud platforms and plans to do the same with this application, but in this particular case, &lt;em&gt;he&lt;/em&gt; needed more security guarantees.&lt;/p&gt;

&lt;p&gt;Yes, the cloud can be an important source of security risks. The cloud's responsibility for security essentially extends to the services provided, but not to the hosted applications. Thus, cloud providers already offer integrated security solutions at storage, infrastructure and network levels, but these may not be sufficient due to intellectual property or local regulations. The &lt;a href="https://en.wikipedia.org/wiki/OWASP" rel="noopener noreferrer"&gt;OWASP&lt;/a&gt;, in its &lt;a href="https://owasp.org/www-pdf-archive/Cloud-Top10-Security-Risks.pdf" rel="noopener noreferrer"&gt;"Cloud Top 10 Security Risks"&lt;/a&gt;, reveals that data ownership and regulatory compliance are respectively the 1st and 3rd highest risks when migrating into the cloud. Businesses need additional guarantees and the ability to extend, or even replace, these security solutions to the applications themselves.&lt;/p&gt;

&lt;p&gt;Here, our application is originally a desktop application. The long-term goal is to &lt;em&gt;servicize&lt;/em&gt; it and use it as a main “shared library” among the satellite imagery services delivered to our customer's customers. The application is also a plain old good monolith, and will be broken later into microservices. The first project milestone was to take this monolith to the cloud and attach a REST API facade to it, and above all to put the security solution in place, and to test it.&lt;/p&gt;
&lt;h2&gt;
  
  
  Security requirements
&lt;/h2&gt;

&lt;p&gt;The security requirements are the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Execution control&lt;/strong&gt;: the application must be “enclaved”, that means the application can only be loaded and executed inside a trusted execution environment, here in this case a GCP project. The containerized application is basically encrypted at build time, then decrypted at runtime, with the help of Vault from Hashicorp. It should not be possible to decrypt the application outside of this execution context. &lt;em&gt;This article is mainly about how we fulfilled this requirement.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code obfuscation&lt;/strong&gt;: protection against reverse engineering. As a last bastion of security, and in the event that an attacker manages to grab the image, decrypt it and inspect it, the application source code must be obfuscated, so as to make it difficult for humans to understand.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Access control&lt;/strong&gt;: this is more classic. The application will expose a REST API. Access control is provided by an in-house API gateway - &lt;em&gt;a server program which is the only possible entry point and acts as an API front-end, receives API requests, passes requests to the back-end service(s) and then passes the response back to the requester&lt;/em&gt; - that supports authentication, authorization, and security audit. The API gateway is used by many projects within our customer's organization and is an internal project by itself.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"The application can only be loaded and executed inside a trusted execution environment. A user should not be able to inspect the contents of the application outside of this execution context."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A last requirement, but not least, is to use only open source tools to implement this security solution, and to be cloud agnostic (you know, to change the provider if necessary).&lt;/p&gt;
&lt;h2&gt;
  
  
  Secured CI/CD
&lt;/h2&gt;

&lt;p&gt;We set up a CI/CD pipeline for building, securing and deploying the application into the targeted GCP project. &lt;a href="https://www.vaultproject.io/" rel="noopener noreferrer"&gt;Vault from Hashicorp&lt;/a&gt; is the corner stone of this workflow. Vault provides secrets management, and encryption of application data (without storing it). In particular, you can generate, use, rotate, and destroy AES256, RSA 2048, RSA 4096, EC P256, EC P384 and EC P521 cryptographic keys. Vault provides a Web interface, as well as a CLI and a HTTP API, to interact with it. It can be viewed as "encryption as a service".&lt;/p&gt;

&lt;p&gt;The CI/CD pipeline, implemented with Gitlab CI/CD, is as follows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fbh4rv0tg9vzecznh0or5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fbh4rv0tg9vzecznh0or5.png" alt="Asguard secured CI/CD" width="800" height="425"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The steps are numbered and detailed below:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The source code is checked out, built and obfuscated.&lt;/li&gt;
&lt;li&gt;We create (first time) or rotate (successive times) a "key encryption key" (KEK) in Vault. Here, we authenticate to 
Vault using a simple token, stored in a Gitlab protected environment variable.&lt;/li&gt;
&lt;li&gt;Envelope encryption: we implement a key hierarchy with a local "data encryption key" (DEK), protected by the KEK. &lt;/li&gt;
&lt;li&gt;We make an archive out of the application, and we use the DEK to encrypt it using OpenSSL or GnuPG locally.&lt;/li&gt;
&lt;li&gt;We store the encrypted application in a Docker container image, along with the &lt;strong&gt;encrypted&lt;/strong&gt; DEK and a bootstrap bash script that will be used at runtime to decrypt and start running the application.&lt;/li&gt;
&lt;li&gt;We push the image in Google Container Registry.&lt;/li&gt;
&lt;li&gt;Self-starting process. The bootstrap script is the entrypoint of the Docker image. The workloads are orchestrated with Google Kubernetes Engine.&lt;/li&gt;
&lt;li&gt;Here, we authenticate to Vault with a method specific to Google Cloud, detailed below.&lt;/li&gt;
&lt;li&gt;We decrypt the DEK, by giving to Vault the ciphertext and the name of the KEK to decrypt against.&lt;/li&gt;
&lt;li&gt;We decrypt the archive locally with OpenSSL or GnuPG, and we unpack the application. &lt;/li&gt;
&lt;li&gt;The application can finally do the job for which it is intended.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You may ask why we create the "intermediate" DEK (step 3) to encrypt the application (step 4) when we could directly encrypt the application with the KEK in the first place. This would require to send the whole application to Vault over the network. We would therefore take the risk of exposing the application in transit, even if TLS is enabled, and above all, the application can be heavy.&lt;/p&gt;

&lt;p&gt;So let's remember these two important definitions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data encryption key (DEK)&lt;/strong&gt;: is an encryption key whose function is to encrypt and decrypt data. The DEK is meant to
be downloaded to encrypt the application locally.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Encryption key (KEK)&lt;/strong&gt;: is an encryption key whose function is to encrypt and decrypt the DEK. The KEK never leaves
Vault.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Vault Auth methods
&lt;/h2&gt;

&lt;p&gt;Authentication in Vault is the process of verifying that the system that logs in is which it says it is, and within a particular context. Vault supports multiple auth methods including Github, LDAP, AWS, Azure, Google Cloud, and more. &lt;/p&gt;

&lt;p&gt;Before a client can interact with Vault, it must authenticate against a chosen auth method. Once authenticated, a token is granted to the client, like a session ID, for subsequent exchanges with Vault.&lt;/p&gt;

&lt;p&gt;In steps 2 and 8 above, we use two different methods to authenticate to Vault:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;At built time, we authenticate using the built-in token auth method (a simple password generated by Vault). This is a 
"shortcut" method, as if we were authenticating directly with the session ID mentioned above.&lt;/li&gt;
&lt;li&gt;At runtime, we authenticate with a method specific to Google Cloud, and this is precisely what will allow us to ensure 
that we are executing our application in our GCP project, and not elsewhere.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Vault GCE Auth workflow
&lt;/h3&gt;

&lt;p&gt;The gcp auth method allows Google Cloud Platform entities to authenticate to Vault. Vault treats Google Cloud as a trusted third party and verifies authenticating entities against the Google Cloud APIs. This Vault backend allows for authentication of Google Cloud IAM service accounts, but here we are interested by authenticating Google Compute Engine (GCE) instances that make up our Kubernetes cluster.&lt;/p&gt;

&lt;p&gt;The auth method must be configured in advance before machines can authenticate. In particular, we must declare that only the machines in the targeted GCP project, in certain zones and with certain labels will be able to authenticate.&lt;/p&gt;

&lt;p&gt;Also, the GCE instances that are authenticating against Vault must have the following role: &lt;br&gt;
roles/iam.serviceAccountTokenCreator&lt;/p&gt;

&lt;p&gt;In order to check the authentication process, Vault itself must authenticate to the Google Cloud Platform. For that, we specify the credentials of a service account with the minimum scope &lt;a href="https://www.googleapis.com/auth/cloud-platform" rel="noopener noreferrer"&gt;https://www.googleapis.com/auth/cloud-platform&lt;/a&gt; &lt;br&gt;
and the role &lt;code&gt;roles/compute.viewer&lt;/code&gt; or a custom role with the following exact permissions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;iam.serviceAccounts.get&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;iam.serviceAccountKeys.get&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;compute.instances.get&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;compute.instanceGroups.list&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;compute.instanceGroups.listInstances&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These allow Vault to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Verify the service account, either directly authenticating or associated with authenticating GCE instance, exists&lt;/li&gt;
&lt;li&gt;Get the corresponding public keys for verifying JWTs signed by service account private keys.&lt;/li&gt;
&lt;li&gt;Verify authenticating GCE instances exist&lt;/li&gt;
&lt;li&gt;Compare bound fields for GCE roles (zone/region, labels, or membership in given instance groups)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The following diagram shows how Vault communicates with Google Cloud to authenticate and authorize JWT tokens:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fw9uzbka4rzakalxsfanu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fw9uzbka4rzakalxsfanu.png" alt="Vault GCE Auth workflow" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The client obtains an instance identity metadata token on a GCE instance.&lt;/li&gt;
&lt;li&gt;The client sends this JWT to Vault along with a role name.&lt;/li&gt;
&lt;li&gt;Vault extracts the kid header value, which contains the ID of the key-pair used to generate the JWT, to find the OAuth2 public cert to verify this JWT.&lt;/li&gt;
&lt;li&gt;Vault authorizes the confirmed instance against the given role, ensuring the instance matches the bound zones, regions, or instance groups. If that is successful, a Vault token with the proper policies is returned.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;See &lt;a href="https://www.vaultproject.io/docs/auth/gcp/" rel="noopener noreferrer"&gt;https://www.vaultproject.io/docs/auth/gcp/&lt;/a&gt; for even more details.&lt;/p&gt;
&lt;h2&gt;
  
  
  Putting the pieces together
&lt;/h2&gt;

&lt;p&gt;Now that we have the theory, it is time to assemble things. As an example, our target application is a very basic web server in Python, serving files relative to its current directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SimpleHTTPServer&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SocketServer&lt;/span&gt;

&lt;span class="n"&gt;PORT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;8000&lt;/span&gt;

&lt;span class="n"&gt;Handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SimpleHTTPServer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SimpleHTTPRequestHandler&lt;/span&gt;

&lt;span class="n"&gt;httpd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SocketServer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;TCPServer&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PORT&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;serving at port&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PORT&lt;/span&gt;
&lt;span class="n"&gt;httpd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;serve_forever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In particular, it will serve the following &lt;code&gt;index.html&lt;/code&gt; page:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fiftaazdp6ppmj4sxnf96.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fiftaazdp6ppmj4sxnf96.png" alt="Asguard demo index html" width="800" height="671"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The goal of our demo will be to modify this index page, commit our changes, and create a git tag to see the CI automatically triggered and our new version deployed, and see in the meantime what are the security elements that have been applied.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code obfuscation
&lt;/h3&gt;

&lt;p&gt;Code obfuscation is done by using a command line tool. A variety of tools exist to perform or assist with code obfuscation, and there is no such thing as a universal tool for obfuscation, as it depends of the architecture and the characteristics of the language.&lt;/p&gt;

&lt;p&gt;We use &lt;a href="https://github.com/dashingsoft/pyarmor" rel="noopener noreferrer"&gt;PyArmor&lt;/a&gt; to obfuscate our Python script above, and this will be typically the result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pytransform&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pyarmor_runtime&lt;/span&gt;
&lt;span class="nf"&gt;pyarmor_runtime&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;__pyarmor__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\x50\x59\x41\x52\x4d\x4f\x52\x00\x00\x03\x05\x00\x17\x0d\x0d\x0a\x02\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x40\x00\x00\x00\x97\x01\x00\x00\x00\x00\x00\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe0\x50\x8c\x64\x26\x42\xd6\x01\xf3\x0d\x68\x2c\x25\x9f\x19\xaa\x05\x4c\x7b\x3c\x45\xce\x28\xdf\x84\xef\xd3\x8e\xed\xac\x78\x4b\x2c\xc5\x39\x30\xcf\xc1\x8b\x68\x89\x12\x6a\xe2\x4a\x89\x4c\x3a\xf8\xbd\x5d\xb2\xc9\xea\xf8\x6e\x38\xa4\x87\x32\xa3\x44\xe1\x0c\x7c\x1c\x75\x30\xc0\xe7\x58\xe5\xad\x5b\xe6\xaa\x02\x76\x03\x59\x08\xc1\x4f\x69\x0f\xee\xf7\x42\xc3\x9c\x6f\xb0\x82\x0f\x2a\x55\x34\x6b\x4d\x3a\xd6\xb6\xae\xdf\x0b\x3c\xf9\x6f\xde\xef\x0f\x90\xd6\x75\x3b\xe8\xc1\xc8\x42\x2b\xb0\x5f\x54\x1d\xd9\xee\xcd\x5d\xa2\x09\xcb\x7b\x5c\xd5\x26\xc8\xf7\x00\x4e\xd3\x3b\xc7\x44\xca\x94\xa6\xdf\x00\x83\x59\x2d\x7b\x75\xf0\x03\x9e\xae\xe2\x79\x8d\xaa\xa0\xbf\x3b\xc0\xc1\x7b\xaa\x84\x7a\xc7\xcb\x33\x82\x54\x5a\x6a\x25\xea\x30\xc4\x53\x0b\x01\xd6\x36\x67\x28\xdb\x80\x19\xe6\x1a\x81\x76\x4a\xb4\x98\xef\xff\xaa\x25\xaa\xe2\x81\x93\x05\xf2\x78\x80\xc6\xc7\x76\x66\xf0\x44\xe0\x15\xec\xe4\xe2\x40\x19\x80\x72\xc2\x0b\x56\x8e\x5e\xe5\x8c\x6a\x14\x2c\x55\x2a\xc8\xdc\x7a\x3b\x69\x3c\x56\x12\xd0\x8a\xe5\x0d\x66\x8e\x51\x4a\xbd\xe7\x8b\x91\xfb\x68\x2d\x05\x9f\x29\x1d\xd5\xd7\x2f\x20\xd3\x89\x76\x14\x69\xe6\xc7\x1a\x54\xa2\xf4\x10\x5e\x6d\xcf\xb9\x54\xd1\x9f\x92\x78\xa5\xb9\xcf\x80\xee\x40\xb7\x04\xd6\xbf\xa9\xca\xc4\x3e\x17\x16\xd6\x9a\x47\xa3\x73\x35\xf6\xd7\xfc\xcc\x3c\x74\x12\x60\x68\x3b\x67\x97\x9f\x12\xc3\x0a\x59\x3e\xe5\xe9\x2d\xf5\xcf\x24\x06\x7d\xdf\xd3\x1a\xec\x3c\x3b\xe3\x8a\xed\x61\x6f\xfc\x49\x0a\xcf\x46\x53\xcb\x55\xdf\x1c\x37\x5b\x1e\xe8\x58\xeb\x94\x15\xdc\xb5\x65\xd1\x77\xaa\xa5\x3c\x24\xf5\x49\x10\x00\x69\x01\x2f\x8f\x51\x48\x49\xa5\xe7\x33\x8e\x04\xd0\x95\x59\xec\x17\x14\x01\x39\xf9\xb4\xbf\x68\xee\xa3\x8e\x73&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The obfuscated script is a normal Python script, with an extra module &lt;code&gt;pytransform.py&lt;/code&gt; and a few extra runtime files. The plain Python script can be replaced with the obfuscated one seamlessly.&lt;/p&gt;

&lt;p&gt;Code obfuscation does not alter the output of a program, which remains fully functional. Just remember that code obfuscation is not encryption. It is just a way to delay human understanding of the code, and to make reverse-engineering of a program difficult and economically unfeasible, as this would need too much time and effort. &lt;/p&gt;

&lt;h3&gt;
  
  
  Vault configuration
&lt;/h3&gt;

&lt;p&gt;Vault must be accessible from both build time and runtime environments. It can be deployed either on-premises or in the cloud, as long as this environment is trusted. This &lt;a href="https://learn.hashicorp.com/vault/operations/production-hardening" rel="noopener noreferrer"&gt;guide&lt;/a&gt; provides guidance for a production hardened deployment of Vault. &lt;/p&gt;

&lt;p&gt;For the purposes of a demo, you can use the following script which automates the deployment and configuration of a Vault server in development mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c"&gt;# Download Vault&lt;/span&gt;
&lt;span class="nv"&gt;VAULT_VERSION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.2.3
curl https://releases.hashicorp.com/vault/&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/vault_&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;_linux_amd64.zip &lt;span class="nt"&gt;-o&lt;/span&gt; vault.zip
unzip vault.zip
&lt;span class="nb"&gt;chmod &lt;/span&gt;u+x vault

&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;VAULT_ADDR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://127.0.0.1:8200
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;VAULT_ROOT_TOKEN_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;root

&lt;span class="c"&gt;# Start Vault in dev mode&lt;/span&gt;
&lt;span class="nb"&gt;nohup &lt;/span&gt;vault server &lt;span class="nt"&gt;-dev&lt;/span&gt; &lt;span class="nt"&gt;-dev-listen-address&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.0.0.0:8200 &lt;span class="nt"&gt;-dev-root-token-id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_ROOT_TOKEN_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &amp;amp;

&lt;span class="c"&gt;# Wait until the Vault server has actually started&lt;/span&gt;
&lt;span class="nb"&gt;sleep &lt;/span&gt;5

&lt;span class="c"&gt;# Login to vault&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nv"&gt;$VAULT_ROOT_TOKEN_ID&lt;/span&gt; | vault login &lt;span class="nt"&gt;--no-print&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; -

&lt;span class="c"&gt;# Enable the Transit secrets engine which generates or encrypts data in-transit&lt;/span&gt;
vault secrets &lt;span class="nb"&gt;enable &lt;/span&gt;transit

&lt;span class="c"&gt;# Create dev policy&lt;/span&gt;
vault policy write asguard-policy policy-dev.hcl

&lt;span class="c"&gt;# Enable the Google Cloud auth method&lt;/span&gt;
vault auth &lt;span class="nb"&gt;enable &lt;/span&gt;gcp

&lt;span class="c"&gt;# Give Vault server the JSON key of a service account with the required GCP permissions:&lt;/span&gt;
&lt;span class="c"&gt;# roles/iam.serviceAccountKeyAdmin&lt;/span&gt;
&lt;span class="c"&gt;# roles/compute.viewer&lt;/span&gt;

&lt;span class="c"&gt;# These allow Vault to:&lt;/span&gt;
&lt;span class="c"&gt;# - Verify that the service account associated with authenticating GCE instance exists&lt;/span&gt;
&lt;span class="c"&gt;# - Get the corresponding public keys for verifying JWTs signed by service account private keys.&lt;/span&gt;
&lt;span class="c"&gt;# - Verify authenticating GCE instances exist&lt;/span&gt;
&lt;span class="c"&gt;# - Compare bound fields for GCE roles (zone/region, labels, or membership in given instance groups)&lt;/span&gt;

vault write auth/gcp/config &lt;span class="nv"&gt;credentials&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;@credentials.json

&lt;span class="c"&gt;# The GCE instances that are authenticating against Vault must have the following role: roles/iam.serviceAccountTokenCreator&lt;/span&gt;

&lt;span class="nv"&gt;ROLE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;asguard-gce-role

&lt;span class="c"&gt;# Create role for GCE instance authentication&lt;/span&gt;
vault write auth/gcp/role/&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ROLE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"gce"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;policies&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"asguard-policy"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;bound_projects&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"asguard"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;bound_zones&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"europe-west1-b"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;bound_labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"foo:bar,zip:zap,gruik:grok"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;with &lt;code&gt;policy-dev.hcl&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# This section grants all access on "transit/*".
path "transit/*" {
  capabilities = ["create", "read", "update", "delete", "list"]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Vault provide many "&lt;a href="https://www.vaultproject.io/docs/secrets" rel="noopener noreferrer"&gt;secrets engines&lt;/a&gt;" to store, generate, or encrypt data. &lt;br&gt;
Here, we use the &lt;a href="https://www.vaultproject.io/docs/secrets/transit" rel="noopener noreferrer"&gt;transit&lt;/a&gt; secrets engine to handle cryptographic functions on data in-transit only. With the exception of the main key (KEK), Vault doesn't store the data sent to the secrets engine. &lt;/p&gt;

&lt;p&gt;The last section in the bash script states that only GCE instances in the GCP project named "asguard", in the zone &lt;code&gt;europe-west1-b&lt;/code&gt; and with the labels &lt;code&gt;foo:bar&lt;/code&gt;, &lt;code&gt;zip:zap&lt;/code&gt;, and &lt;code&gt;gruik:grok&lt;/code&gt; will be able to authenticate to Vault. &lt;/p&gt;
&lt;h3&gt;
  
  
  The Docker images
&lt;/h3&gt;

&lt;p&gt;We distinguish 2 containers for our app: the app container itself, and one &lt;a href="https://kubernetes.io/docs/concepts/workloads/pods/init-containers/" rel="noopener noreferrer"&gt;init container&lt;/a&gt;, which is a specialized container that will run before the app container in a Kubernetes pod. The init container contains the bash script that will decrypt then start the application.  &lt;/p&gt;

&lt;p&gt;Here is our pod configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;asguard-demo&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;asguard-demo&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;asguard-demo&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eu.gcr.io/$GCLOUD_PROJECT_ID/$CI_PROJECT_NAME:$CI_COMMIT_TAG&lt;/span&gt;
    &lt;span class="na"&gt;imagePullPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Always&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8000&lt;/span&gt;
    &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;BOOTSTRAP_DIR&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/asguard&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CACHE_DIR&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/cache&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;VAULT_ADDR&lt;/span&gt;
        &lt;span class="na"&gt;valueFrom&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;secretKeyRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vault-secret&lt;/span&gt;
            &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vault-addr&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;VAULT_KEY_NAME&lt;/span&gt;
        &lt;span class="na"&gt;valueFrom&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;secretKeyRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vault-secret&lt;/span&gt;
            &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vault-key-name&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;VAULT_ROLE&lt;/span&gt;
        &lt;span class="na"&gt;valueFrom&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;secretKeyRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vault-secret&lt;/span&gt;
            &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vault-role&lt;/span&gt;
    &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;workdir&lt;/span&gt;
        &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/asguard&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cache-volume&lt;/span&gt;
        &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/cache&lt;/span&gt;
  &lt;span class="c1"&gt;# These containers are run during pod initialization&lt;/span&gt;
  &lt;span class="na"&gt;initContainers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;install-bootstrap&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eu.gcr.io/$GCLOUD_PROJECT_ID/init-$CI_PROJECT_NAME:$CI_COMMIT_TAG&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;cp&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-R"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;."&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/work-dir"&lt;/span&gt;
    &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;workdir&lt;/span&gt;
        &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/work-dir"&lt;/span&gt;
  &lt;span class="na"&gt;dnsPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Default&lt;/span&gt;
  &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# Volume shared by init and app containers&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;workdir&lt;/span&gt;
    &lt;span class="na"&gt;emptyDir&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;medium&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Memory&lt;/span&gt;
  &lt;span class="c1"&gt;# Volume to decrypt the application in memory&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cache-volume&lt;/span&gt;
    &lt;span class="na"&gt;emptyDir&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;medium&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Memory&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the configuration file, you can see that the pod has a volume that the init container and the application container share.&lt;/p&gt;

&lt;p&gt;The init container mounts the shared Volume at &lt;code&gt;/work-dir&lt;/code&gt;, and the application container mounts it at &lt;code&gt;/asguard&lt;/code&gt;. The init container runs the following command and then terminates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cp -R . /work-dir
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The command copies the bootstrap script and other utilities from the init container's working directory to the shared volume. This done, the application container has now access to the bootstrap script, which is its actual entrypoint.&lt;/p&gt;

&lt;p&gt;The app container has another &lt;code&gt;tmpfs&lt;/code&gt; mount at &lt;code&gt;/cache&lt;/code&gt;. A tmpfs volume is a RAM-backed filesystem, that means a tmpfs mount is temporary, and only persisted in the host memory and outside the container's writable layer. This is particularly useful to decrypt our application in this directory: when the container stops, the tmpfs mount is removed, and files written there won't be persisted.&lt;/p&gt;

&lt;p&gt;Here is the Dockerfile for our application image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Start by obfuscating the application
FROM python:3.5.3-slim AS obfuscate

RUN pip install pyarmor
COPY app/sensitive-app.py app/
# Python version must be the same as the target version in the final image
RUN pyarmor obfuscate app/sensitive-app.py
# Creates a folder dist with the obfuscated Python script


# Then encrypting the application
FROM gcr.io/distroless/base:debug AS build-env

SHELL ["/busybox/sh", "-c"]

# Define build arg
ARG VAULT_KEY_NAME

# Define env variable
ENV VAULT_KEY_NAME ${VAULT_KEY_NAME}

ARG VAULT_ADDR
ARG VAULT_ROOT_TOKEN_ID

WORKDIR /dist
ENV PATH /dist:${PATH}

# Install jq
ARG JQ_VERSION=1.6
RUN wget https://github.com/stedolan/jq/releases/download/jq-${JQ_VERSION}/jq-linux64 -O jq
RUN chmod u+x jq

# Embed the application. DO NOT USE ADD HERE!
COPY app/ app/
# With only the obfuscated Python script
RUN rm app/sensitive-app.py
COPY --from=obfuscate /dist/ app/

# Package the application
RUN tar cvf app.tar.gz app/

# Create a named encryption key
# If the key already exists, this has no effect
RUN wget \
    --header "X-Vault-Token: ${VAULT_ROOT_TOKEN_ID}" \
    --post-data "" \
    ${VAULT_ADDR}/v1/transit/keys/${VAULT_KEY_NAME}

# Rotates the version of the named key
RUN wget \
    --header "X-Vault-Token: ${VAULT_ROOT_TOKEN_ID}" \
    --post-data "" \
    ${VAULT_ADDR}/v1/transit/keys/${VAULT_KEY_NAME}/rotate

# Generate data key
# The name specified as part of the URL is the name of the encryption key
# created in the init container and used to encrypt the datakey
RUN wget \
    --header "X-Vault-Token: ${VAULT_ROOT_TOKEN_ID}" \
    --post-data "" \
    ${VAULT_ADDR}/v1/transit/datakey/plaintext/${VAULT_KEY_NAME} -O datakey.json

RUN cat datakey.json | jq -r ".data.ciphertext" &amp;gt; passphrase.enc

# Encrypt the binary with the plaintext key 
RUN cat datakey.json | jq -r ".data.plaintext" | openssl aes-256-cbc -salt -in app.tar.gz -out app.enc -pass stdin

# Delete unsecured files (delete all files that do not match *.enc)
RUN find /dist/* ! -name '*.enc' -exec rm -rf '{}' +

RUN chmod -R 500 /dist


# App container
FROM gcr.io/distroless/python3

USER nobody:nobody
COPY --chown=nobody:nobody --from=build-env /dist/ /dist/
CMD ["python app/sensitive-app.py"]
# The entrypoint will be mounted from the init container
ENTRYPOINT ["/asguard/bootstrap.sh"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And here is the Dockerfile for the init container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FROM gcr.io/distroless/base:debug AS build-env

SHELL ["/busybox/sh", "-c"]

WORKDIR /asguard
ENV PATH /asguard:${PATH}

# Install jq
ARG JQ_VERSION=1.6
RUN wget https://github.com/stedolan/jq/releases/download/jq-${JQ_VERSION}/jq-linux64 -O jq

# Install cat, tar, wget and cp for bootstrap.sh later execution
ARG BUSYBOX_VERSION=1.30.0-i686
RUN wget https://busybox.net/downloads/binaries/${BUSYBOX_VERSION}/busybox_CAT -O cat
RUN wget https://busybox.net/downloads/binaries/${BUSYBOX_VERSION}/busybox_TAR -O tar
RUN wget https://busybox.net/downloads/binaries/${BUSYBOX_VERSION}/busybox_WGET -O wget
RUN wget https://busybox.net/downloads/binaries/${BUSYBOX_VERSION}/busybox_CP -O cp

COPY bootstrap.sh ./

RUN chmod -R 500 /asguard


# Init container
FROM gcr.io/distroless/base

USER nobody:nobody
COPY --chown=nobody:nobody --from=build-env /asguard /asguard
WORKDIR /asguard
ENV PATH /asguard:${PATH}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The two Dockerfiles alone will perform steps 1 to 5 of the CI pipeline. Nothing more than &lt;code&gt;docker build&lt;/code&gt; is needed to create the automated build.&lt;/p&gt;

&lt;p&gt;We can make some comments on these Dockerfiles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We use Docker &lt;a href="https://docs.docker.com/develop/develop-images/multistage-build/" rel="noopener noreferrer"&gt;multi-stage builds&lt;/a&gt; to separate the main steps of image build and to keep the final images as light as possible.&lt;/li&gt;
&lt;li&gt;We use "&lt;a href="https://github.com/GoogleContainerTools/distroless" rel="noopener noreferrer"&gt;distroless&lt;/a&gt;" images to harden the images as much as possible.&lt;/li&gt;
&lt;li&gt;Distroless images contain only the application and its runtime dependencies. They do not contain package managers, shells (you cannot even do a &lt;code&gt;ls&lt;/code&gt; or a &lt;code&gt;cd&lt;/code&gt;) or any other programs you would expect to find in a standard Linux distribution. You can find different distroless base images depending on your application stack: Java, Python, Golang, Node.js, dotnet, Rust.&lt;/li&gt;
&lt;li&gt;For intermediate build stages, we use distroless &lt;strong&gt;debug&lt;/strong&gt; images which provide a &lt;a href="https://busybox.net/" rel="noopener noreferrer"&gt;busybox&lt;/a&gt; shell. This allows to run all the necessary commands to actually build the image, and for the init image, to download and install all the busybox commands (&lt;code&gt;cat&lt;/code&gt;, &lt;code&gt;tar&lt;/code&gt;, &lt;code&gt;wget&lt;/code&gt;, in a unitary way) needed for &lt;code&gt;bootstrap.sh&lt;/code&gt; later execution in the non-debug final image. We also install &lt;code&gt;cp&lt;/code&gt; that will be used to copy the bootstrap script in the shared volume.&lt;/li&gt;
&lt;li&gt;We use the HTTP API to interact with Vault. We could have added the CLI into the image, but it is more secure and lighter not to embed a tool that offers far more services that the ones we only need.&lt;/li&gt;
&lt;li&gt;As a consequence, we must install &lt;a href="https://github.com/stedolan/jq" rel="noopener noreferrer"&gt;jq&lt;/a&gt; to process the JSON output from HTTP calls.&lt;/li&gt;
&lt;li&gt;We can use &lt;a href="https://stackoverflow.com/questions/28247821/openssl-vs-gpg-for-encrypting-off-site-backups" rel="noopener noreferrer"&gt;OpenSSL or GnuPG&lt;/a&gt; to encrypt the application locally in the container. Here, we use OpenSSL as it is already shipped in distroless images.&lt;/li&gt;
&lt;li&gt;We run the final images with a non-root user.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The build commands are the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker build -t init-asguard-demo -f Dockerfile-init .
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker build -t asguard-demo \
    --network=host \
    --build-arg VAULT_ADDR=https://vaulthost:8200 \
    --build-arg VAULT_ROOT_TOKEN_ID=s3cr3tt0k3n \
    --build-arg VAULT_KEY_NAME=aname \
    -f Dockerfile-app .
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The final images are expected to run in Kubernetes (due to the init container and the mounted volumes), and the bootstrap script will attempt to authenticate to Vault with the GCP method. So, you won't be able to run the final images locally with the default entrypoint (and that is our goal).&lt;/p&gt;

&lt;p&gt;If you edit the Dockerfile to change the final image to &lt;code&gt;:debug&lt;/code&gt;, you can nevertheless get a busybox shell to enter and see that the application is indeed encrypted:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run -it --rm --entrypoint=sh asguard-demo:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Gitlab CI/CD
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://docs.gitlab.com/ee/ci/" rel="noopener noreferrer"&gt;GitLab CI/CD&lt;/a&gt; is configured by a file called &lt;code&gt;.gitlab-ci.yml&lt;/code&gt; placed at the &lt;br&gt;
repository’s root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;stages&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;deploy&lt;/span&gt;

&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gcr.io/kaniko-project/executor:debug&lt;/span&gt;
    &lt;span class="na"&gt;entrypoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;export GOOGLE_APPLICATION_CREDENTIALS=/kaniko/kaniko-secret.json&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;echo $GOOGLE_APPLICATION_CREDENTIALS_BASE64 | base64 -d &amp;gt; $GOOGLE_APPLICATION_CREDENTIALS&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/kaniko/executor --context $CI_PROJECT_DIR --dockerfile Dockerfile-init --build-arg VAULT_ADDR=$VAULT_ADDR --build-arg VAULT_ROOT_TOKEN_ID=$VAULT_ROOT_TOKEN_ID --build-arg VAULT_KEY_NAME=$VAULT_KEY_NAME --destination eu.gcr.io/$GCLOUD_PROJECT_ID/init-$CI_PROJECT_NAME:$CI_COMMIT_TAG&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/kaniko/executor --context $CI_PROJECT_DIR --dockerfile Dockerfile-app --build-arg VAULT_ADDR=$VAULT_ADDR --build-arg VAULT_ROOT_TOKEN_ID=$VAULT_ROOT_TOKEN_ID --build-arg VAULT_KEY_NAME=$VAULT_KEY_NAME --destination eu.gcr.io/$GCLOUD_PROJECT_ID/$CI_PROJECT_NAME:$CI_COMMIT_TAG&lt;/span&gt;
  &lt;span class="na"&gt;only&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;tags&lt;/span&gt;


&lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;google/cloud-sdk:latest&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Authenticate with GKE&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;echo $GOOGLE_APPLICATION_CREDENTIALS_BASE64 | base64 -di &amp;gt; key.json&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;gcloud auth activate-service-account --key-file=key.json&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;gcloud config set project $GCLOUD_PROJECT_ID&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;gcloud container clusters get-credentials $CLUSTER_NAME --zone europe-west1-b&lt;/span&gt;
    &lt;span class="c1"&gt;# Create Kubernetes secret with Vault info&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;kubectl delete secret vault-secret || &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;kubectl create secret generic vault-secret --from-literal=vault-addr=$VAULT_ADDR --from-literal=vault-key-name=$VAULT_KEY_NAME --from-literal=vault-role=$VAULT_ROLE&lt;/span&gt;
    &lt;span class="c1"&gt;# Install envsubst&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;apt-get install -y gettext-base&lt;/span&gt;
    &lt;span class="c1"&gt;# Deploy&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;cat k8s/deployment.yml | envsubst | kubectl apply -f -&lt;/span&gt;

  &lt;span class="na"&gt;only&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;tags&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pipeline of scripts builds the images (steps 1 to 5), pushes them into the Google Container Registry (step 6), then deploys the application into GKE (step 7) at every tag in the &lt;code&gt;master&lt;/code&gt; branch of the repository.  &lt;/p&gt;

&lt;p&gt;Some observations that are worth noting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each stage in the file is executed by the &lt;a href="https://docs.gitlab.com/runner/" rel="noopener noreferrer"&gt;Gitlab Runner&lt;/a&gt; in a container image.&lt;/li&gt;
&lt;li&gt;Instead of &lt;code&gt;docker build&lt;/code&gt;, we use &lt;a href="https://github.com/GoogleContainerTools/kaniko" rel="noopener noreferrer"&gt;Kaniko&lt;/a&gt; to build the application images from the Dockerfiles. Kaniko is meant to be run inside a container or Kubernetes cluster. Kaniko does not depend 
on a Docker daemon and it is therefore an alternative to &lt;a href="https://hub.docker.com/_/docker" rel="noopener noreferrer"&gt;Docker-in-Docker&lt;/a&gt;, which is subject to potential security issues. &lt;/li&gt;
&lt;li&gt;We use &lt;a href="https://docs.gitlab.com/ee/ci/variables/README.html#protected-environment-variables" rel="noopener noreferrer"&gt;protected environment variables&lt;/a&gt; in Gitlab to configure the pipeline (Vault host, Vault's root token, Google credentials and so on). Protected variables are securely passed to the pipeline running.&lt;/li&gt;
&lt;li&gt;We use &lt;code&gt;envsubst&lt;/code&gt; to substitute the values of environment variables into Kubernetes YAML files. I can already hear the haters saying that Kubernetes files should be template-free and should only be patched (see this other article on &lt;a href="https://dev.to/code/kustomize-101"&gt;kubectl kustomize&lt;/a&gt;), but hey, this is a demo...😱&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F9w63fsxedz4cgivvivuc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F9w63fsxedz4cgivvivuc.png" alt="Gitlab protected environment variables" width="800" height="553"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Gitlab itself must be a trusted part of the chain: it can be either self-hosted on-premises or online (SaaS). The Gitlab project must have at least restricted access to the only developers needed for the project. If online, the project repository should be private so as not to publish the source code publicly. Gitlab protects the source code with encryption in transit and at rest, and AFAIK you can manage your own encryption keys. Regardless of the project repository, the Gitlab Runner can also be installed on-premises. &lt;/p&gt;

&lt;h2&gt;
  
  
  Let's run
&lt;/h2&gt;

&lt;p&gt;Here we are! We committed and pushed our changes, we created a git tag, the images have been pulled, and the Kubernetes pod has started. The bootstrap script can finally proceed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/sh&lt;/span&gt;

&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-eux&lt;/span&gt;

&lt;span class="nv"&gt;VAULT_ADDR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_ADDR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;VAULT_KEY_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_KEY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;VAULT_ROLE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_ROLE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

&lt;span class="nv"&gt;WORK_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/dist
&lt;span class="nv"&gt;CACHE_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CACHE_DIR&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="p"&gt;/cache&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

&lt;span class="nv"&gt;BOOTSTRAP_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BOOTSTRAP_DIR&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="p"&gt;/asguard&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BOOTSTRAP_DIR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BOOTSTRAP_DIR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# Get the JWT token for the GCE instance that runs this container&lt;/span&gt;
&lt;span class="nv"&gt;INST_ID_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;wget &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s2"&gt;"Metadata-Flavor: Google"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/identity?audience=vault/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_ROLE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;amp;format=full"&lt;/span&gt; &lt;span class="nt"&gt;-qO&lt;/span&gt; -&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nv"&gt;AUTH_PAYLOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;role&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_ROLE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;jwt&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;INST_ID_TOKEN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;}"&lt;/span&gt;

&lt;span class="c"&gt;# GCE login&lt;/span&gt;
&lt;span class="nv"&gt;CLIENT_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;wget &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--post-data&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AUTH_PAYLOAD&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_ADDR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/v1/auth/gcp/login &lt;span class="nt"&gt;-O&lt;/span&gt; - | &lt;span class="se"&gt;\&lt;/span&gt;
    jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.auth.client_token'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nv"&gt;PAYLOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;ciphertext&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;WORK_DIR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/passphrase.enc&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;}"&lt;/span&gt;

&lt;span class="c"&gt;# Decrypt the application&lt;/span&gt;
wget &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s2"&gt;"X-Vault-Token: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CLIENT_TOKEN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--post-data&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PAYLOAD&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_ADDR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/v1/transit/decrypt/&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VAULT_KEY_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-O&lt;/span&gt; - | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.data.plaintext'&lt;/span&gt; | openssl aes-256-cbc &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-salt&lt;/span&gt; &lt;span class="nt"&gt;-in&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;WORK_DIR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/app.enc &lt;span class="nt"&gt;-out&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CACHE_DIR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/app.tar.gz &lt;span class="nt"&gt;-pass&lt;/span&gt; stdin

&lt;span class="c"&gt;# Unpack&lt;/span&gt;
&lt;span class="nb"&gt;tar &lt;/span&gt;xvf &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CACHE_DIR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/app.tar.gz &lt;span class="nt"&gt;-C&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CACHE_DIR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# Execute arg command&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CACHE_DIR&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
/bin/sh &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The script performs final steps 7 to 11 of the CI pipeline. The last instruction executes the command &lt;code&gt;python app/sensitive-app.py&lt;/code&gt; passed as an argument to the image's entrypoint. At this point, the application is now decrypted and unpacked: the Python web server can start and Grumpy the cat is now happy!&lt;/p&gt;

&lt;p&gt;If you visit the &lt;a href="https://gitlab.com/stack-labs/oss/asguard-demo/-/blob/master/k8s/ingress.yml" rel="noopener noreferrer"&gt;Kubernetes ingress&lt;/a&gt;'s endpoint, you will see the following web page:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fbczqmmnak0tt5ok2h9t6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fbczqmmnak0tt5ok2h9t6.png" alt="Asguard demo index html" width="800" height="682"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note that the self-starting process here can be implemented in whatever language that suits you: you can pick the right distroless image for the desired stack, and Vault provides many &lt;a href="https://www.vaultproject.io/api/libraries.html" rel="noopener noreferrer"&gt;client libraries&lt;/a&gt; to consume the API more conveniently. &lt;/p&gt;

&lt;h2&gt;
  
  
  The "need to know", and the principle of the least privilege
&lt;/h2&gt;

&lt;p&gt;We cannot emphasize this enough, all these security measures are only relevant if all parts of the pipeline are trusted themselves. Most attacks are indeed initiated by users with regular access to the system or communication channel (man-at-the-end attack). That means Gitlab, Vault, the GCP project, and all communication media between them, must limit their access to the smallest number of people (developers, administrators). And within these systems, users must be able to access only the information and resources that are necessary for their job (ex: all users who end up to be administrators of all GCP projects in an organization, that is not a good thing).&lt;/p&gt;

&lt;h3&gt;
  
  
  Decrypting in memory
&lt;/h3&gt;

&lt;p&gt;We saw that the application is decrypted in memory in a tmpfs mount. If such a mount is temporary, the filesystem is still visible from the host. If an attacker has access to the Kubernetes node where the pod is running, he can find our sensitive source code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# find / -name sensitive-app.py&lt;/span&gt;
/home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/cb74781d-7fc7-11ea-adc6-42010a8400f8/volumes/kubernetes.io~empty-dir/cache-volume/app/sensitive-app.py
/var/lib/kubelet/pods/cb74781d-7fc7-11ea-adc6-42010a8400f8/volumes/kubernetes.io~empty-dir/cache-volume/app/sensitive-app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Of course the source code is obfuscated, but still. &lt;/p&gt;

&lt;h2&gt;
  
  
  What about trust?
&lt;/h2&gt;

&lt;p&gt;When transferring container images over internet, &lt;em&gt;trust&lt;/em&gt; is a central concern. It would be good to ensure their integrity and that the CI pipeline is the source of these images. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.docker.com/engine/security/trust/content_trust/" rel="noopener noreferrer"&gt;Docker Content Trust&lt;/a&gt; (DCT) provides this ability to use digital signatures for data sent to and received from remote Docker registries. These signatures allow client-side or runtime verification of the integrity and publisher of specific image tags.&lt;/p&gt;

&lt;p&gt;Through DCT, the CI pipeline could sign the images and Kubernetes could ensure that the images it pulls are signed. But a prerequisite for signing an image with DCT is a Docker Registry with a "notary" server attached (such as the Docker Hub). This is not currently supported in Google Container Registry (see &lt;a href="https://stackoverflow.com/questions/53380782/pushing-signed-docker-images-to-gcr?rq=1" rel="noopener noreferrer"&gt;here&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Google Cloud now implements the concept of &lt;a href="https://cloud.google.com/binary-authorization?hl=en" rel="noopener noreferrer"&gt;Binary Authorization&lt;/a&gt; based off of open source &lt;a href="https://github.com/grafeas/kritis/blob/master/docs/binary-authorization.md" rel="noopener noreferrer"&gt;Kritis&lt;/a&gt;. However, this is specific to Google Cloud and I don't know if it's possible to set up a vanilla Kritis there. &lt;/p&gt;

&lt;p&gt;With Vault, we could easily sign our application and verify this data from the bootstrap script. But then, an attacker could modify the script to not verify the data, and thus break the trust chain. Consequently, the script (or the init image) should also be signed. But it cannot verify itself, and we end up with a chicken-or-egg problem... &lt;/p&gt;

&lt;h2&gt;
  
  
  Criticism
&lt;/h2&gt;

&lt;p&gt;An alternative to the images' build would be putting the encrypted datakey next to the bootstrap script in the init image, instead of placing it next to the application in the app image. In that way, we would separate the encrypted application from the key needed to decrypt it in two separate images. This may be viewed as an additional security. But in that case, the key must be created (or rotated) outside of Docker - as it must be both placed in the init image and used in the app image to actually encrypt it - and &lt;code&gt;docker build&lt;/code&gt; alone does not suffice anymore to build the images. The first solution is therefore an assumed choice.&lt;/p&gt;

&lt;p&gt;Also, the solution we presented here relies on mechanisms that are internal to the image: this is security by the contents, which is a form of security through obscurity. Like &lt;code&gt;docker trust&lt;/code&gt;, it would be nice if Docker could encrypt an image with a &lt;code&gt;docker encrypt&lt;/code&gt; command line. The result image would be an envelope with clear metadata and encrypted contents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;p&gt;We named this security solution for bringing sensitive apps into the cloud "Asguard". It is named after the location of Ásgard in the Norse religion (not Marvel, please 😠). Ásgard is an impregnable divine fortress surrounded by an impassable wall that shelters the gods from all invasions.&lt;/p&gt;

&lt;p&gt;We made a demo which includes all the elements explained in this article. You will find the source code &lt;a href="https://gitlab.com/stack-labs/oss/asguard-demo" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We also presented this subject at the "Capitole du Libre" conference at Toulouse in November 2019. Here is the YouTube &lt;a href="https://youtu.be/bLoNUJizaSI" rel="noopener noreferrer"&gt;video&lt;/a&gt; (in French):&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/bLoNUJizaSI"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  See also
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://coreos.com/rkt/" rel="noopener noreferrer"&gt;rkt&lt;/a&gt;: an alternative to Docker with security as a first-class citizen (container isolation, 
seccomp, SELinux, non-root user, containers are signed and verified by default)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/mozilla/sops" rel="noopener noreferrer"&gt;Mozilla SOPS&lt;/a&gt;: an alternative to protected environment variables in Gitlab. You can store encrypted values directly in git-versioned files.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/shyiko/kubesec" rel="noopener noreferrer"&gt;kubesec&lt;/a&gt;: like SOPS, but dedicated to Kubernetes secrets only. This 
&lt;a href="https://dev.to/code/keep-your-kubernetes-secrets-in-git-with-kubesec/"&gt;article&lt;/a&gt; on our blog introduces kubesec.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.vaultproject.io/docs/platform/k8s/injector" rel="noopener noreferrer"&gt;vault-k8s&lt;/a&gt;: Vault is used more and more for managing Kubernetes secrets.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>security</category>
      <category>cloud</category>
      <category>gitlab</category>
      <category>docker</category>
    </item>
  </channel>
</rss>
