<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Lucien Boix</title>
    <description>The latest articles on Forem by Lucien Boix (@lboix).</description>
    <link>https://forem.com/lboix</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F265599%2Fe696ad61-bae6-4fb0-81e6-d567b3ce3530.jpg</url>
      <title>Forem: Lucien Boix</title>
      <link>https://forem.com/lboix</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/lboix"/>
    <language>en</language>
    <item>
      <title>AWS IAM : how to list unused access keys in your account</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Tue, 24 Sep 2024 20:06:34 +0000</pubDate>
      <link>https://forem.com/lboix/aws-iam-how-to-list-unused-access-keys-in-your-account-3kcf</link>
      <guid>https://forem.com/lboix/aws-iam-how-to-list-unused-access-keys-in-your-account-3kcf</guid>
      <description>&lt;p&gt;You have two options here.&lt;/p&gt;

&lt;p&gt;The best is to activate the IAM &lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/access-analyzer-getting-started.html" rel="noopener noreferrer"&gt;unused access analyser&lt;/a&gt; if you are willing to pay around 50USD monthly for this service.&lt;br&gt;
Basically it will scan constantly all your IAM section and list you warning events like unused roles, unused permissions, unused passwords and what interests us the most here : unused access keys.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You can event use EventBridge to be notified about that through an email or a Lambda (that could write to your Slack channel for example)&lt;/li&gt;
&lt;li&gt;Or simply add this check to your morning routine at work&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Otherwise you can launch this simple bash script I made &lt;a href="https://gist.github.com/lboix/f7981a3e573d110fbc01da99b9500a1a" rel="noopener noreferrer"&gt;here&lt;/a&gt; : it will list you the active access keys not used from more than 90 days. &lt;br&gt;
You can confidently start to deactivate them, then remove them after a few days.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>iam</category>
    </item>
    <item>
      <title>Datadog : how to filter metrics on tag "team"</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Tue, 17 Sep 2024 18:57:27 +0000</pubDate>
      <link>https://forem.com/lboix/datadog-how-to-filter-metrics-on-tag-team-2f83</link>
      <guid>https://forem.com/lboix/datadog-how-to-filter-metrics-on-tag-team-2f83</guid>
      <description>&lt;p&gt;We created a Datadog dashboard to monitor, across our organization,  basic metrics about the health of our apps : logs in errors by service, Kubernetes containers restarts, APM errors by service, etc.&lt;/p&gt;

&lt;p&gt;A few weeks ago, I wanted to add a "Team" filter on it : the goal was to help our different teams using it during their "morning routine" (a daily check of their applications metrics).&lt;/p&gt;

&lt;p&gt;It appeared more challenging than I thought but with the help of Datadog Support we managed to figure it out. I am sharing this knowledge here in case that can help you achieve the same goal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Basically the most important link is this one, listing all available metrics in your Datadog account :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://app.datadoghq.com/metric/summary" rel="noopener noreferrer"&gt;https://app.datadoghq.com/metric/summary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If in the "Tags" section of your metric (the one you want to use in your dashboard) you do not see the one you want to filter on (in our case "team"), then it means it has not been propagated correctly.&lt;/p&gt;

&lt;p&gt;We discovered with Datadog Support that the tagging is different given the nature of the metric you want to use in your dashboard.&lt;/p&gt;

&lt;p&gt;Here are the 3 usecases we identified, but first let's do some preparation if you plan to filter your metrics on teams.&lt;/p&gt;

&lt;h2&gt;
  
  
  Preparation
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;make sure all your different teams are described here : &lt;a href="https://app.datadoghq.com/teams" rel="noopener noreferrer"&gt;https://app.datadoghq.com/teams&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;make sure all of your services do have the right team assigned 
: &lt;a href="https://app.datadoghq.com/services" rel="noopener noreferrer"&gt;https://app.datadoghq.com/services&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;make sure all your pods have a defined label "team" in their Deployment or DaemonSet manifest :
```
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;spec:&lt;br&gt;
  template:&lt;br&gt;
    metadata:&lt;br&gt;
      labels:&lt;br&gt;
        team: your-team-name&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
## Logs metric

Make sure your Datadog agent does have this environment variable in its configuration : it will map the label "team" of your pods with the tag "team" of metrics collected from it.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;name: DD_KUBERNETES_POD_LABELS_AS_TAGS
value: '{"team":"team"}'&lt;/li&gt;
&lt;/ul&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
&amp;gt; Important : if you do use a **custom** logs metric in your dashboard (that means this one is defined [here](https://app.datadoghq.com/logs/pipelines/generate-metrics), then edit it and make sure to add the "team" tag in the "Group By" section like this :
&amp;gt; ![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/4gkjjq4v868gc5qarqxw.png)
&amp;gt; Clic on "Update Metric"

Wait and see if it populates correctly in : 
- https://app.datadoghq.com/logs (clic on one recent log to see its details)
- https://app.datadoghq.com/metric/summary

## Kubernetes metric

If you want to filter your `kubernetes_state.container.*` metrics for example, make sure to have this option activated for your Datadog **Cluster** agent configuration.

If you set up it through [Helm](https://github.com/DataDog/helm-charts/blob/bc09ff3950999aeea1ee142e055b6be452902feb/charts/datadog/values.yaml#L194) :

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;datadog:&lt;br&gt;
  kubeStateMetricsCore:&lt;br&gt;
    labelsAsTags:&lt;br&gt;
      pod:&lt;br&gt;
        team: team&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
If you set up it manually through a YAML manifest, make sure to update this [ConfigMap](https://github.com/DataDog/datadog-agent/blob/main/Dockerfiles/manifests/kubernetes_state_core/cluster-agent-confd-configmap.yaml#L38) :

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;labels_as_tags:&lt;br&gt;
  pod:&lt;br&gt;
    team: team&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
Wait and see if it populates correctly in :
- https://app.datadoghq.com/metric/summary

## APM metric

If you want to filter on `trace.servlet.request.errors.by_http_status` for example, you will need to add this environment variable to your Datadog agent configuration :

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;name: DD_APM_FEATURES
value: 'enable_cid_stats'&lt;/li&gt;
&lt;/ul&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
Then go [here](https://app.datadoghq.com/apm/settings) and "Aggregate APM metrics" by "team" like this :

![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/xwqftztjkkby2uf7r5b0.png)

Wait and see if it populates correctly in :
- https://app.datadoghq.com/metric/summary

## Plugging team filter to your dashboard
Finally it's time to use this new tag you populated!
On the upper right of your dashboard, clic on the **+** ("Add Variable") and specify it like this :

![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/pb2a8eefeglyset3yl5q.png)

Then edit all the sections to add it to the scope of the metric displayed in each one of them, and Save, like this :

![Image description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/mrx8g736r0xcundfe3c7.png)

Hope it helped, have a great day and happy monitoring!

## Sources
- https://docs.datadoghq.com/containers/kubernetes/tag/?tab=manualdaemonset#pod-labels-as-tags
- https://docs.datadoghq.com/tracing/guide/setting_primary_tags_to_scope/?tab=kuberneteswithouthelm#container-based-second-primary-tags

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>sre</category>
      <category>devops</category>
      <category>datadog</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Testing Flux V2 (or migrating from Flux V1) : TLDR</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Wed, 11 Oct 2023 19:51:17 +0000</pubDate>
      <link>https://forem.com/lboix/testing-flux-v2-or-migrating-from-flux-v1-tldr-3ih5</link>
      <guid>https://forem.com/lboix/testing-flux-v2-or-migrating-from-flux-v1-tldr-3ih5</guid>
      <description>&lt;p&gt;Whether you are migrating from deprecated &lt;a href="https://github.com/fluxcd/flux"&gt;FluxV1&lt;/a&gt; or decided to go GitOps by testing &lt;a href="https://fluxcd.io/flux/"&gt;FluxV2&lt;/a&gt;, the existing documentation can be intimidating. You may want to quickly test FluxV2 without using the CLI and its default behaviour of bootstrapping a new repo containing your Flux setup.&lt;/p&gt;

&lt;p&gt;That was my case so I created &lt;a href="https://github.com/lboix/flux2-lite"&gt;this repo&lt;/a&gt; to help you quickly get hands on by providing the simplest manifests template.&lt;/p&gt;

&lt;p&gt;I hope it will help you discover the GitOps philosophy and start great things following it!&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>gitops</category>
    </item>
    <item>
      <title>How many pods can run by default on an EKS node ?</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Wed, 12 Jul 2023 16:46:55 +0000</pubDate>
      <link>https://forem.com/lboix/how-many-pods-can-run-by-default-on-an-eks-node--3gh6</link>
      <guid>https://forem.com/lboix/how-many-pods-can-run-by-default-on-an-eks-node--3gh6</guid>
      <description>&lt;p&gt;As you know, in EKS each of your pod has a private IP assigned. Which means that the max number of pods that a node can handle is directly linked to the the max number of &lt;a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html"&gt;ENI&lt;/a&gt; possible (Elastic Network Interfaces), and their IP adresses, for the EC2 instance type of the node you are using.&lt;/p&gt;

&lt;p&gt;I discovered recently that there were two hard limits applicable here, so I am sharing them in this small post if that can help you to gain some time.&lt;/p&gt;

&lt;p&gt;First take the number from &lt;a href="https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt"&gt;this file&lt;/a&gt; regarding your node instance type.&lt;/p&gt;

&lt;p&gt;If your node is inside a managed node group where the AMI is pinpointed :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the number above is your max&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your node is inside a managed node group where the AMI is NOT pinpointed :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;if the number above is &amp;lt; 110 then this is your max&lt;/li&gt;
&lt;li&gt;if the number above is &amp;gt; 110 and your instance type has less than 30 vCPU, then 110 is your max (explained &lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/create-managed-node-group.html#:~:text=maximum%20number%20is-,110,-.%20For%20instances%20with"&gt;here&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;if the number above is &amp;gt; 110 and your instance type has more than 30 vCPU, then 250 is your max (explained &lt;a href="https://docs.aws.amazon.com/eks/latest/userguide/create-managed-node-group.html#:~:text=number%20jumps%20to-,250,-.%20These%20numbers%20are"&gt;here&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So technically you multiply this final number with the number of nodes in your cluster (assuming they have all the same instance type) and you have the maximum number of pods that can run inside your EKS cluster.&lt;/p&gt;

&lt;p&gt;If you want to double check this value for a node, you can simply run this kubectl commands :&lt;br&gt;
&lt;code&gt;kubectl get nodes&lt;/code&gt;&lt;br&gt;
&lt;code&gt;kubectl describe node NODE_NAME | grep 'pods\|PrivateIP'&lt;/code&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>aws</category>
    </item>
    <item>
      <title>A CloudFront Function to remove a specific value at the beginning of an URL</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Wed, 25 Jan 2023 22:48:38 +0000</pubDate>
      <link>https://forem.com/lboix/a-cloudfrontfunction-to-remove-a-specific-value-at-the-beginning-of-an-url-3f3d</link>
      <guid>https://forem.com/lboix/a-cloudfrontfunction-to-remove-a-specific-value-at-the-beginning-of-an-url-3f3d</guid>
      <description>&lt;p&gt;My usecase was :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the CDN was delivering images from a S3 bucket&lt;/li&gt;
&lt;li&gt;the images URL pattern was &lt;code&gt;https://URL/images/something.jpg&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;there was not an "images" folder at the root of the S3 bucket, images like &lt;code&gt;something.jpg&lt;/code&gt; were directly there&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I used this function and associated it to the right Behavior ("Viewer request" option) of my CloundFront Distribution :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function handler(event) {

    var request = event.request;

    if (request.uri.startsWith("/images/")) {
        request.uri = request.uri.substring(7);
    }

    return request;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let me know if that helped you or if you have suggestion for improvements. Take care !&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>writing</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>How to do a thread dump on a pod running a Java app ?</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Tue, 24 Jan 2023 22:46:59 +0000</pubDate>
      <link>https://forem.com/lboix/how-to-do-a-thread-dump-on-a-pod-running-a-java-app--1cl8</link>
      <guid>https://forem.com/lboix/how-to-do-a-thread-dump-on-a-pod-running-a-java-app--1cl8</guid>
      <description>&lt;p&gt;If your Java app is struggling with busy threads pilling up, there's nothing better to have a look at the state of those threads and see what was their last action before they hung.&lt;/p&gt;

&lt;p&gt;Here is a simple TODO to achieve that if your app is running inside a Kubernetes pod (we will assume that this one only run 1 container).&lt;/p&gt;

&lt;p&gt;Open your terminal and tail the logs of your pod :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get po |grep "YOUR_APP"
kubectl logs -f POD_NAME
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open a new tab of your terminal, and launch the thread dump :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# connect to your pod's container
kubectl exec -it POD_NAME -- sh
# find the PID of your Java process (it should be 1)
ps aux
# force a thread dump to stdout (do not worry : this will not kill the application)
kill -3 YOUR_PID
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Go back in your first tab and analyse the results.&lt;/p&gt;

&lt;p&gt;For example it allowed me one time to quickly find out that I had a key locked in my Redis instance. What else did you discover through them ? Please share your experiences in the comments.&lt;/p&gt;

&lt;p&gt;Take care and have a great day !&lt;/p&gt;

</description>
      <category>node</category>
      <category>oauth</category>
      <category>backend</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>ElasticSearch cluster sanity check and first-aid kit</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Fri, 14 Oct 2022 18:46:44 +0000</pubDate>
      <link>https://forem.com/lboix/elasticsearch-cluster-first-aid-kit-484d</link>
      <guid>https://forem.com/lboix/elasticsearch-cluster-first-aid-kit-484d</guid>
      <description>&lt;p&gt;Here are some useful commands I used in the past to help you fix your yellow or red cluster, especially when you have unassigned shards. If you have suggestions for improvements please let me know in the comments. I wish you a great day !&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# see cluster health
GET _cluster/health?pretty

# see nodes status
GET _cat/nodes?pretty&amp;amp;v=true

# see a summary of the JVM statistics (memory usage, does GC is triggering a lot, etc.) 
GET /_nodes/stats/jvm?pretty

# see shards status
GET /_cat/shards?v

# see shards allocation (useful to detect if a node has a disk space full)
GET /_cat/allocation?v

# get detailed reason for the first unassigned shard
GET /_cluster/allocation/explain

# get the reason for any unhealthy shard
GET _cat/shards?h=index,shard,prirep,state,unassigned.reason

# the detail of an unhealthy shard can be found here : https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-shards.html#_example_with_reasons_for_unassigned_shards:~:text=unassigned.-,reason,-%2C%20ur

# if the unassigned shard belongs to an index you can get rid of (logs of a past day for example), the easiest fix is to remove the related index
GET _cat/indices?v
DELETE /your_index

# if the unassigned shard belongs to an index you can NOT get rid of (production data), then try to reroute it to another node (if it fails the precise reason will be described) : example for primary shard #2 (use "allow_primary": false for a replica shard) of your-index (remove the ?dry_run parameter to actually reroute the shard)
POST _cluster/reroute?dry_run
{
    "commands" : [
        {
          "allocate" : {
              "index" : "your-index", "shard" : 2, "node" : "new-node-name", "allow_primary": true
          }
        }
    ]
}

# if you stuck shard is not in UNASSIGNED status but rather in INITIALIZING status
## if you are with ES7+ then you can force the reassignment of the shard with the command above, but replace allocate with allocate_stale (I never tested it myself actually, only read about this)
## if not and you are comfortable, you can try to reboot the node currently assigned to this shard : after the restart, the shard should be back to UNASSIGNED status and you will be able to use the command above (I never tested it myself actually, only read about this)

# check your cluster settings (allocation rules for example)
GET _cluster/settings

# exclude the IP of a bad node for the shard allocation
PUT _cluster/settings
{
  "transient" :{
      "cluster.routing.allocation.exclude._ip" : "your-node-ip"
  }
}

# check your index settings (shards and replicas number for example)
GET /your-index/_settings

# if you have a replica unassigned shard, a known workaround is to put to 0 the number of replicas (it will delete replica shards) then put it back to its original value (it will recreate them). But I recommend to AVOID doing this as it will put a big load on the cluster, and it's a risky procedure especially if the state of the cluster is red
PUT /your-index/_settings
{
    "index" : {
        "number_of_replicas" : 0
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
    </item>
    <item>
      <title>TODO for smoothly upgrading Kubernetes version</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Fri, 07 Oct 2022 21:54:46 +0000</pubDate>
      <link>https://forem.com/lboix/todo-for-upgrading-kubernetes-version-lnk</link>
      <guid>https://forem.com/lboix/todo-for-upgrading-kubernetes-version-lnk</guid>
      <description>&lt;p&gt;During the last months, I tried to come up with a simple TODO to optimize the process and make it as smooth as possible for your workload with no downtime (by avoiding too much pods starting at the same time, hitting your container registry rate limit, etc).&lt;/p&gt;

&lt;p&gt;So I am sharing it here if that can help your with your first upgrade :&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Preparation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First start reading this article for potential breaking changes (especially regarding deleted apiVersion : you need to update them before going further!) : &lt;a href="https://kubernetes.io/docs/reference/using-api/deprecation-guide/"&gt;https://kubernetes.io/docs/reference/using-api/deprecation-guide/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;If you have one, always start by upgrading your testing / staging cluster before your production one (I strongly suggest it)&lt;/li&gt;
&lt;li&gt;Monitor it for a few days, just to be sure that there is no bad side effect on your workload with this upgrade&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Upgrading master nodes&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Start by upgrading the Kubernetes version of your Control Plane aka your master nodes, depending of the tool you are using (kops, EKS, AKS, etc.)&lt;/li&gt;
&lt;li&gt;If you are using cluster-autoscaler, scale it down :
&lt;code&gt;kubectl scale --replicas=0 deployment/cluster-autoscaler -n kube-system&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;If you are using a GitOps agent (like flux in this example), scale it down :
&lt;code&gt;kubectl scale --replicas=0 deployment/flux -n flux&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Create a new node group with the same new Kubernetes version&lt;/li&gt;
&lt;li&gt;Put a maintenance plage for 2 hours in your monitoring tool&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Rolling out worker nodes&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Drain each node one by one on the old node group, use &lt;code&gt;kubectl get nodes -o wide&lt;/code&gt; to pick the right ones (running old Kubernetes version) :
&lt;code&gt;kubectl drain node_name --ignore-daemonsets --delete-emptydir-data&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;After one drain, wait for all Evicted pods to restart correctly by running this command to get unhealthy pods across the cluster :
&lt;code&gt;kubectl get po -A | grep "0/" | grep -v "Completed"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Wait a few minutes until you only have a few lines left, then move to the next node&lt;/li&gt;
&lt;li&gt;After the last node and when you have no more result at all with this command above, you are good to pursue!&lt;/li&gt;
&lt;/ul&gt;




&lt;ul&gt;
&lt;li&gt;Once ALL nodes of old node group have been drained, you can delete it&lt;/li&gt;
&lt;li&gt;Check the completed deletion using again &lt;code&gt;kubectl get nodes -o wide&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Wrapping up&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If you are using cluster-autoscaler, upgrade the version used :&lt;/li&gt;
&lt;li&gt;find the latest release number that matches the new k8s version of your cluster : &lt;a href="https://github.com/kubernetes/autoscaler/releases"&gt;https://github.com/kubernetes/autoscaler/releases&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;type the major version number in the search field up right to filter easily&lt;/li&gt;
&lt;li&gt;update the used Docker image number of cluster-autoscaler :
&lt;code&gt;kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=k8s.gcr.io/autoscaling/cluster-autoscaler:v1.MAJOR.minor&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;make sure the pod is starting correctly checking its logs :
&lt;code&gt;kubectl scale --replicas=1 deployment/cluster-autoscaler -n kube-system&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;ul&gt;
&lt;li&gt;If you are using a GitOps agent (like flux in this example), scale it up, check logs and make sure that it syncs well :
&lt;code&gt;kubectl scale --replicas=1 deployment/flux -n flux&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;ul&gt;
&lt;li&gt;Check your monitoring tools and resolve muted alerts that may have been triggered by the rollout&lt;/li&gt;
&lt;li&gt;Announce to your team that the rollout is done and all went well :)&lt;/li&gt;
&lt;li&gt;Commit-push all the version modifications you made in your cluster repo if you have one (I strongly suggest it)&lt;/li&gt;
&lt;li&gt;That's it!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you have any suggestion to upgrade this TODO, do not hesitate to let me know in the comments below. Thanks for reading and I wish you a great day!&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
    </item>
    <item>
      <title>Filebeat config on k8s after switching to containerd</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Thu, 18 Aug 2022 20:55:00 +0000</pubDate>
      <link>https://forem.com/lboix/filebeat-config-on-k8s-after-switching-to-containerd-1p6o</link>
      <guid>https://forem.com/lboix/filebeat-config-on-k8s-after-switching-to-containerd-1p6o</guid>
      <description>&lt;p&gt;You can not ignore it, dockershim (layer for using Docker Runtime in Kubernetes) &lt;a href="https://kubernetes.io/docs/tasks/administer-cluster/migrating-from-dockershim/"&gt;will be removed starting 1.24&lt;/a&gt;. Do not worry, it's a change pretty seamless and your images built with Docker will still be fully functional.&lt;/p&gt;

&lt;p&gt;But it's pretty sure that if your current cluster nodes are running through Docker Runtime, then you have some hardcoded configuration tight to Docker.&lt;/p&gt;

&lt;p&gt;In this article we will focus on a &lt;strong&gt;filebeat&lt;/strong&gt; configuration originally setup for Docker Runtime, and what needs to be done after the switch to &lt;strong&gt;containerd&lt;/strong&gt; in order to keep getting your precious logs.&lt;/p&gt;

&lt;p&gt;The main steps are updating your &lt;strong&gt;filebeat&lt;/strong&gt; config file :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;activating &lt;code&gt;symlinks&lt;/code&gt; option&lt;/li&gt;
&lt;li&gt;update the path of the logs files&lt;/li&gt;
&lt;li&gt;use together &lt;code&gt;dissect&lt;/code&gt; and &lt;code&gt;drop_fields&lt;/code&gt; processor to only parse and keep the necessary&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then after that update the &lt;em&gt;volumeMounts&lt;/em&gt; section of your &lt;strong&gt;filebeat&lt;/strong&gt; &lt;em&gt;DaemonSet&lt;/em&gt; definition :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;each existing &lt;em&gt;mountPath&lt;/em&gt; or &lt;em&gt;path&lt;/em&gt; with value &lt;code&gt;/var/lib/docker/containers&lt;/code&gt; will need to be changed to &lt;code&gt;/var/log/containers&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is a snippet of a &lt;strong&gt;filebeat&lt;/strong&gt; config file that worked for me, do not hesitate to let us know if it helped you in some way or if you have a suggestion for improvement :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
data:
  filebeat.yml: |-
    setup.ilm.enabled: false
    filebeat.inputs:
    - type: log
      symlinks: true
      paths:
        - /var/log/containers/*.log
      processors:
        - add_kubernetes_metadata:
            host: ${NODE_NAME}
            in_cluster: true
            default_matchers.enabled: false
            matchers:
            - logs_path:
                logs_path: /var/log/containers/

    processors:
      - add_cloud_metadata:
      - drop_event:
          when:
            equals:
              kubernetes.namespace: "kube-system"
      - dissect:
          tokenizer: "%{timestamp} %{std} %{capital-letter} %{parsed-message}"
          field: "message"
          target_prefix: ""
      - decode_json_fields:
          fields: ["message","log","logs.log","parsed-message"]
          target: "logs"
          process_array: true
      - drop_fields:
          when:
            regexp:
              message: "^{\""
          fields: ["message"]
          ignore_missing: true
      - drop_fields:
          fields: ["log.file.path","timestamp","std","capital-letter","parsed-message"]
          ignore_missing: true

...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Have a great day!&lt;/p&gt;

</description>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Go snippet for creating an Ingress rule</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Thu, 18 Aug 2022 16:16:00 +0000</pubDate>
      <link>https://forem.com/lboix/go-snippet-for-creating-an-ingress-rule-2ici</link>
      <guid>https://forem.com/lboix/go-snippet-for-creating-an-ingress-rule-2ici</guid>
      <description>&lt;p&gt;You probably need to migrate to apiVersion &lt;code&gt;networking.k8s.io/v1&lt;/code&gt; for your Ingress rules (given that after Kubernetes 1.22, the old apiVersion &lt;code&gt;extensions/v1beta1&lt;/code&gt; and &lt;code&gt;networking.k8s.io/v1beta1&lt;/code&gt; will simply disappear). If you are managing your Ingress rules through Go, here is a snippet to generate a valid Ingress rule if that can help you (I struggled a little to find the correct template so I am sharing this post).&lt;/p&gt;

&lt;p&gt;Please let me know it this snippet was useful or if you see some improvements that we can make to it.&lt;/p&gt;

&lt;p&gt;Have a great day!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import (
    v1Networking "k8s.io/api/networking/v1"
    v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

func MapIngress(ingressName string, hostName string) *v1Networking.Ingress {

    annotations := map[string]string{}
    annotations["kubernetes.io/ingress.provider"] = "nginx"
    annotations["kubernetes.io/ingress.class"] = "yourIngressClass"
    annotations["kubernetes.io/tls-acme"] = "true"
    // add other annotations you need

    meta := v1.ObjectMeta{
        Name:        ingressName,
        Annotations: annotations,
    }

    pathTypeImplementationSpecific := v1Networking.PathTypeImplementationSpecific

    return &amp;amp;v1Networking.Ingress{
        ObjectMeta: meta,
        Spec: v1Networking.IngressSpec{
            TLS: []v1Networking.IngressTLS{
        v1Networking.IngressTLS{
            Hosts:      []string{hostName},
            SecretName: "yourSecretName",
        },
            Rules: []v1Networking.IngressRule{
                v1Networking.IngressRule{
                    Host: hostName,
                    IngressRuleValue: v1Networking.IngressRuleValue{
                        HTTP: &amp;amp;v1Networking.HTTPIngressRuleValue{
                            Paths: []v1Networking.HTTPIngressPath{
                                v1Networking.HTTPIngressPath{
                                    Path: "/",
                                    PathType: &amp;amp;pathTypeImplementationSpecific,
                                    Backend: v1Networking.IngressBackend{
                                        Service: &amp;amp;v1Networking.IngressServiceBackend{
                                            Name: "yourServiceName",
                                            Port: v1Networking.ServiceBackendPort{
                                                Number: 80,
                                            },
                                        },
                                    },
                                },
                            },
                        },
                    },
                },
            },
        },
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>kubernetes</category>
      <category>go</category>
    </item>
    <item>
      <title>EKS : migrate your Service to a Network Load Balancer</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Wed, 17 Aug 2022 22:59:00 +0000</pubDate>
      <link>https://forem.com/lboix/eks-migrate-your-service-to-a-network-load-balancer-nkh</link>
      <guid>https://forem.com/lboix/eks-migrate-your-service-to-a-network-load-balancer-nkh</guid>
      <description>&lt;p&gt;Whatever your usecase is (more performance, decrease slightly the AWS bill, etc.), it's often a good call to switch to a Network Load Balancer for your Kubernetes cluster. You will get good performance gain at this level as you will use a more basic layer (4 on &lt;a href="https://en.wikipedia.org/wiki/OSI_model#Layer_architecture" rel="noopener noreferrer"&gt;OSI model&lt;/a&gt;) to receive traffic. Given that all the routing logic is often already done applicatively through the Ingress Controller, or a service mesh like Istio, it's a good call.&lt;/p&gt;

&lt;p&gt;In my usecase for example, it was specifically for having the possibility to use static IPs for my Network Load Balancer (through Amazon Elastic IPs feature).&lt;/p&gt;

&lt;p&gt;After hours of tests and digging, I propose you a snippet that can be a good start for your switch. You will indeed create a new Service first, exposing the same Deployment that the Service currently existing. You will then have two load balancers reachable and forwarding the traffic to the same app. It's really useful to gracefully switch the traffic through DNS, test things, and be able to rollback quickly if needed (a TTL of 300 seconds is acceptable for that).&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

kind: Service
apiVersion: v1
metadata:
  name: public-ingress-nginx-nlb
  namespace: prod
  labels:
    app: public-ingress-nginx-nlb
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
    service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: '60'
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: 'true'
    service.beta.kubernetes.io/aws-load-balancer-type: 'nlb'
    service.beta.kubernetes.io/aws-load-balancer-eip-allocations: "eipalloc-AAA,eipalloc-BBB,eipalloc-CCC"
    service.beta.kubernetes.io/aws-load-balancer-subnets: "subnet-AAA,subnet-BBB,subnet-CCC"
    service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: proxy_protocol_v2.enabled=true
    # until you use the AWS Load Balancer Controller, this last option above needs to be activated manually in the Target groups / Attributes tab
spec:
  type: LoadBalancer
  selector:
    app: public-ingress-nginx
  ports:
    - name: http
      port: 80
      protocol: TCP
      targetPort: http
    - name: https
      port: 443
      protocol: TCP
      targetPort: https


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Notes :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The annotations &lt;code&gt;service.beta.kubernetes.io/aws-load-balancer-eip-allocations&lt;/code&gt; and &lt;code&gt;service.beta.kubernetes.io/aws-load-balancer-subnets&lt;/code&gt; are optional if you do not need to attach and use static IPs for your Network Load Balancer. If it's the case, you will need first to allocate them in EC2 (they need to be in your account and not currently in use). You do not need to have 3, 1 will work but I recommend having 3 if this is for production traffic. For redundancy, AWS will force you to define 1 public subnet per Elastic IP and each subnet will need to be in a different Availability Zone of the Region you are using.&lt;/li&gt;
&lt;li&gt;To be able to use the &lt;a href="https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#use-proxy-protocol" rel="noopener noreferrer"&gt;PROXY protocol&lt;/a&gt; correctly, note that this annotation &lt;code&gt;service.beta.kubernetes.io/aws-load-balancer-target-group-attributes&lt;/code&gt; will not work if you did not setup the &lt;a href="https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/deploy/installation/" rel="noopener noreferrer"&gt;AWS Load Balancer Controller&lt;/a&gt; on your cluster. In the meantime, do not forget to go activate this option through EC2, you will need to edit each Target Group of your Network Load Balancer and check this option :&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foj5o60w7ks1piebrbyrs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foj5o60w7ks1piebrbyrs.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After a few days or weeks, if everything is working as expected, do not forget to delete your original Service : it will tear down automatically the old Classic or Applicative Load Balancer you were using, with no downtime or impact on your current Service linked to your Network Load Balancer. &lt;strong&gt;Also do not forget to update the arg &lt;code&gt;--publish-service&lt;/code&gt; of your Nginx Ingress Controller containers managed by your DaemonSet or Deployment specs.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let me know if this page helped you in some way or if you have some suggestions for improvements.&lt;/p&gt;

&lt;p&gt;Have a great day!&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>aws</category>
    </item>
    <item>
      <title>How to fix a npm install issue ?</title>
      <dc:creator>Lucien Boix</dc:creator>
      <pubDate>Mon, 08 Aug 2022 21:31:00 +0000</pubDate>
      <link>https://forem.com/lboix/how-to-fix-a-npm-install-issue--37kp</link>
      <guid>https://forem.com/lboix/how-to-fix-a-npm-install-issue--37kp</guid>
      <description>&lt;p&gt;Sometimes it can be good to start from scratch, especially if you are opening an old / legacy project. Here are the steps to follow in order to have a working dependency graph again :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;delete the existing &lt;strong&gt;package-lock.json&lt;/strong&gt; file&lt;/li&gt;
&lt;li&gt;ran the following commands:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;node -v&lt;/code&gt; (make sure you are using the same Node version than the pipeline that will build and deploy your project, if not see below)&lt;br&gt;
&lt;code&gt;npm cache clean -f&lt;/code&gt;&lt;br&gt;
&lt;code&gt;npm install&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;All should be good now.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you need to quickly switch to another version of Node than the one currently setup on your workstation, you can use this powerful package &lt;a href="https://www.npmjs.com/package/n"&gt;n&lt;/a&gt; :&lt;br&gt;
&lt;code&gt;npm install -g n&lt;/code&gt;&lt;br&gt;
&lt;code&gt;sudo n stable&lt;/code&gt;&lt;br&gt;
&lt;code&gt;sudo n&lt;/code&gt; (choose + Enter)&lt;br&gt;
&lt;code&gt;node -v&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;Note : same remark if you need to run a &lt;code&gt;npm audit fix&lt;/code&gt;, then make sure you are using the same Node version than the pipeline that will build and deploy your project&lt;/p&gt;

</description>
      <category>node</category>
    </item>
  </channel>
</rss>
