<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Jens Gerke</title>
    <description>The latest articles on Forem by Jens Gerke (@jensgst).</description>
    <link>https://forem.com/jensgst</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F658380%2Fcc107870-d93c-480c-974d-c4a7d97c92d3.png</url>
      <title>Forem: Jens Gerke</title>
      <link>https://forem.com/jensgst</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/jensgst"/>
    <language>en</language>
    <item>
      <title>Preloading Ollama Models</title>
      <dc:creator>Jens Gerke</dc:creator>
      <pubDate>Tue, 26 Mar 2024 15:44:18 +0000</pubDate>
      <link>https://forem.com/jensgst/preloading-ollama-models-221k</link>
      <guid>https://forem.com/jensgst/preloading-ollama-models-221k</guid>
      <description>&lt;p&gt;A few weeks ago, I started using &lt;a href="https://github.com/ollama/ollama"&gt;Ollama&lt;/a&gt; to run language models (LLM), and I've been really enjoying it a lot. After getting the hang of it, I thought it was about time to try it out on one of our real-world cases (I'll share more about this later).&lt;/p&gt;

&lt;p&gt;At &lt;a href="https://github.com/direktiv/direktiv"&gt;Direktiv&lt;/a&gt; we are using Kubernetes for all our deployments and when I tried to run it as a pod, I faced a couple of issues.&lt;/p&gt;

&lt;p&gt;The initial issue I faced was Ollama downloading models as needed, which is logical given its support for multiple models. When starting up, the specific model required has to be fetched, with sizes ranging from 1.5GB to 40GB. This really extends the time it takes for the container to start up.&lt;/p&gt;

&lt;p&gt;To start the download, you'd either make an API call or get the CLI going to fetch the model you need. In a Kubernetes setup, you can easily handle this using a lifecycle event in &lt;code&gt;postStart&lt;/code&gt;. So, here's a simple example of an Ollama deployment I put together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama/ollama:0.1.29&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http&lt;/span&gt;
          &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;11434&lt;/span&gt;
          &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
        &lt;span class="na"&gt;lifecycle&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;postStart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;exec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/bin/sh"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-c"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ollama&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pull&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;gemma:2b"&lt;/span&gt; &lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That went okay, but there is still the startup problem - it took ages to run the lifecycle hook, plus it won't function on Kubernetes nodes with no internet access.  At &lt;a href="https://github.com/direktiv/direktiv"&gt;Direktiv&lt;/a&gt; were are using Knative a lot as well which does not support lifecycle events. So, my plan was to create a container using the Ollama image as base with the model pre-downloaded.&lt;/p&gt;

&lt;p&gt;So, a little hiccup is that Ollama runs as an HTTP service with an API, which makes it a bit tricky to run the &lt;code&gt;pull model&lt;/code&gt; command when building the container image to have the models ready to go right from the start. No services in &lt;code&gt;docker build&lt;/code&gt;, remember?&lt;/p&gt;

&lt;p&gt;There have been a couple of GitHub issues pointing out this problem, but the workaround is to start an Ollama container, pull the model, and then transfer the generated models into a new container build. Personally, I found this process not the best for an automated build.&lt;/p&gt;

&lt;p&gt;Got my developer gloves on and thought, "How hard can it be?" 🧤 Excited that all the download functions in the project were exported, but oh boy, the dependencies didn't play nice! Ended up having to copy and tweak the existing setup. Voila! Now we've got a neat little container for a multi-stage build. Check out the project here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/jensg-st/ollama-pull"&gt;https://github.com/jensg-st/ollama-pull&lt;/a&gt; 💥&lt;/p&gt;

&lt;p&gt;With this container, you can fetch the model in the first stage - in this scenario, it's &lt;code&gt;gemma:2b&lt;/code&gt;. For the main container you can still use the default &lt;code&gt;ollama/ollama&lt;/code&gt; image. The model simply needs to be copied from the &lt;code&gt;downloader&lt;/code&gt; to the main container at &lt;code&gt;/root/.ollama&lt;/code&gt;. You can even download multiple models in the first stage.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;gerke74/ollama-model-loader&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;downloader&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;/ollama-pull gemma:2b

&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; ollama/ollama &lt;/span&gt;

&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; OLLAMA_HOST "0.0.0.0"&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=downloader /root/.ollama /root/.ollama&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's build it and run it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;' &amp;gt; Dockerfile
FROM gerke74/ollama-model-loader as downloader
RUN /ollama-pull gemma:2b
FROM ollama/ollama 
ENV OLLAMA_HOST "0.0.0.0"
COPY --from=downloader /root/.ollama /root/.ollama
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;docker build &lt;span class="nt"&gt;-t&lt;/span&gt; gemma &lt;span class="nb"&gt;.&lt;/span&gt; 
docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 11437:11434 gemma
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The curl command sends the question to the container. It is important to use the right value in &lt;code&gt;model&lt;/code&gt;. In this case &lt;code&gt;gemma:2b&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:11437/api/generate &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
  "model": "gemma:2b",
  "prompt": "Why is the sky blue?"
}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The container will respond like that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"model":"gemma:2b","created_at":"2024-03-26T15:16:56.780177872Z","response":"The","done":false}
{"model":"gemma:2b","created_at":"2024-03-26T15:16:57.003156881Z","response":" sky","done":false}
{"model":"gemma:2b","created_at":"2024-03-26T15:16:57.223483082Z","response":" appears","done":false}
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Please feel free to comment if that was helpful or if something is not working. In the next few posts I will add some real-life functionality to this. &lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>docker</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Knative Serverless in 2024</title>
      <dc:creator>Jens Gerke</dc:creator>
      <pubDate>Wed, 20 Mar 2024 08:17:04 +0000</pubDate>
      <link>https://forem.com/jensgst/knative-serverless-in-2024-dom</link>
      <guid>https://forem.com/jensgst/knative-serverless-in-2024-dom</guid>
      <description>&lt;p&gt;At &lt;a href="https://github.com/direktiv/direktiv"&gt;Direktiv&lt;/a&gt;, we're big fans of &lt;a href="https://knative.dev/docs/"&gt;Knative&lt;/a&gt;. It's not just for serverless – it's a fantastic deployment tool for Kubernetes too. &lt;/p&gt;

&lt;p&gt;The project is emphasizing the serverless nature but it's just in general a great deployment tool as well. In my opinion, it simplifies that process because the deployment of e.g. a HTTP services comes down to one file which describes the whole service you want to provide to your applications or users. &lt;/p&gt;

&lt;p&gt;I could provide a big overview of how Knative works, but in this little tutorial I want to show you the basic installation and configuration and how to deploy your first &lt;a href="https://knative.dev/docs/"&gt;Knative&lt;/a&gt; service. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Installation&lt;/li&gt;
&lt;li&gt;Configuration&lt;/li&gt;
&lt;li&gt;Creating a Service&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Knative Installation &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Starting with Knative can be a bit daunting, especially when it comes to choosing the right installation method. There are two primary ways to install Knative: YAML-based installation and the Knative operator.&lt;/p&gt;

&lt;p&gt;The YAML-based installation is straightforward, but it's somewhat limited in flexibility. If you need to modify configurations during installation, this method won't be good enough. That's where the Knative operator comes in handy. The operator not only installs the serving component but also the eventing component if required and it is offering more flexibility and customization options.&lt;/p&gt;

&lt;p&gt;To get started with the Knative operator, you can use the following command. Do note that it can only be installed in the &lt;code&gt;default&lt;/code&gt; namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f https://github.com/knative/operator/releases/download/knative-v1.13.3/operator.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once you've executed the command to install the Knative operator, you should have the operator up and running with two  pods in the &lt;code&gt;default&lt;/code&gt; namespace.&lt;/p&gt;

&lt;p&gt;To verify that the operator is running, you can use the  command &lt;code&gt;kubectl get pods&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;knative-operator-6d768fb7-xnjgs      
operator-webhook-7d6b54d78b-q66fh              
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Knative is requiring a network layer and you have three different options: Istio, Kourier, and Contour. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Istio:&lt;/strong&gt; Istio is a powerful service mesh that provides advanced networking, security, and observability features. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kourier:&lt;/strong&gt; Kourier is purpose-built for Knative, providing a lightweight and efficient network layer specifically designed for serverless workloads. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Contour:&lt;/strong&gt; Contour is a Kubernetes ingress controller that can also be used as the network layer for Knative. It provides basic routing and load balancing capabilities.&lt;/p&gt;

&lt;p&gt;When deciding which option to choose, consider your specific environment, requirements, and preferences. At &lt;a href="https://github.com/direktiv/direktiv"&gt;Direktiv&lt;/a&gt;, we typically opt for Contour due to its simplicity. However, your choice may vary depending on your use case and infrastructure setup. &lt;/p&gt;

&lt;p&gt;In this tutorial we will use Contour as well:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;--filename&lt;/span&gt; https://github.com/knative/net-contour/releases/download/knative-v1.13.0/contour.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Contour installs an internal and external service in two namespaces. If external access to your Knative services isn't needed, you can optimize your setup by deleting the &lt;code&gt;contour-external&lt;/code&gt; namespace. This eliminates the allocation of an unnecessary external IP within the cluster. Simply run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl delete ns contour-external
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Knative Configuration &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Installing Knative using a single YAML file with the operator is convenient, but configuring it can be challenging and I find the documentation a bit thin. Therefore I will explain it a little bit (although there is a lot more).&lt;/p&gt;

&lt;p&gt;The basic YAML would look like the following snippet.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;operator.knative.dev/v1beta1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;KnativeServing&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;knative-serving&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Usually you want to configure the network layer, features and other settings in Knative. You can modify this file to change the settings during installation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;operator.knative.dev/v1beta1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;KnativeServing&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;direktiv-knative&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ingress&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;contour&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;deployments&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;activator&lt;/span&gt;
    &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;linkerd.io/inject&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;enabled&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;features&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;multi-container&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enabled"&lt;/span&gt;  
      &lt;span class="na"&gt;kubernetes.podspec-volumes-emptydir&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enabled"&lt;/span&gt;
      &lt;span class="na"&gt;kubernetes.podspec-init-containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enabled"&lt;/span&gt;
    &lt;span class="na"&gt;autoscaler&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;initial-scale&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0"&lt;/span&gt;
      &lt;span class="na"&gt;allow-zero-initial-scale&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
      &lt;span class="na"&gt;min-scale&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0"&lt;/span&gt;
    &lt;span class="na"&gt;deployment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;registries-skipping-tag-resolving&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kind.local,ko.local,dev.local,localhost:5000,localhost:31212"&lt;/span&gt;
    &lt;span class="na"&gt;network&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;ingress-class&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contour.ingress.networking.knative.dev"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This YAML is a very simple installation file for Knative. Individual components can be addressed under &lt;code&gt;deployments&lt;/code&gt;. These components can be &lt;code&gt;activator&lt;/code&gt;, &lt;code&gt;autoscaler&lt;/code&gt;, &lt;code&gt;controller&lt;/code&gt;, &lt;code&gt;webhook&lt;/code&gt; or &lt;code&gt;autoscaler-hpa&lt;/code&gt;. In this YAML we are setting an annotation for the &lt;code&gt;activator&lt;/code&gt; pod.&lt;/p&gt;

&lt;p&gt;Under &lt;code&gt;config&lt;/code&gt; is the configuration for Knative's different ConfigMaps in kubernetes. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;config-autoscaler
&lt;/li&gt;
&lt;li&gt;config-defaults
&lt;/li&gt;
&lt;li&gt;config-deployment &lt;/li&gt;
&lt;li&gt;config-domain
&lt;/li&gt;
&lt;li&gt;config-features
&lt;/li&gt;
&lt;li&gt;config-gc
&lt;/li&gt;
&lt;li&gt;config-leader-election
&lt;/li&gt;
&lt;li&gt;config-logging
&lt;/li&gt;
&lt;li&gt;config-network
&lt;/li&gt;
&lt;li&gt;config-observability
&lt;/li&gt;
&lt;li&gt;config-tracing
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can lookup all the different settings in the configmaps after installation and tweak your Knative installation like modifying timeouts and maximum connections.&lt;/p&gt;

&lt;p&gt;The most important setting in this case is &lt;code&gt;ingress-class: "contour.ingress.networking.knative.dev"&lt;/code&gt; under network. This has to be configured because we are using Contour as network layer in this tutorial.&lt;/p&gt;

&lt;p&gt;We recently had one installation where a proxy server was required. Because it took me a little bit to figure out how to set environment variables for the pods I'd like to share the snippet how to do that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;...&lt;/span&gt;
  &lt;span class="na"&gt;deployments&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;controller&lt;/span&gt;
    &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;container&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;controllerserving-certs-ctrl-ca&lt;/span&gt;
      &lt;span class="na"&gt;envVars&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HTTP_PROXY&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://myproxy:3128"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HTTPS_PROXY&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://myproxy:3128"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NO_PROXY&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.svc,.default,.local,.cluster.local,localhost"&lt;/span&gt;
&lt;span class="nn"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After appying the YAML with &lt;code&gt;kubectl apply -f knative.yaml&lt;/code&gt; the list of pods in the &lt;code&gt;default&lt;/code&gt; namespace will look like this and we are ready to install the first service.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;knative-operator-6d768fb7-jthff                          1/1     Running   
operator-webhook-7d6b54d78b-75v46                        1/1     Running  
autoscaler-79d9fb98c-5mtnd                               1/1     Running  
controller-cdf856494-lv9qk                               1/1     Running  
webhook-dddf6fcff-jvdjc                                  1/1     Running    
autoscaler-hpa-7969f4f665-kdhv7                          1/1     Running     
activator-74cc7497c9-vqch9                               1/1     Running    
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Creating a Service &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;After setting up Knative it is time to run the first service. Applying the following file will create a Knative service.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;serving.knative.dev/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;helloworld-go&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;direktiv/simple-hello&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TARGET&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Go&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Sample&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;v1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To check if the service is up and running execute &lt;code&gt;kubectl get ksvc&lt;/code&gt; and the service will show up in the list of available Knative services and it's status.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NAMESPACE   NAME            URL                                              LATESTCREATED         LATESTREADY           READY   REASON
default     helloworld-go   http://helloworld-go.default.svc.cluster.local   helloworld-go-00001   helloworld-go-00001   True 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By default Knative would start a pod instance as well but because we have configured &lt;code&gt;allow-zero-initial-scale&lt;/code&gt; and &lt;code&gt;initial-scale&lt;/code&gt; the service will only be prepared for consumption and not started. A simple &lt;code&gt;curl&lt;/code&gt; will "activate" the pod though.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;--restart&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Never &lt;span class="nt"&gt;--image&lt;/span&gt; curlimages/curl curl-test &lt;span class="nt"&gt;--&lt;/span&gt; curl http://helloworld-go.default
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Maybe you have noticed the delay when calling the service. This happens when the service does a "cold start" with zero pods available. &lt;/p&gt;

&lt;p&gt;At the beginning we said, Knative is a great deployment tool even without the serverless component. We can configure the service to have at least X pods available all the time to avoid those cold starts. With that approach Knative can be used as a simplified Kubernetes deployment tool.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;serving.knative.dev/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;helloworld-go&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;autoscaling.knative.dev/min-scale&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;direktiv/simple-hello&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TARGET&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Go&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Sample&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;v1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The annotation &lt;code&gt;autoscaling.knative.dev/min-scale&lt;/code&gt; would set the minimum number of pods to 1 meaning there is always one pod running at any given time.&lt;/p&gt;

&lt;p&gt;I'm hoping this quick introduction to Knative will help you to get started. There is so much more to explore with Knative with e.g. traffic management and versioning. But I will write about this in a different post. &lt;/p&gt;

&lt;p&gt;If you have any questions, just leave a comment!&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>serverless</category>
      <category>tutorial</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
