<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Foram Jaguwala</title>
    <description>The latest articles on Forem by Foram Jaguwala (@foram_jaguwala_46b596a8f6).</description>
    <link>https://forem.com/foram_jaguwala_46b596a8f6</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1763947%2Ffe23ae62-51b0-4786-a9c1-482f886c78fe.jpg</url>
      <title>Forem: Foram Jaguwala</title>
      <link>https://forem.com/foram_jaguwala_46b596a8f6</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/foram_jaguwala_46b596a8f6"/>
    <language>en</language>
    <item>
      <title>🚀 Fixing Ollama Not Using GPU with Docker Desktop (Step-by-Step + Troubleshooting)</title>
      <dc:creator>Foram Jaguwala</dc:creator>
      <pubDate>Sun, 29 Mar 2026 18:52:56 +0000</pubDate>
      <link>https://forem.com/foram_jaguwala_46b596a8f6/fixing-ollama-not-using-gpu-with-docker-desktop-step-by-step-troubleshooting-42b8</link>
      <guid>https://forem.com/foram_jaguwala_46b596a8f6/fixing-ollama-not-using-gpu-with-docker-desktop-step-by-step-troubleshooting-42b8</guid>
      <description>&lt;p&gt;Running LLMs locally with Ollama is exciting… until you realize everything is running on CPU 😅&lt;/p&gt;

&lt;p&gt;I recently ran into this exact issue — models were working, but GPU wasn’t being used at all.&lt;/p&gt;

&lt;p&gt;Here’s how I fixed it using &lt;strong&gt;Docker Desktop with GPU support&lt;/strong&gt;, along with the debugging steps that helped me understand the real problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔴 The Problem
&lt;/h2&gt;

&lt;p&gt;My initial setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ollama installed locally ✅&lt;/li&gt;
&lt;li&gt;Models running successfully ✅&lt;/li&gt;
&lt;li&gt;GPU usage ❌&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Result:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Slow responses&lt;/li&gt;
&lt;li&gt;High CPU usage&lt;/li&gt;
&lt;li&gt;Poor performance&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🧠 Root Cause
&lt;/h2&gt;

&lt;p&gt;After debugging, I realized:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The issue wasn’t entirely Ollama — it was how my local environment handled GPU access.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Even though the GPU was available, it wasn’t properly exposed to Ollama in my local setup.&lt;/p&gt;

&lt;p&gt;However, the same GPU worked perfectly inside Docker, which confirmed that the environment played a major role.&lt;/p&gt;




&lt;h2&gt;
  
  
  🟢 The Solution: Docker Desktop + GPU
&lt;/h2&gt;

&lt;p&gt;Instead of continuing to debug locally, I moved Ollama into a Docker container with GPU enabled.&lt;/p&gt;

&lt;p&gt;This approach turned out to be much simpler and more reliable.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚙️ Prerequisites
&lt;/h2&gt;

&lt;p&gt;Make sure you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Docker Desktop installed&lt;/li&gt;
&lt;li&gt;NVIDIA GPU (RTX / GTX)&lt;/li&gt;
&lt;li&gt;Latest NVIDIA drivers&lt;/li&gt;
&lt;li&gt;WSL2 enabled (for Windows users)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ✅ Step 1: Verify GPU Access in Docker (Critical Step)
&lt;/h2&gt;

&lt;p&gt;Before running Ollama, verify that Docker can access your GPU:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;--gpus&lt;/span&gt; all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;👉 If this command shows GPU details, your setup is correctly configured.&lt;/p&gt;




&lt;h2&gt;
  
  
  🐳 Step 2: Run Ollama with GPU
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--gpus&lt;/span&gt; all &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ollama:/root/.ollama &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 11434:11434 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama &lt;span class="se"&gt;\&lt;/span&gt;
  ollama/ollama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ⚡ Step 3: Run a Model
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Option 1: Inside Container
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; ollama ollama run llama3.2:1b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option 2: Using API (Recommended)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;--location&lt;/span&gt; &lt;span class="s1"&gt;'http://localhost:11434/api/generate'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: text/plain'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s1"&gt;'{
  "model": "llama3.2:1b",
  "prompt": "Hello"
}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🔍 Step 4: Confirm GPU Usage
&lt;/h2&gt;

&lt;p&gt;Open another terminal and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nvidia-smi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;👉 You should see GPU memory usage increasing while the model is running.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚡ Results
&lt;/h2&gt;

&lt;p&gt;After switching to Docker:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🚀 Faster inference&lt;/li&gt;
&lt;li&gt;🔥 GPU utilization working&lt;/li&gt;
&lt;li&gt;🧠 Smooth local LLM experience&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  🛠️ Troubleshooting Guide
&lt;/h1&gt;

&lt;p&gt;Here are some real issues I encountered:&lt;/p&gt;




&lt;h2&gt;
  
  
  ❌ GPU Not Working in Docker
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ✅ Checklist:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;nvidia-smi&lt;/code&gt; works on host&lt;/li&gt;
&lt;li&gt;Docker Desktop is updated&lt;/li&gt;
&lt;li&gt;WSL2 is enabled&lt;/li&gt;
&lt;li&gt;Container is started with &lt;code&gt;--gpus all&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ❌ GPU Works in Docker but Not in Ollama
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ✅ Fix:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Restart container&lt;/li&gt;
&lt;li&gt;Re-run with GPU flag&lt;/li&gt;
&lt;li&gt;Try a smaller model (e.g., mistral)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ❌ &lt;code&gt;nvidia-smi&lt;/code&gt; Not Found
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Cause:
&lt;/h3&gt;

&lt;p&gt;NVIDIA drivers are not installed&lt;/p&gt;

&lt;h3&gt;
  
  
  Fix:
&lt;/h3&gt;

&lt;p&gt;Install latest drivers and reboot system&lt;/p&gt;




&lt;h2&gt;
  
  
  💡 Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;GPU issues are often &lt;strong&gt;environment-related&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Always verify GPU using a CUDA container first&lt;/li&gt;
&lt;li&gt;Docker Desktop simplifies GPU access significantly&lt;/li&gt;
&lt;li&gt;Running LLMs with GPU drastically improves performance&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📌 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;If your Ollama setup is stuck on CPU:&lt;/p&gt;

&lt;p&gt;👉 Don’t spend too much time debugging locally&lt;br&gt;
👉 Try Docker with GPU support&lt;/p&gt;

&lt;p&gt;It’s simple, reliable, and works consistently.&lt;/p&gt;




&lt;h2&gt;
  
  
  🙌 Need Help?
&lt;/h2&gt;

&lt;p&gt;If you get stuck at any step, feel free to reach out — happy to help!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>docker</category>
      <category>ollama</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
