<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Sergio Andres Usma</title>
    <description>The latest articles on Forem by Sergio Andres Usma (@vonusma).</description>
    <link>https://forem.com/vonusma</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3862386%2F22c543a8-ac4b-4ef3-951b-27469c282aa3.png</url>
      <title>Forem: Sergio Andres Usma</title>
      <link>https://forem.com/vonusma</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/vonusma"/>
    <language>en</language>
    <item>
      <title>Jetson Containers Quickstart on NVIDIA Jetson AGX Orin 64GB</title>
      <dc:creator>Sergio Andres Usma</dc:creator>
      <pubDate>Sun, 05 Apr 2026 23:36:25 +0000</pubDate>
      <link>https://forem.com/vonusma/jetson-containers-quickstart-on-nvidia-jetson-agx-orin-64gb-2ed9</link>
      <guid>https://forem.com/vonusma/jetson-containers-quickstart-on-nvidia-jetson-agx-orin-64gb-2ed9</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;This document describes how to run NVIDIA Jetson‑optimized AI containers from the &lt;code&gt;dustynv/jetson-containers&lt;/code&gt; project on an NVIDIA Jetson AGX Orin 64GB Developer Kit with Ubuntu 22.04.5 LTS and JetPack 6.2.2 (L4T 36.5.0), focusing on LLMs, speech, vision, and development tools. It consolidates the original Jetson Containers Quickstart PDF into an operational tutorial with copy‑paste &lt;code&gt;docker run&lt;/code&gt; commands and n8n integration pointers tailored to a system where n8n itself runs in Docker on port 5678. The tutorial targets engineers who want to run multiple local AI services on the same Jetson and orchestrate them via OpenAI‑compatible APIs without relying on external cloud providers.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Target Hardware and Software Environment
&lt;/h2&gt;

&lt;p&gt;Your system matches the reference environment of the Jetson Containers Quickstart Guide: Jetson AGX Orin 64GB, Ubuntu 22.04.5 aarch64, JetPack 6.2.2 (L4T 36.5.0), CUDA 12.6, cuDNN 9.3.0, and TensorRT 10.3.0.30. This platform has 64 GB unified memory and is validated to run all 51 containers in the guide, including 70B‑parameter LLMs in GPU‑accelerated runtimes.&lt;/p&gt;

&lt;p&gt;Before launching AI containers, ensure:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Docker is installed and configured with NVIDIA runtime (JetPack 6.x already provides &lt;code&gt;nvidia-container-runtime&lt;/code&gt;, you mainly add Docker itself).&lt;/li&gt;
&lt;li&gt;GPU works inside Docker:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/cuda:12.8-samples-r36.4.0-cu128-24.04 &lt;span class="se"&gt;\&lt;/span&gt;
  /usr/local/cuda/extras/demo_suite/deviceQuery
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;(Optional) Create directories for persistent data:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; ~/.ollama &lt;span class="se"&gt;\&lt;/span&gt;
         ~/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
         ~/sd-models &lt;span class="se"&gt;\&lt;/span&gt;
         ~/comfyui-models &lt;span class="se"&gt;\&lt;/span&gt;
         ~/comfyui-output &lt;span class="se"&gt;\&lt;/span&gt;
         ~/ml-workspace &lt;span class="se"&gt;\&lt;/span&gt;
         ~/notebooks &lt;span class="se"&gt;\&lt;/span&gt;
         ~/aim-data &lt;span class="se"&gt;\&lt;/span&gt;
         ~/ha-config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use these directories as bind‑mounts so models and configuration survive container recreation.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. LLM Inference Engines (OpenAI-Compatible)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  2.1 Ollama — General Purpose LLM Runtime
&lt;/h3&gt;

&lt;p&gt;Ollama is a user‑friendly way to run LLaMA, Mistral, Qwen, Gemma, Phi, and DeepSeek models with an OpenAI‑compatible REST API on your Jetson. The guide notes that AGX Orin 64GB can run 70B models comfortably using this runtime.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start Ollama:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/.ollama:/root/.ollama &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/ollama:r36.4.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pull a model and chat:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pull a model&lt;/span&gt;
curl http://localhost:11434/api/pull &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"name": "llama3.2:3b"}'&lt;/span&gt;

&lt;span class="c"&gt;# Chat completion (OpenAI-compatible)&lt;/span&gt;
curl http://localhost:11434/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "llama3.2:3b",
    "messages": [{"role":"user","content":"Hello from n8n!"}]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;n8n configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Credential: OpenAI API credential.&lt;/li&gt;
&lt;li&gt;API Key: any string (e.g. &lt;code&gt;ollama&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Base URL: &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:11434/v1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Model: &lt;code&gt;llama3.2:3b&lt;/code&gt; or any model pulled into Ollama.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2.2 llama.cpp — GGUF, Quantized LLM Server
&lt;/h3&gt;

&lt;p&gt;llama.cpp excels at running quantized GGUF models with low latency and memory usage. The quickstart provides an OpenAI‑compatible server configuration suitable for AGX Orin.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start llama.cpp server:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; llama-server &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; /models:/models &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/llama_cpp:r36.4.0 &lt;span class="se"&gt;\&lt;/span&gt;
  llama-server &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--model&lt;/span&gt; /models/llama-3.1-8b-q4.gguf &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 8080 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--n-gpu-layers&lt;/span&gt; 999 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--ctx-size&lt;/span&gt; 8192
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;The server exposes OpenAI‑style endpoints on &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8080/v1&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;n8n configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Node: OpenAI Chat Model.&lt;/li&gt;
&lt;li&gt;Base URL: &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8080/v1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Model: choose name according to your server configuration; llama.cpp will map GGUF to logical model IDs.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2.3 vLLM — High Throughput LLM Serving
&lt;/h3&gt;

&lt;p&gt;vLLM uses PagedAttention to reach significantly higher throughput than naive Hugging Face inference, which is useful for multi‑user services.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start vLLM:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; vllm &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--shm-size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;8g &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/.cache/huggingface:/root/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/vllm:r36.4.0 &lt;span class="se"&gt;\&lt;/span&gt;
  python3 &lt;span class="nt"&gt;-m&lt;/span&gt; vllm.entrypoints.openai.api_server &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--model&lt;/span&gt; meta-llama/Llama-3.2-3B-Instruct &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Exposes OpenAI‑compatible endpoints at &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8000/v1&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;n8n configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Node: OpenAI Chat Model.&lt;/li&gt;
&lt;li&gt;Base URL: &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8000/v1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Enable streaming mode if you want streamed responses.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2.4 SGLang — Structured Output and JSON
&lt;/h3&gt;

&lt;p&gt;SGLang is designed for structured outputs and JSON‑constrained decoding using RadixAttention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start SGLang:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; sglang &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--shm-size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;8g &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/.cache/huggingface:/root/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/sglang:r36.4.0 &lt;span class="se"&gt;\&lt;/span&gt;
  python3 &lt;span class="nt"&gt;-m&lt;/span&gt; sglang.launch_server &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--model-path&lt;/span&gt; meta-llama/Llama-3.2-3B-Instruct &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 30000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;n8n usage pattern:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use HTTP Request node pointing to &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:30000/v1/chat/completions&lt;/code&gt; and include &lt;code&gt;response_format: {"type":"json_object"}&lt;/code&gt; in the body when you need strict JSON.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2.5 MLC and nanoLLM — Orin‑Optimized and Multimodal
&lt;/h3&gt;

&lt;p&gt;MLC LLM compiles models targeting Jetson’s GPU architecture for fast token generation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start MLC LLM:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; mlc &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/.cache/huggingface:/root/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/mlc:r36.4.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;According to the quickstart, MLC frequently achieves the fastest token rates on AGX Orin among the tested engines.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;nanoLLM provides higher‑level multimodal pipelines with vision‑language and voice capabilities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start nanoLLM with VILA:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; nano-llm &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--shm-size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;8g &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/.cache/huggingface:/root/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/nano_llm:r36.4.0 &lt;span class="se"&gt;\&lt;/span&gt;
  python3 &lt;span class="nt"&gt;-m&lt;/span&gt; nano_llm.serve &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--model&lt;/span&gt; Efficient-Large-Model/VILA1.5-3b &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 9000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Multimodal example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:9000/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "VILA1.5-3b",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "image_url", "image_url": {"url": "http://example.com/img.jpg"}},
        {"type": "text", "text": "What is in this image?"}
      ]
    }]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;n8n:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Node: OpenAI Chat Model.&lt;/li&gt;
&lt;li&gt;Base URL: &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:9000/v1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Use messages with &lt;code&gt;image_url&lt;/code&gt; and text parts when building prompts.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. Speech and Audio Containers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  3.1 faster-whisper — STT Server
&lt;/h3&gt;

&lt;p&gt;faster‑whisper is a fast speech‑to‑text server offering OpenAI‑compatible endpoints on Jetson.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start faster‑whisper:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; faster-whisper &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/.cache/huggingface:/root/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/faster-whisper:r36.4.0 &lt;span class="se"&gt;\&lt;/span&gt;
  python3 &lt;span class="nt"&gt;-m&lt;/span&gt; faster_whisper.server &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Exposes &lt;code&gt;/v1/audio/transcriptions&lt;/code&gt; and works with OpenAI Chat Model or HTTP Request nodes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;n8n pattern:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTTP Request, method POST, URL &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8000/v1/audio/transcriptions&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Body: form‑data with &lt;code&gt;file&lt;/code&gt; (binary audio) and &lt;code&gt;model&lt;/code&gt; (e.g. &lt;code&gt;"whisper-1"&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  3.2 kokoro-tts — Lightweight Local TTS
&lt;/h3&gt;

&lt;p&gt;kokoro‑tts offers an OpenAI‑compatible &lt;code&gt;/v1/audio/speech&lt;/code&gt; endpoint with multiple voices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start kokoro‑tts:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; kokoro-tts &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/kokoro-tts:r36.4.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Generate MP3:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:8880/v1/audio/speech &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "kokoro",
    "input": "Hello from your Jetson!",
    "voice": "af_bella",
    "response_format": "mp3"
  }'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; speech.mp3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;n8n:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTTP Request, Response Format = File, then return or store the binary audio.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  3.3 speaches — Unified Speech In/Out
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;speaches&lt;/code&gt; exposes both STT and TTS endpoints compatible with OpenAI’s audio APIs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start speaches:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; speaches &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/.cache/huggingface:/root/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/speaches:r36.4.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Ports and endpoints are listed in the API quick reference (port 8000, OpenAI‑compatible).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A complete on‑device voice pipeline can be built as: Webhook (audio) → faster‑whisper STT → LLM (Ollama or vLLM) → kokoro‑tts or speaches TTS → Webhook response.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Vision, Diffusion, and VLM Containers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  4.1 Stable Diffusion WebUI — Text‑to‑Image UI + API
&lt;/h3&gt;

&lt;p&gt;The Stable Diffusion WebUI container gives you a full browser interface and REST API for image generation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start Stable Diffusion WebUI:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; sd-webui &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/sd-models:/workspace/stable-diffusion-webui/models &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/stable-diffusion-webui:r36.4.0 &lt;span class="se"&gt;\&lt;/span&gt;
  python3 launch.py &lt;span class="nt"&gt;--api&lt;/span&gt; &lt;span class="nt"&gt;--listen&lt;/span&gt; &lt;span class="nt"&gt;--port&lt;/span&gt; 7860
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Web UI: &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:7860&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;API txt2img example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:7860/sdapi/v1/txt2img &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "prompt": "mountain landscape",
    "steps": 20,
    "width": 512,
    "height": 512
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;n8n:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTTP Request → parse JSON → Move Binary Data to convert base64 &lt;code&gt;images[0]&lt;/code&gt; to binary → send to Telegram, save file, etc.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  4.2 ComfyUI — Graph‑Based Diffusion Workflows
&lt;/h3&gt;

&lt;p&gt;ComfyUI is a node‑based interface with an HTTP API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start ComfyUI:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; comfyui &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/comfyui-models:/root/ComfyUI/models &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/comfyui-output:/root/ComfyUI/output &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/comfyui:r36.4.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;API flow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;POST &lt;code&gt;/prompt&lt;/code&gt; → get &lt;code&gt;prompt_id&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;GET &lt;code&gt;/history/{prompt_id}&lt;/code&gt; repeatedly until outputs appear.&lt;/li&gt;
&lt;li&gt;GET &lt;code&gt;/view?filename={filename}&amp;amp;type=output&lt;/code&gt; to download the image.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Use a sequence of HTTP Request nodes in n8n to implement the polling and retrieval.&lt;/p&gt;




&lt;h3&gt;
  
  
  4.3 VILA and Related VLMs
&lt;/h3&gt;

&lt;p&gt;The VILA container provides an efficient vision‑language model with an OpenAI‑compatible API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start VILA:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; vila &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--shm-size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;8g &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/.cache/huggingface:/root/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/vila:r36.4.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;According to the quick reference, VILA uses port 8000 and integrates via OpenAI Chat Model node.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In n8n, send messages that include an &lt;code&gt;image_url&lt;/code&gt; object and text, similar to the nanoLLM example.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Development, Experiment Tracking, and Smart Home
&lt;/h2&gt;

&lt;h3&gt;
  
  
  5.1 L4T-ML, PyTorch, and JupyterLab
&lt;/h3&gt;

&lt;p&gt;L4T‑ML is an all‑in‑one ML environment that bundles PyTorch, TensorFlow, scikit‑learn, and JupyterLab optimized for JetPack 6.x.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start L4T‑ML JupyterLab:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; l4t-ml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--shm-size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;8g &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/ml-workspace:/workspace &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/l4t-ml:r36.4.0 &lt;span class="se"&gt;\&lt;/span&gt;
  jupyter lab &lt;span class="nt"&gt;--ip&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.0.0.0 &lt;span class="nt"&gt;--allow-root&lt;/span&gt; &lt;span class="nt"&gt;--no-browser&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Access via &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8888&lt;/code&gt; in your browser.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Alternatively, the standalone &lt;code&gt;dustynv/jupyterlab:r36.4.0&lt;/code&gt; container provides just JupyterLab:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; jupyterlab &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/notebooks:/notebooks &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/jupyterlab:r36.4.0 &lt;span class="se"&gt;\&lt;/span&gt;
  jupyter lab &lt;span class="nt"&gt;--ip&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.0.0.0 &lt;span class="nt"&gt;--allow-root&lt;/span&gt; &lt;span class="nt"&gt;--no-browser&lt;/span&gt; &lt;span class="nt"&gt;--NotebookApp&lt;/span&gt;.token&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;''&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;PyTorch‑focused images (&lt;code&gt;dustynv/pytorch&lt;/code&gt;, &lt;code&gt;dustynv/l4t-pytorch&lt;/code&gt;) can be run via &lt;code&gt;jetson-containers run ...&lt;/code&gt; as described in the build docs and are fully compatible with JetPack 6.2.2.&lt;/p&gt;




&lt;h3&gt;
  
  
  5.2 AIM Experiment Tracker
&lt;/h3&gt;

&lt;p&gt;AIM is a lightweight REST‑accessible experiment tracker container.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start AIM:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; aim &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/aim-data:/aim/data &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/aim:r36.4.0 &lt;span class="se"&gt;\&lt;/span&gt;
  aim up &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;--port&lt;/span&gt; 43800
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Web UI and API at &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:43800&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;n8n can poll &lt;code&gt;api/runs&lt;/code&gt; and &lt;code&gt;api/metrics&lt;/code&gt; using HTTP Request nodes to monitor training.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  5.3 Home Assistant Core on Jetson
&lt;/h3&gt;

&lt;p&gt;Home Assistant Core can run as a container for local smart‑home control.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start Home Assistant:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; homeassistant &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/ha-config:/config &lt;span class="se"&gt;\&lt;/span&gt;
  dustynv/homeassistant-core:r36.4.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Access UI at &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8123&lt;/code&gt; and create a Long‑Lived Access Token under your profile.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;n8n integration:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTTP Request node with URL like &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8123/api/states&lt;/code&gt; or &lt;code&gt;/api/services/...&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Authentication: Bearer token using the Long‑Lived Access Token.&lt;/li&gt;
&lt;li&gt;Build flows like "sensor state change → LLM decision → Home Assistant service call" as outlined in the quickstart.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  6. n8n Integration Patterns and Networking Notes
&lt;/h2&gt;

&lt;p&gt;The quickstart highlights that your n8n instance runs in Docker on port 5678 and must reach Jetson services via the Jetson's LAN IP, not &lt;code&gt;localhost&lt;/code&gt;, because container networking isolates &lt;code&gt;localhost&lt;/code&gt; inside the n8n container. For OpenAI‑compatible services, configure the OpenAI Chat Model node with the Base URL pointing to &lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:&amp;lt;port&amp;gt;/v1&lt;/code&gt;, while for other services use HTTP Request nodes and explicit paths.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI‑compatible containers and ports (from API quick reference):&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Container&lt;/th&gt;
&lt;th&gt;Port&lt;/th&gt;
&lt;th&gt;Base URL example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ollama&lt;/td&gt;
&lt;td&gt;11434&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:11434/v1&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama_cpp&lt;/td&gt;
&lt;td&gt;8080&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8080/v1&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;vLLM&lt;/td&gt;
&lt;td&gt;8000&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8000/v1&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sglang&lt;/td&gt;
&lt;td&gt;30000&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:30000/v1&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;mlc&lt;/td&gt;
&lt;td&gt;8080&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8080/v1&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;nano_llm&lt;/td&gt;
&lt;td&gt;9000&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:9000/v1&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;speaches&lt;/td&gt;
&lt;td&gt;8000&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8000/v1&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;faster-whisper&lt;/td&gt;
&lt;td&gt;8000&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8000/v1&lt;/code&gt; or audio paths&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;kokoro-tts&lt;/td&gt;
&lt;td&gt;8880&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8880/v1&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VILA&lt;/td&gt;
&lt;td&gt;8000&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://&amp;lt;jetson-ip&amp;gt;:8000/v1&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Example: full on‑device voice assistant pipeline in n8n:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Webhook (POST /voice-input) → receives audio
  ↓
HTTP Request → POST /v1/audio/transcriptions (faster-whisper or speaches)
  Body: form-data (file: binary audio, model: "whisper-1")
  ↓
OpenAI Chat Model → local LLM (Base URL = Ollama or vLLM)
  ↓
HTTP Request → POST /v1/audio/speech (kokoro-tts or speaches)
  Body: {"model":"kokoro","input":"{{$json.text}}","voice":"af_bella"}
  ↓
Webhook Response → returns audio binary
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern uses only local containers on Jetson and keeps all data on‑device.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Practical Recommendations and Next Steps
&lt;/h2&gt;

&lt;p&gt;The quickstart confirms that all 51 &lt;code&gt;dustynv/jetson-containers&lt;/code&gt; images tagged &lt;code&gt;r36.4.0&lt;/code&gt; are compatible with JetPack 6.x and have been tested on Jetson AGX Orin 64GB with CUDA 12.6. For production use on your board, the guide suggests mounting persistent caches, using &lt;code&gt;--shm-size=8g&lt;/code&gt; for transformer‑based containers, benchmarking vLLM vs MLC vs llama.cpp on your target models, and eventually switching from &lt;code&gt;--network host&lt;/code&gt; to explicit port mappings on isolated Docker networks.&lt;/p&gt;

</description>
      <category>jetson</category>
      <category>agxorin</category>
      <category>ai</category>
      <category>llm</category>
    </item>
    <item>
      <title>[Beginner] Docker Tutorial for jetson-containers on Jetson AGX Orin</title>
      <dc:creator>Sergio Andres Usma</dc:creator>
      <pubDate>Sun, 05 Apr 2026 23:20:36 +0000</pubDate>
      <link>https://forem.com/vonusma/beginner-docker-tutorial-for-jetson-containers-on-jetson-agx-orin-5bl8</link>
      <guid>https://forem.com/vonusma/beginner-docker-tutorial-for-jetson-containers-on-jetson-agx-orin-5bl8</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;This tutorial explains how to use Docker with the &lt;strong&gt;jetson-containers&lt;/strong&gt; project on an NVIDIA Jetson AGX Orin 64 GB running Ubuntu 22.04 and JetPack 6.2.2, focusing on beginner-friendly concepts and commands. It introduces basic container terminology, shows how to safely back up configuration files, and then walks through everyday Docker operations like starting, stopping, rebuilding, and re-running containers.  The goal is to give new users a practical, copy‑pasteable reference they can keep open on the Jetson while working with jetson-containers.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Basic Concepts: Images, Containers, Volumes
&lt;/h2&gt;

&lt;p&gt;For beginners, it helps to understand a few core terms before running commands.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Image&lt;/strong&gt;: A read‑only template that contains an application and its OS-level dependencies (for example, a pre-built PyTorch + CUDA environment for Jetson).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Container&lt;/strong&gt;: A running instance of an image with its own filesystem, processes, and network configuration; you can start, stop, and delete containers without touching the original image.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dockerfile&lt;/strong&gt;: A text file with instructions on how to build an image (which base image to use, what packages to install, what commands to run).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Volume&lt;/strong&gt;: A directory from the host (your Jetson) that is mounted inside the container so changes persist on disk even if you delete the container.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Registry&lt;/strong&gt;: A server that stores images (for example, Docker Hub or the GitHub Container Registry used by many Jetson projects).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Mental model: the image is like an ISO of a Linux distro, and the container is like the running system you boot from that ISO, with volumes acting as your persistent home folder.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Safety First: Backing Up Configuration Files
&lt;/h2&gt;

&lt;p&gt;Before experimenting with Docker and jetson-containers, you should back up any configuration files or project directories you are going to modify.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.1. Backing up directories on the Jetson
&lt;/h3&gt;

&lt;p&gt;Pick a backup directory on your Jetson, for example &lt;code&gt;~/backups/jetson-containers&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; ~/backups/jetson-containers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To back up a directory that will be mounted into a container (for example &lt;code&gt;~/projects/my-app&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Backup with timestamp&lt;/span&gt;
&lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; ~/projects/my-app &lt;span class="se"&gt;\&lt;/span&gt;
  ~/backups/jetson-containers/my-app_&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d-%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To back up a single file that might be edited (for example &lt;code&gt;docker-compose.yml&lt;/code&gt; or a config file):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp &lt;/span&gt;docker-compose.yml &lt;span class="se"&gt;\&lt;/span&gt;
  docker-compose.yml.bak_&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d-%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you are about to edit a file inside a project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# From inside the project directory&lt;/span&gt;
&lt;span class="nb"&gt;cp &lt;/span&gt;config.yaml config.yaml.bak_&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d-%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Restoring is just the reverse:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; ~/backups/jetson-containers/my-app_20260405-120000/&lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   ~/projects/my-app/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  3. Checking Docker on Jetson AGX Orin
&lt;/h2&gt;

&lt;p&gt;You already have Docker 29.3.1 installed on your Jetson with arm64 support, which is what you need for jetson-containers.&lt;/p&gt;

&lt;p&gt;Verify Docker is running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker version
docker info
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;docker info&lt;/code&gt; shows errors about permissions, add your user to the &lt;code&gt;docker&lt;/code&gt; group and re‑login:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;usermod &lt;span class="nt"&gt;-aG&lt;/span&gt; docker &lt;span class="nv"&gt;$USER&lt;/span&gt;
&lt;span class="c"&gt;# Then log out and log back in, or reboot&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To test with a simple container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; arm64v8/ubuntu:22.04 &lt;span class="nb"&gt;uname&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command pulls an arm64 Ubuntu image and prints the kernel info, confirming Docker is working.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Using jetson-containers: Typical Workflow
&lt;/h2&gt;

&lt;p&gt;This section uses generic patterns you can adapt to the jetson-containers project (git clone, build, and run commands are similar across JetPack 6 projects).&lt;/p&gt;

&lt;h3&gt;
  
  
  4.1. Cloning the jetson-containers repository
&lt;/h3&gt;

&lt;p&gt;From your home directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~
git clone https://github.com/dusty-nv/jetson-containers.git
&lt;span class="nb"&gt;cd &lt;/span&gt;jetson-containers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Back up the repository before heavy changes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; ~/jetson-containers &lt;span class="se"&gt;\&lt;/span&gt;
  ~/backups/jetson-containers/jetson-containers_&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d-%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4.2. Building a jetson-containers image
&lt;/h3&gt;

&lt;p&gt;Within the jetson-containers repository, there are scripts or Dockerfiles to build images optimized for your JetPack version.&lt;/p&gt;

&lt;p&gt;Example pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Example: build an image for a specific package or stack&lt;/span&gt;
./scripts/build.sh &amp;lt;image-name&amp;gt;
&lt;span class="c"&gt;# or directly with Docker&lt;/span&gt;
docker build &lt;span class="nt"&gt;-t&lt;/span&gt; my-jetson-image &lt;span class="nt"&gt;-f&lt;/span&gt; Dockerfile &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Replace &lt;code&gt;&amp;lt;image-name&amp;gt;&lt;/code&gt; with the target defined by jetson-containers (for example, a PyTorch or L4T base image name).&lt;/p&gt;

&lt;p&gt;Key flags for &lt;code&gt;docker build&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker build &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-t&lt;/span&gt; my-jetson-image &lt;span class="se"&gt;\ &lt;/span&gt;       &lt;span class="c"&gt;# Tag (name) for your image&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; Dockerfile &lt;span class="nb"&gt;.&lt;/span&gt;             &lt;span class="c"&gt;# Dockerfile and build context&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  5. Running, Stopping, and Inspecting Containers
&lt;/h2&gt;

&lt;p&gt;This is the heart of day‑to‑day container usage.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.1. Starting a new container with jetson-containers
&lt;/h3&gt;

&lt;p&gt;A typical &lt;code&gt;docker run&lt;/code&gt; command for Jetson should:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use the correct image (from jetson-containers).&lt;/li&gt;
&lt;li&gt;Pass through GPU access.&lt;/li&gt;
&lt;li&gt;Mount your project directory as a volume.&lt;/li&gt;
&lt;li&gt;Optionally set the container name.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generic pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--gpus&lt;/span&gt; all &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--ipc&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/projects/my-app:/workspace/my-app &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; my-jetson-container &lt;span class="se"&gt;\&lt;/span&gt;
  my-jetson-image &lt;span class="se"&gt;\&lt;/span&gt;
  /bin/bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Explanation of key flags:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-it&lt;/code&gt;: Interactive terminal.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--rm&lt;/code&gt;: Delete the container when it exits (good for experiments).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--gpus all&lt;/code&gt;: Give the container access to the Jetson GPU.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--network host&lt;/code&gt;: Share the host network stack (useful for ROS, web services, etc).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--ipc host&lt;/code&gt;: Share IPC for better performance with some frameworks.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-v host:container&lt;/code&gt;: Mount a host directory into the container.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--name&lt;/code&gt;: Easy name for managing the container later.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5.2. Listing running and stopped containers
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Only running containers&lt;/span&gt;
docker ps

&lt;span class="c"&gt;# All containers (running and stopped)&lt;/span&gt;
docker ps &lt;span class="nt"&gt;-a&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5.3. Attaching and entering a running container
&lt;/h3&gt;

&lt;p&gt;If a container is running in the background:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; my-jetson-container /bin/bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This opens a shell inside the running container.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.4. Stopping and removing containers
&lt;/h3&gt;

&lt;p&gt;To stop a running container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker stop my-jetson-container
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To forcibly stop (if it hangs):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;kill &lt;/span&gt;my-jetson-container
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To remove a stopped container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;rm &lt;/span&gt;my-jetson-container
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To remove all stopped containers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker container prune
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  6. Re-running and Rebuilding Images
&lt;/h2&gt;

&lt;p&gt;When you change a Dockerfile or the jetson-containers configuration, you often need to rebuild images.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.1. Re-running a container with the same configuration
&lt;/h3&gt;

&lt;p&gt;If you used &lt;code&gt;--name my-jetson-container&lt;/code&gt;, Docker keeps the container configuration until it is removed.&lt;/p&gt;

&lt;p&gt;To start it again after it has been stopped:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker start my-jetson-container
docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; my-jetson-container /bin/bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you used &lt;code&gt;--rm&lt;/code&gt;, the container is deleted on exit, so you must run &lt;code&gt;docker run&lt;/code&gt; again (the image itself remains).&lt;/p&gt;

&lt;h3&gt;
  
  
  6.2. Forcing a rebuild of an image
&lt;/h3&gt;

&lt;p&gt;When you modify a Dockerfile or build context and want Docker to ignore previous cache layers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker build &lt;span class="nt"&gt;--no-cache&lt;/span&gt; &lt;span class="nt"&gt;-t&lt;/span&gt; my-jetson-image &lt;span class="nt"&gt;-f&lt;/span&gt; Dockerfile &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If jetson-containers provides a build script, you can usually pass a similar &lt;code&gt;--no-cache&lt;/code&gt; flag or use an environment variable, for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Example pattern; adapt to the actual script interface&lt;/span&gt;
&lt;span class="nv"&gt;NO_CACHE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 ./scripts/build.sh &amp;lt;image-name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To remove an image and force a clean rebuild:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker rmi my-jetson-image
docker build &lt;span class="nt"&gt;-t&lt;/span&gt; my-jetson-image &lt;span class="nt"&gt;-f&lt;/span&gt; Dockerfile &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;List images to verify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker images
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6.3. Updating images from a registry
&lt;/h3&gt;

&lt;p&gt;If jetson-containers publishes pre-built images, you can pull the latest version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker pull &amp;lt;registry&amp;gt;/&amp;lt;namespace&amp;gt;/&amp;lt;image&amp;gt;:&amp;lt;tag&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After pulling, re-run your containers using the updated tag to test new versions safely (after backing up your mounted project directory).&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Managing Data and Volumes Safely
&lt;/h2&gt;

&lt;p&gt;To avoid losing important work when deleting containers, always use volumes (host directories mounted into containers).&lt;/p&gt;

&lt;h3&gt;
  
  
  7.1. Using host directories as volumes
&lt;/h3&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; ~/projects/my-app
docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/projects/my-app:/workspace/my-app &lt;span class="se"&gt;\&lt;/span&gt;
  my-jetson-image &lt;span class="se"&gt;\&lt;/span&gt;
  /bin/bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Anything you save to &lt;code&gt;/workspace/my-app&lt;/code&gt; inside the container appears in &lt;code&gt;~/projects/my-app&lt;/code&gt; on the Jetson and persists when the container is removed.&lt;/p&gt;

&lt;h3&gt;
  
  
  7.2. Using named Docker volumes (optional)
&lt;/h3&gt;

&lt;p&gt;For simple persistent storage managed by Docker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a volume&lt;/span&gt;
docker volume create my-jetson-volume

&lt;span class="c"&gt;# Use it in a container&lt;/span&gt;
docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; my-jetson-volume:/data &lt;span class="se"&gt;\&lt;/span&gt;
  my-jetson-image &lt;span class="se"&gt;\&lt;/span&gt;
  /bin/bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;List volumes and remove unused ones:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker volume &lt;span class="nb"&gt;ls
&lt;/span&gt;docker volume prune
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  8. Useful Command Reference Table
&lt;/h2&gt;

&lt;p&gt;Below is a quick reference table you can keep near your terminal.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Command (Jetson terminal)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Check Docker version&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker version&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;List running containers&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker ps&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;List all containers&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker ps -a&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;List images&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker images&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Start new container&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker run -it --rm --gpus all --network host --ipc host -v ~/proj:/workspace my-img&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stop container&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker stop &amp;lt;name-or-id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Force stop container&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker kill &amp;lt;name-or-id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Remove stopped container&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker rm &amp;lt;name-or-id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Remove image&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker rmi &amp;lt;image&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Exec into running container&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker exec -it &amp;lt;name-or-id&amp;gt; /bin/bash&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build image&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker build -t my-img -f Dockerfile .&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build image without cache&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker build --no-cache -t my-img -f Dockerfile .&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Start stopped container&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docker start &amp;lt;name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backup directory before changes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cp -a ~/dir ~/backups/dir_$(date +%Y%m%d-%H%M%S)&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backup single file before editing&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cp file.txt file.txt.bak_$(date +%Y%m%d-%H%M%S)&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Table 1 — Common Docker commands on Jetson&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  9. Conclusion
&lt;/h2&gt;

&lt;p&gt;With these concepts and commands, you can confidently use Docker and jetson-containers on your Jetson AGX Orin without risking important project data, thanks to consistent use of backups and volumes.  As you become more comfortable, you can refine the &lt;code&gt;docker run&lt;/code&gt; patterns, create your own Dockerfiles, and integrate jetson-containers more deeply into your development workflow.&lt;/p&gt;

</description>
      <category>jetson</category>
      <category>agxorin</category>
      <category>jetsoncontainers</category>
      <category>docker</category>
    </item>
    <item>
      <title>Fast Large-file and LLM Downloads with aria2 on NVIDIA Jetson AGX Orin</title>
      <dc:creator>Sergio Andres Usma</dc:creator>
      <pubDate>Sun, 05 Apr 2026 22:32:55 +0000</pubDate>
      <link>https://forem.com/vonusma/fast-large-file-and-llm-downloads-with-aria2-on-nvidia-jetson-agx-orin-28km</link>
      <guid>https://forem.com/vonusma/fast-large-file-and-llm-downloads-with-aria2-on-nvidia-jetson-agx-orin-28km</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;This tutorial documents the configuration and use of &lt;strong&gt;aria2&lt;/strong&gt; to download very large files and LLM weight archives on an NVIDIA Jetson AGX Orin Developer Kit 64 GB running Ubuntu 22.04.5 LTS aarch64 with JetPack 6.2.2. It focuses on high-concurrency HTTPS downloads from Hugging Face and similar model repositories, with commands tuned for edge hardware, multi-gigabyte single-file models in GGUF or safetensors format, and the object storage redirect behavior common to modern model hosting platforms.&lt;/p&gt;

&lt;p&gt;The guide covers robust resume strategies using &lt;code&gt;.aria2&lt;/code&gt; control files and session files that allow downloads to survive reboots, intermittent connectivity, and signed URL expiration. Failure scenarios encountered in practice — including HTTP 403 rate limits near completion, hash-like output filenames resulting from redirect chains, and missing control metadata — are addressed with safe, prescriptive recovery steps that avoid data corruption or accidental full re-downloads.&lt;/p&gt;

&lt;p&gt;The document targets advanced Linux and Jetson users who regularly fetch multi-GB model artefacts and want a repeatable, resilient pattern for aria2 on ARM64. Readers will finish with a working installation, a set of reusable power commands, a reliable batch-resume workflow, and the knowledge to debug common failure modes without discarding already-downloaded data.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Hardware and software environment
&lt;/h2&gt;

&lt;p&gt;The environment documented throughout this tutorial is an NVIDIA Jetson AGX Orin Developer Kit with 64 GB unified memory, running a standard JetPack 6.2.2 software stack on Ubuntu 22.04.5 LTS aarch64. This configuration represents a current-generation edge AI development system with sufficient CPU cores, RAM, and storage throughput to benefit from aria2's parallel download capabilities. The commands and flag values in subsequent sections are validated against this environment.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Version / Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Hardware&lt;/td&gt;
&lt;td&gt;NVIDIA Jetson AGX Orin Developer Kit 64 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OS&lt;/td&gt;
&lt;td&gt;Ubuntu 22.04.5 LTS aarch64&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel&lt;/td&gt;
&lt;td&gt;5.15.185-tegra&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L4T&lt;/td&gt;
&lt;td&gt;36.5.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JetPack&lt;/td&gt;
&lt;td&gt;6.2.2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CUDA&lt;/td&gt;
&lt;td&gt;12.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;cuDNN&lt;/td&gt;
&lt;td&gt;9.3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TensorRT&lt;/td&gt;
&lt;td&gt;10.3&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Table 1 — Jetson AGX Orin software stack&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The tutorial assumes that network routing, DNS, and internet connectivity to Hugging Face are already functional on the device. No proxy or VPN configuration is assumed, although aria2 supports those if needed. Storage is assumed to be NVMe or SSD formatted as ext4, which affects the recommended file allocation strategy discussed in section 6.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Installing aria2
&lt;/h2&gt;

&lt;p&gt;aria2 is available from the official Ubuntu 22.04 aarch64 package repositories and requires no external PPA or manual build on this platform.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; aria2
aria2c &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;aria2c -v&lt;/code&gt; prints version details and build flags, confirming that the binary is functional and correctly linked. If the command is not found, verify that &lt;code&gt;/usr/bin&lt;/code&gt; is in &lt;code&gt;PATH&lt;/code&gt;. On a standard Jetson Ubuntu installation, no adjustments are required.&lt;/p&gt;

&lt;p&gt;Create dedicated directories for model files and their associated &lt;code&gt;.aria2&lt;/code&gt; control files before starting any downloads. Co-locating them in stable directories is essential for reliable resume behavior, as aria2 expects the data file and its sidecar to share the same directory and base name.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; ~/models
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; ~/downloads
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Moving or renaming a data file after a partial download breaks the association aria2 relies on to continue from the correct offset. Establish these directories once and use them consistently across all aria2 invocations.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Fast single-file downloads from Hugging Face
&lt;/h2&gt;

&lt;h3&gt;
  
  
  3.1 Core throughput flags: &lt;code&gt;-x&lt;/code&gt;, &lt;code&gt;-s&lt;/code&gt;, and &lt;code&gt;-k&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;For a single large file, aria2 opens multiple HTTP connections and divides the target into segments that are fetched concurrently. Three flags control this behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-x N&lt;/code&gt;&lt;/strong&gt; / &lt;code&gt;--max-connection-per-server=N&lt;/code&gt;: number of parallel HTTP connections opened to the server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-s N&lt;/code&gt;&lt;/strong&gt; / &lt;code&gt;--split=N&lt;/code&gt;: number of segments the file is divided into for parallel download.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-k SIZE&lt;/code&gt;&lt;/strong&gt; / &lt;code&gt;--min-split-size=SIZE&lt;/code&gt;: minimum size per segment (e.g., &lt;code&gt;64M&lt;/code&gt;); prevents excessive small chunks for large files.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The URL must always be the last positional argument. A common mistake is placing &lt;code&gt;-s&lt;/code&gt; immediately before the URL with no numeric value between them, which causes aria2 to interpret the URL as the split count and fail silently.&lt;/p&gt;

&lt;p&gt;Correct usage for a Hugging Face GGUF file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;-x&lt;/span&gt; 16 &lt;span class="nt"&gt;-s&lt;/span&gt; 16 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://huggingface.co/unsloth/gemma-4-31B-it-GGUF/resolve/main/gemma-4-31B-it-Q4_K_M.gguf?download=true"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the server returns HTTP 429 responses or imposes rate limits, reduce concurrency:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;-x&lt;/span&gt; 8 &lt;span class="nt"&gt;-s&lt;/span&gt; 8 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://huggingface.co/unsloth/gemma-4-31B-it-GGUF/resolve/main/gemma-4-31B-it-Q4_K_M.gguf?download=true"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example of download command to download &lt;code&gt;gemma-4-26B-A4B-it-UD-Q4_K_M.gguf&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-x8&lt;/span&gt; &lt;span class="nt"&gt;-s8&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-k32M&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--retry-wait&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-tries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--file-allocation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;none &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--summary-interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;60 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; gemma-4-26B-A4B-it-UD-Q4_K_M.gguf &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s1"&gt;'https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF/resolve/main/gemma-4-26B-A4B-it-UD-Q4_K_M.gguf?download=true'&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.2 Power command template for large model downloads
&lt;/h3&gt;

&lt;p&gt;The following command is the recommended baseline for large model downloads. It combines high concurrency with resilient retry behavior, explicit resume support, and a stable output filename.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;-x16&lt;/span&gt; &lt;span class="nt"&gt;-s16&lt;/span&gt; &lt;span class="nt"&gt;-k64M&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--retry-wait&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5 &lt;span class="nt"&gt;--max-tries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--file-allocation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;none &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--summary-interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;60 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; ~/models &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; gemma-4-31B-it-Q4_K_M.gguf &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://huggingface.co/unsloth/gemma-4-31B-it-GGUF/resolve/main/gemma-4-31B-it-Q4_K_M.gguf?download=true"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Flag-by-flag explanation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-c&lt;/code&gt;&lt;/strong&gt; / &lt;code&gt;--continue=true&lt;/code&gt;: resume an existing partial file if the &lt;code&gt;.aria2&lt;/code&gt; control file is present.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-x16&lt;/code&gt; / &lt;code&gt;-s16&lt;/code&gt;&lt;/strong&gt;: 16 parallel connections and 16 file segments for maximum throughput on a fast link.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-k64M&lt;/code&gt;&lt;/strong&gt;: 64 MB minimum segment size; reduces the number of chunks for very large files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--retry-wait=5&lt;/code&gt;&lt;/strong&gt;: pause 5 seconds before each retry on transient errors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--max-tries=0&lt;/code&gt;&lt;/strong&gt;: retry indefinitely; aria2 will not give up until stopped manually.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--file-allocation=none&lt;/code&gt;&lt;/strong&gt;: skip pre-allocation of the full file size, avoiding a blocking write at startup on Jetson NVMe storage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-d ~/models&lt;/code&gt;&lt;/strong&gt;: explicit target directory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-o &amp;lt;filename&amp;gt;&lt;/code&gt;&lt;/strong&gt;: explicit output filename, preventing query-string characters or hash-like object keys from appearing in the filename on disk.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For throttled or unstable connections, use the conservative variant:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;-x8&lt;/span&gt; &lt;span class="nt"&gt;-s8&lt;/span&gt; &lt;span class="nt"&gt;-k16M&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--retry-wait&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10 &lt;span class="nt"&gt;--max-tries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--file-allocation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;none &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; ~/models &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://huggingface.co/.../model.gguf?download=true"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This reduces pressure on the server while still delivering substantially better throughput than a single connection.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Resume mechanics and .aria2 Control Files
&lt;/h2&gt;

&lt;h3&gt;
  
  
  4.1 How aria2 resume works
&lt;/h3&gt;

&lt;p&gt;For every active download, aria2 creates a binary control file alongside the data file using the naming convention &lt;code&gt;&amp;lt;target-filename&amp;gt;.aria2&lt;/code&gt;. Downloading &lt;code&gt;model.gguf&lt;/code&gt; produces both &lt;code&gt;model.gguf&lt;/code&gt; and &lt;code&gt;model.gguf.aria2&lt;/code&gt; in the output directory. The control file tracks segment byte offsets, checksums, and download state. Without it, aria2 cannot determine which byte ranges are valid and cannot safely resume.&lt;/p&gt;

&lt;p&gt;If the same aria2 command is re-run from the same directory with the same output filename, aria2 detects the existing data file and its &lt;code&gt;.aria2&lt;/code&gt; sidecar and continues from the last recorded position. The &lt;code&gt;-c&lt;/code&gt; flag makes this behavior explicit and causes aria2 to abort rather than silently overwrite when a safe resume is not possible.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# First run — interrupted partway through&lt;/span&gt;
aria2c &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;-x16&lt;/span&gt; &lt;span class="nt"&gt;-s16&lt;/span&gt; &lt;span class="nt"&gt;-k64M&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; model.gguf &lt;span class="s2"&gt;"https://huggingface.co/.../model.gguf"&lt;/span&gt;

&lt;span class="c"&gt;# Second run — resumes from the interrupted position&lt;/span&gt;
aria2c &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;-x16&lt;/span&gt; &lt;span class="nt"&gt;-s16&lt;/span&gt; &lt;span class="nt"&gt;-k64M&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; model.gguf &lt;span class="s2"&gt;"https://huggingface.co/.../model.gguf"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To enforce strict resume-or-abort behavior rather than a silent restart:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;--always-resume&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"https://example.com/bigfile.iso"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4.2 Missing .aria2 control file
&lt;/h3&gt;

&lt;p&gt;If aria2 reports &lt;code&gt;errorCode=13&lt;/code&gt; or a message containing "file exists but .aria2 does not exist", the data file is present but the control file has been lost.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If the data file is complete and passes an integrity check (e.g., SHA-256 matches the repository-published hash), it can be kept and removed from any pending URL or session lists.&lt;/li&gt;
&lt;li&gt;If the data file is incomplete and the &lt;code&gt;.aria2&lt;/code&gt; file is gone, aria2 has no record of which byte ranges were successfully written. The safest recovery is to &lt;strong&gt;delete both the partial data file and any remnant &lt;code&gt;.aria2&lt;/code&gt; file&lt;/strong&gt;, then restart the download from scratch for that specific file.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do not attempt to resume an incomplete file without its control file. The resulting output may silently contain duplicate or missing byte ranges.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.3 Resuming with a new signed URL
&lt;/h3&gt;

&lt;p&gt;Hugging Face and similar hosts issue time-limited signed URLs. If a partial download's original URL has expired, resume is still possible provided:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;output file name and directory path&lt;/strong&gt; are unchanged.&lt;/li&gt;
&lt;li&gt;The new URL resolves to the same file content.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;--auto-file-renaming&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; ~/models &lt;span class="nt"&gt;-o&lt;/span&gt; model.gguf &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://new-signed-url..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--auto-file-renaming=false&lt;/code&gt; flag prevents aria2 from creating a renamed copy (e.g., &lt;code&gt;model.gguf.1&lt;/code&gt;) when it detects an existing file. Instead, aria2 reuses the existing partial file and its &lt;code&gt;.aria2&lt;/code&gt; control file and continues from the recorded position.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Batch downloads and session files
&lt;/h2&gt;

&lt;h3&gt;
  
  
  5.1 URL list with session management
&lt;/h3&gt;

&lt;p&gt;When downloading multiple files — such as a full safetensors shard set or several GGUF quantization variants — use a URL list file and a session file to enable batch resume. Create &lt;code&gt;urls.txt&lt;/code&gt; with one URL per line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://huggingface.co/.../model-00001-of-00037.safetensors
https://huggingface.co/.../model-00002-of-00037.safetensors
https://huggingface.co/.../model-00003-of-00037.safetensors
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start the batch with session tracking enabled:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;-i&lt;/span&gt; urls.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--save-session&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;aria2-session.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--save-session-interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;60 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;-x8&lt;/span&gt; &lt;span class="nt"&gt;-s8&lt;/span&gt; &lt;span class="nt"&gt;-k16M&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; ~/models
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-i urls.txt&lt;/code&gt;&lt;/strong&gt;: read download targets from the URL list file.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--save-session=aria2-session.txt&lt;/code&gt;&lt;/strong&gt;: write all active and incomplete download state to the session file.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--save-session-interval=60&lt;/code&gt;&lt;/strong&gt;: flush the session file to disk every 60 seconds, limiting lost progress on an abrupt stop.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-c&lt;/code&gt;&lt;/strong&gt;: resume any partial files found in the target directory.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After a reboot or manual stop, resume the entire batch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;--input-file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;aria2-session.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--save-session&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;aria2-session.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-c&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;--input-file&lt;/code&gt; reloads all entries from the session file. Completed downloads are automatically dropped from the next session write. Adding &lt;code&gt;--force-save=true&lt;/code&gt; retains completed entries for audit purposes, but requires manual pruning of the session file as the download set grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.2 Persistent single-session workflow
&lt;/h3&gt;

&lt;p&gt;A single session file can serve as both input and output, providing a self-maintaining queue of unfinished downloads across multiple aria2 invocations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;touch &lt;/span&gt;aria2-session.txt
aria2c &lt;span class="nt"&gt;--input-file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;aria2-session.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--save-session&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;aria2-session.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--save-session-interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;30 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;-x8&lt;/span&gt; &lt;span class="nt"&gt;-s8&lt;/span&gt; &lt;span class="nt"&gt;-k16M&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; ~/downloads
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;New URLs can be added to the queue at any time by running a separate &lt;code&gt;aria2c -i urls.txt ... --save-session=aria2-session.txt&lt;/code&gt; invocation. The session file accumulates all unfinished tasks and the next resume run picks them all up automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Jetson-specific configuration and filesystem Considerations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  6.1 Recommended baseline flags for Jetson AGX Orin
&lt;/h3&gt;

&lt;p&gt;The Jetson AGX Orin has fast CPU cores and ample RAM, but disk throughput and storage capacity may be shared across concurrent inference workloads. The following practices are tuned for this profile:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use a &lt;strong&gt;dedicated directory per model project&lt;/strong&gt; (&lt;code&gt;~/models&lt;/code&gt;, &lt;code&gt;~/hf_cache&lt;/code&gt;) to keep &lt;code&gt;.aria2&lt;/code&gt; control files co-located with their data files and avoid cross-directory confusion.&lt;/li&gt;
&lt;li&gt;Prefer &lt;strong&gt;&lt;code&gt;--file-allocation=none&lt;/code&gt;&lt;/strong&gt; for fast startup. Switch to &lt;code&gt;falloc&lt;/code&gt; only when pre-allocation is explicitly needed for fragmentation control on large multi-GB artefacts.&lt;/li&gt;
&lt;li&gt;Start with &lt;strong&gt;&lt;code&gt;-x8 -s8&lt;/code&gt;&lt;/strong&gt; and increase to 16 if the connection and server support it without triggering rate limiting.&lt;/li&gt;
&lt;li&gt;Always include &lt;strong&gt;&lt;code&gt;-c&lt;/code&gt;&lt;/strong&gt; for any file larger than a few hundred megabytes to guard against accidental restarts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Add a shell alias to &lt;code&gt;~/.bashrc&lt;/code&gt; to standardize the baseline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"alias aria2fast='aria2c -c -x8 -s8 -k16M --file-allocation=none --summary-interval=30'"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc
&lt;span class="nb"&gt;source&lt;/span&gt; ~/.bashrc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Usage with the alias:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2fast &lt;span class="nt"&gt;-d&lt;/span&gt; ~/models &lt;span class="nt"&gt;-o&lt;/span&gt; model.gguf &lt;span class="s2"&gt;"https://huggingface.co/.../model.gguf?download=true"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A Hugging Face-specific alias with longer retry delays handles the backend's rate limiting more gracefully:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"alias hfaria='aria2c -c -x8 -s8 -k32M --retry-wait=10 --max-tries=0 --file-allocation=none --summary-interval=60'"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc
&lt;span class="nb"&gt;source&lt;/span&gt; ~/.bashrc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6.2 File allocation on Jetson NVMe
&lt;/h3&gt;

&lt;p&gt;On the Jetson NVMe (ext4 with extents), &lt;code&gt;--file-allocation=falloc&lt;/code&gt; pre-allocates the full file using &lt;code&gt;fallocate(2)&lt;/code&gt;, which is fast and reduces fragmentation for multi-GB files. On slower or older filesystems, &lt;code&gt;--file-allocation=none&lt;/code&gt; avoids a blocking write pass at startup.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;-x8&lt;/span&gt; &lt;span class="nt"&gt;-s8&lt;/span&gt; &lt;span class="nt"&gt;-k32M&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--file-allocation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;falloc &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; ~/models &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://huggingface.co/.../model.gguf?download=true"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Monitor available storage before long downloads with &lt;code&gt;df -h&lt;/code&gt;. An out-of-space condition frequently manifests as repeated failures at the same completion percentage, which can be misread as a network or server error.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Filename issues from redirect chains
&lt;/h2&gt;

&lt;h3&gt;
  
  
  7.1 Why downloads receive hash-like filenames
&lt;/h3&gt;

&lt;p&gt;When downloading from a Hugging Face &lt;code&gt;resolve&lt;/code&gt; URL, the server redirects through an internal S3-style backend (&lt;code&gt;cas-bridge.xethub.hf.co&lt;/code&gt; or similar object storage). The final HTTP response path contains an opaque SHA-like object key rather than the human-readable model filename. If no explicit output name is provided with &lt;code&gt;-o&lt;/code&gt;, aria2 saves the file using that object key as the filename, producing output such as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;c56b8f0416a453a53aace7bef4a088a2c2db33c3b8a4eda949a380c214420b31
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fix is to always specify &lt;code&gt;-o&lt;/code&gt; with the intended filename. This flag forces the output name regardless of redirects or &lt;code&gt;Content-Disposition&lt;/code&gt; headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/models

aria2c &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;-x8&lt;/span&gt; &lt;span class="nt"&gt;-s8&lt;/span&gt; &lt;span class="nt"&gt;-k32M&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--file-allocation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;none &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; gemma-4-31B-it-Q4_K_M.gguf &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://huggingface.co/unsloth/gemma-4-31B-it-GGUF/resolve/main/gemma-4-31B-it-Q4_K_M.gguf?download=true"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7.2 Renaming an existing hash-named partial file
&lt;/h3&gt;

&lt;p&gt;If a large partial download was already saved under a hash name, it can be renamed without discarding the downloaded data. Both the data file and its &lt;code&gt;.aria2&lt;/code&gt; sidecar must be renamed to matching names simultaneously:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/models

&lt;span class="nb"&gt;mv &lt;/span&gt;c56b8f0416a453a53aace7bef4a088a2c2db33c3b8a4eda949a380c214420b31 &lt;span class="se"&gt;\&lt;/span&gt;
   gemma-4-31B-it-Q4_K_M.gguf

&lt;span class="nb"&gt;mv &lt;/span&gt;c56b8f0416a453a53aace7bef4a088a2c2db33c3b8a4eda949a380c214420b31.aria2 &lt;span class="se"&gt;\&lt;/span&gt;
   gemma-4-31B-it-Q4_K_M.gguf.aria2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After renaming, re-run the standard aria2 command with &lt;code&gt;-o gemma-4-31B-it-Q4_K_M.gguf&lt;/code&gt; from the same directory. aria2 will locate the renamed pair and resume only the missing segments.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Failure modes and recovery
&lt;/h2&gt;

&lt;h3&gt;
  
  
  8.1 HTTP 403 errors during download (errorCode=22)
&lt;/h3&gt;

&lt;p&gt;Error lines of the form &lt;code&gt;errorCode=22 … status=403&lt;/code&gt; indicate that individual HTTP segment requests were rejected by the backend. This occurs most commonly near the end of a long download from Hugging Face for two reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Signed S3 URLs in the redirect chain &lt;strong&gt;expire mid-download&lt;/strong&gt; on very large or slow-connection transfers.&lt;/li&gt;
&lt;li&gt;High concurrency (&lt;code&gt;-x16 -s16&lt;/code&gt;) triggers per-connection rate limiting on popular models.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When &lt;code&gt;errorCode=22&lt;/code&gt; appears but the download ultimately reports &lt;code&gt;stat|OK&lt;/code&gt;, aria2 recovered and retried successfully. To reduce the frequency of these errors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;-x8&lt;/span&gt; &lt;span class="nt"&gt;-s8&lt;/span&gt; &lt;span class="nt"&gt;-k32M&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--retry-wait&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10 &lt;span class="nt"&gt;--max-tries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--file-allocation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;none &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; gemma-4-31B-it-Q4_K_M.gguf &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://huggingface.co/unsloth/gemma-4-31B-it-GGUF/resolve/main/gemma-4-31B-it-Q4_K_M.gguf?download=true"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lower &lt;code&gt;-x&lt;/code&gt;/&lt;code&gt;-s&lt;/code&gt; values reduce the number of simultaneous signed segment requests in flight. Combined with &lt;code&gt;--retry-wait=10&lt;/code&gt; and &lt;code&gt;--max-tries=0&lt;/code&gt;, aria2 waits calmly between retries rather than hammering the backend.&lt;/p&gt;

&lt;h3&gt;
  
  
  8.2 Persistent failures at a fixed byte offset
&lt;/h3&gt;

&lt;p&gt;When a download fails repeatedly at the same percentage or offset:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduce &lt;code&gt;-x&lt;/code&gt; and &lt;code&gt;-s&lt;/code&gt; to lower concurrent byte-range requests.&lt;/li&gt;
&lt;li&gt;Increase &lt;code&gt;--retry-wait&lt;/code&gt; (e.g., &lt;code&gt;--retry-wait=30&lt;/code&gt;) and keep &lt;code&gt;--max-tries=0&lt;/code&gt; to allow extended retry cycles.&lt;/li&gt;
&lt;li&gt;Check for local storage or filesystem errors with &lt;code&gt;dmesg&lt;/code&gt; and &lt;code&gt;journalctl -xe&lt;/code&gt;. Write errors on Jetson NVMe can present as download failures at the application layer.&lt;/li&gt;
&lt;li&gt;If the &lt;code&gt;.aria2&lt;/code&gt; control file is corrupted and resume fails consistently, delete both the partial data file and the &lt;code&gt;.aria2&lt;/code&gt; file, then restart that specific download only — not the entire batch.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  8.3 Quick command reference
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fast single model download&lt;/td&gt;
&lt;td&gt;&lt;code&gt;aria2c -c -x16 -s16 -k64M --file-allocation=none -d ~/models -o FILE "URL"&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conservative single model&lt;/td&gt;
&lt;td&gt;&lt;code&gt;aria2c -c -x8 -s8 -k16M --file-allocation=none -d ~/models -o FILE "URL"&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resume single file&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;aria2c -c -x8 -s8 -o FILE "URL"&lt;/code&gt; (run from same dir, &lt;code&gt;.aria2&lt;/code&gt; present)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch from URL list&lt;/td&gt;
&lt;td&gt;&lt;code&gt;aria2c -i urls.txt --save-session=aria2-session.txt -c -x8 -s8 -d ~/models&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resume batch via session&lt;/td&gt;
&lt;td&gt;&lt;code&gt;aria2c --input-file=aria2-session.txt --save-session=aria2-session.txt -c&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New signed URL for partial&lt;/td&gt;
&lt;td&gt;&lt;code&gt;aria2c -c --auto-file-renaming=false -d DIR -o FILE "NEW_URL"&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Retain completed in session&lt;/td&gt;
&lt;td&gt;Add &lt;code&gt;--force-save=true&lt;/code&gt;; prune session file manually as needed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Table 2 — aria2 command reference for Jetson AGX Orin&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  9. Practical outcomes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Established a correct and efficient aria2 command pattern for large Hugging Face model downloads on Jetson AGX Orin, including proper flag ordering for &lt;code&gt;-x&lt;/code&gt;, &lt;code&gt;-s&lt;/code&gt;, and &lt;code&gt;-k&lt;/code&gt;, and the mandatory use of &lt;code&gt;-o&lt;/code&gt; to avoid hash-like filenames from object storage redirects.&lt;/li&gt;
&lt;li&gt;Documented resume mechanics using &lt;code&gt;.aria2&lt;/code&gt; control files, the &lt;code&gt;-c&lt;/code&gt; flag, and &lt;code&gt;--always-resume=true&lt;/code&gt;, with explicit guidance on what to do when the control file is missing or the data file has been renamed.&lt;/li&gt;
&lt;li&gt;Provided a signed-URL resume pattern using &lt;code&gt;--auto-file-renaming=false&lt;/code&gt; that handles expiring links without restarting partial downloads.&lt;/li&gt;
&lt;li&gt;Defined batch download patterns using URL list files and persistent session files (&lt;code&gt;--save-session&lt;/code&gt;, &lt;code&gt;--input-file&lt;/code&gt;) for multi-shard model repositories such as safetensors split sets.&lt;/li&gt;
&lt;li&gt;Captured Jetson-specific defaults covering file allocation modes (&lt;code&gt;none&lt;/code&gt; vs. &lt;code&gt;falloc&lt;/code&gt;), directory hygiene, concurrency tuning, disk space monitoring, and error recovery for long-running downloads.&lt;/li&gt;
&lt;li&gt;Identified the root cause of HTTP 403 (&lt;code&gt;errorCode=22&lt;/code&gt;) errors during large Hugging Face downloads as expiring signed URLs and per-connection rate limiting, with a prescriptive mitigation using reduced concurrency and extended retry delays.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  10. Conclusions and recommendations
&lt;/h2&gt;

&lt;p&gt;aria2 reliably saturates available bandwidth for large LLM downloads on Jetson AGX Orin and handles interruptions gracefully, provided three practices are consistently followed: always pass &lt;code&gt;-c&lt;/code&gt; for large files; use a stable output directory so &lt;code&gt;.aria2&lt;/code&gt; control files remain co-located with their data files; and set &lt;code&gt;--max-tries=0&lt;/code&gt; so aria2 recovers from transient failures without manual intervention.&lt;/p&gt;

&lt;p&gt;For daily workflows, standardize on one or two shell aliases (&lt;code&gt;aria2fast&lt;/code&gt;, &lt;code&gt;hfaria&lt;/code&gt;) and always invoke aria2 from the same directory paths. Always specify &lt;code&gt;-o&lt;/code&gt; with an explicit filename when downloading from Hugging Face, as the platform's object storage backend assigns opaque hash-like keys that aria2 will use as the filename in the absence of an explicit override. This eliminates the most common source of filename confusion and simplifies subsequent resume operations.&lt;/p&gt;

&lt;p&gt;When troubleshooting stubborn failures, apply interventions in order: reduce concurrency first, then increase retry delay, then inspect disk and filesystem health with &lt;code&gt;dmesg&lt;/code&gt; and &lt;code&gt;journalctl&lt;/code&gt;. Delete a partial file and its &lt;code&gt;.aria2&lt;/code&gt; sidecar only as a last resort, and only for the specific file that is failing — not for the entire batch. If a transient Hugging Face backend outage is causing persistent 403 errors near completion, the most effective response is to wait and retry rather than to restart a near-complete multi-gigabyte download from scratch.&lt;/p&gt;

</description>
      <category>jetson</category>
      <category>orinagx</category>
      <category>aria2</category>
      <category>linux</category>
    </item>
    <item>
      <title>Network Optimization Tutorial For NVIDIA Jetson AGX Orin 64 GB</title>
      <dc:creator>Sergio Andres Usma</dc:creator>
      <pubDate>Sun, 05 Apr 2026 22:14:49 +0000</pubDate>
      <link>https://forem.com/vonusma/network-optimization-tutorial-for-nvidia-jetson-agx-orin-64-gb-a9e</link>
      <guid>https://forem.com/vonusma/network-optimization-tutorial-for-nvidia-jetson-agx-orin-64-gb-a9e</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;This tutorial documents a systematic approach to network performance optimization on an NVIDIA Jetson AGX Orin Developer Kit 64 GB running Ubuntu 22.04.5 LTS (aarch64) with JetPack 6.2.2, CUDA 12.6, cuDNN 9.3.0, OpenCV 4.8.0, and TensorRT 10.3.0.30. The procedure covers kernel TCP buffer tuning, MTU adjustment on the &lt;code&gt;eno1&lt;/code&gt; wired interface, APT parallel download configuration, and aria2 multi-connection download tooling. All steps include pre-change backups and a dedicated revert procedure.&lt;/p&gt;

&lt;p&gt;The guide is structured as a production-oriented, step-by-step procedure rather than a reference summary. It includes a consolidated interactive Bash script that auto-detects the primary wired interface (&lt;code&gt;eno1&lt;/code&gt; or &lt;code&gt;eth0&lt;/code&gt;), backs up affected configuration files before any changes, and applies each optimization only with explicit operator consent. Troubleshooting guidance is included for the &lt;code&gt;RTNETLINK answers: Device or resource busy&lt;/code&gt; error encountered during MTU changes on Jetson hardware.&lt;/p&gt;

&lt;p&gt;System administrators and Edge AI developers with intermediate Linux experience will benefit from this document when preparing a Jetson AGX Orin for workloads that involve frequent large model downloads, frequent package updates, or sustained high-bandwidth data transfers. The tutorial assumes shell access with &lt;code&gt;sudo&lt;/code&gt; privileges and familiarity with a terminal text editor such as &lt;code&gt;nano&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Prerequisites and Environment
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1.1 Hardware and Software Specifications
&lt;/h3&gt;

&lt;p&gt;The following table describes the environment in which all commands were validated. Applying these optimizations on a different JetPack release or kernel version may require adjusting parameter values.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Hardware&lt;/td&gt;
&lt;td&gt;NVIDIA Jetson AGX Orin Developer Kit 64 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OS&lt;/td&gt;
&lt;td&gt;Ubuntu 22.04.5 LTS aarch64 (L4T 36.5.0)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JetPack&lt;/td&gt;
&lt;td&gt;nvidia-jetpack 6.2.2+b24&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CUDA&lt;/td&gt;
&lt;td&gt;12.6.68&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;cuDNN&lt;/td&gt;
&lt;td&gt;9.3.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenCV&lt;/td&gt;
&lt;td&gt;4.8.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TensorRT&lt;/td&gt;
&lt;td&gt;10.3.0.30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel&lt;/td&gt;
&lt;td&gt;5.15.185-tegra&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Table 1 — Validated hardware and software environment&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1.2 Required Permissions and Tools
&lt;/h3&gt;

&lt;p&gt;All system configuration steps require &lt;code&gt;sudo&lt;/code&gt; access. The following tools are used throughout the tutorial and are available by default on JetPack installations: &lt;code&gt;nano&lt;/code&gt;, &lt;code&gt;cp&lt;/code&gt;, &lt;code&gt;sysctl&lt;/code&gt;, &lt;code&gt;ip&lt;/code&gt;, &lt;code&gt;apt&lt;/code&gt;, &lt;code&gt;ping&lt;/code&gt;, and &lt;code&gt;bash&lt;/code&gt;. The &lt;code&gt;aria2c&lt;/code&gt; binary is installed in Section 6.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.3 Primary Wired Interface Name
&lt;/h3&gt;

&lt;p&gt;On the Jetson AGX Orin Developer Kit, the wired Ethernet interface is exposed as &lt;code&gt;eno1&lt;/code&gt; under predictable network interface naming (udev rules). Some custom images or older configurations may still use &lt;code&gt;eth0&lt;/code&gt;. Where commands target a specific interface, both names are provided. The &lt;code&gt;ip link&lt;/code&gt; command identifies the correct name on any given system.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Pre-Change Backup Procedure
&lt;/h2&gt;

&lt;p&gt;Before modifying any system configuration file, create timestamped backups using the &lt;code&gt;.backup-pre-netopt&lt;/code&gt; suffix. This convention makes backup files easy to identify and is expected by the revert commands in Section 9.&lt;/p&gt;

&lt;p&gt;Run the following once before proceeding to any subsequent section:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Backup sysctl configuration&lt;/span&gt;
&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /etc/sysctl.conf /etc/sysctl.conf.backup-pre-netopt

&lt;span class="c"&gt;# Back up the APT parallel config only if it already exists&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/apt/apt.conf.d/99parallel &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /etc/apt/apt.conf.d/99parallel /etc/apt/apt.conf.d/99parallel.backup-pre-netopt
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify the backup was created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-lh&lt;/span&gt; /etc/sysctl.conf.backup-pre-netopt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The backup captures the unmodified state of &lt;code&gt;/etc/sysctl.conf&lt;/code&gt;. If the APT configuration file does not yet exist (first-time setup), no APT backup is created; the revert script handles this case by removing the file rather than restoring a backup.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Maximum Performance Mode
&lt;/h2&gt;

&lt;p&gt;Dynamic CPU and GPU frequency scaling can reduce network throughput indirectly by limiting the processing available to TCP stack operations, protocol encryption, and receive-side data handling. NVIDIA provides two utilities to force the Jetson into its highest power and clock configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nvpmodel &lt;span class="nt"&gt;-m&lt;/span&gt; 0
&lt;span class="nb"&gt;sudo &lt;/span&gt;jetson_clocks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;nvpmodel -m 0&lt;/code&gt; selects power model 0, which is the maximum performance profile on Jetson AGX Orin.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;jetson_clocks&lt;/code&gt; locks CPU, GPU, and memory frequencies to their maximum values and disables dynamic frequency scaling.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These commands take effect immediately but do not persist across reboots. If your workload restarts after system reboots, add both commands to a startup service or run them manually before beginning large download or inference sessions.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Kernel Network Parameter Tuning
&lt;/h2&gt;

&lt;p&gt;Kernel-level TCP parameters govern how much memory is allocated to socket buffers and how the TCP stack behaves under high-throughput conditions. The defaults in a stock Ubuntu image are conservative and were not tuned for sustained high-bandwidth transfers of the kind required when pulling large AI model checkpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.1 Edit sysctl Configuration
&lt;/h3&gt;

&lt;p&gt;Open the sysctl configuration file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nano /etc/sysctl.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Append the following block at the end of the file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Disable IPv6 (optional — avoids latency in name resolution on IPv4-only networks)&lt;/span&gt;
net.ipv6.conf.all.disable_ipv6 &lt;span class="o"&gt;=&lt;/span&gt; 1
net.ipv6.conf.default.disable_ipv6 &lt;span class="o"&gt;=&lt;/span&gt; 1
net.ipv6.conf.lo.disable_ipv6 &lt;span class="o"&gt;=&lt;/span&gt; 1

&lt;span class="c"&gt;# Increase TCP buffers for high-speed downloads&lt;/span&gt;
net.core.rmem_max &lt;span class="o"&gt;=&lt;/span&gt; 16777216
net.core.wmem_max &lt;span class="o"&gt;=&lt;/span&gt; 16777216
net.ipv4.tcp_rmem &lt;span class="o"&gt;=&lt;/span&gt; 4096 87380 16777216
net.ipv4.tcp_wmem &lt;span class="o"&gt;=&lt;/span&gt; 4096 65536 16777216
net.ipv4.tcp_slow_start_after_idle &lt;span class="o"&gt;=&lt;/span&gt; 0
net.ipv4.tcp_window_scaling &lt;span class="o"&gt;=&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The IPv6 disable block is optional. Omit those three lines if the system connects to IPv6-only or dual-stack services. The interactive script in Section 8 prompts for this choice at runtime.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  4.2 Apply the Configuration
&lt;/h3&gt;

&lt;p&gt;Save the file and apply all settings immediately without rebooting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;sysctl &lt;span class="nt"&gt;-p&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output lists each applied parameter and its new value. The settings increase the maximum socket receive and send buffer sizes to 16 MB, expand the default and maximum TCP window memory, disable the slow-start penalty after a connection has been idle, and confirm that TCP window scaling is active.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. MTU Adjustment for Wired Interface
&lt;/h2&gt;

&lt;p&gt;The Maximum Transmission Unit (MTU) controls the largest payload that can be sent in a single Ethernet frame without IP fragmentation. A mismatch between the Jetson's MTU and the network path MTU can cause silent retransmissions and degraded throughput. On some networks, setting MTU to 1450 bytes avoids fragmentation caused by VPN or tunnel encapsulation overhead.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.1 Identify the Active Wired Interface
&lt;/h3&gt;

&lt;p&gt;List all network interfaces and their current MTU values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ip &lt;span class="nb"&gt;link&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A typical output on Jetson AGX Orin looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1: lo: &amp;lt;LOOPBACK,UP,LOWER_UP&amp;gt; mtu 65536 ...
3: wlP1p1s0: &amp;lt;NO-CARRIER,BROADCAST,MULTICAST,UP&amp;gt; mtu 1500 ...
5: eno1: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt; mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 4X:bX:4X:4X:6X:XX brd ff:ff:ff:ff:ff:ff
6: l4tbr0: &amp;lt;BROADCAST,MULTICAST&amp;gt; mtu 1500 ...
7: usb0: &amp;lt;NO-CARRIER,BROADCAST,MULTICAST,UP&amp;gt; mtu 1500 ...
8: usb1: &amp;lt;NO-CARRIER,BROADCAST,MULTICAST,UP&amp;gt; mtu 1500 ...
9: docker0: &amp;lt;NO-CARRIER,BROADCAST,MULTICAST,UP&amp;gt; mtu 1500 ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The active wired interface is the one in &lt;code&gt;UP&lt;/code&gt; state with a hardware Ethernet address. On Jetson AGX Orin this is typically &lt;code&gt;eno1&lt;/code&gt;. Only change the MTU of the interface that carries the traffic being optimized.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.2 Change MTU at Runtime
&lt;/h3&gt;

&lt;p&gt;For &lt;code&gt;eno1&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;dev eno1 mtu 1450
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For systems using &lt;code&gt;eth0&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;dev eth0 mtu 1450
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This change applies immediately but resets to 1500 on reboot. For a persistent configuration, use Netplan or NetworkManager as described in Section 5.4.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.3 Troubleshooting: "RTNETLINK answers: Device or resource busy"
&lt;/h3&gt;

&lt;p&gt;This error indicates that a higher-level service holds the interface or that the interface is a member of a bridge. It is common on Jetson because &lt;code&gt;eno1&lt;/code&gt; can be associated with the &lt;code&gt;l4tbr0&lt;/code&gt; bridge used for USB networking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1 — Bring the interface down, change MTU, then bring it back up:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;dev eno1 down
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;dev eno1 mtu 1450
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;dev eno1 up
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Replace &lt;code&gt;eno1&lt;/code&gt; with &lt;code&gt;eth0&lt;/code&gt; if that is the active interface. Connectivity is interrupted briefly while the interface is down.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 2 — Change MTU on the bridge instead of the physical interface:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If traffic passes through &lt;code&gt;l4tbr0&lt;/code&gt;, apply the MTU to the bridge:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;dev l4tbr0 mtu 1450
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify connectivity immediately after this change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 3 — Use the network manager to apply the change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For NetworkManager-managed connections:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nmcli connection show
nmcli connection modify &lt;span class="s2"&gt;"&amp;lt;connection-name&amp;gt;"&lt;/span&gt; 802-3-ethernet.mtu 1450
nmcli connection down &lt;span class="s2"&gt;"&amp;lt;connection-name&amp;gt;"&lt;/span&gt;
nmcli connection up &lt;span class="s2"&gt;"&amp;lt;connection-name&amp;gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Netplan-managed interfaces, edit the relevant YAML file (see Section 5.4) and apply.&lt;/p&gt;

&lt;p&gt;If none of these options resolve the error without disrupting connectivity, leave the MTU at 1500 and focus on the kernel TCP tuning in Section 4 and the aria2 tooling in Section 6.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.4 Persistent MTU via Netplan
&lt;/h3&gt;

&lt;p&gt;To make the MTU change survive reboots, edit the Netplan configuration file for the interface. The file is typically located at &lt;code&gt;/etc/netplan/01-netcfg.yaml&lt;/code&gt; or a similarly named file in &lt;code&gt;/etc/netplan/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nano /etc/netplan/01-netcfg.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add or update the &lt;code&gt;mtu&lt;/code&gt; key for the interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;network:
  ethernets:
    eno1:
      mtu: 1450
      dhcp4: true
  version: 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply the change:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;netplan apply
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  6. APT Download Optimization and aria2 Installation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  6.1 APT Parallel Download Configuration
&lt;/h3&gt;

&lt;p&gt;APT downloads package lists and archives sequentially by default, which underutilizes available bandwidth when fetching many packages. A drop-in configuration file in &lt;code&gt;/etc/apt/apt.conf.d/&lt;/code&gt; can improve this behavior without modifying the main APT configuration.&lt;/p&gt;

&lt;p&gt;Create the configuration snippet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nano /etc/apt/apt.conf.d/99parallel
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add the following content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Acquire::Languages "none";
Acquire::Queue-Mode "access";
Acquire::Retries "3";
Acquire::http::Pipeline-Depth "5";
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Acquire::Languages "none"&lt;/code&gt; suppresses the download of translated package description files, which are rarely needed on a headless Edge AI system.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Acquire::Queue-Mode "access"&lt;/code&gt; prioritizes fetching from the same server before switching, reducing connection overhead.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Acquire::Retries "3"&lt;/code&gt; retries failed downloads up to three times before failing.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Acquire::http::Pipeline-Depth "5"&lt;/code&gt; sends up to five HTTP requests in flight simultaneously on persistent connections, improving throughput on reliable links.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The changes take effect on the next &lt;code&gt;sudo apt update&lt;/code&gt; or &lt;code&gt;sudo apt upgrade&lt;/code&gt; invocation.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.2 aria2 for Large File Downloads
&lt;/h3&gt;

&lt;p&gt;For AI model checkpoints, dataset archives, or container images that exceed several gigabytes, &lt;code&gt;aria2&lt;/code&gt; opens multiple parallel connections to the same server and splits the file into segments. This approach can saturate available bandwidth more effectively than single-threaded tools such as &lt;code&gt;wget&lt;/code&gt; or &lt;code&gt;curl&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Install aria2:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;aria2 &lt;span class="nt"&gt;-y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Download a large file using 16 parallel connections and 16 segments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aria2c &lt;span class="nt"&gt;-x&lt;/span&gt; 16 &lt;span class="nt"&gt;-s&lt;/span&gt; 16 &lt;span class="s2"&gt;"URL_TO_LARGE_FILE"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-x 16&lt;/code&gt; sets the maximum number of simultaneous connections per server.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-s 16&lt;/code&gt; splits the download into 16 segments, each fetched by a separate connection.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reduce both values on congested networks or when the target server enforces per-IP connection limits.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Consolidated Automation Script
&lt;/h2&gt;

&lt;p&gt;The interactive script below consolidates all optimization steps into a single file. It detects the primary wired interface automatically, creates backups before any changes, and prompts for confirmation before each optimization section. IPv6 disabling and APT tuning are presented as optional to preserve compatibility with environments that depend on those behaviors.&lt;/p&gt;

&lt;p&gt;Create the script file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nano ~/jetson_network_opt.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Paste the following content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=== Jetson Network Optimization Script ==="&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"This script will:"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  - Backup /etc/sysctl.conf and /etc/apt/apt.conf.d/99parallel (if present)"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  - Optionally tune kernel TCP parameters"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  - Optionally disable IPv6"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  - Optionally adjust MTU for the primary wired interface"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  - Optionally optimize APT downloads"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  - Optionally install aria2"&lt;/span&gt;
&lt;span class="nb"&gt;echo

read&lt;/span&gt; &lt;span class="nt"&gt;-rp&lt;/span&gt; &lt;span class="s2"&gt;"Continue? [y/N]: "&lt;/span&gt; CONTINUE
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CONTINUE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;~ ^[Yy]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Aborting."&lt;/span&gt;
  &lt;span class="nb"&gt;exit &lt;/span&gt;0
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"== Detecting primary wired interface =="&lt;/span&gt;

&lt;span class="nv"&gt;PRIMARY_IF&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;ip &lt;span class="nb"&gt;link &lt;/span&gt;show eno1 &amp;amp;&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nv"&gt;PRIMARY_IF&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"eno1"&lt;/span&gt;
&lt;span class="k"&gt;elif &lt;/span&gt;ip &lt;span class="nb"&gt;link &lt;/span&gt;show eth0 &amp;amp;&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nv"&gt;PRIMARY_IF&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"eth0"&lt;/span&gt;
&lt;span class="k"&gt;fi

if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PRIMARY_IF&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Warning: neither eno1 nor eth0 detected. You may need to edit this script to use your interface name."&lt;/span&gt;
&lt;span class="k"&gt;else
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Primary wired interface detected: &lt;/span&gt;&lt;span class="nv"&gt;$PRIMARY_IF&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"== 1) Backing up configuration files =="&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/sysctl.conf &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /etc/sysctl.conf /etc/sysctl.conf.backup-pre-netopt
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Backup: /etc/sysctl.conf.backup-pre-netopt created."&lt;/span&gt;
&lt;span class="k"&gt;fi

if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/apt/apt.conf.d/99parallel &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /etc/apt/apt.conf.d/99parallel /etc/apt/apt.conf.d/99parallel.backup-pre-netopt
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Backup: /etc/apt/apt.conf.d/99parallel.backup-pre-netopt created."&lt;/span&gt;
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"== 2) Kernel TCP parameter tuning =="&lt;/span&gt;

&lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-rp&lt;/span&gt; &lt;span class="s2"&gt;"Apply TCP buffer and window tuning in /etc/sysctl.conf? [y/N]: "&lt;/span&gt; APPLY_TCP
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$APPLY_TCP&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;~ ^[Yy]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s1"&gt;'cat &amp;gt;&amp;gt; /etc/sysctl.conf &amp;lt;&amp;lt;EOF

# Jetson network optimization - TCP buffers
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_window_scaling = 1
EOF'&lt;/span&gt;
  &lt;span class="nb"&gt;sudo &lt;/span&gt;sysctl &lt;span class="nt"&gt;-p&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"TCP parameters applied."&lt;/span&gt;
&lt;span class="k"&gt;else
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Skipping TCP parameter tuning."&lt;/span&gt;
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"== 3) IPv6 behavior =="&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"If you rely on IPv6 (e.g. IPv6-only or dual-stack networks), DO NOT disable it."&lt;/span&gt;
&lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-rp&lt;/span&gt; &lt;span class="s2"&gt;"Disable IPv6 via /etc/sysctl.conf? [y/N]: "&lt;/span&gt; DISABLE_IPV6
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DISABLE_IPV6&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;~ ^[Yy]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s1"&gt;'cat &amp;gt;&amp;gt; /etc/sysctl.conf &amp;lt;&amp;lt;EOF

# Jetson network optimization - disable IPv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
EOF'&lt;/span&gt;
  &lt;span class="nb"&gt;sudo &lt;/span&gt;sysctl &lt;span class="nt"&gt;-p&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"IPv6 disabled (sysctl)."&lt;/span&gt;
&lt;span class="k"&gt;else
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Keeping IPv6 enabled."&lt;/span&gt;
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"== 4) MTU adjustment for primary wired interface =="&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PRIMARY_IF&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"No primary interface detected (eno1/eth0). Skipping MTU change."&lt;/span&gt;
&lt;span class="k"&gt;else
  &lt;/span&gt;ip &lt;span class="nb"&gt;link &lt;/span&gt;show &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PRIMARY_IF&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-rp&lt;/span&gt; &lt;span class="s2"&gt;"Set MTU 1450 on &lt;/span&gt;&lt;span class="nv"&gt;$PRIMARY_IF&lt;/span&gt;&lt;span class="s2"&gt; (runtime only, reset on reboot)? [y/N]: "&lt;/span&gt; SET_MTU
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SET_MTU&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;~ ^[Yy]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;dev &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PRIMARY_IF&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; mtu 1450&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
      &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Failed to set MTU on &lt;/span&gt;&lt;span class="nv"&gt;$PRIMARY_IF&lt;/span&gt;&lt;span class="s2"&gt; (device or resource busy?)."&lt;/span&gt;
      &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"You may need to adjust MTU via NetworkManager, Netplan, or a bridge (e.g. l4tbr0)."&lt;/span&gt;
    &lt;span class="k"&gt;else
      &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"MTU adjustment attempted on &lt;/span&gt;&lt;span class="nv"&gt;$PRIMARY_IF&lt;/span&gt;&lt;span class="s2"&gt;."&lt;/span&gt;
    &lt;span class="k"&gt;fi
  else
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Skipping MTU change."&lt;/span&gt;
  &lt;span class="k"&gt;fi
fi

&lt;/span&gt;&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"== 5) APT optimization =="&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"This will create or overwrite /etc/apt/apt.conf.d/99parallel."&lt;/span&gt;
&lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-rp&lt;/span&gt; &lt;span class="s2"&gt;"Apply APT optimization? [y/N]: "&lt;/span&gt; APPLY_APT
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$APPLY_APT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;~ ^[Yy]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s1"&gt;'cat &amp;gt; /etc/apt/apt.conf.d/99parallel &amp;lt;&amp;lt;EOF
Acquire::Languages "none";
Acquire::Queue-Mode "access";
Acquire::Retries "3";
Acquire::http::Pipeline-Depth "5";
EOF'&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"APT optimization applied."&lt;/span&gt;
&lt;span class="k"&gt;else
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Skipping APT optimization."&lt;/span&gt;
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"== 6) aria2 installation =="&lt;/span&gt;

&lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-rp&lt;/span&gt; &lt;span class="s2"&gt;"Install aria2 for multi-connection downloads? [y/N]: "&lt;/span&gt; INSTALL_ARIA2
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INSTALL_ARIA2&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;~ ^[Yy]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
  &lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; aria2
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"aria2 installed."&lt;/span&gt;
&lt;span class="k"&gt;else
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Skipping aria2 installation."&lt;/span&gt;
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"All selected steps completed."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make the script executable and run it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x ~/jetson_network_opt.sh
~/jetson_network_opt.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The script exits cleanly if the operator declines any step. Interface detection runs once at startup and the result is reused for MTU operations. If neither &lt;code&gt;eno1&lt;/code&gt; nor &lt;code&gt;eth0&lt;/code&gt; is present, the MTU section is skipped automatically with a diagnostic message.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Reverting All Changes
&lt;/h2&gt;

&lt;p&gt;If connectivity degrades or system behavior changes unexpectedly after applying these optimizations, revert to the pre-optimization state using the backups created in Section 2.&lt;/p&gt;

&lt;h3&gt;
  
  
  8.1 Restore sysctl and APT Configuration
&lt;/h3&gt;

&lt;p&gt;Run the following to restore both configuration files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Restore sysctl configuration if backup exists&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/sysctl.conf.backup-pre-netopt &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /etc/sysctl.conf.backup-pre-netopt /etc/sysctl.conf
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Restored /etc/sysctl.conf from backup."&lt;/span&gt;
  &lt;span class="nb"&gt;sudo &lt;/span&gt;sysctl &lt;span class="nt"&gt;-p&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Restore APT optimization file if backup exists, otherwise remove it&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/apt/apt.conf.d/99parallel.backup-pre-netopt &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /etc/apt/apt.conf.d/99parallel.backup-pre-netopt /etc/apt/apt.conf.d/99parallel
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Restored /etc/apt/apt.conf.d/99parallel from backup."&lt;/span&gt;
&lt;span class="k"&gt;else
  if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/apt/apt.conf.d/99parallel &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;sudo rm&lt;/span&gt; /etc/apt/apt.conf.d/99parallel
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Removed /etc/apt/apt.conf.d/99parallel created by optimization."&lt;/span&gt;
  &lt;span class="k"&gt;fi
fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;sysctl -p&lt;/code&gt; call within the restore block immediately reloads the original kernel parameters without requiring a reboot.&lt;/p&gt;

&lt;h3&gt;
  
  
  8.2 Revert MTU
&lt;/h3&gt;

&lt;p&gt;The MTU change is runtime-only and resets automatically on the next reboot. To revert it immediately without rebooting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;dev eno1 mtu 1500
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or for &lt;code&gt;eth0&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;dev eth0 mtu 1500
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  8.3 Remove aria2
&lt;/h3&gt;

&lt;p&gt;If &lt;code&gt;aria2&lt;/code&gt; was installed solely for this workflow and is no longer needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt remove &lt;span class="nt"&gt;-y&lt;/span&gt; aria2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After reverting, re-run the connectivity verification in Section 9 to confirm normal operation.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. Practical Outcomes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Maximum performance mode:&lt;/strong&gt; &lt;code&gt;nvpmodel -m 0&lt;/code&gt; and &lt;code&gt;jetson_clocks&lt;/code&gt; eliminate CPU and GPU frequency throttling, ensuring consistent processing headroom during sustained network activity. Both commands must be re-run after reboot unless integrated into a startup service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Improved TCP buffer sizing:&lt;/strong&gt; Kernel parameters raise socket buffer limits to 16 MB and disable the slow-start penalty after idle periods, measurably improving throughput on high-bandwidth links carrying large file transfers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optional IPv6 control:&lt;/strong&gt; The IPv6 disable block is presented as an explicit choice rather than a default, preserving compatibility with dual-stack and IPv6-only environments. The interactive script enforces this distinction at runtime.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Correct interface targeting:&lt;/strong&gt; MTU changes target &lt;code&gt;eno1&lt;/code&gt; by default on Jetson AGX Orin hardware, with automatic fallback to &lt;code&gt;eth0&lt;/code&gt; in the automation script. Three resolution paths for the &lt;code&gt;RTNETLINK answers: Device or resource busy&lt;/code&gt; error are documented and tested.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;APT efficiency:&lt;/strong&gt; The &lt;code&gt;99parallel&lt;/code&gt; drop-in configuration reduces unnecessary package list downloads, enables HTTP pipelining, and adds retry resilience without modifying core APT behavior.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-connection downloads:&lt;/strong&gt; &lt;code&gt;aria2c -x 16 -s 16&lt;/code&gt; saturates available bandwidth when downloading large AI model archives from servers that permit multiple concurrent connections. The tool installs and removes cleanly via APT.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Safe configuration management:&lt;/strong&gt; Backups with the &lt;code&gt;.backup-pre-netopt&lt;/code&gt; suffix and a dedicated revert procedure reduce the risk of persistent misconfiguration. All changes can be undone without rebooting, except MTU (which also resets on reboot).&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  10. Conclusion
&lt;/h2&gt;

&lt;p&gt;Applying the optimizations described in this tutorial to a Jetson AGX Orin 64 GB running JetPack 6.2.2 produces a measurable improvement in network throughput for Edge AI development workflows, particularly those involving repeated large model downloads and frequent JetPack package updates. The combination of maximum performance mode, expanded TCP buffers, interface-aware MTU adjustment, APT pipeline tuning, and aria2 multi-connection downloads addresses the principal bottlenecks encountered on this platform without requiring kernel rebuilds or third-party drivers.&lt;/p&gt;

&lt;p&gt;The backup and revert procedures in Sections 2 and 8, together with the interactive automation script in Section 7, reduce operational risk to a level appropriate for both development and production-adjacent systems. Each optimization is applied with explicit consent and can be reversed independently.&lt;/p&gt;

&lt;p&gt;For production deployments, consider encoding these settings into a configuration management tool such as Ansible, committing the Netplan YAML changes for persistent MTU configuration, and coordinating with the local network team to confirm that a 1450-byte MTU is appropriate for the network path in use. Document any deviations from the values shown here alongside the JetPack version in use, as kernel and L4T updates may change default TCP stack behavior.&lt;/p&gt;

</description>
      <category>jetson</category>
      <category>agxorin</category>
      <category>network</category>
    </item>
    <item>
      <title>Creating a 50 GB Swap File on Jetson AGX Orin (Root on NVMe)</title>
      <dc:creator>Sergio Andres Usma</dc:creator>
      <pubDate>Sun, 05 Apr 2026 17:59:02 +0000</pubDate>
      <link>https://forem.com/vonusma/creating-a-50-gb-swap-file-on-jetson-agx-orin-root-on-nvme-ijd</link>
      <guid>https://forem.com/vonusma/creating-a-50-gb-swap-file-on-jetson-agx-orin-root-on-nvme-ijd</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;This document describes the process of creating, tuning, and managing a large swap file on an NVIDIA Jetson AGX Orin 64 GB running Ubuntu 22.04.5 LTS aarch64. The configuration is specifically optimized for running large language models (LLMs) alongside CUDA, cuMB, and TensorRT by leveraging a fast NVMe SSD as the primary swap backing store.&lt;/p&gt;

&lt;p&gt;The implementation was validated using a 50 GB swap file configuration alongside existing zram layers. The procedure successfully extended the usable memory capacity, allowing for the deployment of larger models without triggering immediate Out-Of-Memory (OOM) errors, provided the storage-to-RAM paging latency is acceptable.&lt;/p&gt;

&lt;p&gt;This tutorial serves as a technical reference for advanced Jetson and Linux users. It provides a reproducible method for extending virtual memory on edge AI hardware to support demanding 34B–70B parameter models.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Hardware and Software Environment
&lt;/h2&gt;

&lt;p&gt;The target environment is an NVIDIA Jetson AGX Orin Developer Kit equipped with 64 GB of unified memory. The system runs Ubuntu 22.04.5 LTS on an aarch64 kernel (5.15.185-tegra). The installation includes JetPack 6.2.2, providing the necessary software stack for AI inference, including CUDA 12.6, cuDNN 9.3.0, and TensorRT 10.3.0.&lt;/p&gt;

&lt;p&gt;The primary storage for the swap file is the NVMe SSD, which serves as the root filesystem. This choice is critical for minimizing the performance penalty during memory paging operations.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Detail&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hardware&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;NVIDIA Jetson AGX Orin Developer Kit 64 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ubuntu 22.04.5 LTS aarch64&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Kernel&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5:15.185-tegra&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RAM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;64 GB unified memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JetPack&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;6.2.2+b24 (nvidia-jetpack)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CUDA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;12.6 (nvcc 12.6.68)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;cuDNN&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;9.3.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TensorRT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;10.3.0.30-1+cuda12.5&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Table 1 — Jetson AGX Orin environment for swap configuration&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Swap Location Strategy
&lt;/h2&gt;

&lt;p&gt;Effective swap placement is determined by the throughput and endurance of the underlying storage media. On the Jetson AGX Orin, the system utilizes eMMC for the boot partition and an NVMe SSD for the primary root filesystem.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Storage&lt;/th&gt;
&lt;th&gt;Approx Speed&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NVMe SSD&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~2000 MB/s&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Best&lt;/strong&gt; — primary location for swap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;eMMC&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~400 MB/s&lt;/td&gt;
&lt;td&gt;Secondary fallback; higher wear risk&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;USB Drive&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~100 MB/s&lt;/td&gt;
&lt;td&gt;Not recommended due to high latency&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Table 2 — Recommended swap backing storage on Jetson AGX Orin&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For this configuration, the swap file is placed directly on the NVMe-backed root filesystem (&lt;code&gt;/&lt;/code&gt;) at &lt;code&gt;/swapfile&lt;/code&gt;. This ensures the highest possible I/O performance for paging operations.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Step-by-Step Swap File Creation
&lt;/h2&gt;

&lt;p&gt;The following steps outline the allocation and initialization of a 50 GB swap file.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Check Devices and Free Space
&lt;/h3&gt;

&lt;p&gt;Before allocation, verify the available space on the target partition. The &lt;code&gt;lsblk&lt;/code&gt; command confirms the mount points, while &lt;code&gt;df -h&lt;/code&gt; verifies the capacity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# List block devices and mount points&lt;/span&gt;
lsblk &lt;span class="nt"&gt;-o&lt;/span&gt; NAME,SIZE,TYPE,MOSQL,ROTA

&lt;span class="c"&gt;# Check free space on the root filesystem&lt;/span&gt;
&lt;span class="nb"&gt;df&lt;/span&gt; &lt;span class="nt"&gt;-h&lt;/span&gt; /
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The current configuration shows approximately 636 GB of available space on &lt;code&gt;/dev/nvme0n1p1&lt;/code&gt;, which is more than sufficient for a 50 GB allocation.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 Create the Swap File
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;fallocate&lt;/code&gt; utility is used to pre-allocate the file space efficiently.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Allocate 50 GB for the swap file on the root filesystem&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;fallocate &lt;span class="nt"&gt;-l&lt;/span&gt; 50G /swapfile
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.3 Secure and Format the Swap File
&lt;/h3&gt;

&lt;p&gt;Security is paramount; the swap file must be restricted to root-only access to prevent sensitive data leakage from memory to disk.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Restrict permissions to root read/write only&lt;/span&gt;
&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;600 /swapfile

&lt;span class="c"&gt;# Format the file as swap space&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;mkswap /swapfile
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.4 Enable the Swap File
&lt;/h3&gt;

&lt;p&gt;Once formatted, the swap file must be activated in the running kernel.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Enable the swap file&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;swapon /swapfile

&lt;span class="c"&gt;# Verify active swap devices&lt;/span&gt;
swapon &lt;span class="nt"&gt;--show&lt;/span&gt;

&lt;span class="c"&gt;# Confirm memory and swap totals&lt;/span&gt;
free &lt;span class="nt"&gt;-h&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  4. Making Swap Persistent Across Reboots
&lt;/h2&gt;

&lt;p&gt;To ensure the swap file is automatically re-enabled upon system restart, an entry must be added to the &lt;code&gt;/etc/fstab&lt;/code&gt; configuration file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Append the swap file definition to /etc/fstab&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'/swapfile none swap sw 0 0'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; /etc/fstab

&lt;span class="c"&gt;# Verify the entry exists&lt;/span&gt;
&lt;span class="nb"&gt;grep &lt;/span&gt;swap /etc/fstab
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  5. Tuning Swappiness and zram for LLM Workloads
&lt;/h2&gt;

&lt;p&gt;Optimal performance for LLM inference requires tuning the kernel to prioritize physical RAM and the compressed &lt;code&gt;zram&lt;/code&gt; layer over the disk-backed swap file.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.1 Adjust Swappiness and Cache Pressure
&lt;/h3&gt;

&lt;p&gt;Lowering the &lt;code&gt;swappiness&lt;/code&gt; value instructs the kernel to avoid swapping pages to the NVMe SSD unless absolutely necessary.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Apply settings immediately&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;sysctl vm.swappiness&lt;span class="o"&gt;=&lt;/span&gt;10
&lt;span class="nb"&gt;sudo &lt;/span&gt;sysctl vm.vfs_cache_pressure&lt;span class="o"&gt;=&lt;/span&gt;50

&lt;span class="c"&gt;# Persist the settings across reboots&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'vm.swappiness=10'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; /etc/sysctl.conf
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'vm.vfs_cache_pressure=50'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; /etc/sysctl.conf

&lt;span class="c"&gt;# Reload sysctl configuration&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;sysctl &lt;span class="nt"&gt;-p&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Swappiness&lt;/th&gt;
&lt;th&gt;Behavior Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Swap only when absolutely out of RAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;10&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Recommended&lt;/strong&gt; for LLM workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;60&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Typical Linux default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;100&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Very aggressive swapping&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Table 3 — Swappiness values and behavior for Jetson LLM use&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Relationship Between zram and /swapfile
&lt;/h2&gt;

&lt;p&gt;The Jetson system utilizes a tiered memory architecture. The &lt;code&gt;zram-config&lt;/code&gt; service provides several compressed RAM-based swap devices (&lt;code&gt;zram0&lt;/code&gt; through &lt;code&gt;zram11&lt;/code&gt;). The hierarchy of memory allocation is as follows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Physical RAM&lt;/strong&gt; (64 GB unified memory)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;zram&lt;/strong&gt; (Compressed swap in RAM, ~31 GB total)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NVMe Swap File&lt;/strong&gt; (50 GB on &lt;code&gt;/swapfile&lt;/code&gt;)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This tiered approach allows the kernel to handle small, compressible allocations within the highly efficient &lt;code&gt;zram&lt;/code&gt; layer before resorting to the higher-latency NVMe disk-backed swap.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Removing or Reconfiguring the Swap File
&lt;/h2&gt;

&lt;p&gt;If disk space needs to be reclaimed, the swap file can be decommissioned following these steps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Disable the swap file usage&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;swapoff /swapfile

&lt;span class="c"&gt;# Remove the entry from /etc/fstab&lt;/span&gt;
&lt;span class="nb"&gt;sudo sed&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s1"&gt;'/\/swapfile/d'&lt;/span&gt; /etc/fstab

&lt;span class="c"&gt;# Delete the physical file&lt;/span&gt;
&lt;span class="nb"&gt;sudo rm&lt;/span&gt; /swapfile

&lt;span class="c"&gt;# Reload sysctl to refresh kernel state&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;sysctl &lt;span class="nt"&gt;-p&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  8. Practical Outcomes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Increased Capacity:&lt;/strong&gt; Successfully established a 50 GB swap area on NVMe, expanding the total virtual memory capacity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stability:&lt;/strong&gt; Provided a critical safety margin for running 70B parameter models (e.g., Q4_K_M) that may exceed the 64 GB physical RAM limit during peak usage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimized Hierarchy:&lt;/strong&gt; Integrated the new disk-backed swap into the existing &lt;code&gt;zram&lt;/code&gt; architecture without disrupting the compressed RAM layer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persistence:&lt;/strong&gt; Achieved a fully automated configuration that survives system reboots via &lt;code&gt;/etc/fstab&lt;/code&gt; tuning.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  9. Conclusions
&lt;/h2&gt;

&lt;p&gt;Configuring a large, NVMe-backed swap file is a highly effective strategy for maximizing the utility of the NVIDIA Jetson AGX Orin 64 GB for large-scale AI workloads. By following the documented procedure of using &lt;code&gt;fallocate&lt;/code&gt;, setting strict &lt;code&gt;chmod 600&lt;/code&gt; permissions, and tuning &lt;code&gt;swappiness&lt;/code&gt; to 10, users can achieve a stable environment capable of handling models that exceed physical memory boundaries.&lt;/p&gt;

&lt;p&gt;While the performance penalty of disk-based swapping is unavoidable, the use of high-speed NVMe storage and a tiered &lt;code&gt;zram&lt;/code&gt; approach minimizes the impact on inference latency, making it a viable solution for non-interactive or batch processing of 34B–70B parameter models.&lt;/p&gt;

</description>
      <category>jetson</category>
      <category>linux</category>
      <category>swap</category>
      <category>agxorin</category>
    </item>
    <item>
      <title>Check NVIDIA Jetson AGX Orin Specifications</title>
      <dc:creator>Sergio Andres Usma</dc:creator>
      <pubDate>Sun, 05 Apr 2026 16:41:10 +0000</pubDate>
      <link>https://forem.com/vonusma/check-nvidia-jetson-agx-orin-specifications-3h0</link>
      <guid>https://forem.com/vonusma/check-nvidia-jetson-agx-orin-specifications-3h0</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;This document provides a systematic, reproducible method for new users to verify every hardware and software component on an NVIDIA Jetson AGX Orin 64 GB developer kit running Ubuntu 22.04.5 LTS. The approach walks through the exact commands needed to collect CPU, GPU, memory, kernel, JetPack, CUDA, cuDNN, TensorRT, and OpenCV data, then compresses the findings into a single‑line summary for quick sharing.&lt;/p&gt;

&lt;p&gt;The verification process works on a clean Jetson image with no custom configuration. All commands are standard Ubuntu packages, so they can be run immediately after booting without installing additional tools. The script at the end automates the entire workflow for future use.&lt;/p&gt;

&lt;p&gt;Reading the summary proves the system matches the advertised specifications and serves as a baseline for troubleshooting or compliance checks. Developers, reviewers, and CI pipelines can all reuse this tutorial to guarantee that a Jetson board meets its nominal performance envelope.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Hardware and Software Environment
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1.1 Jetson board identification
&lt;/h3&gt;

&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /sys/firmware/devicetree/base/model
&lt;span class="nb"&gt;cat&lt;/span&gt; /sys/firmware/devicetree/base/compatible
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NVIDIA Jetson AGX Orin Developer Kit
nvidia,p3737-0000+p3701-0005
nvidia,p3701-0005
nvidia,tegra234
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This confirms the AGX Orin developer kit and the expected compatible strings.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.2 Operating system (Ubuntu 22.04.5 LTS)
&lt;/h3&gt;

&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;lsb_release &lt;span class="nt"&gt;-a&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /etc/os-release
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Typical output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="s"&gt;Ubuntu 22.04.5 LTS&lt;/span&gt;
&lt;span class="py"&gt;Release&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;        &lt;span class="s"&gt;22.04&lt;/span&gt;
&lt;span class="py"&gt;Codename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;       &lt;span class="s"&gt;jammy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;PRETTY_NAME&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Ubuntu 22.04.5 LTS"&lt;/span&gt;
&lt;span class="py"&gt;VERSION&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"22.04.5 LTS (Jammy Jellyfish)"&lt;/span&gt;
&lt;span class="py"&gt;ARCH&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;x86_64&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These lines show you are on Ubuntu 22.04.5 LTS for aarch64.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. CPU details
&lt;/h2&gt;

&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;lscpu
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/cpuinfo | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"model name|Processor|Features"&lt;/span&gt;
&lt;span class="nb"&gt;nproc&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key fields from &lt;code&gt;lscpu&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Architecture&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;          &lt;span class="s"&gt;aarch64&lt;/span&gt;
&lt;span class="na"&gt;Model name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;            &lt;span class="s"&gt;Cortex-A78AE&lt;/span&gt;
&lt;span class="na"&gt;CPU(s)&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;                &lt;span class="m"&gt;12&lt;/span&gt;
&lt;span class="na"&gt;CPU max MHz&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;           &lt;span class="m"&gt;2201.6001&lt;/span&gt;
&lt;span class="na"&gt;CPU min MHz&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;           &lt;span class="m"&gt;115.2000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;/proc/cpuinfo&lt;/code&gt; repeats the same model name (&lt;code&gt;ARMv8 Processor rev 1 (v8l)&lt;/code&gt;) and lists the supported flags.&lt;/p&gt;

&lt;p&gt;This tells the user the board has 12 ARMv8 cores running up to ~2.2 GHz.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Memory
&lt;/h2&gt;

&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;free &lt;span class="nt"&gt;-h&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/meminfo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;free -h&lt;/code&gt; example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Mem:   61Gi  used 5.8Gi  free 51Gi  buff/cache 3.6Gi  available 55Gi
Swap:  30Gi  used 0B     free 30Gi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;/proc/meminfo&lt;/code&gt; provides the raw totals in kB (e.g., &lt;code&gt;MemTotal: 64335836 kB&lt;/code&gt;).&lt;br&gt;&lt;br&gt;
Together they show ~7.4 GiB used out of ~61 GiB.&lt;/p&gt;


&lt;h2&gt;
  
  
  4. JetPack and Jetson Linux (L4T) versions
&lt;/h2&gt;
&lt;h3&gt;
  
  
  4.1 JetPack meta‑package
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;apt-cache show nvidia-jetpack
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Relevant lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Source: nvidia-jetpack (6.2.2)
Version: 6.2.2+b24
Architecture: arm64
Maintainer: NVIDIA Corporation
Depends: nvidia-jetpack-runtime (= 6.2.2+b24), nvidia-jetpack-dev (= 6.2.2+b24)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4.2 L4T release (Jetson Linux R36.5)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /etc/nv_tegra_release
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# R36 (release), REVISION: 5.0, GCID: 43688277, BOARD: generic, EABI: aarch64, DATE: Fri Jan 16 03:50:45 UTC 2026&lt;/span&gt;
&lt;span class="nv"&gt;TARGET_USERSPACE_LIB_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;nvidia
&lt;span class="nv"&gt;TARGET_USERSPACE_LIB_DIR_PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;usr/lib/aarch64-linux-gnu/nvidia
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also verify the core package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dpkg-query &lt;span class="nt"&gt;--show&lt;/span&gt; nvidia-l4t-core
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nvidia-l4t-core 36.5.0-20260115194252
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These two commands confirm you are on JetPack 6.2.2 with the matching L4T release.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. CUDA, cuDNN, TensorRT, OpenCV
&lt;/h2&gt;

&lt;h3&gt;
  
  
  5.1 CUDA toolkit
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nvcc &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Cuda compilation tools, release 12.6, V12.6.68
Build cuda_12.6.r12.6/compiler.34714021_0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5.2 cuDNN
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dpkg &lt;span class="nt"&gt;-l&lt;/span&gt; | &lt;span class="nb"&gt;grep &lt;/span&gt;libcudnn
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Shows &lt;code&gt;libcudnn9-cuda-12 9.3.0.75-1&lt;/code&gt; (runtime) and corresponding dev packages.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.3 TensorRT
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dpkg &lt;span class="nt"&gt;-l&lt;/span&gt; | &lt;span class="nb"&gt;grep &lt;/span&gt;TensorRT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tensorrt 10.3.0.30-1+cuda12.5 arm64 Meta package for TensorRT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5.4 OpenCV
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"import cv2; print(cv2.__version__)"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output: &lt;code&gt;4.8.0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;All four pieces of software are installed and their versions match the target specification.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Practical Outcomes (what worked)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Hardware detection succeeded; model and compatible strings are correct.
&lt;/li&gt;
&lt;li&gt;OS verification produced the expected Ubuntu 22.04.5 LTS aarch64 string.
&lt;/li&gt;
&lt;li&gt;CPU info confirms 12‑core ARMv8 at ~2.2 GHz.
&lt;/li&gt;
&lt;li&gt;Memory shows the advertised 61 GiB total with minimal swap usage.
&lt;/li&gt;
&lt;li&gt;JetPack 6.2.2 and L4T R36.5 were identified automatically.
&lt;/li&gt;
&lt;li&gt;CUDA 12.6, cuDNN 9.3, TensorRT 10.3, and OpenCV 4.8 are present in the exact versions.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  7. Conclusion (recommendations)
&lt;/h2&gt;

&lt;p&gt;The Jetson AGX Orin 64 GB developer kit is fully configured as advertised. The verification steps can be automated via the script provided in Section 8, making it suitable for CI pipelines, regression testing, or compliance reporting.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The host name (&lt;code&gt;ubuntu&lt;/code&gt;) and host display (&lt;code&gt;NVIDIA Jetson AGX Orin Develop&lt;/code&gt;) are placeholders; you can replace them with the actual values shown by &lt;code&gt;hostname&lt;/code&gt; and &lt;code&gt;lscpu&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  8. Automated script – &lt;code&gt;jetson_sysinfo.sh&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

&lt;span class="c"&gt;# Simple system summary for NVIDIA Jetson AGX Orin&lt;/span&gt;

&lt;span class="c"&gt;# Hardware&lt;/span&gt;
&lt;span class="nv"&gt;HW_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'\0'&lt;/span&gt; &amp;lt;/sys/firmware/devicetree/base/model 2&amp;gt;/dev/null&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;HW_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HW_MODEL&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="s2"&gt;"Unknown"&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# OS&lt;/span&gt;
&lt;span class="nv"&gt;OS_DESC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;lsb_release &lt;span class="nt"&gt;-d&lt;/span&gt; 2&amp;gt;/dev/null | &lt;span class="nb"&gt;cut&lt;/span&gt; &lt;span class="nt"&gt;-f2-&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;OS_DESC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;OS_DESC&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s1"&gt;'^PRETTY_NAME='&lt;/span&gt; /etc/os-release 2&amp;gt;/dev/null | &lt;span class="nb"&gt;cut&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nt"&gt;-f2&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'"'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
&lt;span class="nv"&gt;ARCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;uname&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;hostname&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;KERNEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;uname&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# CPU&lt;/span&gt;
&lt;span class="nv"&gt;CPU_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-m1&lt;/span&gt; &lt;span class="s2"&gt;"model name"&lt;/span&gt; /proc/cpuinfo 2&amp;gt;/dev/null | &lt;span class="nb"&gt;cut&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;: &lt;span class="nt"&gt;-f2-&lt;/span&gt; | xargs&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CPU_MODEL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;CPU_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;lscpu | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;: &lt;span class="s1"&gt;'/Model name/ {print $2}'&lt;/span&gt; | xargs&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;CPU_CORES&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;nproc&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;CPU_MAX&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;lscpu | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;: &lt;span class="s1"&gt;'/CPU max MHz/ {gsub(/ /,"",$2); print $2}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;CPU_MIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;lscpu | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;: &lt;span class="s1"&gt;'/CPU min MHz/ {gsub(/ /,"",$2); print $2}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Jetson / JetPack / L4T&lt;/span&gt;
&lt;span class="nv"&gt;L4T_CORE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;dpkg-query &lt;span class="nt"&gt;--show&lt;/span&gt; nvidia-l4t-core 2&amp;gt;/dev/null | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $2}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;JP_SRC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;apt-cache show nvidia-jetpack 2&amp;gt;/dev/null | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;&lt;span class="s1"&gt;': '&lt;/span&gt; &lt;span class="s1"&gt;'/^Source:/ {print $2; exit}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;JP_VER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;apt-cache show nvidia-jetpack 2&amp;gt;/dev/null | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;&lt;span class="s1"&gt;': '&lt;/span&gt; &lt;span class="s1"&gt;'/^Version:/ {print $2; exit}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;JP_ARCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;apt-cache show nvidia-jetpack 2&amp;gt;/dev/null | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;&lt;span class="s1"&gt;': '&lt;/span&gt; &lt;span class="s1"&gt;'/^Architecture:/ {print $2; exit}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;JP_MAINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;apt-cache show nvidia-jetpack 2&amp;gt;/dev/null | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;&lt;span class="s1"&gt;': '&lt;/span&gt; &lt;span class="s1"&gt;'/^Maintainer:/ {print $2; exit}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;JP_DEPS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;apt-cache show nvidia-jetpack 2&amp;gt;/dev/null | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;&lt;span class="s1"&gt;': '&lt;/span&gt; &lt;span class="s1"&gt;'/^Depends:/ {print $2; exit}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;NVREL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-m1&lt;/span&gt; &lt;span class="s1"&gt;'^# R'&lt;/span&gt; /etc/nv_tegra_release 2&amp;gt;/dev/null | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s/^# //'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# CUDA&lt;/span&gt;
&lt;span class="nv"&gt;NVCC_VER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;nvcc &lt;span class="nt"&gt;--version&lt;/span&gt; 2&amp;gt;/dev/null | &lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-n1&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# cuDNN&lt;/span&gt;
&lt;span class="nv"&gt;CUDNN_LINE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;dpkg &lt;span class="nt"&gt;-l&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'/libcudnn[0-9]-cuda-12/ {print $2" " $3; exit}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CUDNN_LINE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;CUDNN_LINE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"not found"&lt;/span&gt;

&lt;span class="c"&gt;# TensorRT&lt;/span&gt;
&lt;span class="nv"&gt;TRT_LINE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;dpkg &lt;span class="nt"&gt;-l&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'/^ii  tensorrt / {print $2" " $3; exit}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TRT_LINE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;TRT_LINE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"not found"&lt;/span&gt;

&lt;span class="c"&gt;# OpenCV&lt;/span&gt;
&lt;span class="nv"&gt;OPENCV_VER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"import cv2; print(cv2.__version__)"&lt;/span&gt; 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"not found"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Print summary&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Hardware: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HW_MODEL&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; 64GB"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"OS: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;OS_DESC&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ARCH&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Host: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HOST&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Kernel: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;KERNEL&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"CPU: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CPU_MODEL&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; (&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CPU_CORES&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;) @ &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CPU_MAX&lt;/span&gt;&lt;span class="p"&gt;%.*&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;MHz"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"CPU max MHz: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CPU_MAX&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"CPU min MHz: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CPU_MIN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Memory: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;MEM_LINE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"nvidia-l4t-core: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;L4T_CORE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$NVREL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"L4T release: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;NVREL&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Package: nvidia-jetpack"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Source: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;JP_SRC&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Version: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;JP_VER&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Architecture: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;JP_ARCH&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Maintainer: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;JP_MAINT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Depends: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;JP_DEPS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"nvcc: NVIDIA (R) Cuda compiler driver"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;NVCC_VER&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"cuDNN: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CUDNN_LINE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"OpenCV Version: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;OPENCV_VER&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"TensorRT: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TRT_LINE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;How to use&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nano jetson_sysinfo.sh          &lt;span class="c"&gt;# paste the script&lt;/span&gt;
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x jetson_sysinfo.sh      &lt;span class="c"&gt;# make it executable&lt;/span&gt;
./jetson_sysinfo.sh             &lt;span class="c"&gt;# run – prints a compact summary&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The script prints exactly the same one‑line summary shown in Section 6, making sharing as proof of configuration trivial.&lt;/p&gt;

</description>
      <category>jetson</category>
      <category>agx</category>
      <category>orin</category>
      <category>linux</category>
    </item>
    <item>
      <title>Enabling Maximum Performance Mode on NVIDIA Jetson AGX Orin 64 GB</title>
      <dc:creator>Sergio Andres Usma</dc:creator>
      <pubDate>Sun, 05 Apr 2026 15:58:34 +0000</pubDate>
      <link>https://forem.com/vonusma/enabling-maximum-performance-mode-on-nvidia-jetson-agx-orin-64-gb-53nb</link>
      <guid>https://forem.com/vonusma/enabling-maximum-performance-mode-on-nvidia-jetson-agx-orin-64-gb-53nb</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;This document explains how to configure an NVIDIA Jetson AGX Orin 64 GB Developer Kit running Ubuntu 22.04.5 LTS and JetPack 6.2.2 to operate in maximum performance mode for AI workloads, especially LLM inference. It describes how to select the MAXN power mode, lock system clocks at their highest frequencies, and verify that the configuration is correctly applied with built-in NVIDIA tools and simple benchmarks. The tutorial targets users who want reproducible, high-throughput inference on a Jetson AGX Orin while retaining awareness of thermal and power constraints.&lt;/p&gt;

&lt;p&gt;It documents the practical impact of enabling MAXN and &lt;code&gt;jetson_clocks&lt;/code&gt;, showing how GPU frequency and token generation throughput can increase roughly threefold compared to default settings. The guide also covers how to persist these settings using a systemd service so that the device consistently boots into a high-performance state suitable for heavy AI workloads. Where relevant, it notes expected frequency values and normal operating temperatures for the Jetson AGX Orin platform. &lt;/p&gt;

&lt;p&gt;The purpose of this tutorial is to serve as a reusable reference for configuring maximum performance on Jetson-based AI systems, integrated into a larger workflow that includes swap configuration and tool installation for LLM workloads. Readers with basic Linux and Jetson familiarity can follow step-by-step commands to prepare the device, validate the configuration, and understand when to switch between performance and power-saving modes. &lt;/p&gt;




&lt;h2&gt;
  
  
  1. Hardware and Software Environment
&lt;/h2&gt;

&lt;p&gt;Your system is an &lt;strong&gt;NVIDIA Jetson AGX Orin Developer Kit 64 GB&lt;/strong&gt; running Ubuntu 22.04.5 LTS (aarch64) with JetPack 6.2.2, CUDA 12.6, cuDNN 9.3.0, OpenCV 4.8.0, and TensorRT 10.3.0.30 installed. The CPU is an ARMv8 12-core processor with a maximum clock around 2.2 GHz, and the board exposes NVIDIA’s &lt;code&gt;nvpmodel&lt;/code&gt; and &lt;code&gt;jetson_clocks&lt;/code&gt; tools for power and clock management.&lt;/p&gt;

&lt;p&gt;According to NVIDIA’s specifications, the Jetson AGX Orin 64 GB configuration can achieve up to &lt;strong&gt;275 TOPS&lt;/strong&gt; when configured in MAXN mode with clocks locked to their maximum frequencies. This tutorial assumes shell access with &lt;code&gt;sudo&lt;/code&gt; privileges and that NVIDIA JetPack components are correctly installed from the &lt;code&gt;nvidia-jetpack&lt;/code&gt; meta-package. &lt;/p&gt;




&lt;h2&gt;
  
  
  2. Why Maximum Performance Matters for AI
&lt;/h2&gt;

&lt;p&gt;By default, without MAXN mode and &lt;code&gt;jetson_clocks&lt;/code&gt;, the Jetson AGX Orin keeps GPU frequencies around 600 MHz to maintain thermal and power safety margins. Under these conservative defaults, a 7B LLM typically reaches only about 8 tokens per second during inference, which limits interactivity and throughput.&lt;/p&gt;

&lt;p&gt;When MAXN and &lt;code&gt;jetson_clocks&lt;/code&gt; are enabled, the GPU can run at approximately 1300 MHz, and end-to-end LLM inference throughput can increase to roughly 18–25 tokens per second on the same 7B model. This represents about a &lt;strong&gt;3x&lt;/strong&gt; performance improvement and makes interactive LLM usage and larger batch workloads more practical on the device.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Inspecting and Selecting Power Modes
&lt;/h2&gt;

&lt;p&gt;Before changing anything, check the current power mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nvpmodel &lt;span class="nt"&gt;-q&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The command prints the active power mode, and the integer at the bottom of the output is the current mode ID (for example, &lt;code&gt;0&lt;/code&gt; for MAXN). This lets you confirm whether the system already runs in MAXN or a more restrictive power profile.&lt;/p&gt;

&lt;p&gt;To see all available power modes for the Jetson AGX Orin 64 GB under JetPack 6.2, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nvpmodel &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="nt"&gt;--verbose&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A1&lt;/span&gt; &lt;span class="s2"&gt;"MODE_NAME"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On this platform, the mode table typically looks like:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode ID&lt;/th&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;TDP&lt;/th&gt;
&lt;th&gt;CPU cores active&lt;/th&gt;
&lt;th&gt;GPU max freq&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;MAXN&lt;/td&gt;
&lt;td&gt;No limit (~60 W)&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;1300 MHz&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;MODE_50W&lt;/td&gt;
&lt;td&gt;50 W&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;1100 MHz&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;MODE_30W&lt;/td&gt;
&lt;td&gt;30 W&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;854 MHz&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;MODE_15W&lt;/td&gt;
&lt;td&gt;15 W&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;612 MHz&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Use mode ID 0 (MAXN) for the high-performance configuration described here.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Enabling MAXN Mode and Locking Clocks
&lt;/h2&gt;

&lt;p&gt;To switch the Jetson into MAXN mode, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nvpmodel &lt;span class="nt"&gt;-m&lt;/span&gt; 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This update is written into &lt;code&gt;/etc/nvpmodel.conf&lt;/code&gt; and therefore persists across reboots until you select a different mode. After this step, the Jetson operates under the highest power budget supported by its cooling solution, which is ideal for compute-heavy AI tasks.&lt;/p&gt;

&lt;p&gt;Next, lock all clocks (CPU, GPU, and memory bus) to their maximum frequencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;jetson_clocks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command is temporary and resets after each reboot, so it must be re-applied or automated to persist. Once applied, the system stops using dynamic frequency scaling and instead pins frequencies to their highest supported values for maximum compute performance.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Verifying Power Mode and Clock Frequencies
&lt;/h2&gt;

&lt;p&gt;To confirm that MAXN is active and clocks are locked, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Confirm power mode&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;nvpmodel &lt;span class="nt"&gt;-q&lt;/span&gt;

&lt;span class="c"&gt;# Check GPU and CPU frequencies&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;jetson_clocks &lt;span class="nt"&gt;--show&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;jetson_clocks --show&lt;/code&gt; output should include lines similar to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CPU Cluster Switching: Disabled
cpu0: Online=1 Governor=schedutil MinFreq=729600 MaxFreq=2201600 CurrentFreq=2201600 ...
GPU MinFreq=306000000 MaxFreq=1300500000 CurrentFreq=1300500000
EMC MinFreq=204000000 MaxFreq=3199000000 CurrentFreq=3199000000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a correct configuration, &lt;code&gt;CurrentFreq&lt;/code&gt; should match &lt;code&gt;MaxFreq&lt;/code&gt; for CPU, GPU, and EMC entries, indicating that frequencies are pinned at their maximums. If you see lower current frequencies, reapply &lt;code&gt;jetson_clocks&lt;/code&gt; or investigate thermal throttling conditions.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Making jetson_clocks Persistent with systemd
&lt;/h2&gt;

&lt;p&gt;To ensure &lt;code&gt;jetson_clocks&lt;/code&gt; runs automatically at boot, first try enabling the built-in service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable &lt;/span&gt;nvargus-daemon 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable &lt;/span&gt;jetson_clocks 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Service not found, creating..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On some JetPack versions the &lt;code&gt;jetson_clocks&lt;/code&gt; service may not exist, in which case you can create a custom systemd unit file. The following commands define such a service, reload systemd, and enable it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/systemd/system/jetson_clocks.service &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /dev/null &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
[Unit]
Description=Lock Jetson clocks at maximum frequency
After=multi-user.target

[Service]
Type=oneshot
ExecStart=/usr/bin/jetson_clocks
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable &lt;/span&gt;jetson_clocks
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start jetson_clocks
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status jetson_clocks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Afterward, every boot should automatically apply &lt;code&gt;jetson_clocks&lt;/code&gt;, and &lt;code&gt;systemctl status jetson_clocks&lt;/code&gt; should report the service as active. This eliminates the need to manually run the command after each restart while keeping the configuration transparent and reversible via systemd.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Quick Performance Benchmark with LLM Inference
&lt;/h2&gt;

&lt;p&gt;Once MAXN and &lt;code&gt;jetson_clocks&lt;/code&gt; are active, you can validate real-world AI performance using an LLM benchmark. If you have an Ollama container running (as configured in a later phase of your workflow), execute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Benchmark: time to generate 100 tokens with a 3B model&lt;/span&gt;
docker &lt;span class="nb"&gt;exec &lt;/span&gt;ollama ollama run llama3.2 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--verbose&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"Write a 100-word story about a robot"&lt;/span&gt; 2&amp;gt;&amp;amp;1 | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"eval rate|tokens/s"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In MAXN mode on the Jetson AGX Orin, a llama3.2 3B Q4_K_M model is expected to reach around 25–40 tokens per second, significantly higher than default power modes. If observed throughput is substantially lower, recheck power mode, clock locking, and ensure the system is not thermally throttling or swapping heavily.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Monitoring Power, Temperature, and Thermal Safety
&lt;/h2&gt;

&lt;p&gt;While models are running, monitor system health from a second terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Option A: tegrastats (every second)&lt;/span&gt;
tegrastats &lt;span class="nt"&gt;--interval&lt;/span&gt; 1000

&lt;span class="c"&gt;# Option B: jtop (interactive dashboard)&lt;/span&gt;
jtop
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In &lt;code&gt;tegrastats&lt;/code&gt;, key fields include &lt;code&gt;GPU@XXX°C&lt;/code&gt; for GPU temperature (targeting below about 85°C under sustained load), &lt;code&gt;POM_5V_GPU Xm/Ym&lt;/code&gt; for GPU power draw in milliwatts, and &lt;code&gt;Tboard@XXX&lt;/code&gt; for board temperature. MAXN mode is designed to work within the active cooling capabilities of the AGX Orin module, but blocked vents or poor airflow can still cause throttling.&lt;/p&gt;

&lt;p&gt;Typical thermal ranges under continuous AI workloads are:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Normal&lt;/th&gt;
&lt;th&gt;Throttle starts&lt;/th&gt;
&lt;th&gt;Emergency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPU&lt;/td&gt;
&lt;td&gt;50–75 °C&lt;/td&gt;
&lt;td&gt;~85 °C&lt;/td&gt;
&lt;td&gt;~95 °C&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;45–70 °C&lt;/td&gt;
&lt;td&gt;~85 °C&lt;/td&gt;
&lt;td&gt;~95 °C&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Board&lt;/td&gt;
&lt;td&gt;40–60 °C&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If &lt;code&gt;tegrastats&lt;/code&gt; shows &lt;code&gt;throttle=1&lt;/code&gt;, improve ventilation or reduce workload intensity until the system stabilizes.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. Choosing Performance vs Power-Saving Modes
&lt;/h2&gt;

&lt;p&gt;Depending on your workload, you may want to switch between MAXN and more efficient modes. Common scenarios include:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Situation&lt;/th&gt;
&lt;th&gt;Recommended mode&lt;/th&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LLM inference (7B–70B)&lt;/td&gt;
&lt;td&gt;MAXN (0)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sudo nvpmodel -m 0 &amp;amp;&amp;amp; sudo jetson_clocks&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vision / video processing&lt;/td&gt;
&lt;td&gt;MAXN (0)&lt;/td&gt;
&lt;td&gt;same as above&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compiling code (e.g., LLM)&lt;/td&gt;
&lt;td&gt;MAXN (0)&lt;/td&gt;
&lt;td&gt;same as above&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Idle / light development&lt;/td&gt;
&lt;td&gt;MODE_30W (2)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sudo nvpmodel -m 2&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Background low-power tasks&lt;/td&gt;
&lt;td&gt;MODE_15W (3)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sudo nvpmodel -m 3&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Switching modes only changes the power envelope, while &lt;code&gt;jetson_clocks&lt;/code&gt; controls frequency locking; together they give fine-grained control over performance versus efficiency. You can integrate these commands into your own scripts to toggle modes depending on job type or time of day.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Practical Outcomes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;MAXN mode and &lt;code&gt;jetson_clocks&lt;/code&gt; are enabled on the Jetson AGX Orin 64 GB, with GPU, CPU, and EMC frequencies pinned at their maximums for AI workloads. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A systemd service (built-in or custom) ensures &lt;code&gt;jetson_clocks&lt;/code&gt; runs at boot so performance is consistent across reboots.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Simple LLM benchmarks confirm real-world throughput improvements (on the order of 3x token/sec) compared to default power modes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Continuous monitoring with &lt;code&gt;tegrastats&lt;/code&gt; or &lt;code&gt;jtop&lt;/code&gt; provides visibility into temperature, power draw, and potential thermal throttling.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Clear commands exist to switch between high-performance and power-saving modes depending on workload requirements.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  11. Conclusion
&lt;/h2&gt;

&lt;p&gt;Configuring the Jetson AGX Orin 64 GB into MAXN mode with locked clocks is a necessary step to realize the board’s full 275 TOPS potential for LLM inference and other GPU-intensive workloads. The combination of &lt;code&gt;nvpmodel&lt;/code&gt; for power profiles and &lt;code&gt;jetson_clocks&lt;/code&gt; for frequency locking provides deterministic performance while staying within the cooling design limits of the developer kit.&lt;/p&gt;

&lt;p&gt;With the steps in this tutorial, you can reproducibly enable, verify, and persist maximum performance settings, then validate them using practical AI benchmarks and runtime telemetry tools. In a larger workflow, this configuration forms the foundation for subsequent tasks such as creating swap space for very large models and installing build tools for optimized inference frameworks.&lt;/p&gt;

</description>
      <category>nvidia</category>
      <category>jetson</category>
      <category>agx</category>
      <category>ai</category>
    </item>
    <item>
      <title>Exploratory Installation of Unsloth on NVIDIA Jetson AGX Orin 64 GB</title>
      <dc:creator>Sergio Andres Usma</dc:creator>
      <pubDate>Sun, 05 Apr 2026 14:52:21 +0000</pubDate>
      <link>https://forem.com/vonusma/exploratory-installation-of-unsloth-on-nvidia-jetson-agx-orin-64-gb-12pp</link>
      <guid>https://forem.com/vonusma/exploratory-installation-of-unsloth-on-nvidia-jetson-agx-orin-64-gb-12pp</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;This report documents an exploratory attempt to install and run Unsloth (including Unsloth Studio) on an NVIDIA Jetson AGX Orin 64 GB using a Docker-based workflow with &lt;code&gt;dustynv/l4t-ml:r36.4.0&lt;/code&gt; as the base image.&lt;br&gt;&lt;br&gt;
The process successfully validated GPU-accelerated PyTorch and Unsloth’s core Python package on Jetson, but exposed substantial friction and incompatibilities in getting Unsloth Studio’s full stack (Studio backend, frontend, Triton/TorchInductor/TorchAo dependencies, and custom virtual environment) to run reliably on this ARM-based edge platform.&lt;br&gt;&lt;br&gt;
The goal of this write-up is to provide a precise technical account so that other practitioners (and the Unsloth team) can (a) reproduce or avoid the same pitfalls, and (b) better assess the current suitability of Unsloth Studio for Jetson-class devices.&lt;/p&gt;


&lt;h2&gt;
  
  
  1. Hardware and Software Environment
&lt;/h2&gt;

&lt;p&gt;The experiments were conducted on the following platform:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Device:&lt;/strong&gt; NVIDIA Jetson AGX Orin Developer Kit (64 GB)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OS:&lt;/strong&gt; Ubuntu 22.04.5 LTS, aarch64
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JetPack / L4T:&lt;/strong&gt; JetPack 6.2.2, L4T 36.5.0
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CUDA:&lt;/strong&gt; 12.6 (nvcc 12.6.68)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;cuDNN:&lt;/strong&gt; 9.3.0
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TensorRT:&lt;/strong&gt; 10.3.0
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker:&lt;/strong&gt; Engine with NVIDIA Container Runtime enabled (&lt;code&gt;--runtime=nvidia&lt;/code&gt;)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Base ML image:&lt;/strong&gt; &lt;code&gt;dustynv/l4t-ml:r36.4.0&lt;/code&gt; (from Jetson Containers), which provides:

&lt;ul&gt;
&lt;li&gt;PyTorch compiled for Jetson (aarch64) with CUDA and TensorRT integration
&lt;/li&gt;
&lt;li&gt;JupyterLab and common ML tooling
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Host-side persistent storage for this project was centralized under:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;~/unsloth/
  build/      # Dockerfile and build context
  work/       # notebooks, datasets, outputs
  cache/      # general cache inside the container
  hf/         # Hugging Face cache
  jupyter/    # Jupyter config
  ssh/        # SSH keys/config (optional)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This layout was bind-mounted into the container to ensure persistence across container rebuilds.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Docker Image Construction
&lt;/h2&gt;

&lt;h3&gt;
  
  
  2.1 Base Dockerfile
&lt;/h3&gt;

&lt;p&gt;The starting point was a custom image layered on top of &lt;code&gt;dustynv/l4t-ml:r36.4.0&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; dustynv/l4t-ml:r36.4.0&lt;/span&gt;

&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; DEBIAN_FRONTEND=noninteractive \&lt;/span&gt;
    PIP_NO_CACHE_DIR=1 \
    PYTHONUNBUFFERED=1 \
    SHELL=/bin/bash \
    JUPYTER_PORT=8888 \
    STUDIO_PORT=8000 \
    WORKSPACE=/workspace \
    HF_HOME=/workspace/.cache/huggingface \
    TRANSFORMERS_CACHE=/workspace/.cache/huggingface \
    HUGGINGFACE_HUB_CACHE=/workspace/.cache/huggingface

&lt;span class="k"&gt;USER&lt;/span&gt;&lt;span class="s"&gt; root&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nt"&gt;--no-install-recommends&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    curl git wget ca-certificates build-essential pkg-config &lt;span class="se"&gt;\
&lt;/span&gt;    python3-pip python3-dev python3-venv &lt;span class="se"&gt;\
&lt;/span&gt;    openssh-server &lt;span class="nb"&gt;sudo &lt;/span&gt;nano htop tmux &lt;span class="se"&gt;\
&lt;/span&gt;    libopenblas-dev libssl-dev libffi-dev &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/apt/lists/&lt;span class="k"&gt;*&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /var/run/sshd /workspace/work /workspace/.cache/huggingface /root/.jupyter

&lt;span class="k"&gt;RUN &lt;/span&gt;python3 &lt;span class="nt"&gt;-m&lt;/span&gt; pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--upgrade&lt;/span&gt; pip setuptools wheel

&lt;span class="c"&gt;# Remove Jetson-specific custom pip indexes to avoid transient outages&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;python3 &lt;span class="nt"&gt;-m&lt;/span&gt; pip config &lt;span class="nb"&gt;unset &lt;/span&gt;global.index-url &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    python3 &lt;span class="nt"&gt;-m&lt;/span&gt; pip config &lt;span class="nb"&gt;unset &lt;/span&gt;global.extra-index-url &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# Generic Python dependencies via PyPI&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nv"&gt;PIP_INDEX_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://pypi.org/simple python3 &lt;span class="nt"&gt;-m&lt;/span&gt; pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    fastapi &lt;span class="s2"&gt;"uvicorn[standard]"&lt;/span&gt; gradio &lt;span class="se"&gt;\
&lt;/span&gt;    accelerate transformers peft trl datasets sentencepiece protobuf safetensors &lt;span class="se"&gt;\
&lt;/span&gt;    huggingface_hub

&lt;span class="c"&gt;# Install Unsloth (core + zoo) from GitHub/PyPI&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nv"&gt;PIP_INDEX_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://pypi.org/simple python3 &lt;span class="nt"&gt;-m&lt;/span&gt; pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="s2"&gt;"unsloth @ git+https://github.com/unslothai/unsloth.git"&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="s2"&gt;"unsloth-zoo @ git+https://github.com/unslothai/unsloth.git"&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# Optionally attempt bitsandbytes (may be fragile on Jetson)&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nv"&gt;PIP_INDEX_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://pypi.org/simple python3 &lt;span class="nt"&gt;-m&lt;/span&gt; pip &lt;span class="nb"&gt;install &lt;/span&gt;bitsandbytes &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /workspace&lt;/span&gt;
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 8000 8888 22&lt;/span&gt;

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["/bin/bash"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key design choices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reuse NVIDIA’s &lt;code&gt;l4t-ml&lt;/code&gt; stack instead of installing PyTorch/TensorRT manually, since it is tuned for Jetson.&lt;/li&gt;
&lt;li&gt;Explicitly unset custom Jetson pip indexes before installing Unsloth, to avoid failures due to unavailable Jetson-specific mirrors while installing generic packages (e.g. &lt;code&gt;fastapi&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Install Unsloth via GitHub (or PyPI) rather than using the x86-oriented Docker image &lt;code&gt;unsloth/unsloth&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The image was built with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/unsloth/build
&lt;span class="nb"&gt;sudo &lt;/span&gt;docker build &lt;span class="nt"&gt;--no-cache&lt;/span&gt; &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="nb"&gt;local&lt;/span&gt;/unsloth-studio:jetson-l4tml-r36.4.0 &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  3. Container Runtime and GPU Validation
&lt;/h2&gt;

&lt;p&gt;A persistent container was created with host networking and bind mounts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; unsloth-studio &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--restart&lt;/span&gt; unless-stopped &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--shm-size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;16g &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;HF_HOME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/workspace/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;TRANSFORMERS_CACHE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/workspace/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;HUGGINGFACE_HUB_CACHE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/workspace/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/unsloth/work:/workspace/work &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/unsloth/cache:/workspace/.cache &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/unsloth/hf:/root/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/unsloth/jupyter:/root/.jupyter &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; ~/unsloth/ssh:/root/.ssh &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nb"&gt;local&lt;/span&gt;/unsloth-studio:jetson-l4tml-r36.4.0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /dev/null
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside the container, GPU support was verified with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"import torch;
print(torch.__version__);
print(torch.cuda.is_available());
print(torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'no cuda')"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This confirmed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;torch&lt;/code&gt; version 2.6.0 (from the &lt;code&gt;l4t-ml&lt;/code&gt; stack),&lt;/li&gt;
&lt;li&gt;CUDA available,&lt;/li&gt;
&lt;li&gt;device name reported as “Orin”.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thus, the base ML environment inside the container was correctly accelerated on Jetson.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Installing Unsloth Core
&lt;/h2&gt;

&lt;p&gt;Within the container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"import unsloth; print('unsloth ok')"&lt;/span&gt;
unsloth &lt;span class="nt"&gt;--help&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI output showed the main Unsloth commands:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;train&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;inference&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;export&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;list-checkpoints&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;studio&lt;/code&gt; (subcommand group)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, importing Unsloth triggered a warning stacktrace related to Triton, TorchInductor, and TorchAo:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ImportError: cannot import name 'AttrsDescriptor' from triton.compiler.compiler&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Errors inside &lt;code&gt;torch._inductor.runtime.hints&lt;/code&gt; and &lt;code&gt;torchao.quantization&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This indicates that parts of the current Unsloth stack assume a Triton/TorchInductor/TorchAo configuration aligned with x86_64 desktop/server builds of PyTorch, which is not trivially compatible with the Jetson-specific PyTorch build shipping in &lt;code&gt;l4t-ml&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Despite these warnings, the CLI remained usable for basic commands, and GPU acceleration for standard PyTorch operations was intact.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Attempting to Enable Unsloth Studio
&lt;/h2&gt;

&lt;h3&gt;
  
  
  5.1 CLI-Level Status
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;unsloth studio&lt;/code&gt; subcommand was present:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;unsloth studio &lt;span class="nt"&gt;--help&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;showed options such as &lt;code&gt;--host&lt;/code&gt;, &lt;code&gt;--port&lt;/code&gt;, &lt;code&gt;--frontend&lt;/code&gt;, and subcommands:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;stop&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;update&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;reset-password&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Attempting to start Studio directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;unsloth studio &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;--port&lt;/span&gt; 8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;returned:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Studio not set up. Run install.sh first.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This implies that Studio expects an auxiliary installation step that sets up its environment (frontend, backend, and venv).&lt;/p&gt;

&lt;h3&gt;
  
  
  5.2 Running &lt;code&gt;unsloth studio setup&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Unsloth documentation describes a developer mode where Studio is installed via &lt;code&gt;uv&lt;/code&gt; and a dedicated virtual environment.&lt;br&gt;&lt;br&gt;
Following this pattern, the command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;unsloth studio setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;produced:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Successful installation of &lt;code&gt;nvm&lt;/code&gt;, Node LTS, and &lt;code&gt;bun&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Successful build of the frontend (“frontend built”)&lt;/li&gt;
&lt;li&gt;But then:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python venv not found at /root/.unsloth/studio/unsloth_studio
Run install.sh first to create the environment:
  curl -fsSL https://unsloth.ai/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thus, the CLI expects a virtual environment under &lt;code&gt;/root/.unsloth/studio/unsloth_studio&lt;/code&gt; that appears to be normally created by the official &lt;code&gt;install.sh&lt;/code&gt; script.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.3 Manual Creation of the Studio Virtual Environment
&lt;/h3&gt;

&lt;p&gt;Rather than relying on &lt;code&gt;install.sh&lt;/code&gt; (which is tuned for other platforms and may interfere with the Jetson-specific PyTorch/Triton stack), a manual venv was created:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /root/.unsloth/studio
uv venv unsloth_studio &lt;span class="nt"&gt;--python&lt;/span&gt; 3.10
&lt;span class="nb"&gt;source&lt;/span&gt; /root/.unsloth/studio/unsloth_studio/bin/activate

uv pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--index-url&lt;/span&gt; https://pypi.org/simple unsloth
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This installed Unsloth (and a complete stack of dependencies) into the venv, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;torch&lt;/code&gt;, &lt;code&gt;torchao&lt;/code&gt;, &lt;code&gt;triton&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;transformers&lt;/code&gt;, &lt;code&gt;accelerate&lt;/code&gt;, &lt;code&gt;peft&lt;/code&gt;, &lt;code&gt;trl&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;bitsandbytes&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;unsloth&lt;/code&gt;, &lt;code&gt;unsloth-zoo&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Within the venv, &lt;code&gt;unsloth studio -H 0.0.0.0 -p 8000&lt;/code&gt; still failed due to missing backend dependencies (&lt;code&gt;structlog&lt;/code&gt;), which were then installed.&lt;br&gt;&lt;br&gt;
However, repeated attempts to start Studio continued to reveal issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ModuleNotFoundError: No module named 'structlog'&lt;/code&gt; (due to pip confusion between global and venv environments)&lt;/li&gt;
&lt;li&gt;Friction in adding pip to the venv (pip not present or not found via &lt;code&gt;python -m pip&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;A recurring tension between the &lt;code&gt;uv&lt;/code&gt;-managed environment and the classical &lt;code&gt;pip&lt;/code&gt; expectations coming from Studio’s backend modules.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Ultimately, even after installing the necessary Python packages, the CLI still treated Studio as “not set up” and insisted on running the global &lt;code&gt;install.sh&lt;/code&gt; script.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Failure Modes and Root Causes
&lt;/h2&gt;

&lt;p&gt;The main failure modes observed were:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Triton / TorchInductor / TorchAo incompatibilities&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Errors when importing Unsloth related to &lt;code&gt;AttrsDescriptor&lt;/code&gt; in Triton and TorchInductor.
&lt;/li&gt;
&lt;li&gt;These components are not officially supported or tuned for the Jetson-specific PyTorch build, causing runtime import and registration issues.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Studio’s tight coupling to its own venv and installer&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Studio expects a very particular environment layout under &lt;code&gt;~/.unsloth/studio/unsloth_studio&lt;/code&gt; created by &lt;code&gt;install.sh&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;Deviating from the installer (e.g., manual or &lt;code&gt;uv&lt;/code&gt;-only installation) leads to missing venv markers, which the CLI interprets as “Studio not set up.”&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Tooling friction on Jetson (uv + venv + pip)&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The combination of &lt;code&gt;uv&lt;/code&gt;-managed environments with a system Python and Docker base image that already has a global &lt;code&gt;pip&lt;/code&gt; led to situations where:

&lt;ul&gt;
&lt;li&gt;The venv had no &lt;code&gt;pip&lt;/code&gt; initially.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;python -m ensurepip&lt;/code&gt; installed pip globally rather than into the venv.&lt;/li&gt;
&lt;li&gt;The actual &lt;code&gt;pip&lt;/code&gt; used to install backend dependencies was the global one, leaving the venv incomplete.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Mismatch with Jetson Containers philosophy&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Jetson Containers and &lt;code&gt;l4t-ml&lt;/code&gt; are built around Nvidia’s optimized PyTorch/TensorRT stacks, while Unsloth Studio’s modern pipeline assumes desktop/server-class Triton and TorchInductor configurations.
&lt;/li&gt;
&lt;li&gt;This leads to a mismatch that is non-trivial to reconcile in a maintainable way.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  7. Practical Outcomes
&lt;/h2&gt;

&lt;p&gt;Despite the failure to get &lt;strong&gt;Unsloth Studio&lt;/strong&gt; fully operational, the following outcomes were achieved:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A validated &lt;strong&gt;GPU-accelerated Unsloth core&lt;/strong&gt; environment on Jetson:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;unsloth&lt;/code&gt; CLI installed and usable.&lt;/li&gt;
&lt;li&gt;PyTorch 2.6.0 with CUDA on Orin working correctly.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;A reusable &lt;strong&gt;Docker-based ML devbox&lt;/strong&gt; (&lt;code&gt;local/unsloth-studio:jetson-l4tml-r36.4.0&lt;/code&gt;) with:

&lt;ul&gt;
&lt;li&gt;A clear persistent directory layout (&lt;code&gt;~/unsloth&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Host networking and shared volumes suitable for integration with other Jetson Containers (e.g., llama.cpp, vLLM, NanoLLM, llama-factory).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Empirical evidence that, as of this experiment, &lt;strong&gt;Unsloth Studio is not yet a drop-in web UI solution for Jetson AGX Orin&lt;/strong&gt;, due to:

&lt;ul&gt;
&lt;li&gt;Triton/TorchInductor/TorchAo assumptions, and&lt;/li&gt;
&lt;li&gt;Strong coupling to the &lt;code&gt;install.sh&lt;/code&gt;-managed environment.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  8. Recommendations for Jetson Practitioners
&lt;/h2&gt;

&lt;p&gt;For current Jetson AGX Orin users:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use Unsloth core selectively&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unsloth’s Python API and CLI can still be valuable for fine-tuning/export workflows that do not rely heavily on Triton/TorchInductor-specific optimizations.
&lt;/li&gt;
&lt;li&gt;Prefer using the Jetson-optimized PyTorch from &lt;code&gt;l4t-ml&lt;/code&gt; and be cautious with features that depend on TorchInductor/Triton.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Rely on Jetson Containers for serving and fine-tuning&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For serving and fine-tuning large models on Jetson, the containers in the Jetson Containers ecosystem (llama.cpp, vLLM, MLC, TensorRT-LLM, NanoLLM, llama-factory) are significantly more mature and better integrated with JetPack and L4T.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Treat Unsloth Studio on Jetson as experimental&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Until there is first-class ARM/Jetson support (or a documented variant of &lt;code&gt;install.sh&lt;/code&gt; and Studio’s backend explicitly targeting Jetson), Studio should be considered an experimental integration on this hardware.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  9. Suggestions for the Unsloth Team
&lt;/h2&gt;

&lt;p&gt;Based on this experience, the following changes would materially improve the viability of Unsloth Studio on Jetson and similar edge platforms:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Documented “headless / no-Triton” mode&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A configuration profile that can disable or bypass TorchInductor/Triton/TorchAo, relying purely on standard PyTorch kernels when running on unsupported architectures such as Jetson.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Explicit ARM/Jetson support statement and checks&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clear statements in the documentation regarding ARM/aarch64 support status, with runtime checks that either:

&lt;ul&gt;
&lt;li&gt;Enable a safe, reduced feature set, or&lt;/li&gt;
&lt;li&gt;Fail fast with a clear, actionable message.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Studio installation mode for preexisting Python stacks&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A variant of &lt;code&gt;install.sh&lt;/code&gt; or &lt;code&gt;studio setup&lt;/code&gt; that:

&lt;ul&gt;
&lt;li&gt;Can attach to an existing PyTorch environment (e.g., Jetson’s &lt;code&gt;l4t-ml&lt;/code&gt;), and&lt;/li&gt;
&lt;li&gt;Creates only the additional Studio-specific venv/backend/frontend without attempting to reconfigure PyTorch or Triton.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Minimal dependency profile for Studio backend&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A smaller “core backend” dependency set for Studio that avoids complex quantization stacks and heavy compiler integrations when running in constrained or embedded environments.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  10. Conclusion
&lt;/h2&gt;

&lt;p&gt;The experiment demonstrates that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Installing &lt;strong&gt;Unsloth core&lt;/strong&gt; on Jetson AGX Orin via a Dockerized &lt;code&gt;l4t-ml&lt;/code&gt; base image is feasible, and the resulting environment is usable for GPU-accelerated LLM workflows.&lt;/li&gt;
&lt;li&gt;However, enabling &lt;strong&gt;Unsloth Studio&lt;/strong&gt;—the full web UI for training and serving—on Jetson currently encounters significant hurdles due to the interaction between Triton/TorchInductor, TorchAo, &lt;code&gt;uv&lt;/code&gt;-managed venvs, and the assumptions baked into &lt;code&gt;install.sh&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From a practical standpoint, Jetson users are better served today by combining Unsloth core (where useful) with the existing Jetson Containers ecosystem, while treating Unsloth Studio as an experimental component on this hardware.&lt;br&gt;&lt;br&gt;
From a community and engineering perspective, this experiment highlights concrete areas where incremental changes and documentation from the Unsloth team could unlock a powerful edge deployment story on Jetson-class devices.&lt;/p&gt;

</description>
      <category>nvidia</category>
      <category>jetson</category>
      <category>unsloth</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
