<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Luke Hinds</title>
    <description>The latest articles on Forem by Luke Hinds (@lukehinds).</description>
    <link>https://forem.com/lukehinds</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F978931%2F0fa78d0e-564a-46f3-98ff-2b76b40859d3.jpeg</url>
      <title>Forem: Luke Hinds</title>
      <link>https://forem.com/lukehinds</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/lukehinds"/>
    <language>en</language>
    <item>
      <title>nono : the ultimate coding agent security tool - sandbox and supercharge Claude Code in just two simple steps!</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Tue, 10 Mar 2026 17:03:08 +0000</pubDate>
      <link>https://forem.com/lukehinds/nono-the-ultimate-coding-agent-security-tool-sandbox-and-supercharge-claude-code-in-just-two-87l</link>
      <guid>https://forem.com/lukehinds/nono-the-ultimate-coding-agent-security-tool-sandbox-and-supercharge-claude-code-in-just-two-87l</guid>
      <description>&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/d6Y8S-tzJ0U"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

&lt;p&gt;Claude Code is one of the most capable coding agents available today. It can navigate your codebase, edit files across multiple directories, run shell commands, install packages, and execute tests — all autonomously.&lt;/p&gt;

&lt;p&gt;It also runs with your full user permissions. Every file on your system is readable. Every credential in your environment is accessible. Every command it runs executes as you.&lt;/p&gt;

&lt;p&gt;We've all seen the stories. Claude deleted my database. Claude deleted all my code. Claude wiped my home directory. These aren't hypothetical risks — they're things that happen when an agent operates with no isolation between what it wants to do and what it can do.&lt;/p&gt;

&lt;h2&gt;
  
  
  Permission prompts don't scale
&lt;/h2&gt;

&lt;p&gt;The default response to this problem has been permission dialogs. The agent wants to run a command — do you approve? The agent wants to write a file — do you approve?&lt;/p&gt;

&lt;p&gt;The issue is predictable. After the third prompt, you're clicking "allow" reflexively. Approval fatigue turns a security feature into a rubber stamp. You end up with interrupted flow &lt;em&gt;and&lt;/em&gt; reduced security.&lt;/p&gt;

&lt;p&gt;Claude Code now ships with a built-in sandbox mode, which is a meaningful step forward. It uses OS-level primitives to enforce filesystem and network boundaries, reducing the need for constant approval prompts.&lt;/p&gt;

&lt;p&gt;But it includes an escape hatch. When a command fails due to sandbox restrictions, Claude can retry it outside the sandbox with the &lt;code&gt;dangerouslyDisableSandbox&lt;/code&gt; parameter. The name is honest — it is dangerous. One retry outside the sandbox and the agent has full access again.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kernel-level enforcement with no escape
&lt;/h2&gt;

&lt;p&gt;nono takes a different approach. It uses Landlock (Linux) and Seatbelt (macOS) to enforce restrictions at the kernel level. Once applied, these restrictions are irreversible for the process. There is no API to widen them. There is no escape hatch. Not even nono itself can remove the restrictions after they're set.&lt;/p&gt;

&lt;p&gt;This is what we mean by "structurally impossible." The kernel denies unauthorised operations directly. There's no interception layer, no policy filter, no mechanism to bypass. If the agent tries to read your SSH keys or delete files outside its granted path, the operation fails at the syscall level.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;Install nono:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;nono
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Launch Claude Code inside a sandbox using the built-in profile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nono run &lt;span class="nt"&gt;--allow-cwd&lt;/span&gt; &lt;span class="nt"&gt;--profile&lt;/span&gt; claude-code &lt;span class="nt"&gt;--&lt;/span&gt; claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Claude Code now runs with default-deny filesystem access. Only the current directory is writable. SSH keys, AWS credentials, system configs — all blocked at the kernel level.&lt;/p&gt;

&lt;p&gt;If you want to run silently without the nono banner:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nono run &lt;span class="nt"&gt;--allow-cwd&lt;/span&gt; &lt;span class="nt"&gt;--silent&lt;/span&gt; &lt;span class="nt"&gt;--profile&lt;/span&gt; claude-code &lt;span class="nt"&gt;--&lt;/span&gt; claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For more granular control:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Read-only access to reference docs, write access to project&lt;/span&gt;
nono run &lt;span class="nt"&gt;--read&lt;/span&gt; ~/docs &lt;span class="nt"&gt;--write&lt;/span&gt; ./src &lt;span class="nt"&gt;--profile&lt;/span&gt; claude-code &lt;span class="nt"&gt;--&lt;/span&gt; claude

&lt;span class="c"&gt;# Check why a specific path would be blocked&lt;/span&gt;
nono why &lt;span class="nt"&gt;--path&lt;/span&gt; ~/.ssh/id_rsa &lt;span class="nt"&gt;--op&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;nono is free, open source, and available today. We're actively developing fine-grained network controls, atomic rollbacks, and library bindings for Python, TypeScript, and Rust so other tools can integrate nono's security primitives directly.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Docs&lt;/strong&gt;: &lt;a href="https://nono.sh/docs" rel="noopener noreferrer"&gt;nono.sh/docs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/always-further/nono" rel="noopener noreferrer"&gt;github.com/always-further/nono&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discord&lt;/strong&gt;: &lt;a href="https://discord.gg/pPcjYzGvbS" rel="noopener noreferrer"&gt;Join the community&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Don't yolo when you can nono.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>security</category>
      <category>terminal</category>
    </item>
    <item>
      <title>Introducing nono: A Secure Sandbox for AI Agents</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Mon, 02 Feb 2026 23:52:43 +0000</pubDate>
      <link>https://forem.com/lukehinds/introducing-nono-a-secure-sandbox-for-ai-agents-1lo2</link>
      <guid>https://forem.com/lukehinds/introducing-nono-a-secure-sandbox-for-ai-agents-1lo2</guid>
      <description>&lt;p&gt;AI coding agents like Claude Code, OpenCode, and others are incredibly powerful — they can write code, refactor entire codebases, and automate tedious tasks. But there's a catch: they run with &lt;em&gt;your&lt;/em&gt; permissions. That means they can read your SSH keys, access your AWS credentials, and potentially exfiltrate data to remote servers. We have seen this play out the past few days with the attacks against OpenClaw servers.&lt;/p&gt;

&lt;p&gt;Current solutions often rely on the agent to police itself. But bugs happen. Security vulnerabilities are common. What if there was a way to make unauthorized operations &lt;em&gt;structurally impossible&lt;/em&gt;?&lt;/p&gt;

&lt;p&gt;Enter &lt;strong&gt;nono&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is nono?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/lukehinds/nono" rel="noopener noreferrer"&gt;nono&lt;/a&gt; is a capability-based security shell that leverages kernel-level primitives to sandbox AI agents and untrusted processes. It uses &lt;strong&gt;Landlock&lt;/strong&gt; on Linux and &lt;strong&gt;Seatbelt&lt;/strong&gt; on macOS to create an environment where once restrictions are applied, there's no API to escape them — not even for nono itself.&lt;/p&gt;

&lt;p&gt;The name says it all: "no" to unauthorized filesystem access, "no" to secret exfiltration, "no" to destructive commands.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Just Use Docker?
&lt;/h2&gt;

&lt;p&gt;Great question. Containers are fantastic for many use cases, but they come with overhead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Heavyweight&lt;/strong&gt;: Containers require images, layers, and a runtime&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complexity&lt;/strong&gt;: Managing volumes, networking, and permissions adds friction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Overkill&lt;/strong&gt;: Sometimes you just want to run a single command safely&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;nono is lightweight by design. It applies kernel-enforced restrictions directly to a process without spinning up containers, VMs, or additional infrastructure. You get security without the weight.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Install via Homebrew (macOS):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew tap lukehinds/nono
brew &lt;span class="nb"&gt;install &lt;/span&gt;nono
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Or build from source:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/lukehinds/nono.git
&lt;span class="nb"&gt;cd &lt;/span&gt;nono
cargo build &lt;span class="nt"&gt;--release&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Run your first sandboxed command:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Give Claude Code read/write access only to the current directory&lt;/span&gt;
nono run &lt;span class="nt"&gt;--allow&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; claude

&lt;span class="c"&gt;# Separate read and write permissions&lt;/span&gt;
nono run &lt;span class="nt"&gt;--allow&lt;/span&gt; ./project-dir &lt;span class="nt"&gt;--write&lt;/span&gt; ./output claude

&lt;span class="c"&gt;# Block network access entirely&lt;/span&gt;
nono run &lt;span class="nt"&gt;--allow&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--net-block&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; my-agent

&lt;span class="c"&gt;# Preview what would happen (dry run)&lt;/span&gt;
nono run &lt;span class="nt"&gt;--allow&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--dry-run&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; my-command
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Features
&lt;/h2&gt;

&lt;h3&gt;
  
  
  No Escape Hatch
&lt;/h3&gt;

&lt;p&gt;Once nono applies the sandbox, there is no mechanism to bypass it. The kernel denies unauthorized operations — this isn't policy-based filtering that can be circumvented; it's structural enforcement.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent Agnostic
&lt;/h3&gt;

&lt;p&gt;nono works with any CLI tool or AI agent: Claude Code, GPT wrappers, custom scripts, build tools — if it runs in a terminal, nono can sandbox it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Defense in Depth
&lt;/h3&gt;

&lt;p&gt;nono provides multiple layers of protection:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Protection&lt;/th&gt;
&lt;th&gt;Can Be Overridden?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Command blocklist&lt;/td&gt;
&lt;td&gt;Blocks dangerous binaries (rm, dd, chmod, sudo)&lt;/td&gt;
&lt;td&gt;Yes, with &lt;code&gt;--allow-command&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel (delete)&lt;/td&gt;
&lt;td&gt;Blocks unlink/rmdir syscalls&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel (truncate)&lt;/td&gt;
&lt;td&gt;Prevents zeroing out files&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Filesystem sandbox&lt;/td&gt;
&lt;td&gt;Restricts path access&lt;/td&gt;
&lt;td&gt;Only via explicit &lt;code&gt;--allow&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network sandbox&lt;/td&gt;
&lt;td&gt;Blocks network access&lt;/td&gt;
&lt;td&gt;Only by removing &lt;code&gt;--net-block&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Sensitive Paths Protected by Default
&lt;/h3&gt;

&lt;p&gt;SSH keys, AWS credentials, shell configs, and other sensitive paths are blocked by default. Your secrets stay secret.&lt;/p&gt;

&lt;h3&gt;
  
  
  Process Inheritance
&lt;/h3&gt;

&lt;p&gt;When a sandboxed process spawns children, those children inherit all restrictions. There's no privilege escalation through subprocess tricks.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────┐
│  Terminal                                       │
│                                                 │
│  $ nono run --allow ./project -- claude         │
│                                                 │
│  ┌───────────────────────────────────────────┐  │
│  │  nono (applies sandbox, then exec)        │  │
│  │                                           │  │
│  │  ┌─────────────────────────────────────┐  │  │
│  │  │  Claude Code (sandboxed)            │  │  │
│  │  │  ✓ Can read/write ./project         │  │  │
│  │  │  ✗ Cannot access ~/.ssh, ~/.aws     │  │  │
│  │  │  ✗ Cannot delete files              │  │  │
│  │  └─────────────────────────────────────┘  │  │
│  └───────────────────────────────────────────┘  │
└─────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Command validation&lt;/strong&gt; — Dangerous commands are blocked before execution&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sandbox applied&lt;/strong&gt; — OS-level restrictions are set (irreversibly)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kernel enforcement&lt;/strong&gt; — Even allowed paths can't have files deleted&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Process execution&lt;/strong&gt; — Your command runs with only granted capabilities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Child inheritance&lt;/strong&gt; — All subprocesses run under the same restrictions&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Platform Support
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;Kernel Requirement&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Linux&lt;/td&gt;
&lt;td&gt;Landlock LSM&lt;/td&gt;
&lt;td&gt;5.13+&lt;/td&gt;
&lt;td&gt;✅ Filesystem&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux&lt;/td&gt;
&lt;td&gt;Landlock LSM&lt;/td&gt;
&lt;td&gt;6.7+&lt;/td&gt;
&lt;td&gt;✅ Filesystem + Network&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;macOS&lt;/td&gt;
&lt;td&gt;Seatbelt&lt;/td&gt;
&lt;td&gt;10.5+&lt;/td&gt;
&lt;td&gt;✅ Filesystem + Network&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;🚧 Not yet supported&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Debugging: The &lt;code&gt;why&lt;/code&gt; Command
&lt;/h2&gt;

&lt;p&gt;Ever wonder why a path is blocked? nono can tell you:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;nono why ~/.ssh/id_rsa
Path: /Users/you/.ssh/id_rsa
Status: BLOCKED
Reason: Sensitive path - SSH private keys are protected by default
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes it easy to understand and debug sandbox behavior without trial and error.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Use Cases
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Running AI coding agents safely:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nono run &lt;span class="nt"&gt;--allow&lt;/span&gt; ./my-project &lt;span class="nt"&gt;--&lt;/span&gt; claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Building code with restricted write access:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nono run &lt;span class="nt"&gt;--read&lt;/span&gt; ./src &lt;span class="nt"&gt;--write&lt;/span&gt; ./target &lt;span class="nt"&gt;--&lt;/span&gt; cargo build
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Running tests without network access:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nono run &lt;span class="nt"&gt;--allow&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--net-block&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; npm &lt;span class="nb"&gt;test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Processing files without risk of deletion:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nono run &lt;span class="nt"&gt;--read&lt;/span&gt; ./input &lt;span class="nt"&gt;--write&lt;/span&gt; ./output &lt;span class="nt"&gt;--&lt;/span&gt; python process.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Get Involved
&lt;/h2&gt;

&lt;p&gt;nono is open source under the Apache 2.0 license. The project is still early (security auditing is ongoing), but it's ready for experimentation and feedback.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/lukehinds/nono" rel="noopener noreferrer"&gt;github.com/lukehinds/nono&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation&lt;/strong&gt;: &lt;a href="https://docs.nono.sh" rel="noopener noreferrer"&gt;docs.nono.sh&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;As AI agents become more powerful and autonomous, the security model of "trust the agent to behave" becomes increasingly untenable. nono offers a different approach: make bad behavior impossible at the kernel level.&lt;/p&gt;

&lt;p&gt;Give it a try, break things (safely), and let us know what you think.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;nono is created by &lt;a href="https://github.com/lukehinds" rel="noopener noreferrer"&gt;Luke Hinds&lt;/a&gt;. Star the repo if you find it useful!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>claude</category>
      <category>openclaw</category>
      <category>security</category>
    </item>
    <item>
      <title>🚀🔥 Train a 4B Model to Beat Claude Sonnet 4.5 🤖 and Gemini Pro 2.5 ✨ at Tool Calling 🛠️ - and for all Free! 💸🎉</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Thu, 25 Dec 2025 20:07:07 +0000</pubDate>
      <link>https://forem.com/lukehinds/train-a-4b-model-to-beat-claude-sonnet-45-and-gemini-pro-25-at-tool-calling-and-for-3a7m</link>
      <guid>https://forem.com/lukehinds/train-a-4b-model-to-beat-claude-sonnet-45-and-gemini-pro-25-at-tool-calling-and-for-3a7m</guid>
      <description>&lt;p&gt;&lt;em&gt;How we used synthetic data to fine-tune Qwen3-4B into a specialist that outperforms frontier models — and how you can do it too (for free)&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;There's a common assumption in the AI community that bigger is always better. That if you want state-of-the-art performance, you need to use the largest, most expensive models from frontier labs like Anthropic, OpenAI, or Google.&lt;/p&gt;

&lt;p&gt;We're here to challenge that assumption.&lt;/p&gt;

&lt;p&gt;Using &lt;a href="https://github.com/always-further/deepfabric" rel="noopener noreferrer"&gt;DeepFabric&lt;/a&gt;, an open-source tool, we trained Qwen3-4B — a model with just 4 billion parameters — to outperform Claude Sonnet 4.5 and Google Gemini Pro 2.5 at tool calling tasks. And we're sharing exactly how we did it so you can replicate our results.&lt;/p&gt;

&lt;p&gt;The secret? &lt;strong&gt;Specialization beats generalization&lt;/strong&gt;. Frontier models are generalists — they're designed to handle everything from poetry to protein folding. But if you need a model that excels at one specific task, a small, focused model trained on high-quality synthetic data can beat the giants.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem with Generalist Models
&lt;/h2&gt;

&lt;p&gt;Frontier models like Claude Sonnet 4.5 and Gemini Pro 2.5 are remarkable achievements. They can write code, analyze documents, engage in nuanced conversations, and much more. But this versatility comes at a cost.&lt;/p&gt;

&lt;p&gt;When you're building an agentic system — one where the model needs to interact with tools, APIs, or MCP servers — you don't need a model that can do everything. You need a model that can do &lt;em&gt;one thing exceptionally well&lt;/em&gt;: understand when to call a tool, which tool to call, and what parameters to pass within the tools schema.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpohv0ethe7s897wopo62.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpohv0ethe7s897wopo62.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is where the generalist approach shows its weaknesses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inconsistent tool calling:&lt;/strong&gt; Large models sometimes hallucinate tool names or parameters that don't exist&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema violations:&lt;/strong&gt; They might return responses that don't match the expected JSON schema&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reasoning gaps:&lt;/strong&gt; When faced with complex multi-step tool interactions, they can lose track of state&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost and latency:&lt;/strong&gt; API calls to frontier models are expensive&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data privacy concerns:&lt;/strong&gt; Sending sensitive data to third-party APIs can be risky&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The DeepFabric Approach
&lt;/h2&gt;

&lt;p&gt;Instead of trying to create one model that handles everything, we help you create &lt;strong&gt;specialist models&lt;/strong&gt; that are experts at specific tool-calling tasks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5buplzso2immn1xio2lu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5buplzso2immn1xio2lu.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The process works in three stages:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Topic Graph Generation
&lt;/h3&gt;

&lt;p&gt;One of the biggest challenges in synthetic data generation is ensuring diversity. If you just ask an LLM to generate training samples, you'll quickly get repetitive, homogeneous data that leads to overfitting.&lt;/p&gt;

&lt;p&gt;DeepFabric solves this with a unique &lt;strong&gt;topic graph generation algorithm&lt;/strong&gt;. Given a domain (like "Blender 3D modeling" or "REST API development"), it builds a comprehensive graph of subtopics, ensuring that generated samples cover the full breadth of the domain without redundancy.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Synthetic Sample Generation
&lt;/h3&gt;

&lt;p&gt;For each topic in the graph, DeepFabric generates training samples that combine:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chain-of-thought reasoning:&lt;/strong&gt; The model learns to think through problems step by step&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool calling patterns:&lt;/strong&gt; Proper syntax and parameter usage for each tool&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real execution traces:&lt;/strong&gt; Using Spin's WebAssembly sandbox, tools actually execute, producing authentic results&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point is critical. Traditional synthetic data generators &lt;em&gt;simulate&lt;/em&gt; tool outputs, which means the training data often contains hallucinated results. DeepFabric actually &lt;em&gt;executes&lt;/em&gt; the tools in isolated sandboxes, so the model learns from real cause-and-effect relationships.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Fine-Tuning and Evaluation
&lt;/h3&gt;

&lt;p&gt;The generated dataset is automatically formatted for popular training frameworks (TRL, Unsloth) and can be uploaded directly to Hugging Face. After training, DeepFabric's built-in evaluation engine measures performance on held-out test samples.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Results
&lt;/h2&gt;

&lt;p&gt;We put this approach to the test using the Blender MCP server — a tool interface for 3D modeling operations in Blender. Here's how the models performed:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Tool Calling Accuracy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DeepFabric Fine-Tuned (Qwen3-4B)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;93.50%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Sonnet 4.5&lt;/td&gt;
&lt;td&gt;80.50%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google Gemini Pro 2.5&lt;/td&gt;
&lt;td&gt;47.00%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A 4 billion parameter model — small enough to run on consumer hardware — outperforming models that are orders of magnitude larger. How is this possible?&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Works
&lt;/h2&gt;

&lt;p&gt;The key insight is that &lt;strong&gt;tool calling is a narrow, well-defined task&lt;/strong&gt;. Unlike open-ended conversation or creative writing, successful tool calling requires:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Recognizing when a tool is needed&lt;/li&gt;
&lt;li&gt;Selecting the correct tool from available options&lt;/li&gt;
&lt;li&gt;Formatting parameters according to a strict schema&lt;/li&gt;
&lt;li&gt;Interpreting results and deciding next steps&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These are learnable patterns. A small model with high-quality, domain-specific training data can master these patterns more effectively than a large generalist model that has to balance thousands of different capabilities.&lt;/p&gt;

&lt;p&gt;Think of it like hiring for a job. Would you rather have a brilliant generalist who's pretty good at everything, or a specialist who has spent years mastering exactly the skill you need? For tool calling, specialization wins.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Replicate Our Results
&lt;/h2&gt;

&lt;p&gt;We've made this entire process available for free. Here's how to get started:&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 1: Use Our Google Colab (Easiest)
&lt;/h3&gt;

&lt;p&gt;We've created a Google Colab notebook using the awesome &lt;a href="https://unsloth.ai" rel="noopener noreferrer"&gt;Unsloth.ai&lt;/a&gt; for optimized fine-tuning -  this walks you through the entire process — from generating your dataset to training and evaluating your model. No local setup required, and it runs on free GPU instances.&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://colab.research.google.com/drive/1EG1V40v5xkJKLf6Ra6W4378vYqlZNVWq#scrollTo=Z0d_s14-H4Of" rel="noopener noreferrer"&gt;[Link to Colab Notebook]&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 2: Run Locally
&lt;/h3&gt;

&lt;p&gt;If you prefer to run things on your own hardware:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install DeepFabric&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;deepfabric

&lt;span class="c"&gt;# Set your API key (for dataset generation)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-api-key"&lt;/span&gt;

&lt;span class="c"&gt;# Generate a dataset&lt;/span&gt;
deepfabric generate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--topic-prompt&lt;/span&gt; &lt;span class="s2"&gt;"Your domain here"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--mode&lt;/span&gt; graph &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--depth&lt;/span&gt; 4 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--degree&lt;/span&gt; 4 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--num-samples&lt;/span&gt; 100 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output-save-as&lt;/span&gt; dataset.jsonl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Training Your Model
&lt;/h3&gt;

&lt;p&gt;Once you have your dataset, training is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dataset&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;trl&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SFTTrainer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SFTConfig&lt;/span&gt;

&lt;span class="c1"&gt;# Load your dataset
&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_files&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dataset.jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;train&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Split into train/eval
&lt;/span&gt;&lt;span class="n"&gt;splits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Format and train
&lt;/span&gt;&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Qwen/Qwen2.5-7B-Instruct&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;trainer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SFTTrainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;train_dataset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;splits&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;train&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;SFTConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_train_epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;trainer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Evaluating Your Model
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;deepfabric.evaluation&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Evaluator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EvaluatorConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;InferenceConfig&lt;/span&gt;

&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;EvaluatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;inference_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;InferenceConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./output/checkpoint-final&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;backend&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transformers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;evaluator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Evaluator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;evaluator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;splits&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tool Selection Accuracy: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_selection_accuracy&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Parameter Accuracy: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameter_accuracy&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Overall Score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;overall_score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What Makes DeepFabric Different
&lt;/h2&gt;

&lt;p&gt;There are other tools for generating synthetic datasets. Here's what sets DeepFabric apart:&lt;/p&gt;

&lt;h3&gt;
  
  
  Topic Graph Diversity
&lt;/h3&gt;

&lt;p&gt;Most dataset generators produce samples that cluster around common patterns. DeepFabric's graph-based approach ensures comprehensive coverage of your domain, hitting edge cases and uncommon scenarios that make your model robust.&lt;/p&gt;

&lt;h3&gt;
  
  
  Real Tool Execution
&lt;/h3&gt;

&lt;p&gt;Simulated tool outputs create unrealistic training data. DeepFabric uses Spin's WebAssembly sandbox to actually execute tools, producing authentic cause-and-effect relationships in your training data.&lt;/p&gt;

&lt;h3&gt;
  
  
  MCP Integration
&lt;/h3&gt;

&lt;p&gt;DeepFabric natively supports the Model Context Protocol (MCP) schema, making it easy to import tool definitions from any MCP server. If you already have an MCP server, you can start generating training data immediately.&lt;/p&gt;

&lt;h3&gt;
  
  
  End-to-End Pipeline
&lt;/h3&gt;

&lt;p&gt;From dataset generation to training to evaluation, DeepFabric handles the entire workflow. No need to stitch together multiple tools or write custom integration code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Use Cases
&lt;/h2&gt;

&lt;p&gt;This approach isn't limited to Blender. Any domain where you need reliable tool calling is a candidate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Developer tools:&lt;/strong&gt; Code execution, file manipulation, git operations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data pipelines:&lt;/strong&gt; Database queries, API calls, data transformation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business automation:&lt;/strong&gt; CRM updates, email composition, calendar management&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creative software:&lt;/strong&gt; Image editing, audio processing, video production&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IoT and hardware:&lt;/strong&gt; Device control, sensor reading, automation scripts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you can define it as a tool with a schema, you can train a specialist model for it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;We believe this represents a shift in how AI systems will be built. Rather than relying on ever-larger generalist models, we'll see architectures that combine:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A reasoning backbone&lt;/strong&gt; (which could be a frontier model) for complex decision-making&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Specialist tool-calling models&lt;/strong&gt; for reliable, schema-compliant interactions with external systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This hybrid approach gives you the best of both worlds: the reasoning capability of large models with the reliability and efficiency of focused specialists.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;Ready to train your own specialist model? Here are your next steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Try the Colab notebook&lt;/strong&gt; to see the full pipeline in action&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Star the GitHub repo&lt;/strong&gt; to stay updated on new features&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Join our Discord&lt;/strong&gt; to connect with other builders&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Share your results&lt;/strong&gt; — we'd love to see what specialist models you create&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Links
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/always-further/deepfabric" rel="noopener noreferrer"&gt;github.com/always-further/deepfabric&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation:&lt;/strong&gt; &lt;a href="http://docs.deepfabric.dev" rel="noopener noreferrer"&gt;docs.deepfabric.dev&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discord:&lt;/strong&gt; &lt;a href="https://discord.gg/pPcjYzGvbS" rel="noopener noreferrer"&gt;discord.gg/pPcjYzGvbS&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyPI:&lt;/strong&gt; &lt;a href="https://pypi.org/project/deepfabric/" rel="noopener noreferrer"&gt;pypi.org/project/deepfabric&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reddit:&lt;/strong&gt; &lt;a href="https://www.reddit.com/r/deepfabric/" rel="noopener noreferrer"&gt;r/deepfabric&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Small models. Big results. Let's build the future of AI tooling together.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;— The Always Further AI Team&lt;/strong&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Fine-Tuning with GRPO Datasets: A Developer's Guide to DeepFabric's GRPO Formatter</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Tue, 21 Oct 2025 09:34:02 +0000</pubDate>
      <link>https://forem.com/lukehinds/fine-tuning-with-grpo-datasets-a-developers-guide-to-deepfabrics-grpo-formatter-245h</link>
      <guid>https://forem.com/lukehinds/fine-tuning-with-grpo-datasets-a-developers-guide-to-deepfabrics-grpo-formatter-245h</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;When training language models for mathematical reasoning, one of the key challenges is getting the model to not just produce correct answers, but to show its work in a structured, verifiable way. This is where GRPO (Generalized Reward-based Policy Optimization) comes in.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/lukehinds/deepfabric/" rel="noopener noreferrer"&gt;DeepFabric's&lt;/a&gt; GRPO formatter transforms your datasets into the precise format needed for GRPO training pipelines, wrapping reasoning traces and solutions in configurable tags that enable reward-based optimization. In this post, we'll dive deep into how to use it effectively.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is GRPO?
&lt;/h2&gt;

&lt;p&gt;GRPO is a reinforcement learning technique that optimizes language models using reward signals. For mathematical reasoning tasks, this typically means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Structured reasoning&lt;/strong&gt;: The model's thought process is wrapped in specific tags&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extractable solutions&lt;/strong&gt;: Final answers are clearly delineated for verification&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reward computation&lt;/strong&gt;: Numerical answers can be automatically validated against ground truth&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The format enables training systems to parse the model's output, extract the answer, compute rewards based on correctness, and update the model accordingly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The GRPO Format Structure
&lt;/h2&gt;

&lt;p&gt;A properly formatted GRPO sample looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"You are given a problem. Think about the problem and provide your working out. Place it between &amp;lt;start_working_out&amp;gt; and &amp;lt;end_working_out&amp;gt;. Then, provide your solution between &amp;lt;SOLUTION&amp;gt; and &amp;lt;/SOLUTION&amp;gt;."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"What is 15% of 240?"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;start_working_out&amp;gt;To find 15% of 240, I need to multiply 240 by 0.15. 240 × 0.15 = 36&amp;lt;end_working_out&amp;gt;&amp;lt;SOLUTION&amp;gt;36&amp;lt;/SOLUTION&amp;gt;"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the clear separation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reasoning section&lt;/strong&gt;: Wrapped in &lt;code&gt;&amp;lt;start_working_out&amp;gt;...&amp;lt;/end_working_out&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution section&lt;/strong&gt;: Wrapped in &lt;code&gt;&amp;lt;SOLUTION&amp;gt;...&amp;lt;/SOLUTION&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This structure allows the training pipeline to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extract the numerical answer (36)&lt;/li&gt;
&lt;li&gt;Compare it against ground truth&lt;/li&gt;
&lt;li&gt;Compute reward signals&lt;/li&gt;
&lt;li&gt;Backpropagate through the reasoning process&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Using DeepFabric's GRPO Formatter
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Basic Usage
&lt;/h3&gt;

&lt;p&gt;The GRPO formatter is built into DeepFabric and supports multiple input formats out of the box:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;deepfabric.formatters.builtin.grpo&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GrpoFormatter&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize with default configuration
&lt;/span&gt;&lt;span class="n"&gt;formatter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GrpoFormatter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Your raw dataset sample
&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If a train travels 120 km in 2 hours, what is its average speed?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;60&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chain_of_thought&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Speed = Distance / Time. Speed = 120 km / 2 hours = 60 km/h&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Format for GRPO training
&lt;/span&gt;&lt;span class="n"&gt;formatted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format_dataset&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Configuration Options
&lt;/h3&gt;

&lt;p&gt;The formatter is highly configurable to match your training pipeline's requirements:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning_start_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;think&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning_end_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;/think&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_start_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;answer&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_end_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;/answer&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system_prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Solve the following problem step by step.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;validate_numerical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;  &lt;span class="c1"&gt;# Enforce numerical answer extraction
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;formatter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GrpoFormatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This flexibility means you can adapt to different GRPO implementations (like Qwen, DeepSeek, or custom pipelines) that might use different tag conventions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Supported Input Formats
&lt;/h2&gt;

&lt;p&gt;One of the formatter's strengths is its ability to handle diverse dataset structures. Let's explore each:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Question-Answer Format
&lt;/h3&gt;

&lt;p&gt;The simplest format - just questions and answers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is 25² ?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;625&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The formatter automatically generates a basic reasoning wrapper if no reasoning trace is provided.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Chain-of-Thought Format
&lt;/h3&gt;

&lt;p&gt;Includes explicit reasoning steps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Solve: 3x + 5 = 20&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chain_of_thought&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Subtract 5 from both sides: 3x = 15. Divide both sides by 3: x = 5.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Structured Chain-of-Thought
&lt;/h3&gt;

&lt;p&gt;The most detailed format with message structure and reasoning traces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Calculate 15! / 13!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning_trace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thought&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Using factorial properties, 15! / 13! = 15 × 14 × 13! / 13!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thought&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The 13! cancels out, leaving 15 × 14&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Calculate: 15 × 14 = 210&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;210&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Conversation Format
&lt;/h3&gt;

&lt;p&gt;Already has messages but needs GRPO formatting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is 2³ + 3²?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Let me calculate: 2³ = 8 and 3² = 9. Therefore 8 + 9 = 17&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The formatter intelligently extracts reasoning and answer, then wraps them in GRPO tags.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Generic Format
&lt;/h3&gt;

&lt;p&gt;For datasets with non-standard field names:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;problem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Find the area of a circle with radius 5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;78.54&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Area = πr². With r=5: Area = π × 5² = π × 25 ≈ 78.54&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The formatter searches for common field name patterns like &lt;code&gt;problem&lt;/code&gt;, &lt;code&gt;prompt&lt;/code&gt;, &lt;code&gt;input&lt;/code&gt; for questions and &lt;code&gt;solution&lt;/code&gt;, &lt;code&gt;output&lt;/code&gt;, &lt;code&gt;response&lt;/code&gt; for answers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Validation and Quality Control
&lt;/h2&gt;

&lt;p&gt;The formatter includes robust validation at two levels:&lt;/p&gt;

&lt;h3&gt;
  
  
  Input Validation
&lt;/h3&gt;

&lt;p&gt;Before formatting, each sample is validated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# This returns False for invalid samples
&lt;/span&gt;&lt;span class="n"&gt;is_valid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Validation checks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Required fields are present&lt;/li&gt;
&lt;li&gt;Data types are correct&lt;/li&gt;
&lt;li&gt;Format can be detected and handled&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Output Validation
&lt;/h3&gt;

&lt;p&gt;After formatting, samples are validated against GRPO requirements:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;formatted_sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_format_single_sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;is_grpo_compliant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;validate_output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;formatted_sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All required roles (system, user, assistant) are present&lt;/li&gt;
&lt;li&gt;GRPO formatting tags are correctly applied&lt;/li&gt;
&lt;li&gt;Numerical answers are extractable (if validation enabled)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Numerical Answer Extraction
&lt;/h3&gt;

&lt;p&gt;When &lt;code&gt;validate_numerical: True&lt;/code&gt;, the formatter uses regex patterns to ensure answers can be extracted:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# The formatter compiles these patterns
&lt;/span&gt;&lt;span class="n"&gt;format_regex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;end_working_out&amp;gt;.*?&amp;lt;SOLUTION&amp;gt;(.+?)&amp;lt;/SOLUTION&amp;gt;\s*$&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MULTILINE&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DOTALL&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;number_regex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;SOLUTION&amp;gt;.*?\s*([+-]?[\d\.,]+)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MULTILINE&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DOTALL&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures the training pipeline can reliably extract answers for reward computation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Example: Formatting a Math Dataset
&lt;/h2&gt;

&lt;p&gt;Let's walk through a complete example using a mathematical reasoning dataset:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;deepfabric.formatters.builtin.grpo&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GrpoFormatter&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="c1"&gt;# Configuration matching your training pipeline
&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning_start_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;start_working_out&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning_end_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;end_working_out&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_start_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;SOLUTION&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_end_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;/SOLUTION&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;validate_numerical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;formatter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GrpoFormatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Your raw dataset
&lt;/span&gt;&lt;span class="n"&gt;raw_samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A rectangle has length 8cm and width 5cm. What is its perimeter?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chain_of_thought&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Perimeter of rectangle = 2(length + width). P = 2(8 + 5) = 2(13) = 26cm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;26&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Simplify: (x² - 9) / (x - 3)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chain_of_thought&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Factor numerator: (x+3)(x-3) / (x-3). Cancel (x-3): x + 3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x + 3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Format the entire dataset
&lt;/span&gt;&lt;span class="n"&gt;formatted_dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format_dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Save for training
&lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;grpo_training_data.jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;formatted_dataset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Formatted &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;formatted_dataset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; samples for GRPO training&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output JSONL file is ready to feed into your GRPO training pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integration with DeepFabric Dataset Generation
&lt;/h2&gt;

&lt;p&gt;DeepFabric can generate synthetic datasets and format them for GRPO in one pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config.yaml&lt;/span&gt;
&lt;span class="na"&gt;dataset_system_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
  &lt;span class="s"&gt;Generate mathematical reasoning problems suitable for GRPO training.&lt;/span&gt;
  &lt;span class="s"&gt;Include step-by-step reasoning and numerical answers.&lt;/span&gt;

&lt;span class="na"&gt;topic_tree&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4"&lt;/span&gt;
    &lt;span class="na"&gt;depth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
    &lt;span class="na"&gt;branching_factor&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
    &lt;span class="na"&gt;root_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mathematical&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;reasoning:&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;algebra,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;geometry,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;arithmetic"&lt;/span&gt;
  &lt;span class="na"&gt;save_as&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topics.jsonl"&lt;/span&gt;

&lt;span class="na"&gt;data_engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4"&lt;/span&gt;
    &lt;span class="na"&gt;samples_per_topic&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
    &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.7&lt;/span&gt;

&lt;span class="na"&gt;dataset&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;creation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;formatter&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;grpo"&lt;/span&gt;
    &lt;span class="na"&gt;formatter_config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;reasoning_start_tag&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;start_working_out&amp;gt;"&lt;/span&gt;
      &lt;span class="na"&gt;reasoning_end_tag&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;end_working_out&amp;gt;"&lt;/span&gt;
      &lt;span class="na"&gt;solution_start_tag&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;SOLUTION&amp;gt;"&lt;/span&gt;
      &lt;span class="na"&gt;solution_end_tag&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;/SOLUTION&amp;gt;"&lt;/span&gt;
      &lt;span class="na"&gt;validate_numerical&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;save_as&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;grpo_dataset.jsonl"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepfabric start config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This generates a complete GRPO-formatted dataset from scratch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best Practices
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Match Your Training Pipeline
&lt;/h3&gt;

&lt;p&gt;Different GRPO implementations use different tags. Always configure the formatter to match your training code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# For Qwen-style GRPO
&lt;/span&gt;&lt;span class="n"&gt;qwen_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning_start_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;start_working_out&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning_end_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;end_working_out&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_start_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;SOLUTION&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_end_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;/SOLUTION&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# For custom pipeline
&lt;/span&gt;&lt;span class="n"&gt;custom_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning_start_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[REASONING]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning_end_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[/REASONING]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_start_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[ANSWER]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_end_tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[/ANSWER]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Enable Validation for Math Tasks
&lt;/h3&gt;

&lt;p&gt;For mathematical reasoning, always enable numerical validation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;validate_numerical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures your reward function can extract answers reliably.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Provide Quality Reasoning Traces
&lt;/h3&gt;

&lt;p&gt;The better your input reasoning, the better your GRPO training:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Good: Detailed step-by-step
&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is 15% of 80?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chain_of_thought&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Convert percentage to decimal: 15% = 0.15. Multiply: 80 × 0.15 = 12&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;12&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Less ideal: Minimal reasoning
&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is 15% of 80?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;12&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Customize System Prompts
&lt;/h3&gt;

&lt;p&gt;The system prompt guides model behavior during training:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system_prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Solve mathematical problems by:
1. Breaking down the problem
2. Showing all calculation steps
3. Providing the final numerical answer&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Validate Your Output
&lt;/h3&gt;

&lt;p&gt;Always check a few formatted samples before training:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;formatted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format_dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Inspect first sample
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;formatted&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# Validate all samples
&lt;/span&gt;&lt;span class="n"&gt;valid_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;formatted&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;validate_output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;valid_count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;formatted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; samples are GRPO-compliant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Troubleshooting Common Issues
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Issue: Samples Being Filtered Out
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: Some samples don't appear in the formatted output.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution&lt;/strong&gt;: Check validation errors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;raw_samples&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid sample: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Common causes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Missing required fields&lt;/li&gt;
&lt;li&gt;Empty answers&lt;/li&gt;
&lt;li&gt;Incompatible format&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Issue: Answer Extraction Fails
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: &lt;code&gt;validate_numerical&lt;/code&gt; rejects valid samples.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution&lt;/strong&gt;: Check answer format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# The regex expects numbers in the solution tags
# This works:
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;SOLUTION&amp;gt;42&amp;lt;/SOLUTION&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# This might fail:
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;SOLUTION&amp;gt;The answer is forty-two&amp;lt;/SOLUTION&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For non-numerical answers, disable validation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;validate_numerical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Issue: Reasoning Not Preserved
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: Original reasoning is lost during formatting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution&lt;/strong&gt;: Ensure reasoning is in a recognized field:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Recognized fields for reasoning:
&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chain_of_thought&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning_trace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Not recognized:
&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;explanation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;steps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;working&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Advanced: Custom Format Handling
&lt;/h2&gt;

&lt;p&gt;If you have a unique dataset structure, you can extend the formatter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;deepfabric.formatters.builtin.grpo&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GrpoFormatter&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CustomGrpoFormatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;GrpoFormatter&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_format_custom_format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Handle custom dataset structure.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;problem_statement&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;steps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_steps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# List of steps
&lt;/span&gt;        &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;correct_answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="c1"&gt;# Combine steps into reasoning
&lt;/span&gt;        &lt;span class="n"&gt;reasoning&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;assistant_content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reasoning_start_tag&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;reasoning&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reasoning_end_tag&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;solution_start_tag&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;solution_end_tag&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;assistant_content&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_format_single_sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Try custom format first
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;problem_statement&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_steps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_format_custom_format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Fall back to standard formats
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;_format_single_sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;DeepFabric's GRPO formatter provides a robust, flexible way to prepare datasets for reward-based optimization training. Key takeaways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Multiple format support&lt;/strong&gt;: Works with Q&amp;amp;A, CoT, conversations, and generic formats&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configurable tags&lt;/strong&gt;: Adapt to any GRPO training pipeline&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Built-in validation&lt;/strong&gt;: Ensures quality and compliance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Numerical extraction&lt;/strong&gt;: Enables reliable reward computation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integration ready&lt;/strong&gt;: Works seamlessly with DeepFabric's generation pipeline&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Whether you're formatting existing datasets or generating new synthetic data, the GRPO formatter handles the complex transformations needed for successful mathematical reasoning training.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://lukehinds.github.io/deepfabric/" rel="noopener noreferrer"&gt;DeepFabric Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/2402.05749" rel="noopener noreferrer"&gt;Generalized Reward-based Policy Optimization Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Happy training!&lt;/p&gt;

</description>
      <category>llm</category>
      <category>ai</category>
      <category>agents</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>One Dataset, Many Formats: DeepFabric's Approach to Training Format Flexibility</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Mon, 20 Oct 2025 06:44:21 +0000</pubDate>
      <link>https://forem.com/lukehinds/one-dataset-many-formats-deepfabrics-approach-to-training-format-flexibility-m30</link>
      <guid>https://forem.com/lukehinds/one-dataset-many-formats-deepfabrics-approach-to-training-format-flexibility-m30</guid>
      <description>&lt;h2&gt;
  
  
  The Format Problem in ML Training
&lt;/h2&gt;

&lt;p&gt;Synthetic dataset generation for fine-tuning language models often leads to a format lock-in problem. After investing significant time generating high-quality synthetic data, teams discover their dataset is bound to a single training format. Experimenting with TRL's SFTTrainer requires one specific structure. Switching to GRPO for mathematical reasoning demands another. Testing OpenAI's Harmony format for reasoning-focused models needs yet another transformation.&lt;/p&gt;

&lt;p&gt;Teams typically respond by regenerating datasets for each framework, writing custom conversion scripts that are brittle and hard to maintain, limiting themselves to a single framework, or maintaining multiple versions of the same dataset with associated storage and versioning complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  The DeepFabric Approach
&lt;/h2&gt;

&lt;p&gt;DeepFabric addresses this through a format-agnostic workflow built on three principles. First, all datasets are generated in a universal storage format using the OpenAI messages standard in JSONL. Second, HuggingFace Hub integration enables sharing datasets once in their generic format. Third, on-demand formatting allows pulling and reformatting datasets for any training framework at runtime.&lt;/p&gt;

&lt;p&gt;This architecture means datasets are generated once and uploaded to HuggingFace in a generic format where they can be shared with teams or the broader community. When needed, the dataset can be formatted for any training pipeline without regeneration, enabling experimentation with different frameworks using the same source data.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Complete Workflow
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Generate Your Dataset with Chain-of-Thought Reasoning and Custom Tools
&lt;/h3&gt;

&lt;p&gt;Chain-of-thought (CoT) reasoning in tool-calling datasets teaches models to think through problems step-by-step before invoking functions. This approach significantly improves structured output quality in tool and MCP (Model Context Protocol) calling scenarios. When models explicitly reason about which tool to use, what parameters to provide, and why that tool is appropriate, they produce more accurate and contextually appropriate function calls.&lt;/p&gt;

&lt;p&gt;DeepFabric supports custom tool definitions, allowing you to specify domain-specific functions that your model should learn to use. Let's create a configuration file that defines custom financial analysis tools and generates a dataset with chain-of-thought reasoning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# financial_cot_config.yaml&lt;/span&gt;
&lt;span class="na"&gt;dataset_system_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;are&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;an&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;expert&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;financial&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;analyst.&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;When&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;responding,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;first&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;explain&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;your&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;reasoning&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;step-by-step,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;then&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;call&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;appropriate&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;financial&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;analysis&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;correct&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;parameters."&lt;/span&gt;

&lt;span class="na"&gt;topic_tree&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;topic_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Financial&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;analysis&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;portfolio&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;management&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;quantitative&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tools"&lt;/span&gt;
  &lt;span class="na"&gt;topic_system_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;are&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;an&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;expert&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;financial&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;analyst&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;creating&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;comprehensive&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;financial&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;analysis&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;scenarios."&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai"&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o"&lt;/span&gt;
  &lt;span class="na"&gt;degree&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
  &lt;span class="na"&gt;depth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.7&lt;/span&gt;
  &lt;span class="na"&gt;save_as&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;financial_topics.jsonl"&lt;/span&gt;

&lt;span class="na"&gt;data_engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;instructions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Generate&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;realistic&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;financial&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;analysis&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;scenarios&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;step-by-step&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;reasoning&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;before&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;calls"&lt;/span&gt;
  &lt;span class="na"&gt;generation_system_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;are&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;an&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;expert&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;financial&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;analyst.&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;When&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;responding,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;first&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;explain&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;your&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;reasoning&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;step-by-step,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;then&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;call&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;appropriate&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;financial&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;analysis&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;correct&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;parameters."&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai"&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o"&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.8&lt;/span&gt;
  &lt;span class="na"&gt;max_retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;conversation_template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cot_structured"&lt;/span&gt;

  &lt;span class="c1"&gt;# Define custom financial analysis tools&lt;/span&gt;
  &lt;span class="na"&gt;custom_tools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get_portfolio_performance"&lt;/span&gt;
      &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Retrieves&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;historical&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;performance&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;portfolio&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;including&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;returns,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;holdings,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;time&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;series&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;data"&lt;/span&gt;
      &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object"&lt;/span&gt;
        &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;portfolio_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string"&lt;/span&gt;
            &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unique&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;portfolio"&lt;/span&gt;
          &lt;span class="na"&gt;start_date&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string"&lt;/span&gt;
            &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Start&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;date&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;YYYY-MM-DD&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;format"&lt;/span&gt;
          &lt;span class="na"&gt;end_date&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string"&lt;/span&gt;
            &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;End&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;date&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;YYYY-MM-DD&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;format"&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;portfolio_id"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;start_date"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;end_date"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;calculate_sharpe_ratio"&lt;/span&gt;
      &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Calculates&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Sharpe&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ratio&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;risk-adjusted&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;return&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;analysis"&lt;/span&gt;
      &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object"&lt;/span&gt;
        &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;returns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;array"&lt;/span&gt;
            &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;number"&lt;/span&gt;
            &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Array&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;of&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;periodic&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;returns"&lt;/span&gt;
          &lt;span class="na"&gt;risk_free_rate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;number"&lt;/span&gt;
            &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Annual&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;risk-free&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;rate&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;as&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;decimal"&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;returns"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;risk_free_rate"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;analyze_sector_exposure"&lt;/span&gt;
      &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyzes&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;portfolio&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;exposure&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;across&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;different&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;market&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;sectors"&lt;/span&gt;
      &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object"&lt;/span&gt;
        &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;portfolio_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string"&lt;/span&gt;
            &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unique&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;portfolio"&lt;/span&gt;
          &lt;span class="na"&gt;date&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string"&lt;/span&gt;
            &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analysis&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;date&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;YYYY-MM-DD&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;format"&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;portfolio_id"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;date"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;dataset&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;creation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;num_steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
    &lt;span class="na"&gt;batch_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
    &lt;span class="na"&gt;sys_msg&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;save_as&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;financial_cot_dataset.jsonl"&lt;/span&gt;

&lt;span class="c1"&gt;# Automatically upload to HuggingFace Hub&lt;/span&gt;
&lt;span class="na"&gt;huggingface&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;repository&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-org/financial-reasoning-cot"&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;finance"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chain-of-thought"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool-calling"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthetic"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Generate the dataset and upload it to HuggingFace in one step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;HF_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-hf-token"&lt;/span&gt;
deepfabric generate financial_cot_config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The generated dataset contains examples where the model reasons through financial problems before making tool calls. This generic format includes the reasoning chain, tool definitions, and the conversation structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"I need to analyze the risk-adjusted return of my portfolio over the last year"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Let me work through this systematically. To calculate risk-adjusted returns, I need three key metrics: the portfolio's actual return, the risk-free rate for comparison, and the portfolio's volatility (standard deviation). The Sharpe ratio is the standard metric for this analysis, calculated as (portfolio return - risk-free rate) / portfolio standard deviation.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;First, I should retrieve your portfolio's historical performance data for the past year. Then I'll need to fetch the current risk-free rate, typically using the 1-year Treasury yield. Finally, I'll calculate the standard deviation of returns and compute the Sharpe ratio.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Let me start by getting your portfolio performance data for the specified period."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"available_tools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"get_portfolio_performance"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Retrieves historical performance data for a portfolio including returns, holdings, and time series data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"portfolio_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Unique identifier for the portfolio"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"start_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Start date in YYYY-MM-DD format"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"end_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"End date in YYYY-MM-DD format"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"required"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"portfolio_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"start_date"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"end_date"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"calculate_sharpe_ratio"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Calculates the Sharpe ratio for risk-adjusted return analysis"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"returns"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"array"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"items"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"number"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Array of periodic returns"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"risk_free_rate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"number"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Annual risk-free rate as a decimal"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"required"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"returns"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"risk_free_rate"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This format is framework-agnostic. It contains all necessary information including the chain-of-thought reasoning, tool schemas, and conversation flow, but isn't locked to any specific training library. The reasoning component teaches the model to think through domain-specific problems, improving both the quality of tool selection and parameter accuracy in production use.&lt;/p&gt;

&lt;p&gt;DeepFabric automatically handles the HuggingFace upload, repository creation, dataset card generation with appropriate metadata, and tag application for discoverability. The key insight is that you're uploading the generic format, not a training-specific format. This single upload serves all downstream use cases across different training frameworks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Pull and Format for Your Training Pipeline
&lt;/h3&gt;

&lt;p&gt;Anyone with access to the dataset can now pull it and format it for their specific training framework. The same source data can be reformatted multiple times for different purposes without regeneration.&lt;/p&gt;

&lt;h4&gt;
  
  
  Training with TRL SFTTrainer for Tool Calling
&lt;/h4&gt;

&lt;p&gt;TRL's SFTTrainer expects tools in an OpenAI-compatible schema with explicit function definitions. The chain-of-thought reasoning is preserved in the assistant's response, teaching the model to think through the problem before tool invocation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dataset&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;trl&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SFTConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SFTTrainer&lt;/span&gt;

&lt;span class="c1"&gt;# Format the dataset for TRL
&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deepfabric&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;format&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--repo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-org/financial-reasoning-cot&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--formatter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;training_trl.jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Load the formatted dataset
&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_files&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;training_trl.jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Train with TRL - the reasoning helps the model learn better tool selection
&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SFTConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;trainer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SFTTrainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;train_dataset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;train&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;trainer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This converts the generic dataset to TRL's specific structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"I need to analyze the risk-adjusted return of my portfolio over the last year"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Let me work through this systematically. To calculate risk-adjusted returns, I need three key metrics: the portfolio's actual return..."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"function"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"function"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"get_portfolio_performance"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Retrieves historical performance data for a portfolio including returns, holdings, and time series data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"portfolio_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Unique identifier for the portfolio"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"start_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Start date in YYYY-MM-DD format"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"end_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"End date in YYYY-MM-DD format"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"required"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"portfolio_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"start_date"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"end_date"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Training with GRPO for Mathematical Reasoning
&lt;/h4&gt;

&lt;p&gt;GRPO (Generalized Reward-based Policy Optimization) requires explicit reasoning tags to separate the working-out from the final solution. This format works particularly well with the chain-of-thought data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepfabric format &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--repo&lt;/span&gt; your-org/financial-reasoning-cot &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--formatter&lt;/span&gt; grpo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; training_grpo.jsonl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The formatter wraps reasoning in tags that GRPO uses for reward modeling:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"You are a financial analysis assistant that shows your reasoning before providing solutions."&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"I need to analyze the risk-adjusted return of my portfolio over the last year"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;start_working_out&amp;gt;To calculate risk-adjusted returns, I need three key metrics: the portfolio's actual return, the risk-free rate for comparison, and the portfolio's volatility. The Sharpe ratio is calculated as (portfolio return - risk-free rate) / portfolio standard deviation. I should retrieve the portfolio's historical performance data, fetch the current risk-free rate using the 1-year Treasury yield, calculate the standard deviation of returns, then compute the Sharpe ratio.&amp;lt;end_working_out&amp;gt;&amp;lt;SOLUTION&amp;gt;Retrieve portfolio performance data using get_portfolio_performance, then calculate Sharpe ratio with the returns and current risk-free rate.&amp;lt;/SOLUTION&amp;gt;"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The explicit separation of reasoning from solution helps GRPO training optimize for both correct thinking processes and accurate final answers.&lt;/p&gt;

&lt;h4&gt;
  
  
  Training with Harmony for OpenAI Reasoning Models
&lt;/h4&gt;

&lt;p&gt;OpenAI's Harmony format (used in gpt-oss and reasoning-focused models) uses specific XML-style tags for reasoning chains:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepfabric format &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--repo&lt;/span&gt; your-org/financial-reasoning-cot &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--formatter&lt;/span&gt; harmony &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; training_harmony.jsonl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"I need to analyze the risk-adjusted return of my portfolio over the last year"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;reasoning&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;To calculate risk-adjusted returns, I need three key metrics: portfolio return, risk-free rate, and volatility. The Sharpe ratio is the standard metric: (return - risk_free_rate) / std_dev.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;I need to:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;1. Retrieve portfolio historical data&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;2. Get current risk-free rate&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;3. Calculate standard deviation&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;4. Compute Sharpe ratio&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/reasoning&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;output&amp;gt;I'll retrieve your portfolio performance data for the past year and calculate the Sharpe ratio for risk-adjusted return analysis.&amp;lt;/output&amp;gt;"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Training with ChatML for Reasoning-Capable Chat Models
&lt;/h4&gt;

&lt;p&gt;For models that expect ChatML delimiters with preserved reasoning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepfabric format &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--repo&lt;/span&gt; your-org/financial-reasoning-cot &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--formatter&lt;/span&gt; im_format &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; training_chatml.jsonl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The formatter wraps everything in ChatML tags while maintaining the reasoning flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;|im_start|&amp;gt;user&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;I need to analyze the risk-adjusted return of my portfolio over the last year&amp;lt;|im_end|&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;|im_start|&amp;gt;assistant&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Let me work through this systematically. To calculate risk-adjusted returns, I need three key metrics: the portfolio's actual return, the risk-free rate for comparison, and the portfolio's volatility...&amp;lt;|im_end|&amp;gt;"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why Chain-of-Thought Improves Tool Calling with Custom Tools
&lt;/h2&gt;

&lt;p&gt;The chain-of-thought approach combined with custom tool definitions provides several concrete benefits for model training. Models trained with explicit reasoning learn to validate their tool selection before making calls, reducing errors where inappropriate functions are invoked. The reasoning chain provides context for parameter selection, leading to more accurate argument values. When models explain their approach, they're more likely to catch edge cases and error conditions before execution.&lt;/p&gt;

&lt;p&gt;Custom tools allow you to define domain-specific functions that match your production environment. For example, the financial analysis tools defined in the configuration above (&lt;code&gt;get_portfolio_performance&lt;/code&gt;, &lt;code&gt;calculate_sharpe_ratio&lt;/code&gt;, &lt;code&gt;analyze_sector_exposure&lt;/code&gt;) teach the model to work with your specific API schema. When fine-tuning models like SmolLM2-1.7B-Instruct with PEFT/LoRA on this custom tool dataset, the model learns both the reasoning patterns and the exact tool signatures it will encounter in production.&lt;/p&gt;

&lt;p&gt;Here's a practical training example using the formatted dataset:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;trl&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SFTTrainer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SFTConfig&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dataset&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;peft&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LoraConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;get_peft_model&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;

&lt;span class="c1"&gt;# Format the dataset for TRL
&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deepfabric&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;format&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--repo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-org/financial-reasoning-cot&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--formatter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trl_sft_tools.jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Load model and tokenizer
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HuggingFaceTB/SmolLM2-1.7B-Instruct&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HuggingFaceTB/SmolLM2-1.7B-Instruct&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Configure LoRA for efficient fine-tuning
&lt;/span&gt;&lt;span class="n"&gt;peft_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LoraConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;lora_alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;lora_dropout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;target_modules&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;down_proj&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;o_proj&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;k_proj&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;q_proj&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;gate_proj&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;up_proj&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;v_proj&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;use_dora&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;init_lora_weights&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gaussian&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Apply PEFT model adaptation
&lt;/span&gt;&lt;span class="n"&gt;peft_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_peft_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;peft_config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Load dataset
&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_files&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./trl_sft_tools.jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Configure training
&lt;/span&gt;&lt;span class="n"&gt;training_args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SFTConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;output_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;financial-reasoning-model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;adamw_torch_fused&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;bf16&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;push_to_hub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;report_to&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;none&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Train
&lt;/span&gt;&lt;span class="n"&gt;trainer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SFTTrainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;peft_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;training_args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;train_dataset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;train&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;processing_class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;trainer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The resulting model learns to reason through financial problems using the exact custom tools you defined, making it production-ready for your specific use case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Format Conversion Script
&lt;/h2&gt;

&lt;p&gt;Here's a practical script demonstrating conversion of one chain-of-thought dataset to multiple formats:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# multi-format-reasoning.sh&lt;/span&gt;

&lt;span class="nv"&gt;REPO&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-org/financial-reasoning-cot"&lt;/span&gt;
&lt;span class="nv"&gt;BASE_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"financial_reasoning"&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Converting reasoning dataset to multiple formats..."&lt;/span&gt;

&lt;span class="c"&gt;# Format for TRL SFTTrainer with tool calling&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Formatting for TRL SFTTrainer..."&lt;/span&gt;
deepfabric format &lt;span class="nt"&gt;--repo&lt;/span&gt; &lt;span class="nv"&gt;$REPO&lt;/span&gt; &lt;span class="nt"&gt;--formatter&lt;/span&gt; trl &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BASE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_trl.jsonl"&lt;/span&gt;

&lt;span class="c"&gt;# Format for GRPO reasoning training&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Formatting for GRPO..."&lt;/span&gt;
deepfabric format &lt;span class="nt"&gt;--repo&lt;/span&gt; &lt;span class="nv"&gt;$REPO&lt;/span&gt; &lt;span class="nt"&gt;--formatter&lt;/span&gt; grpo &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BASE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_grpo.jsonl"&lt;/span&gt;

&lt;span class="c"&gt;# Format for Harmony (OpenAI reasoning models)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Formatting for Harmony..."&lt;/span&gt;
deepfabric format &lt;span class="nt"&gt;--repo&lt;/span&gt; &lt;span class="nv"&gt;$REPO&lt;/span&gt; &lt;span class="nt"&gt;--formatter&lt;/span&gt; harmony &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BASE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_harmony.jsonl"&lt;/span&gt;

&lt;span class="c"&gt;# Format for ChatML reasoning chat models&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Formatting for ChatML..."&lt;/span&gt;
deepfabric format &lt;span class="nt"&gt;--repo&lt;/span&gt; &lt;span class="nv"&gt;$REPO&lt;/span&gt; &lt;span class="nt"&gt;--formatter&lt;/span&gt; im_format &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BASE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_chatml.jsonl"&lt;/span&gt;

&lt;span class="c"&gt;# Format for XLAM v2 multi-turn reasoning&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Formatting for XLAM v2..."&lt;/span&gt;
deepfabric format &lt;span class="nt"&gt;--repo&lt;/span&gt; &lt;span class="nv"&gt;$REPO&lt;/span&gt; &lt;span class="nt"&gt;--formatter&lt;/span&gt; xlam_v2 &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BASE_NAME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_xlam.jsonl"&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Conversion complete. Created 5 training-ready formats from a single dataset."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running this script produces TRL-ready format with OpenAI-compatible tool schemas, GRPO format with explicit reasoning/solution separation, Harmony format with reasoning tags for OpenAI models, ChatML format with delimiter-wrapped reasoning, and XLAM v2 format for Salesforce's multi-turn tool calling framework. All formats preserve the chain-of-thought reasoning that improves tool calling accuracy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Supported Formatters
&lt;/h2&gt;

&lt;p&gt;DeepFabric currently supports formatters for major training frameworks:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Formatter&lt;/th&gt;
&lt;th&gt;Command Flag&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Framework&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TRL SFT Tools&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;trl&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tool/function calling&lt;/td&gt;
&lt;td&gt;HuggingFace TRL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GRPO&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;grpo&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Mathematical/logical reasoning&lt;/td&gt;
&lt;td&gt;GRPO training&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Harmony&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;harmony&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Reasoning with XML tags&lt;/td&gt;
&lt;td&gt;OpenAI gpt-oss&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ChatML&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;chatml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Chat models with structure&lt;/td&gt;
&lt;td&gt;ChatML-compatible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Im Format&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;im_format&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Chat with delimiters&lt;/td&gt;
&lt;td&gt;ChatML variants&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Alpaca&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;alpaca&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Instruction following&lt;/td&gt;
&lt;td&gt;Stanford Alpaca&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;XLAM v2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;xlam_v2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Multi-turn tool calling&lt;/td&gt;
&lt;td&gt;Salesforce xLAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tool Calling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;tool_calling&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generic tool calling&lt;/td&gt;
&lt;td&gt;Agent training&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Single Tool&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;single_tool_call&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Individual function calls&lt;/td&gt;
&lt;td&gt;Function execution&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Custom formatters can be created for specific needs using DeepFabric's formatter API.&lt;/p&gt;

&lt;h2&gt;
  
  
  Advanced Configuration
&lt;/h2&gt;

&lt;p&gt;Fine-grained control over formatting is available through YAML configuration files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# formatter_config.yaml&lt;/span&gt;
&lt;span class="na"&gt;dataset&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;formatters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trl_custom"&lt;/span&gt;
      &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;builtin://trl_sft_tools"&lt;/span&gt;
      &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;custom_trl.jsonl"&lt;/span&gt;
      &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;include_system_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;system_prompt_override&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;You are a function calling AI model. You are provided with function&lt;/span&gt;
          &lt;span class="s"&gt;signatures within &amp;lt;tools&amp;gt;&amp;lt;/tools&amp;gt; XML tags. Think through your&lt;/span&gt;
          &lt;span class="s"&gt;reasoning before invoking tools.&lt;/span&gt;
        &lt;span class="na"&gt;validate_tool_schemas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;remove_available_tools_field&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepfabric format &lt;span class="nt"&gt;--repo&lt;/span&gt; your-org/dataset &lt;span class="nt"&gt;-c&lt;/span&gt; formatter_config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Benefits of This Approach
&lt;/h2&gt;

&lt;p&gt;Storage efficiency improves because teams store one canonical dataset and generate formats on-demand rather than maintaining multiple versions. Collaboration becomes simpler when datasets are shared once in a universal format but consumed in framework-specific formats, eliminating coordination overhead. Experimentation costs drop dramatically since trying different frameworks no longer requires expensive regeneration. Reproducibility improves with clear lineage from source to formatted data and version control at the dataset level. Cost savings accumulate because expensive synthetic data generation happens once while cheap reformatting happens as needed.&lt;/p&gt;

&lt;p&gt;The approach is future-proof. When new training frameworks emerge, adding a formatter is straightforward. Datasets remain relevant as frameworks evolve without requiring regeneration as tools improve.&lt;/p&gt;

&lt;h2&gt;
  
  
  Community Impact
&lt;/h2&gt;

&lt;p&gt;This architecture enables new collaboration patterns. Dataset creators can focus on data quality rather than format compatibility, reach wider audiences with a single upload, and receive feedback more easily. Dataset users find more usable datasets, can experiment with different frameworks without risk, and can contribute formatters back to the community. Research teams can compare approaches fairly using identical source data, reproduce results more reliably, and build on each other's work more effectively.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;Install DeepFabric:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;deepfabric
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a configuration file with your custom tools and chain-of-thought settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# my_cot_config.yaml&lt;/span&gt;
&lt;span class="na"&gt;dataset_system_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;are&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;an&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;expert&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;software&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;engineer.&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Explain&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;your&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;reasoning&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;step-by-step&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;before&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;using&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;debugging&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tools."&lt;/span&gt;

&lt;span class="na"&gt;topic_tree&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;topic_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Software&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;debugging&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;developer&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tools"&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai"&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o"&lt;/span&gt;
  &lt;span class="na"&gt;degree&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;depth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.7&lt;/span&gt;
  &lt;span class="na"&gt;save_as&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;debugging_topics.jsonl"&lt;/span&gt;

&lt;span class="na"&gt;data_engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;instructions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Generate&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;realistic&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;debugging&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;scenarios&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;step-by-step&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;reasoning"&lt;/span&gt;
  &lt;span class="na"&gt;generation_system_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;are&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;an&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;expert&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;software&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;engineer.&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Explain&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;your&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;reasoning&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;step-by-step&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;before&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;using&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;debugging&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tools."&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai"&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o"&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.8&lt;/span&gt;
  &lt;span class="na"&gt;conversation_template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cot_structured"&lt;/span&gt;

  &lt;span class="na"&gt;custom_tools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;run_debugger"&lt;/span&gt;
      &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Executes&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;debugger&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;command&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;returns&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;output"&lt;/span&gt;
      &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object"&lt;/span&gt;
        &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string"&lt;/span&gt;
            &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;debugger&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;command&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;execute"&lt;/span&gt;
          &lt;span class="na"&gt;breakpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;integer"&lt;/span&gt;
            &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Line&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;number&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;breakpoint"&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;command"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;dataset&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;creation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;num_steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;
    &lt;span class="na"&gt;batch_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
  &lt;span class="na"&gt;save_as&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_dataset.jsonl"&lt;/span&gt;

&lt;span class="na"&gt;huggingface&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;repository&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-username/debugging-reasoning"&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;software-engineering"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reasoning"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chain-of-thought"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Generate and upload in one step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;HF_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-token"&lt;/span&gt;
deepfabric generate my_cot_config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Format for your training pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepfabric format &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--repo&lt;/span&gt; your-username/debugging-reasoning &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--formatter&lt;/span&gt; trl &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; training_data.jsonl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then train using your preferred framework as shown in the training example above.&lt;/p&gt;

&lt;h3&gt;
  
  
  Working with Existing Datasets
&lt;/h3&gt;

&lt;p&gt;You can format existing community datasets without generating new ones:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Format a community dataset for TRL&lt;/span&gt;
deepfabric format &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--repo&lt;/span&gt; lukehinds/smol-test-sample &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--formatter&lt;/span&gt; trl &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; trl_training.jsonl

&lt;span class="c"&gt;# Or for GRPO reasoning training&lt;/span&gt;
deepfabric format &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--repo&lt;/span&gt; lukehinds/smol-test-sample &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--formatter&lt;/span&gt; grpo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; grpo_training.jsonl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;DeepFabric's format-agnostic architecture decouples data generation from training format requirements. Datasets are generated once in a universal format, shared through HuggingFace Hub, and reformatted on-demand for specific training frameworks. This approach reduces storage requirements, eliminates coordination overhead, enables cost-effective experimentation, and ensures datasets remain useful as the ecosystem evolves.&lt;/p&gt;

&lt;p&gt;The integration of chain-of-thought reasoning with tool calling further improves this workflow by teaching models to reason through problems before invoking functions, resulting in more accurate tool selection and parameter generation in production environments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;p&gt;Documentation: &lt;a href="https://lukehinds.github.io/deepfabric/" rel="noopener noreferrer"&gt;https://lukehinds.github.io/deepfabric/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/lukehinds/deepfabric" rel="noopener noreferrer"&gt;https://github.com/lukehinds/deepfabric&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Discord Community: &lt;a href="https://discord.gg/pPcjYzGvbS" rel="noopener noreferrer"&gt;https://discord.gg/pPcjYzGvbS&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Examples: &lt;a href="https://github.com/lukehinds/deepfabric/tree/main/examples" rel="noopener noreferrer"&gt;https://github.com/lukehinds/deepfabric/tree/main/examples&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;DeepFabric is open source and welcomes contributions. To add support for a new training format, see the &lt;a href="https://github.com/lukehinds/deepfabric/blob/main/docs/formatters/custom-formatter-guide.md" rel="noopener noreferrer"&gt;custom formatter guide&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>agents</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Build High Quality Reasoning Datasets in Minutes Using Natural Prompts</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Wed, 24 Sep 2025 21:03:45 +0000</pubDate>
      <link>https://forem.com/lukehinds/build-chain-of-thought-datasets-in-minutes-using-natural-prompts-4jae</link>
      <guid>https://forem.com/lukehinds/build-chain-of-thought-datasets-in-minutes-using-natural-prompts-4jae</guid>
      <description>&lt;p&gt;As developers, we've all been there. You have a brilliant idea for a Reasoning model you want to fine-tune, but then reality hits: you need training data. Quality training data. &lt;strong&gt;A lot&lt;/strong&gt; of quality training data.&lt;/p&gt;

&lt;p&gt;The traditional path? Weeks of manual data curation, complex prompt engineering, which looks good at first, but turns out to be full of duplicates with very little diversity. You have tried a few of the open source tools, but non of them cut-it. Most of us end up abandoning the project or settling for subpar datasets that produce mediocre models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What if I told you there's a tool that can generate professional-grade CoT datasets in minutes using natural language prompts?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Enter &lt;strong&gt;DeepFabric&lt;/strong&gt; - and it's about to change how you think about dataset creation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: Dataset Creation is Broken
&lt;/h2&gt;

&lt;p&gt;Before DeepFabric, creating CoT datasets meant:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📝 &lt;strong&gt;Manual curation&lt;/strong&gt;: Spending days writing examples by hand&lt;/li&gt;
&lt;li&gt;🔧 &lt;strong&gt;Complex prompt engineering&lt;/strong&gt;: Wrestling with intricate templates&lt;/li&gt;
&lt;li&gt;💸 &lt;strong&gt;Expensive services&lt;/strong&gt;: Paying premium rates for quality data&lt;/li&gt;
&lt;li&gt;🎯 &lt;strong&gt;Limited diversity&lt;/strong&gt;: Struggling to create varied, non-repetitive examples&lt;/li&gt;
&lt;li&gt;⚖️ &lt;strong&gt;Quality vs. quantity&lt;/strong&gt;: Choosing between good data or enough data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most developers either gave up or shipped models trained on insufficient data.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: DeepFabric's Triple Threat
&lt;/h2&gt;

&lt;p&gt;DeepFabric doesn't just solve the dataset problem - it obliterates it with &lt;strong&gt;three different CoT formats&lt;/strong&gt; that cover every use case:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. 🔥 &lt;strong&gt;Free-text CoT&lt;/strong&gt; (GSM8K Style)
&lt;/h3&gt;

&lt;p&gt;Perfect for mathematical reasoning and step-by-step problem solving.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepfabric generate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--mode&lt;/span&gt; tree &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--provider&lt;/span&gt; openai &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model&lt;/span&gt; gpt-4o-mini &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--depth&lt;/span&gt; 2 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--degree&lt;/span&gt; 2 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--num-steps&lt;/span&gt; 4 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--topic-prompt&lt;/span&gt; &lt;span class="s2"&gt;"Mathematical word problems and logical reasoning"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--generation-system-prompt&lt;/span&gt; &lt;span class="s2"&gt;"You are a math tutor creating educational problems"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--conversation-type&lt;/span&gt; cot_freetext &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dataset-save-as&lt;/span&gt; math_reasoning.jsonl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output format:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"question"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Sarah has 24 apples. She gives away 1/3 to her neighbors and keeps 1/4 for herself. How many apples are left?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"chain_of_thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"First, I need to find 1/3 of 24 apples. 24 ÷ 3 = 8 apples given to neighbors. Next, I need to find 1/4 of 24 apples. 24 ÷ 4 = 6 apples kept for herself. Total apples used: 8 + 6 = 14 apples. Apples left: 24 - 14 = 10 apples."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"final_answer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"10 apples"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. 🏗️ &lt;strong&gt;Structured CoT&lt;/strong&gt; (Conversation Based)
&lt;/h3&gt;

&lt;p&gt;Ideal for educational dialogues and systematic problem-solving.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepfabric generate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--mode&lt;/span&gt; graph &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--provider&lt;/span&gt; ollama &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model&lt;/span&gt; qwen3:32b &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--topic-prompt&lt;/span&gt; &lt;span class="s2"&gt;"Computer science algorithms and data structures"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--conversation-type&lt;/span&gt; cot_structured &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--reasoning-style&lt;/span&gt; logical &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dataset-save-as&lt;/span&gt; cs_reasoning.jsonl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output format:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"How would you implement a binary search algorithm?"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"I'll walk you through implementing binary search step by step..."&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reasoning_trace"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"step"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"reasoning"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Define the search space with left and right pointers"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"step"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"reasoning"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Calculate middle index to divide the array"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"step"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"reasoning"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Compare target with middle element"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"final_answer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Binary search works by repeatedly dividing the search interval in half. Start with two pointers: `left = 0` and `right = len(arr) - 1`. While `left &amp;lt;= right`, compute `mid = (left + right) // 2`. If `arr[mid] == target`, return `mid`. If `arr[mid] &amp;lt; target`, move `left = mid + 1`; otherwise move `right = mid - 1`. If the loop ends without finding the target, return -1."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. 🚀 &lt;strong&gt;Hybrid CoT&lt;/strong&gt; (Best of Both Worlds)
&lt;/h3&gt;

&lt;p&gt;Combines natural reasoning with structured steps - perfect for complex domains.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepfabric generate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--provider&lt;/span&gt; gemini &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model&lt;/span&gt; gemini-2.5-flash &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--topic-prompt&lt;/span&gt; &lt;span class="s2"&gt;"Infrastructure outages and their resolution"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--conversation-type&lt;/span&gt; cot_hybrid &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--num-steps&lt;/span&gt; 8 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dataset-save-as&lt;/span&gt; science_hybrid.jsonl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output format:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"question"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A critical production service, 'Order Processing,' is experiencing intermittent failures. Users report orders are occasionally failing to process, leading to lost revenue. The service runs on Kubernetes, utilizing a deployment of 10 pods. Initial monitoring shows CPU and memory usage are within acceptable limits, but error logs indicate sporadic database connection timeouts. You suspect a resource contention issue or a misconfiguration somewhere in the infrastructure. Describe your systematic approach to diagnose and resolve this issue, detailing the tools and techniques you'd employ at each step."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"chain_of_thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"To systematically troubleshoot this intermittent failure, I will follow a structured approach. First, I will gather more information about the failures, including frequency, error messages, and affected users. Then, I will examine the Kubernetes deployment, focusing on resource limits, liveness/readiness probes, and networking. Next, I will investigate the database connection pool and its configuration. I will also analyze the application code for potential resource leaks or inefficient database queries. Finally, I will implement monitoring and alerting to proactively detect and prevent future issues."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reasoning_trace"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Gather detailed information about the failures. This includes the exact error messages, timestamps, affected users, and any patterns in the failures (e.g., time of day, specific order types)."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Analyze application logs, Kubernetes events, and user reports to collect failure data."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Examine the Kubernetes deployment configuration. Verify resource limits (CPU, memory) are appropriately set for the pods. Check liveness and readiness probes to ensure pods are healthy and responsive. Inspect networking configuration to rule out connectivity issues."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Use `kubectl describe deployment order-processing` and `kubectl get pods -o wide` to inspect the deployment and pod configurations. Check network policies and service definitions."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Investigate the database connection pool configuration. Verify the maximum number of connections is sufficient for the workload. Check for connection leaks or inefficient connection management in the application code."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Examine the application's database connection pool settings. Use database monitoring tools to track connection usage and identify potential leaks. Analyze application code for database query performance and connection handling."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Analyze application code for potential resource leaks or inefficient database queries. Use profiling tools to identify performance bottlenecks and memory leaks. Optimize database queries to reduce load on the database."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Use profiling tools like `pprof` or Java profilers to analyze application performance. Examine database query logs and use `EXPLAIN` to optimize slow queries."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Implement enhanced monitoring and alerting. Set up alerts for database connection errors, high latency, and resource utilization. Use dashboards to visualize key metrics and identify trends."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Configure Prometheus and Grafana to monitor database connections, latency, and resource usage. Set up alerts in Alertmanager to notify on-call engineers of critical issues."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Consider implementing a circuit breaker pattern to prevent cascading failures. If the database becomes unavailable, the circuit breaker will prevent the application from overwhelming the database with requests."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Implement a circuit breaker using a library like Hystrix or Resilience4j in the application code."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Implement connection retry logic with exponential backoff. This will allow the application to gracefully handle temporary database connection issues."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Implement retry logic using a library like Guava Retry or Spring Retry in the application code."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Conduct load testing to simulate production traffic and identify potential bottlenecks. Use load testing tools like JMeter or Gatling to generate realistic workloads."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Run load tests with increasing traffic to identify the point at which the database connection timeouts occur."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"If the database is the bottleneck, consider scaling the database or optimizing its configuration. This may involve adding more resources to the database server, tuning database parameters, or implementing database sharding."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Scale the database vertically (add more CPU, memory) or horizontally (add more database replicas). Tune database parameters like `max_connections` and `shared_buffers`."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"step_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"thought"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Document the troubleshooting steps and the resolution. Create a runbook for future incidents. Share the knowledge with the team to prevent similar issues in the future."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Create a detailed runbook documenting the troubleshooting steps, the root cause, and the resolution. Share the runbook with the team and conduct a post-mortem analysis."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"final_answer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The intermittent failures are likely due to database connection timeouts caused by resource contention or misconfiguration. The systematic approach involves gathering detailed information, examining the Kubernetes deployment, investigating the database connection pool, analyzing application code, implementing enhanced monitoring and alerting, and considering circuit breakers and retry logic. Load testing and database scaling may also be necessary. Finally, documenting the troubleshooting steps and sharing the knowledge with the team is crucial."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why Developers Are Going Crazy for DeepFabric
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🧠 &lt;strong&gt;Smart Topic Generation&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;DeepFabric doesn't just generate random examples. It creates a &lt;strong&gt;hierarchical topic tree&lt;/strong&gt; or &lt;strong&gt;graph-nodes&lt;/strong&gt; first, ensuring your dataset covers diverse subtopics without redundancy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Mathematical Reasoning
├── Algebra Problems
│   ├── Linear Equations
│   └── Quadratic Functions
└── Geometry Problems
    ├── Area Calculations
    └── Volume Problems
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🔧 &lt;strong&gt;YAML Configuration = Zero Complexity&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;No more complex prompt engineering. Just describe what you want:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# cot_config.yaml&lt;/span&gt;
&lt;span class="na"&gt;dataset_system_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;are&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;helpful&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;AI&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;that&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;solves&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;problems&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;step-by-step"&lt;/span&gt;

&lt;span class="na"&gt;topic_tree&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;topic_prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Programming&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;challenges&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;algorithms"&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ollama"&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qwen3:32b"&lt;/span&gt;
  &lt;span class="na"&gt;depth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;degree&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;

&lt;span class="na"&gt;data_engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;conversation_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cot_hybrid"&lt;/span&gt;
  &lt;span class="na"&gt;reasoning_style&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;logical"&lt;/span&gt;
  &lt;span class="na"&gt;instructions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Create&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;coding&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;problems&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;that&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;require&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;systematic&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;thinking"&lt;/span&gt;

&lt;span class="na"&gt;dataset&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;creation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;num_steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;
    &lt;span class="na"&gt;batch_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then run: &lt;code&gt;deepfabric generate cot_config.yaml&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🌐 &lt;strong&gt;Multi-Provider Freedom&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Switch between providers based on your needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI GPT-4&lt;/strong&gt; for complex reasoning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama&lt;/strong&gt; for local, private generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini&lt;/strong&gt; for fast bulk creation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic Claude&lt;/strong&gt; for nuanced problems&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  📤 &lt;strong&gt;Instant HuggingFace Integration&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepfabric generate config.yaml &lt;span class="nt"&gt;--hf-repo&lt;/span&gt; username/my-cot-dataset
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your dataset is automatically uploaded with a generated dataset card. No manual uploads, no fuss.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Impact: What Developers Are Building
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;🎓 Educational AI&lt;/strong&gt;: Teachers creating personalized math tutoring datasets&lt;br&gt;
&lt;strong&gt;🤖 Agent Training&lt;/strong&gt;: Developers building reasoning agents for complex tasks&lt;br&gt;
&lt;strong&gt;📊 Research&lt;/strong&gt;: ML researchers generating evaluation benchmarks&lt;br&gt;
&lt;strong&gt;💼 Enterprise&lt;/strong&gt;: Companies creating domain-specific reasoning models&lt;/p&gt;
&lt;h2&gt;
  
  
  The Numbers Don't Lie
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;⏱️ 95% faster&lt;/strong&gt; than manual dataset creation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;📈 10x more diverse&lt;/strong&gt; examples per domain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;💰 80% cost reduction&lt;/strong&gt; compared to data labeling services&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;🎯 Zero prompt engineering&lt;/strong&gt; required&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Ready to Transform Your ML Pipeline?
&lt;/h2&gt;

&lt;p&gt;Getting started takes literally 30 seconds:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;deepfabric

&lt;span class="c"&gt;# Generate your first CoT dataset&lt;/span&gt;
deepfabric generate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--topic-prompt&lt;/span&gt; &lt;span class="s2"&gt;"Your domain here"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--conversation-type&lt;/span&gt; cot_freetext &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--num-steps&lt;/span&gt; 10 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--provider&lt;/span&gt; openai &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model&lt;/span&gt; gpt-4o-mini

&lt;span class="c"&gt;# Watch the magic happen ✨&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;The ML community is moving fast, and quality training data is the bottleneck. DeepFabric removes that bottleneck entirely.&lt;/p&gt;

&lt;p&gt;Whether you're building the next breakthrough in reasoning AI or just need better training data for your side project, DeepFabric gives you superpowers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stop spending weeks on dataset creation. Start building better models today.&lt;/strong&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Try DeepFabric Now:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;📚 &lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/lukehinds/deepfabric" rel="noopener noreferrer"&gt;https://github.com/lukehinds/deepfabric&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📖 &lt;strong&gt;Documentation&lt;/strong&gt;: &lt;a href="https://lukehinds.github.io/DeepFabric/" rel="noopener noreferrer"&gt;https://lukehinds.github.io/DeepFabric/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;💬 &lt;strong&gt;Discord&lt;/strong&gt;: Join the community for support and sharing datasets&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;What kind of CoT dataset will you build first? Drop a comment and let's discuss! 🚀&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags&lt;/strong&gt;: #MachineLearning #AI #Datasets #ChainOfThought #Python #OpenSource #MLOps #DataScience #DeepLearning #ArtificialIntelligence&lt;/p&gt;

</description>
      <category>llm</category>
      <category>machinelearning</category>
      <category>tooling</category>
    </item>
    <item>
      <title>🚀 Build a ☸️ Kubernetes ☸️ SRE Agent in Just Ten Minutes! ⏱️⚡</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Tue, 02 Sep 2025 09:42:55 +0000</pubDate>
      <link>https://forem.com/lukehinds/build-a-kubernetes-sre-agent-in-just-ten-minutes-ke4</link>
      <guid>https://forem.com/lukehinds/build-a-kubernetes-sre-agent-in-just-ten-minutes-ke4</guid>
      <description>&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/BQ0MT7UzDKg"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;Hey dev.to community! Have you ever wished you could automate more of your Kubernetes operational tasks, freeing up valuable SRE time? I made a video recently on "Building an AgentUp Kubernetes Agent in only Ten Minutes," demonstrates how to do just that with incredible speed and efficiency using &lt;strong&gt;AgentUp&lt;/strong&gt;. This isn't just about scripting; it's about creating an &lt;strong&gt;autonomous, goal-driven agent&lt;/strong&gt; that can understand, plan, and execute complex Kubernetes operations.&lt;/p&gt;

&lt;p&gt;Let's dive into how the new and exciting &lt;strong&gt;&lt;a href="https://github.com/RedDotRocket/AgentUp" rel="noopener noreferrer"&gt;AgentUp&lt;/a&gt;&lt;/strong&gt; framework makes this possible and what makes this demonstration so compelling for developers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Getting Started: A New Agent in a Flash
&lt;/h3&gt;

&lt;p&gt;The video kicks off by showing just how easy it is to initialize a new agent using the &lt;code&gt;agentup init&lt;/code&gt; command. The key choices made set the stage for a powerful automation tool:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Iterative Agent:&lt;/strong&gt; This is the core of its intelligence – a &lt;strong&gt;goal-driven autonomous agent&lt;/strong&gt; capable of planning, decomposing tasks, and iterating through them until a goal is achieved. The demo generously allocates 50 iterations, showing its capacity for complex problem-solving.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Essential Components:&lt;/strong&gt; The initialization includes authentication, state management, an AI provider (OpenAI in this case), MCP (Model Context Protocol), and deployment options. For simplicity in the demo, a simple API key is used for authentication, with state persisted to a file.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Brain of AgentUp: &lt;code&gt;agentup.yaml&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Once initialized, the &lt;code&gt;agentup.yaml&lt;/code&gt; file becomes the central configuration point – effectively, the &lt;strong&gt;"brain of AgentUp"&lt;/strong&gt;. What's particularly appealing for developers here is its &lt;strong&gt;declarative runtime&lt;/strong&gt;, allowing you to set everything in a &lt;strong&gt;single point of truth&lt;/strong&gt;. This configuration is designed to be &lt;strong&gt;portable&lt;/strong&gt; and easily shared or stored in a repository.&lt;/p&gt;

&lt;p&gt;A critical part of the configuration involves the &lt;strong&gt;MCP (Model Context Protocol)&lt;/strong&gt;, where we use the &lt;a href="https://github.com/containers/kubernetes-mcp-server" rel="noopener noreferrer"&gt;&lt;strong&gt;Kubernetes MCP Server&lt;/strong&gt;&lt;/a&gt; from the containers team. In AgentUp, MCP tools are exposed as &lt;strong&gt;A2A (Agent-to-Agent) skills&lt;/strong&gt;, making them discoverable capabilities. This means one agent can query another to find out if it possesses the necessary skills, for example, to list pods or check cluster state.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Robust Security with Scopes&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;One of the most impressive aspects for a production environment is AgentUp's approach to security. Each tool available within the Kubernetes MCP server is mapped to a &lt;strong&gt;scope&lt;/strong&gt;, which acts as a security permission.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  The demo uses a wildcard scope for simplicity, granting admin access.&lt;/li&gt;
&lt;li&gt;  However, the system supports a &lt;strong&gt;top-down scope hierarchy&lt;/strong&gt;, allowing for highly granular control over what the MCP server presents to the model.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Crucially, the model is not even aware of tools it has no scope for&lt;/strong&gt;, and any attempt to use them would be denied. This ensures that your agent operates only within its designated permissions. While a fixed API key is used for the demo, in a real-world scenario, these scopes would typically be within a signed token for robust validation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Putting the Agent to Work: Live Kubernetes Operations
&lt;/h3&gt;

&lt;p&gt;After a quick virtual environment setup and running the agent, the demonstration shifts to a front-end client that communicates with the agent via &lt;strong&gt;A2A JSON RPC methods&lt;/strong&gt;. This client immediately pulls and displays the available tools/skills from the agent.&lt;/p&gt;

&lt;p&gt;The demo beautifully illustrates the iterative agent's capabilities with a series of chained commands:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Discovering Namespaces and Pods:&lt;/strong&gt; The agent is asked, "How many namespaces do I have in my cluster and which namespaces contain the most pods?".

&lt;ul&gt;
&lt;li&gt;  The agent &lt;strong&gt;authenticates&lt;/strong&gt;, identifies available tools, and selects &lt;code&gt;namespaces list&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;  It then proceeds to &lt;code&gt;pods list in namespace&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;  The agent successfully breaks this down into &lt;strong&gt;three tasks&lt;/strong&gt;: listing namespaces, counting pods per namespace, and ranking namespaces by pod count. It correctly identifies the namespace with the most pods, showcasing its &lt;strong&gt;goal-driven task decomposition&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Checking Pod Running Times:&lt;/strong&gt; A follow-up query, "How long have the pods been running within the agent up cluster?", demonstrates the agent's ability to extract specific information, again calling &lt;code&gt;MCP pods listing namespace&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Performing a Deletion (and Verification!):&lt;/strong&gt; In a bold move, the demonstration requests, "Please remove the completed completed pod".

&lt;ul&gt;
&lt;li&gt;  The agent processes this, calling &lt;code&gt;pods delete&lt;/code&gt;, and successfully removes the specified pod. This highlights its &lt;strong&gt;write capabilities&lt;/strong&gt; and the impact of its assigned scopes.&lt;/li&gt;
&lt;li&gt;  To complete the loop, a verification step, "verify the pod has actually been removed," is performed, with the agent confirming the removal and providing a &lt;strong&gt;100% confidence score&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What This Means for Developers
&lt;/h3&gt;

&lt;p&gt;This AgentUp demo highlights several powerful takeaways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Speed of Development:&lt;/strong&gt; You can &lt;strong&gt;configure an agent incredibly quickly&lt;/strong&gt;, essentially needing to "set a persona for the agent, give it some tools, and then send it to work".&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Autonomous Operations:&lt;/strong&gt; The &lt;strong&gt;iterative, goal-driven nature&lt;/strong&gt; of these agents means they can handle complex tasks with minimal human intervention.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Secure &amp;amp; Discoverable Capabilities:&lt;/strong&gt; The &lt;strong&gt;MCP system with granular scopes&lt;/strong&gt; ensures secure operations, while A2A skills make agent capabilities discoverable, paving the way for advanced agent collaboration.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The video hints at exciting future developments like &lt;strong&gt;agent collaboration&lt;/strong&gt;, which promises to unlock even more complex automation scenarios.&lt;/p&gt;

&lt;p&gt;If you're looking to bring intelligent automation to your Kubernetes clusters and streamline SRE workflows, AgentUp looks like a game-changer. Check out the video and consider exploring AgentUp yourself – there's even an upcoming agent workshop mentioned where you can build your own agents! This is definitely one to watch for anyone interested in the future of autonomous operations.&lt;/p&gt;

&lt;p&gt;The Agentup.yml is available here, to get you going in minutes!&lt;/p&gt;

&lt;p&gt;What other Agents could you build, using your favorite MCP servers?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://gist.github.com/lukehinds/1013cfe94a5417715a71ae4c5e9b6c8c" rel="noopener noreferrer"&gt;https://gist.github.com/lukehinds/1013cfe94a5417715a71ae4c5e9b6c8c&lt;/a&gt; &lt;/p&gt;

</description>
      <category>agentup</category>
      <category>kubernetes</category>
      <category>ai</category>
      <category>devops</category>
    </item>
    <item>
      <title>vibe coders welcome</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Sat, 30 Aug 2025 12:40:53 +0000</pubDate>
      <link>https://forem.com/lukehinds/vibe-coders-welcome-4cfp</link>
      <guid>https://forem.com/lukehinds/vibe-coders-welcome-4cfp</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/lukehinds/build-any-agent-you-like-at-the-free-agentup-workshop-come-try-out-the-hottest-open-source-agent-3alg" class="crayons-story__hidden-navigation-link"&gt;🤖 Build Your First Fully Autonomous AI Agent - Free Workshop! 🚀&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/lukehinds" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F978931%2F0fa78d0e-564a-46f3-98ff-2b76b40859d3.jpeg" alt="lukehinds profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/lukehinds" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Luke Hinds
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Luke Hinds
                
              
              &lt;div id="story-author-preview-content-2809288" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/lukehinds" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F978931%2F0fa78d0e-564a-46f3-98ff-2b76b40859d3.jpeg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Luke Hinds&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/lukehinds/build-any-agent-you-like-at-the-free-agentup-workshop-come-try-out-the-hottest-open-source-agent-3alg" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Aug 30 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/lukehinds/build-any-agent-you-like-at-the-free-agentup-workshop-come-try-out-the-hottest-open-source-agent-3alg" id="article-link-2809288"&gt;
          🤖 Build Your First Fully Autonomous AI Agent - Free Workshop! 🚀
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/agentaichallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;agentaichallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/aiops"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;aiops&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/opensource"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;opensource&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/lukehinds/build-any-agent-you-like-at-the-free-agentup-workshop-come-try-out-the-hottest-open-source-agent-3alg" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;5&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/lukehinds/build-any-agent-you-like-at-the-free-agentup-workshop-come-try-out-the-hottest-open-source-agent-3alg#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              3&lt;span class="hidden s:inline"&gt; comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            1 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>ai</category>
      <category>agentaichallenge</category>
      <category>aiops</category>
      <category>opensource</category>
    </item>
    <item>
      <title>🤖 Build Your First Fully Autonomous AI Agent - Free Workshop! 🚀</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Sat, 30 Aug 2025 12:27:43 +0000</pubDate>
      <link>https://forem.com/lukehinds/build-any-agent-you-like-at-the-free-agentup-workshop-come-try-out-the-hottest-open-source-agent-3alg</link>
      <guid>https://forem.com/lukehinds/build-any-agent-you-like-at-the-free-agentup-workshop-come-try-out-the-hottest-open-source-agent-3alg</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvl88bzsh1wpdtfrkoqq8.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvl88bzsh1wpdtfrkoqq8.jpg" alt=" " width="800" height="906"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ever wanted to build your own multi-turn autonomous AI Agent? &lt;/p&gt;

&lt;p&gt;A red teamer 👾, coding assistant 🧑🏽‍💻, CI bug fixer 💽, AI Dog Washer 🐶?&lt;/p&gt;

&lt;p&gt;Got a burning business idea you'd love to prototype with AI agents, but don't want to spend 2-3 days wrestling with spaghetti like over-bloated frameworks?&lt;/p&gt;

&lt;p&gt;Well come along to the "Build an Agent", &lt;a href="https://github.com/RedDotRocket/AgentUp" rel="noopener noreferrer"&gt;AgentUp&lt;/a&gt; Workshop and you get to take home your own working agent at the end! 🤖&lt;/p&gt;

&lt;p&gt;What you need:&lt;br&gt;
Ability to use a terminal ✅&lt;br&gt;
Can edit config files ✅&lt;br&gt;
Some imagination ✅&lt;br&gt;
✅ No coding skills required , vibe coders are welcome 🚀&lt;/p&gt;

&lt;p&gt;Bring whatever MCP servers you want, plug them into AgentUp, and bob's your uncle - you've got a working agent!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/RedDotRocket/AgentUp" rel="noopener noreferrer"&gt;AgentUp&lt;/a&gt; is the Open Source Agent Framework to learn for the future. It built by engineers who've created open-source solutions powering mission-critical systems at Google, GitHub, Nvidia, Red Hat, Shopify and more. We understand what it takes to build stable, secure, scalable software - and we're applying those same principles to make AI agents truly production-ready, secure and reliable.&lt;/p&gt;

&lt;p&gt;Friday 09/05 at 7PM C.E.T / 10:00 am PDT on the RDR discord:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://discord.gg/gqSg8YMm?event=1410340261778952263" rel="noopener noreferrer"&gt;https://discord.gg/gqSg8YMm?event=1410340261778952263&lt;/a&gt; &lt;/p&gt;

</description>
      <category>ai</category>
      <category>agentaichallenge</category>
      <category>aiops</category>
      <category>opensource</category>
    </item>
    <item>
      <title>The Agent Paradox: Why AI Agents Shine in Demos But Stumble in Production</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Tue, 26 Aug 2025 12:23:06 +0000</pubDate>
      <link>https://forem.com/lukehinds/the-agent-paradox-why-ai-agents-shine-in-demos-but-stumble-in-production-1he1</link>
      <guid>https://forem.com/lukehinds/the-agent-paradox-why-ai-agents-shine-in-demos-but-stumble-in-production-1he1</guid>
      <description>&lt;h2&gt;
  
  
  It worked so well before, what went wrong?
&lt;/h2&gt;

&lt;p&gt;A recurring pattern has emerged across the AI Agent world: Agents that demonstrate remarkable capabilities in controlled environments often struggle when deployed to real-world scenarios. These systems execute complex tasks with seemingly magical precision during demonstrations, impressing audiences and convincing stakeholders, when on strict guardrails with highly specific system prompts. However, production deployment tells a different story. &lt;/p&gt;

&lt;p&gt;The same agent that wow'ed with the demo, begins to stumble in the wild. It enters infinite loops, produces inconsistent outputs, or worse—fails catastrophically with real financial and reputational consequences. This phenomenon reveals the &lt;strong&gt;Agent Paradox&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This phenomenon isn't just an unfortunate coincidence; it's a fundamental challenge that exposes the mismatch between our traditional engineering practices and the probabilistic nature of AI systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Heart of the Problem
&lt;/h2&gt;

&lt;p&gt;The core issue lies in the &lt;strong&gt;probabilistic nature&lt;/strong&gt; of Large Language Models (LLMs) that power these agents. Unlike traditional software, LLMs don't produce the same output for identical inputs. This non-determinism creates a perfect storm of engineering challenges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unpredictable behavior&lt;/strong&gt; under identical conditions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Difficulty in comprehensive testing&lt;/strong&gt; of all potential outcomes
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complex failure modes&lt;/strong&gt; that emerge only in production environments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Coming aground" scenarios&lt;/strong&gt; where agents get stuck in loops or halt unexpectedly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When agents encounter novel user interactions, live data streams, or unexpected API responses, their brittleness becomes painfully apparent. This reality demands a fundamental rethinking of how we develop, test, and manage agentic software.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fundamental Engineering Conflict
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Traditional Software Engineering: The Deterministic World
&lt;/h3&gt;

&lt;p&gt;For decades, our engineering practices have been built on solid deterministic foundations:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Predictable Logic&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_tax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;income&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;income&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;50000&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;income&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;income&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Given input &lt;code&gt;x&lt;/code&gt;, function always produces output &lt;code&gt;y&lt;/code&gt;. Simple. Reliable. Testable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repeatable Testing&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_tax_calculation&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;calculate_tax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;calculate_tax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;12000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run this test a million times—same result every time. CI passes, we ship!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Transparent Debugging&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When bugs occur, we set breakpoints, inspect variables, trace the call stack. The application's state is frozen in time, allowing systematic analysis of exactly what went wrong.&lt;/p&gt;

&lt;p&gt;This is engineering in a determined world.&lt;/p&gt;

&lt;h3&gt;
  
  
  AI Agents: Welcome to the Probabilistic Universe
&lt;/h3&gt;

&lt;p&gt;AI agents operate in a fundamentally different paradigm that &lt;strong&gt;shatters our deterministic world&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Probabilistic Nature&lt;/strong&gt;&lt;br&gt;
Instead of computing a single "correct" output, agents calculate probability distributions over vast possibility spaces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;P(Output | Input) = complex probability distribution
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent samples from this distribution, and the emergent behaviors are often beyond our complete understanding.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Black Box Problem&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There's no breakpoint you can set inside a neural network during inference. The agent's "state" exists as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High-dimensional vectors that are opaque to human inspection&lt;/li&gt;
&lt;li&gt;Complex weight matrices with billions of parameters&lt;/li&gt;
&lt;li&gt;Emergent patterns that resist traditional analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When an agent fails, debugging becomes an exercise in educated guesswork rather than systematic analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Implications
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Demo Environment vs. Production Reality
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Demo Environment:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Controlled inputs and scenarios&lt;/li&gt;
&lt;li&gt;Curated test cases that showcase strengths&lt;/li&gt;
&lt;li&gt;Limited edge cases&lt;/li&gt;
&lt;li&gt;Forgiving error handling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Production Environment:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unpredictable user behavior&lt;/li&gt;
&lt;li&gt;Malformed or unexpected data&lt;/li&gt;
&lt;li&gt;Integration with flaky external APIs&lt;/li&gt;
&lt;li&gt;Real stakes for failures&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Brittle Agent Phenomenon
&lt;/h3&gt;

&lt;p&gt;Consider a customer service agent that works perfectly in demos but fails in production when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Users provide ambiguous requests&lt;/li&gt;
&lt;li&gt;External systems return unexpected error codes&lt;/li&gt;
&lt;li&gt;The conversation context becomes too complex&lt;/li&gt;
&lt;li&gt;Edge cases emerge that weren't anticipated during development&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Moving Forward: New Engineering Paradigms
&lt;/h2&gt;

&lt;p&gt;The Agent Paradox isn't just a technical challenge—it's a call for &lt;strong&gt;new engineering methodologies&lt;/strong&gt; adapted to probabilistic systems:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Probabilistic Testing Strategies
&lt;/h3&gt;

&lt;p&gt;Instead of asserting exact outputs, we need to test distributions and ranges of acceptable behaviors.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Observability Over Debugging
&lt;/h3&gt;

&lt;p&gt;Since we can't debug AI agents traditionally, we need robust monitoring, logging, and behavioral analysis systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Graceful Degradation by Design
&lt;/h3&gt;

&lt;p&gt;Systems must be architected to handle probabilistic failures elegantly, with multiple fallback strategies.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Continuous Behavioral Validation
&lt;/h3&gt;

&lt;p&gt;Ongoing monitoring of agent behavior in production, with automated detection of drift or degradation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The Agent Paradox represents more than a technical hurdle—it's a paradigm shift that challenges decades of software engineering practices. As we continue to integrate AI agents into critical systems, we must evolve our methodologies to embrace probabilistic behavior while maintaining the reliability our users expect.&lt;/p&gt;

&lt;p&gt;The magic of AI agents in demos is real, but harnessing that magic reliably in production requires us to fundamentally rethink how we build, test, and maintain software systems.&lt;/p&gt;

&lt;p&gt;The future belongs to engineers who can bridge the gap between deterministic expectations and probabilistic realities.&lt;/p&gt;

&lt;p&gt;We are trying to our small part towards making that happen with &lt;a href="https://github.com/RedDotRocket/AgentUp" rel="noopener noreferrer"&gt;AgentUp&lt;/a&gt;, enterprise grade Agents , built with good fundamental principles of software engineering. &lt;/p&gt;




&lt;p&gt;&lt;em&gt;What strategies have you found effective for managing probabilistic AI systems in production? Share your experiences in the comments below.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; #ai #agents #llm #softwaredevelopment #engineering #testing #debugging #machinelearning&lt;/p&gt;

</description>
    </item>
    <item>
      <title>What is the Best Programming Language for AI Engineering!</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Thu, 07 Aug 2025 09:16:33 +0000</pubDate>
      <link>https://forem.com/lukehinds/-473a</link>
      <guid>https://forem.com/lukehinds/-473a</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/lukehinds/what-is-the-best-language-for-ai-engineering-4iom" class="crayons-story__hidden-navigation-link"&gt;What Is The Best Language 🏆for AI Engineering❓&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/lukehinds" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F978931%2F0fa78d0e-564a-46f3-98ff-2b76b40859d3.jpeg" alt="lukehinds profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/lukehinds" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Luke Hinds
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Luke Hinds
                
              
              &lt;div id="story-author-preview-content-2757717" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/lukehinds" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F978931%2F0fa78d0e-564a-46f3-98ff-2b76b40859d3.jpeg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Luke Hinds&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/lukehinds/what-is-the-best-language-for-ai-engineering-4iom" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Aug 7 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/lukehinds/what-is-the-best-language-for-ai-engineering-4iom" id="article-link-2757717"&gt;
          What Is The Best Language 🏆for AI Engineering❓
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/python"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;python&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/rust"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;rust&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/go"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;go&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/lukehinds/what-is-the-best-language-for-ai-engineering-4iom" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;7&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/lukehinds/what-is-the-best-language-for-ai-engineering-4iom#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              1&lt;span class="hidden s:inline"&gt; comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            2 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>ai</category>
      <category>python</category>
      <category>rust</category>
      <category>go</category>
    </item>
    <item>
      <title>What Is The Best Language 🏆for AI Engineering❓</title>
      <dc:creator>Luke Hinds</dc:creator>
      <pubDate>Thu, 07 Aug 2025 09:00:47 +0000</pubDate>
      <link>https://forem.com/lukehinds/what-is-the-best-language-for-ai-engineering-4iom</link>
      <guid>https://forem.com/lukehinds/what-is-the-best-language-for-ai-engineering-4iom</guid>
      <description>&lt;p&gt;I have had a few folks ask me about language choices in AI Engineering. Now before I get into this, there is never a one-size fits all answer, so much of it is based on many nuances, with a key one being the fluency of the individual or team in the choice of language.&lt;/p&gt;

&lt;p&gt;We have also seen AI become commoditized where you don't need to understand complex math or neural networks, and instead it's about slinging requests around, handling concurrency and other common web-like programming paradigms. &lt;/p&gt;

&lt;p&gt;Having said that , here are my views and why as someone who has historically been more of a static typed enthusiast - Python is coming out on top and its why I am building &lt;a href="https://github.com/RedDotRocket/AgentUp" rel="noopener noreferrer"&gt;AgentUp&lt;/a&gt; in Python.&lt;/p&gt;

&lt;h2&gt;
  
  
  1 - Python
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Massive AI/ML ecosystem (transformers, pytorch, OpenAI libs, etc.)&lt;/li&gt;
&lt;li&gt;Performance when it matters - heavy computation happens in optimized C/CUDA libraries under the hood, so you get Python's simplicity for App logic + C's speed for AI operations&lt;/li&gt;
&lt;li&gt;Decent async support for concurrent agents (I love FastAPI)&lt;/li&gt;
&lt;li&gt;Yes, its dynamically typed, but Pydantic helps lots with data validation (and v2's Rust core from a performance standpoint)&lt;/li&gt;
&lt;li&gt;Lots of strong frameworks, langchain, autogen, langgraph, pydantic AI and more.&lt;/li&gt;
&lt;li&gt;Extensive libraries for common agent needs (web scraping, APIs, databases)&lt;/li&gt;
&lt;li&gt;It's been the go to data science lang for years, so has a strong pipeline of research to product - e.g. jupyter -&amp;gt; prototype -&amp;gt; product.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  No 2. Go
&lt;/h2&gt;

&lt;p&gt;Go would be my second choice, especially if performance and deployment simplicity are priorities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Excellent concurrency model with goroutines (ideal for multi-agent systems)&lt;/li&gt;
&lt;li&gt;Single binary deployments with no dependency management headaches&lt;/li&gt;
&lt;li&gt;Strong performance for I/O-heavy operations (common in agents)&lt;/li&gt;
&lt;li&gt;Growing ecosystem for AI integrations&lt;/li&gt;
&lt;li&gt;Excellent native networking (net/http is phenomenal and the gRPC ecosystem strength for microservices architectures)&lt;/li&gt;
&lt;li&gt;No native CUDA bindings (I am not a fan of CGo).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  No. 3 Rust
&lt;/h2&gt;

&lt;p&gt;Rust ranks third , another strong statically typed language, with great tooling support (cargo ftw!)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Exceptional performance and memory safety&lt;/li&gt;
&lt;li&gt;Excellent for systems-level work&lt;/li&gt;
&lt;li&gt;But smaller AI ecosystem and steeper learning curve, but huggingface are building out more libraries there.&lt;/li&gt;
&lt;li&gt;Development velocity would likely be slower&lt;/li&gt;
&lt;li&gt;Better suited if you're building performance-critical infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Worthy Mention, Typescript!
&lt;/h2&gt;

&lt;p&gt;TypeScript/Node.js/Deno deserves a mention for teams already in that ecosystem! Vercel's AI SDK is decent framework!&lt;/p&gt;

&lt;p&gt;Of course there are many more nuances though. Certain languages can be applied where they are strong. If you're building Agents that deal with web APIs all day, you're not too concerned about gpu inference etc.&lt;/p&gt;

&lt;p&gt;As with any x vs y post, I am sure I have opened a can of worms! My next topic will be vim vs emacs, just kidding&lt;/p&gt;

&lt;h3&gt;
  
  
  What are your thoughts, which language did you choose and why?
&lt;/h3&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>rust</category>
      <category>go</category>
    </item>
  </channel>
</rss>
