<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Marius Momeu</title>
    <description>The latest articles on Forem by Marius Momeu (@mariusmomeu).</description>
    <link>https://forem.com/mariusmomeu</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3757191%2Fd0db92e7-aed8-423b-906d-369d9ac1eebf.png</url>
      <title>Forem: Marius Momeu</title>
      <link>https://forem.com/mariusmomeu</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mariusmomeu"/>
    <language>en</language>
    <item>
      <title>I Spent $174 Transpiling 12 Open-Source C Projects (28K Lines) to Rust. Here's What Happened.</title>
      <dc:creator>Marius Momeu</dc:creator>
      <pubDate>Thu, 12 Feb 2026 15:51:41 +0000</pubDate>
      <link>https://forem.com/mariusmomeu/i-spent-174-transpiling-12-open-source-c-projects-28k-lines-to-rust-heres-what-happened-3g9i</link>
      <guid>https://forem.com/mariusmomeu/i-spent-174-transpiling-12-open-source-c-projects-28k-lines-to-rust-heres-what-happened-3g9i</guid>
      <description>&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;I've been curious about how far AI agents can go with real systems programming work — not toy examples, but actual C libraries with pointer arithmetic, &lt;code&gt;void*&lt;/code&gt; generics, linked lists, and cryptographic primitives. So I extracted 46 functions and modules from 12 open-source C projects and pointed our transpiler agent — designed via &lt;a href="https://github.com/ksenxx/kiss_ai" rel="noopener noreferrer"&gt;KISS AI&lt;/a&gt; and using Opus 4.5 as the LLM — at each one to see if it could produce idiomatic, memory-safe Rust without human intervention.&lt;/p&gt;

&lt;p&gt;Each project came with a gauntlet of JSON test vectors defining success criteria. But the agent couldn't simply aim for those — it first had to generate comprehensive unit tests and achieve byte-for-byte I/O equivalence with the C originals through FFI verification, only then validating against the provided test vectors. All without human intervention.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real-World Libraries (12 Projects)
&lt;/h2&gt;

&lt;p&gt;These are functions and modules pulled from 12 real-world open-source projects, covering everything from low-level string manipulation to full collision detection engines, procedural noise generation, and post-quantum cryptography:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/DaveGamble/cJSON" rel="noopener noreferrer"&gt;cJSON&lt;/a&gt;&lt;/strong&gt; (~3,800 lines) — a lightweight JSON parser with recursive tree structures using &lt;code&gt;next&lt;/code&gt;/&lt;code&gt;prev&lt;/code&gt;/&lt;code&gt;child&lt;/code&gt; pointers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/curl/curl" rel="noopener noreferrer"&gt;curl&lt;/a&gt;&lt;/strong&gt; (~50 lines) — path manipulation and string duplication functions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/nothings/stb" rel="noopener noreferrer"&gt;stb&lt;/a&gt;&lt;/strong&gt; (~9,700 lines) — Sean Barrett's famous single-header C libraries, including hash maps, dynamic arrays, and string containers that achieve generics through &lt;code&gt;void*&lt;/code&gt; type erasure and macro magic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/wazuh/wazuh" rel="noopener noreferrer"&gt;Wazuh&lt;/a&gt;&lt;/strong&gt; (~1,200 lines) — base64 encoding/decoding, UTF-8 handling, search-and-replace, file queue management from this security monitoring platform&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/RandyGaul/cute_headers" rel="noopener noreferrer"&gt;cute_c2&lt;/a&gt;&lt;/strong&gt; (~6,000 lines) — a 2D collision detection library implementing the GJK algorithm, AABB tests, capsule collisions, ray casting, and manifold generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/anael-seghezzi/Maratis-Tiny-C-library" rel="noopener noreferrer"&gt;Maratis Tiny C Library&lt;/a&gt;&lt;/strong&gt; (~3,300 lines) — image loading (PNG), pixel format conversion, inflate decompression, and agglomerative clustering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/id-Software/Quake-III-Arena" rel="noopener noreferrer"&gt;Quake III Arena&lt;/a&gt;&lt;/strong&gt; (~2,300 lines) — the legendary &lt;code&gt;q_math.c&lt;/code&gt; (with a certain famous expletive redacted from the comments)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/nothings/stb" rel="noopener noreferrer"&gt;stb_perlin&lt;/a&gt;&lt;/strong&gt; (~500 lines) — Ken Perlin's improved noise function from stb, covering 3D noise, fractal Brownian motion, ridge noise, and turbulence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/facebook/zstd" rel="noopener noreferrer"&gt;Zstandard (zstd)&lt;/a&gt;&lt;/strong&gt; (~100 lines) — filename generation and line-pointer utilities from Meta's fast compression library&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/cr-bgbcc" rel="noopener noreferrer"&gt;BTAC1C&lt;/a&gt;&lt;/strong&gt; (~550 lines) — audio sample prediction functions from Brendan Bohannon's audio compression codec&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://sphincs.org/" rel="noopener noreferrer"&gt;SPHINCS+&lt;/a&gt;&lt;/strong&gt; (~5,000 lines) — a NIST-standardized post-quantum signature scheme (FIPS 205 / SLH-DSA) combining WOTS+, FORS, and Merkle trees with BLAKE hash backend&lt;/li&gt;
&lt;li&gt;Entries from the &lt;strong&gt;&lt;a href="http://www.underhanded-c.org/" rel="noopener noreferrer"&gt;Underhanded C Contest&lt;/a&gt;&lt;/strong&gt; (~200 lines) — deliberately deceptive code designed to hide malicious behavior in innocent-looking C&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How the Agent Works
&lt;/h2&gt;

&lt;p&gt;Each transpilation follows a 10-step pipeline, entirely automated:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Read&lt;/strong&gt; all C source files and test vectors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compile&lt;/strong&gt; the C source with cmake to verify it builds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create&lt;/strong&gt; a Rust project (edition 2024) and transpile to idiomatic, memory-safe Rust&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate&lt;/strong&gt; unit tests (no mocks — real inputs and outputs only)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verify&lt;/strong&gt; I/O equivalence by FFI-calling both C and Rust and comparing results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate&lt;/strong&gt; tests from the JSON test vectors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create&lt;/strong&gt; C-compatible FFI wrappers (&lt;code&gt;#[unsafe(no_mangle)] pub extern "C" fn&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Remove&lt;/strong&gt; all &lt;code&gt;unsafe&lt;/code&gt; from core logic, confining it to FFI boundaries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmark&lt;/strong&gt; Rust vs C performance (mean over 100 executions) and memory overhead (peak RSS via GNU &lt;code&gt;time -v&lt;/code&gt;, heap/stack profiling via valgrind massif and DHAT)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Write&lt;/strong&gt; a translation summary documenting everything&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The agent works autonomously and decides how to tackle issues at every step. When it hits a compile error or test failure, it reads the error, diagnoses the issue, fixes the code, and retries. Across all 46 translation tasks, it made ~4,500 tool calls — 35% shell commands (cmake, cargo), 22% file writes, 18% file reads, and 8% edits.&lt;/p&gt;

&lt;h3&gt;
  
  
  Orchestration: No Subagents, Just Focused Sessions
&lt;/h3&gt;

&lt;p&gt;Looking at the raw agent logs for cJSON, Perlin Noise, and SPHINCS+ reveals something I didn't expect: the agent never spawns subagents. Despite having KISS's &lt;code&gt;Task&lt;/code&gt; tool available — which can delegate work to specialized child agents — every project is handled by a single KISS session working directly with its tools. No delegation, no fan-out, no divide-and-conquer.&lt;/p&gt;

&lt;p&gt;The architecture is two layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Outer orchestrator.&lt;/strong&gt; A Python script scans for projects, iterates through them sequentially, and launches a fresh KISS session for each one. Each session gets its own context window with no shared state between projects. If a transpilation fails, the script retries with a fresh session (up to 6 attempts) — though across all 46 translations, every one succeeded on the first attempt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inner agent.&lt;/strong&gt; A single KISS instance that receives the 10-step pipeline as a system prompt, then works through it autonomously using direct tool calls. Within each session:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Aggressive tool call batching.&lt;/strong&gt; Nearly every turn fires 3–8 parallel operations. The Perlin Noise session opens with 2 parallel glob patterns to discover source files and test vectors simultaneously, then immediately fires 3 parallel file reads, then 3 more test vector reads — all in the first 3 turns. The SPHINCS+ session reads 4 source files per turn during its initial analysis phase (20+ files across ~5 turns).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;TodoWrite&lt;/code&gt; as working memory.&lt;/strong&gt; Every session maintains a structured checklist of the 10 pipeline steps, updating status in real-time. This gives the agent an explicit record of what's done and what's next — important because the context window is finite and the agent can't always "remember" what it did 50 turns ago.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Context compaction for long sessions.&lt;/strong&gt; When a session exceeds the context window limit (~157K tokens), KISS automatically compacts the conversation into a structured summary and the agent continues from where it left off. Perlin Noise (96 turns) never compacted. cJSON (109 turns) compacted once — mid-way through FFI test creation. SPHINCS+ (265 turns) compacted twice: once while fixing &lt;code&gt;static mut&lt;/code&gt; references, and again while debugging the BLAKE byte-vs-bit count issue. After each compaction, the agent reads its own previously-generated files to re-orient rather than relying on conversation memory.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No planning phase.&lt;/strong&gt; The agent never enters Plan mode or pauses to reason about &lt;em&gt;how&lt;/em&gt; to decompose the problem. It reads the C code, builds it, creates the Rust project, writes the translation, and iterates on errors — all in a single continuous loop. The 10-step pipeline structure comes entirely from the system prompt; the agent follows it implicitly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Session sizes varied predictably with project complexity. The 44 simpler library translations averaged 62 turns each (range: 41–96). cJSON required 109 turns due to its recursive tree structure and the UTF-8 byte accumulation fix. SPHINCS+ was the outlier at 265 turns — the combination of 20+ C source files, Rust 2024's &lt;code&gt;static mut&lt;/code&gt; prohibition, and the BLAKE byte/bit count debugging pushed it to nearly 4x the typical session length.&lt;/p&gt;

&lt;h3&gt;
  
  
  Notable Findings
&lt;/h3&gt;

&lt;p&gt;Beyond just translating code, the agent surfaced architectural insights and potential bugs in the original C codebases:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unnecessary &lt;code&gt;Rc&amp;lt;RefCell&amp;lt;&amp;gt;&amp;gt;&lt;/code&gt; in cJSON.&lt;/strong&gt; The agent initially reached for &lt;code&gt;Rc&amp;lt;RefCell&amp;lt;CJson&amp;gt;&amp;gt;&lt;/code&gt; to model &lt;a href="https://github.com/DaveGamble/cJSON" rel="noopener noreferrer"&gt;cJSON&lt;/a&gt;'s tree of &lt;code&gt;next&lt;/code&gt;/&lt;code&gt;prev&lt;/code&gt;/&lt;code&gt;child&lt;/code&gt; pointers. When I had it investigate whether that was actually necessary, it found zero &lt;code&gt;Rc::clone()&lt;/code&gt; calls anywhere — items were never shared between multiple owners. The entire &lt;code&gt;Rc&amp;lt;RefCell&amp;lt;&amp;gt;&amp;gt;&lt;/code&gt; layer was unnecessary overhead, driven by the agent's defensive stance on C pointer aliasing patterns that didn't actually exist in this codebase. Removing it made the Rust translation both simpler and faster than the C original (see Performance below).&lt;/p&gt;

&lt;p&gt;This pattern repeated across several projects: C codebases had aliasing patterns that were technically safe but relied entirely on programmer discipline. Rust's borrow checker forced the agent to make ownership explicit, which revealed that many assumed-complex pointer relationships were actually simple tree structures or single-owner patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The BLAKE hash bug in SPHINCS+.&lt;/strong&gt; While translating the &lt;a href="https://sphincs.org/" rel="noopener noreferrer"&gt;SPHINCS+&lt;/a&gt; post-quantum signature scheme (&lt;a href="https://csrc.nist.gov/pubs/fips/205/final" rel="noopener noreferrer"&gt;FIPS 205 / SLH-DSA&lt;/a&gt;), the agent noticed that the KAT test driver code passes &lt;strong&gt;byte counts&lt;/strong&gt; to &lt;code&gt;blake256_update()&lt;/code&gt; where the function signature expects &lt;strong&gt;bit counts&lt;/strong&gt; — effectively processing only 1/8th of the intended data per update call. We introduced this bug deliberately to test whether the agent would catch semantic mismatches between function signatures and call sites. The agent flagged it immediately but faithfully reproduced the behavior for I/O equivalence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// C code: blakeX_update(ctx, data, SPX_N);  // SPX_N=16, expects bits&lt;/span&gt;
&lt;span class="c1"&gt;// Rust:   state.update(data, SPX_N as u64); // Matching C's byte-vs-bit behavior&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The KAT (Known Answer Test) digest matches exactly: &lt;code&gt;97B452A98F312321D982CDE133B1BF6D7189DC0A9296338C9A823A6689670584&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Eliminating &lt;code&gt;unsafe&lt;/code&gt; from SPHINCS+.&lt;/strong&gt; The agent initially used &lt;code&gt;unsafe&lt;/code&gt; blocks in the address and thash modules, then successfully replaced them with safe alternatives using explicit &lt;code&gt;to_ne_bytes&lt;/code&gt;/&lt;code&gt;from_ne_bytes&lt;/code&gt; conversions — trading some performance for full memory safety in the core library.&lt;/p&gt;

&lt;h3&gt;
  
  
  Debugging Patterns I Observed
&lt;/h3&gt;

&lt;p&gt;The agent developed consistent debugging strategies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Never trusting its own test expectations.&lt;/strong&gt; When a Rust test failed, it compiled and ran the C version to establish ground truth before fixing the Rust code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creating minimal reproduction programs.&lt;/strong&gt; For integer overflow edge cases, it wrote one-off C programs to test &lt;code&gt;INT_MIN / -1&lt;/code&gt; behavior.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clone-operate-reassign for borrow checker issues.&lt;/strong&gt; When hitting &lt;code&gt;cannot borrow as mutable because it is also borrowed as immutable&lt;/code&gt;, it consistently applied the pattern of cloning the data, operating on the clone, then reassigning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Byte accumulation for UTF-8.&lt;/strong&gt; It learned not to cast individual bytes to &lt;code&gt;char&lt;/code&gt; when parsing strings containing multibyte characters (this came up with Japanese text in JSON).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Testing Strategy
&lt;/h3&gt;

&lt;p&gt;Every translation includes a comprehensive test suite with three tiers of verification:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Unit Tests&lt;/strong&gt; — verify individual functions and edge cases in isolation. The agent writes these while implementing each module, covering boundary conditions, error paths, and algorithmic correctness. Test counts range from 9–50+ per project depending on complexity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. FFI Equivalence Tests&lt;/strong&gt; — the critical layer. For each public API function, the agent compiles both the C original and Rust translation, calls them via FFI with identical inputs, and asserts byte-for-byte output equivalence. This catches subtle behavioral divergence (floating-point rounding, string encoding, integer overflow) that unit tests might miss. Typical coverage: 6–19 equivalence tests per project.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Test Vector Tests&lt;/strong&gt; — validate against the provided JSON specifications. Each project ships with test vectors defining expected behavior; the agent parses these and generates corresponding test cases. Counts range from 3–30 vectors per project.&lt;/p&gt;

&lt;p&gt;Aggregate statistics across the 46 translations:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Test Type&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Unit Tests&lt;/td&gt;
&lt;td&gt;~1,100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FFI Equivalence Tests&lt;/td&gt;
&lt;td&gt;~400&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test Vector Tests&lt;/td&gt;
&lt;td&gt;~300&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~1,800 tests&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Three projects received detailed benchmarking (cJSON, stb_perlin, SPHINCS+) with additional performance and memory profiling suites.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Test Generation Gap.&lt;/strong&gt; Early experiments with dedicated test-generation subagents revealed that specialized agents produce 5–6× more comprehensive test suites than the single orchestrator approach used here. When a subagent's sole responsibility is "write exhaustive tests for this module," it explores corner cases (negative numbers, empty inputs, Unicode edge cases, allocation failures) that the orchestrator skips in favor of "happy path" coverage sufficient to pass the test vectors. The trade-off: test generation time increases proportionally, and most of that expanded coverage exercises unlikely edge cases. For this experiment, the orchestrator-only approach prioritized translation speed over test comprehensiveness, relying on the FFI equivalence layer to catch behavioral mismatches.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Future work:&lt;/strong&gt; Test coverage could definitely be improved. We're planning to integrate a coverage tracking framework (like &lt;code&gt;cargo-llvm-cov&lt;/code&gt;) and deploy a specialized test-generation subagent to systematically target uncovered branches and edge cases, aiming for &amp;gt;90% line coverage across all translations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Projects&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;12&lt;/strong&gt; (46 translation tasks)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Success Rate&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;100%&lt;/strong&gt; (46/46)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total Cost&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$173.73&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total Time&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;~8 hours&lt;/strong&gt; (~10.4 min/task)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;C Lines In&lt;/td&gt;
&lt;td&gt;~28,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rust Lines Out&lt;/td&gt;
&lt;td&gt;~37,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mean Cost/Task&lt;/td&gt;
&lt;td&gt;$3.78&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Cost per translation ranged from $1.29 to $18.81. The median was around $3.00. The most expensive translation was &lt;a href="https://sphincs.org/" rel="noopener noreferrer"&gt;SPHINCS+&lt;/a&gt; at $18.81 — its 20+ source files and complex cryptographic pipeline required the longest sessions. Among the library translations, &lt;a href="https://github.com/DaveGamble/cJSON" rel="noopener noreferrer"&gt;cJSON&lt;/a&gt; was the costliest at $10.07 — its self-referential tree structure with doubly-linked sibling lists required the most iteration. The cheapest was a &lt;a href="https://github.com/wazuh/wazuh" rel="noopener noreferrer"&gt;Wazuh&lt;/a&gt; search-and-replace function at $1.29.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The economics are compelling.&lt;/strong&gt; At $174 for 12 projects (~28,000 lines of C), a single hour of senior developer time costs more than the median per-translation cost. Whether the output is production-ready is another question — but as a starting point for human review and a forcing function to surface architectural decisions (data structure choices, ownership patterns, API boundaries), the cost is hard to argue with.&lt;/p&gt;

&lt;p&gt;Every translation passed its full test suite. The Perlin Noise translation achieved exact f32 numerical matching across all 30 test vectors — no floating-point divergence.&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance
&lt;/h2&gt;

&lt;p&gt;We benchmarked three projects in detail — cJSON (data-structure-heavy), stb_perlin (pure computation), and SPHINCS+ (cryptographic workload) — and compared the Rust translations against their C originals. All benchmarks used CPU frequency scaling disabled, and both sides were compiled with &lt;code&gt;-O3&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Runtime Speed
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;cJSON.&lt;/strong&gt; 100 individually-timed iterations, 10 warmup. C compiled via cmake; Rust compiled with &lt;code&gt;--release&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;C (mean +/- stddev)&lt;/th&gt;
&lt;th&gt;Rust (mean +/- stddev)&lt;/th&gt;
&lt;th&gt;Delta&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Simple JSON Parse&lt;/td&gt;
&lt;td&gt;1.04 us +/- 747 ns&lt;/td&gt;
&lt;td&gt;659 ns +/- 37 ns&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Rust 36% faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complex JSON Parse&lt;/td&gt;
&lt;td&gt;10.98 us +/- 6.58 us&lt;/td&gt;
&lt;td&gt;5.90 us +/- 749 ns&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Rust 46% faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Large Array Parse (1000 elem)&lt;/td&gt;
&lt;td&gt;138.35 us +/- 8.71 us&lt;/td&gt;
&lt;td&gt;84.89 us +/- 7.46 us&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Rust 39% faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Simple JSON Print&lt;/td&gt;
&lt;td&gt;448 ns +/- 247 ns&lt;/td&gt;
&lt;td&gt;284 ns +/- 45 ns&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Rust 37% faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complex JSON Print&lt;/td&gt;
&lt;td&gt;2.67 us +/- 1.47 us&lt;/td&gt;
&lt;td&gt;2.78 us +/- 2.64 us&lt;/td&gt;
&lt;td&gt;Rust 4% slower&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Object Creation&lt;/td&gt;
&lt;td&gt;609 ns +/- 288 ns&lt;/td&gt;
&lt;td&gt;403 ns +/- 123 ns&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Rust 34% faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Array Creation (10 elem)&lt;/td&gt;
&lt;td&gt;1.22 us +/- 1.77 us&lt;/td&gt;
&lt;td&gt;927 ns +/- 224 ns&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Rust 24% faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Array Creation (100 elem)&lt;/td&gt;
&lt;td&gt;11.31 us +/- 5.55 us&lt;/td&gt;
&lt;td&gt;5.62 us +/- 1.45 us&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Rust 50% faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Rust is faster in &lt;strong&gt;7 of 8 benchmarks&lt;/strong&gt;, often by 30-50%. Two things stand out:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rust has much lower variance.&lt;/strong&gt; Look at Simple JSON Parse: C's stddev is 747 ns (72% of the mean), while Rust's is 37 ns (6% of the mean). The Rust binary's performance is dramatically more predictable.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The speedup comes from data structure choice, not language overhead.&lt;/strong&gt; C's cJSON traverses linked lists (&lt;code&gt;next&lt;/code&gt;/&lt;code&gt;prev&lt;/code&gt;/&lt;code&gt;child&lt;/code&gt; pointers) while the Rust translation uses &lt;code&gt;Vec&lt;/code&gt;, which has better cache locality. Rust also eliminates per-node &lt;code&gt;strlen()&lt;/code&gt; calls since &lt;code&gt;String&lt;/code&gt; tracks its own length, and &lt;code&gt;Vec&lt;/code&gt; amortizes allocations where C does many small &lt;code&gt;malloc&lt;/code&gt; calls.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Perlin Noise.&lt;/strong&gt; 100 iterations, 5 warmup:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;C (mean)&lt;/th&gt;
&lt;th&gt;Rust (mean)&lt;/th&gt;
&lt;th&gt;Overhead&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;noise3&lt;/code&gt; (1M samples)&lt;/td&gt;
&lt;td&gt;36.06 ms&lt;/td&gt;
&lt;td&gt;43.13 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+19.6%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;fbm_noise3&lt;/code&gt; (100K samples, 6 octaves)&lt;/td&gt;
&lt;td&gt;22.26 ms&lt;/td&gt;
&lt;td&gt;26.44 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+18.8%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;ridge_noise3&lt;/code&gt; (100K samples, 6 octaves)&lt;/td&gt;
&lt;td&gt;22.74 ms&lt;/td&gt;
&lt;td&gt;26.98 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+18.6%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A consistent ~19% overhead across all three functions. The same ratio held at &lt;code&gt;-O2&lt;/code&gt;, suggesting it's intrinsic to the translation — most likely Rust's bounds-checked array accesses on the permutation table lookups in the hot path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SPHINCS+.&lt;/strong&gt; 100 iterations, 2 warmup. C compiled with GCC; Rust compiled with &lt;code&gt;--release&lt;/code&gt;. Both use the same deterministic RNG seed and message:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;C (per op)&lt;/th&gt;
&lt;th&gt;Rust (per op)&lt;/th&gt;
&lt;th&gt;Overhead&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Keypair Generation&lt;/td&gt;
&lt;td&gt;2.97 ms&lt;/td&gt;
&lt;td&gt;3.17 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+6.7%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Signing&lt;/td&gt;
&lt;td&gt;68.64 ms&lt;/td&gt;
&lt;td&gt;81.89 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+19.3%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verification&lt;/td&gt;
&lt;td&gt;4.29 ms&lt;/td&gt;
&lt;td&gt;4.58 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+6.7%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Signing shows the largest overhead (~19%), likely due to Rust's safe byte conversions in the inner hash loops and the &lt;code&gt;Mutex&lt;/code&gt;-protected global RNG state. Keypair generation and verification are within ~7% of C, indicating the core Merkle tree and FORS logic translate efficiently.&lt;/p&gt;

&lt;h3&gt;
  
  
  Binary Sizes
&lt;/h3&gt;

&lt;p&gt;The Rust binaries are &lt;em&gt;dramatically&lt;/em&gt; larger than their C counterparts:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;Artifact&lt;/th&gt;
&lt;th&gt;C&lt;/th&gt;
&lt;th&gt;Rust&lt;/th&gt;
&lt;th&gt;Binary Ratio&lt;/th&gt;
&lt;th&gt;
&lt;code&gt;.text&lt;/code&gt; Ratio&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/DaveGamble/cJSON" rel="noopener noreferrer"&gt;cJSON&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.so&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;14.0 KB&lt;/td&gt;
&lt;td&gt;459.3 KB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;32.7x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;93.4x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://sphincs.org/" rel="noopener noreferrer"&gt;SPHINCS+&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;.so&lt;/code&gt; × 2&lt;/td&gt;
&lt;td&gt;72.2 KB&lt;/td&gt;
&lt;td&gt;377.2 KB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;5.2x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.5x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/nothings/stb" rel="noopener noreferrer"&gt;stb_perlin&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;executable&lt;/td&gt;
&lt;td&gt;18.4 KB&lt;/td&gt;
&lt;td&gt;416.8 KB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;22.7x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;74.6x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://sphincs.org/" rel="noopener noreferrer"&gt;SPHINCS+&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;executable&lt;/td&gt;
&lt;td&gt;18.2 KB&lt;/td&gt;
&lt;td&gt;445.6 KB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;24.5x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;46.1x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All artifacts stripped and compiled with &lt;code&gt;-O3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;.text&lt;/code&gt; ratio column tells the story. The small C libraries — cJSON (4.7 KB of code) and stb_perlin (4.0 KB) — show extreme ratios (74–93x) because Rust statically links the standard library, panic infrastructure, formatting machinery, and the memory allocator into every artifact. These have no equivalent in the C version, which relies on the system providing libc.&lt;/p&gt;

&lt;p&gt;SPHINCS+ shows the impact of code size on overhead ratios. Its shared library has a much more reasonable 6.5x &lt;code&gt;.text&lt;/code&gt; ratio because the C code is already substantial (42 KB across two &lt;code&gt;.so&lt;/code&gt; files), diluting the fixed Rust runtime overhead. But the SPHINCS+ executable shows 46x &lt;code&gt;.text&lt;/code&gt; overhead — closer to the small libraries — because the C driver is tiny (9.4 KB) and dynamically links its dependencies, while Rust statically links everything.&lt;/p&gt;

&lt;p&gt;The practical takeaway: Rust's binary size overhead is dominated by fixed costs (stdlib, panic handling, allocator). For small C codebases, expect 20–90x inflation. For larger projects (&amp;gt;40 KB of C code), the ratio drops to 5–10x as application code dominates. For server-side applications or CLIs, the extra few hundred KB is usually irrelevant; for embedded systems or shared libraries in size-constrained environments, it's worth planning for.&lt;/p&gt;

&lt;h3&gt;
  
  
  Runtime Memory
&lt;/h3&gt;

&lt;p&gt;Profiled with valgrind (massif + DHAT) and GNU &lt;code&gt;time -v&lt;/code&gt;, both sides at &lt;code&gt;-O3&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;Peak RSS Overhead&lt;/th&gt;
&lt;th&gt;Peak Heap Ratio&lt;/th&gt;
&lt;th&gt;Alloc Churn Ratio&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/DaveGamble/cJSON" rel="noopener noreferrer"&gt;cJSON&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+13%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.97x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1.08x bytes, 1.75x blocks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://sphincs.org/" rel="noopener noreferrer"&gt;SPHINCS+&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+13%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.15x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;118x bytes, 133x blocks&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/nothings/stb" rel="noopener noreferrer"&gt;stb_perlin&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+51%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.29x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1.61x bytes, 17x blocks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The story differs sharply by workload:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;cJSON&lt;/strong&gt; is the best case. Peak live heap is virtually identical (~25 KB) — the Rust &lt;code&gt;Rc&amp;lt;RefCell&amp;lt;&amp;gt;&amp;gt;&lt;/code&gt; tree and the C linked-list tree hold the same data at the same cost. Rust just fragments it into more, smaller allocations (1.75x block count). RSS overhead is a modest 13%.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;stb_perlin&lt;/strong&gt; is pure computation — no application-level heap use at all. The C driver makes exactly 2 allocations (libc stdio buffers). The entire 51% RSS overhead is the Rust runtime itself: the statically linked stdlib, panic/unwind tables, and formatting machinery. This is the worst-case scenario for runtime memory: a tiny program where the fixed overhead dominates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SPHINCS+&lt;/strong&gt; is the most interesting. The 118x allocation churn comes from Rust's use of &lt;code&gt;Vec&amp;lt;u8&amp;gt;&lt;/code&gt; for intermediate cryptographic buffers, whereas C uses fixed-size stack arrays or global &lt;code&gt;.bss&lt;/code&gt; buffers (C's &lt;code&gt;.bss&lt;/code&gt; is 35 KB vs Rust's 4 KB). Despite this churn, peak live heap is only 15% higher — Rust promptly frees temporaries. The obvious optimization: convert hot-path &lt;code&gt;Vec&amp;lt;u8&amp;gt;&lt;/code&gt; allocations to fixed-size &lt;code&gt;[u8; N]&lt;/code&gt; stack arrays, which would likely close both the allocation gap and the 19% overhead in signing performance.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;These 12 projects were a proving ground — small-to-medium C libraries and programs, most under a few thousand lines. The results are encouraging, but the real test is scaling up. We're now turning the agent loose on larger real-world targets: codebases with complex struct hierarchies spanning dozens of files, intricate memory management patterns (custom allocators, arena-based allocation, reference-counted object graphs), tens of thousands of lines of C, and the kind of deeply intertwined module dependencies that make manual translation a multi-month effort.&lt;/p&gt;

&lt;p&gt;The questions we want to answer next: How does the agent handle codebases where no single file fits in its context window? Can it maintain coherent data structure translations across 50+ modules? What happens when the C code relies heavily on platform-specific behavior or inline assembly? And critically, does the cost-per-line stay reasonable, or does it explode with complexity?&lt;/p&gt;

&lt;p&gt;Stay tuned.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This work is led and executed by &lt;a href="https://www.sec.in.tum.de/i20/people/momeu-marius" rel="noopener noreferrer"&gt;Marius Momeu&lt;/a&gt; (Postdoctoral Researcher, UC Berkeley and PhD Candidate, TU Munich) under the supervision of &lt;a href="https://people.eecs.berkeley.edu/~ksen/" rel="noopener noreferrer"&gt;Koushik Sen&lt;/a&gt; (Professor, UC Berkeley), and supported in part by the Defense Advanced Research Projects Agency (DARPA) under Agreement No. HR00112590134.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rust</category>
      <category>c</category>
    </item>
  </channel>
</rss>
