<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: CharmPic</title>
    <description>The latest articles on Forem by CharmPic (@charmpic).</description>
    <link>https://forem.com/charmpic</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3301778%2Fb25090e0-32de-44a7-a2b3-8691c3a7f56a.png</url>
      <title>Forem: CharmPic</title>
      <link>https://forem.com/charmpic</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/charmpic"/>
    <language>en</language>
    <item>
      <title>Language Barriers: A Struggle for Japanese Developers on Dev.to</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Sun, 12 Apr 2026 06:58:52 +0000</pubDate>
      <link>https://forem.com/charmpic/language-barriers-a-struggle-for-japanese-developers-on-devto-kjc</link>
      <guid>https://forem.com/charmpic/language-barriers-a-struggle-for-japanese-developers-on-devto-kjc</guid>
      <description>&lt;p&gt;As a Japanese developer, I love browsing Dev.to to keep up with the latest tech trends. However, I often face a significant "wall" that hinders my learning experience: the language barrier.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;The "I Can't Read English Fast Enough" Problem&lt;br&gt;
Let’s be honest—reading long technical articles in English is exhausting when it's not your native language. Even if a headline looks incredibly interesting, the psychological hurdle of clicking on a wall of English text is surprisingly high.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Limitations of Browser Translation&lt;br&gt;
You might say, "Just use Google Translate or built-in browser features!" But it’s not that simple:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Friction: Having to manually trigger translation for every single page is a tedious extra step.&lt;/p&gt;

&lt;p&gt;Accuracy: Standard browser translations often struggle with technical context. They sometimes mangle code snippets or turn specific jargon into nonsensical Japanese, forcing me to switch back to the original text anyway.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Dream Feature: AI-Powered Native Translation
I often find myself wishing Dev.to would implement an integrated AI translation feature.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With the power of modern LLMs, we could have context-aware, high-quality translations at the click of a button. Imagine a "Read in Japanese" toggle right next to the article!&lt;/p&gt;

&lt;p&gt;I understand that API costs are a major concern, making this a difficult feature to implement for free. But it’s painful to think about how many amazing insights I’m missing out on just because of the language gap. T^T&lt;/p&gt;

&lt;p&gt;I’d love to hear from you:&lt;br&gt;
How do non-native English speakers handle this? Do you use any specific tools or extensions that make your Dev.to experience smoother?&lt;/p&gt;

</description>
      <category>community</category>
      <category>devjournal</category>
      <category>discuss</category>
      <category>learning</category>
    </item>
    <item>
      <title>Re-evaluating the ROI of GLM-5.1 Pro After a Massive Price Hike to $680</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Sun, 12 Apr 2026 06:29:49 +0000</pubDate>
      <link>https://forem.com/charmpic/re-evaluating-the-roi-of-glm-51-pro-after-a-massive-price-hike-to-680-i2d</link>
      <guid>https://forem.com/charmpic/re-evaluating-the-roi-of-glm-51-pro-after-a-massive-price-hike-to-680-i2d</guid>
      <description>&lt;p&gt;Headline: GLM-5.1 Pro Price Hike: 3x Increase to $680/year — Time to Look for Alternatives?&lt;/p&gt;

&lt;p&gt;I recently received some shocking news regarding the GLM-5.1 Pro plan.&lt;br&gt;
The annual subscription, which used to be a reasonable $180, has suddenly spiked to over $680. That is a staggering 3x increase.&lt;/p&gt;

&lt;p&gt;To be fair, the GLM-5.1 Pro plan offered incredible value. Its performance and limits were comparable to the Claude Code $200/month (Max) tier, making it a "hidden gem" for developers. Even at $680/year, one could argue it still offers decent value considering the high-end capabilities.&lt;/p&gt;

&lt;p&gt;However, a 600% price jump changes the equation. At this price point, we can no longer ignore other major AI players in the market. It’s time to start comparing the cost-to-performance ratio against other leading LLMs again.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>news</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Instant Glory: The App That Makes Every Coder a DEV Challenge Winner</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Fri, 10 Apr 2026 17:20:15 +0000</pubDate>
      <link>https://forem.com/charmpic/instant-glory-the-app-that-makes-every-coder-a-dev-challenge-winner-1fmo</link>
      <guid>https://forem.com/charmpic/instant-glory-the-app-that-makes-every-coder-a-dev-challenge-winner-1fmo</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/aprilfools-2026"&gt;DEV April Fools Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  The Ultimate Ego Booster: Challenge Winner Simulator 2026
&lt;/h1&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Have you ever felt the unbearable emptiness of not winning a DEV challenge? The sleepless nights. The existential dread. The nagging suspicion that your code may not be "useless" enough to qualify for greatness?&lt;/p&gt;

&lt;p&gt;I built a gloriously unnecessary victory machine that solves all of that.&lt;/p&gt;

&lt;p&gt;Challenge Winner Simulator 2026 is a delightfully over-the-top praise engine: enter your name, and the app transforms into a full-blown cosmic celebration of your alleged brilliance. You get dramatic compliments, absurd statistics, galactic proclamations, a cinematic Star Wars-style credit crawl, and enough visual excess to convince any developer that they are, in fact, the chosen one.&lt;/p&gt;

&lt;p&gt;And because the joke simply refused to stay in the browser, I also built a Windows desktop version of the app with Flutter and WebView2. So now the same majestic nonsense can be launched as a native Windows app, packaged like a serious piece of software despite being fundamentally unserious in every possible way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;Web demo:&lt;br&gt;&lt;br&gt;
&lt;a href="https://moe-charm.github.io/dev_challenges/20260411winner/index.html" rel="noopener noreferrer"&gt;https://moe-charm.github.io/dev_challenges/20260411winner/index.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Windows release:&lt;br&gt;&lt;br&gt;
&lt;a href="https://github.com/moe-charm/dev_challenges/releases/tag/winner-simulator-20260411" rel="noopener noreferrer"&gt;https://github.com/moe-charm/dev_challenges/releases/tag/winner-simulator-20260411&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Click the "CELEBRATE!" button to trigger the full auditory and visual experience.&lt;br&gt;&lt;br&gt;
And yes, the music absolutely matters.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And because the joke simply refused to stay in the browser, I also built a Windows desktop version of the app with Flutter and WebView2. So now the same majestic nonsense can be launched as a native Windows app, packaged like a serious piece of software despite being fundamentally unserious in every possible way. And since this is an April Fools project, you can enjoy the whole thing locally anytime, even offline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;Web version:&lt;br&gt;&lt;br&gt;
&lt;a href="https://github.com/moe-charm/dev_challenges/tree/main/20260411winner" rel="noopener noreferrer"&gt;https://github.com/moe-charm/dev_challenges/tree/main/20260411winner&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Windows version:&lt;br&gt;&lt;br&gt;
&lt;a href="https://github.com/moe-charm/dev_challenges/tree/main/winner_simulator_app" rel="noopener noreferrer"&gt;https://github.com/moe-charm/dev_challenges/tree/main/winner_simulator_app&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Release build:&lt;br&gt;&lt;br&gt;
&lt;a href="https://github.com/moe-charm/dev_challenges/releases/tag/winner-simulator-20260411" rel="noopener noreferrer"&gt;https://github.com/moe-charm/dev_challenges/releases/tag/winner-simulator-20260411&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How I Built It
&lt;/h2&gt;

&lt;p&gt;I wanted this to feel both ridiculous and weirdly overengineered, so I kept the web version lightweight while piling on just enough spectacle to make it feel expensive.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Vanilla HTML/CSS/JS: no framework, no mercy, just pure DOM manipulation and theatrical confidence.&lt;/li&gt;
&lt;li&gt;CSS transforms and animations: used for the big cinematic crawl, dramatic fades, glowing text, and all the unnecessary grandeur.&lt;/li&gt;
&lt;li&gt;Canvas API: used for fireworks and particle effects so the whole thing could sparkle like it was accepting an award nobody asked for.&lt;/li&gt;
&lt;li&gt;Web Audio API: used for fanfares, drum rolls, cat-like sounds, and the kind of BGM that insists your name deserves a standing ovation.&lt;/li&gt;
&lt;li&gt;i18n logic: supports both English and Japanese, because winning should be internationally embarrassing.&lt;/li&gt;
&lt;li&gt;Flutter + WebView2: for the Windows desktop edition, which embeds the same HTML challenge into a standalone app so the joke can live outside the browser too.&lt;/li&gt;
&lt;li&gt;Embedded assets: the Windows build packages the challenge inside the app, so it can be distributed as a release ZIP without needing a separate content folder.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The whole project is intentionally excessive for something fundamentally useless, which is exactly what made it fun to build.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prize Category
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best Ode to Larry Masinter&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This project is basically a shrine to the spirit of playful protocol absurdity. It leans hard into the glorious nonsense of &lt;code&gt;418 I'm a teapot&lt;/code&gt;, celebrates the ritual of turning a tiny joke into a grand experience, and fully embraces the idea that the web can be both technically elaborate and completely ridiculous at the same time.&lt;/p&gt;

&lt;p&gt;The Golden Teapot is not just a trophy. It is a philosophy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Google AI Usage (Best Google AI Usage Entry)
&lt;/h3&gt;

&lt;p&gt;This entire project was built in a deep pair-programming session with &lt;strong&gt;Antigravity&lt;/strong&gt;, Google’s agentic AI coding assistant. &lt;br&gt;
Antigravity wasn't just a code generator; it acted as a "Dramatic Consultant" and "Vibe Architect." Here’s how we used it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rapid Prototyping&lt;/strong&gt;: Antigravity generated the complex CSS 3D transforms for the credit crawl and the Canvas-based firework engine from scratch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic Iteration&lt;/strong&gt;: We iterated on the visual "wow factor" by asking the AI to "make it more over-the-top" and "add more galactic energy," which led to the inclusion of glitch effects, screen shakes, and dynamic starfields.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creative Writing&lt;/strong&gt;: The AI helped craft the hyperbolic, universe-shattering narratives in the crawl and world reaction sections, ensuring the "uselessness" was presented with the highest possible prestige.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sound Engineering&lt;/strong&gt;: The AI assisted in integrating the Web Audio API for real-time sound synthesis while managing the external BGM integration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-Platform Escalation&lt;/strong&gt;: When our ambitions got too big and we decided to build a native offline Windows desktop app via Flutter, the AI (along with a strategic assist from ChatGPT for Windows WebView2 virtual hosting) helped us bypass local CORS restrictions and materialize the embedded assets natively!
The collaboration felt less like "writing code" and more like "directing a digital movie." AI allowed me to focus on the humor and vision while it handled the heavy lifting of the visual and auditory implementation.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>devchallenge</category>
      <category>418challenge</category>
      <category>showdev</category>
    </item>
    <item>
      <title>NyanZip: The Delightfully Useless Cat-Language Compression App</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Fri, 03 Apr 2026 04:44:05 +0000</pubDate>
      <link>https://forem.com/charmpic/nyanzip-the-delightfully-useless-cat-language-compression-app-5gj7</link>
      <guid>https://forem.com/charmpic/nyanzip-the-delightfully-useless-cat-language-compression-app-5gj7</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/aprilfools-2026"&gt;DEV April Fools Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  NyanZip: The Delightfully Useless Cat-Language Compression App
&lt;/h1&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I built NyanZip, a browser-based joke app that takes normal text and turns it into exaggerated cat language.&lt;/p&gt;

&lt;p&gt;It has two modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A playful "compress" mode that expands text into a noisy stream of &lt;code&gt;meow&lt;/code&gt; and &lt;code&gt;MEOW!!&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;An "ultra" mode that uses real compression under the hood, but still wraps everything in cat-themed nonsense&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It also includes optional Gemini-powered cat commentary, because every compression tool deserves a tiny, judgmental reviewer in a bow tie.&lt;/p&gt;

&lt;p&gt;The result is intentionally impractical, a little chaotic, and exactly the kind of project that feels right for April Fools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;Live demo: &lt;a href="https://moe-charm.github.io/dev_challenges/20260402april/" rel="noopener noreferrer"&gt;https://moe-charm.github.io/dev_challenges/20260402april/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Repository: &lt;a href="https://github.com/moe-charm/dev_challenges" rel="noopener noreferrer"&gt;https://github.com/moe-charm/dev_challenges&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;The code is all in the repository above. The main pieces are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;App shell and UI: &lt;code&gt;20260402april/index.html&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Main interaction logic: &lt;code&gt;20260402april/app.js&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Cat-text encoder and decoder: &lt;code&gt;20260402april/js/engine.js&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Ultra compression pipeline: &lt;code&gt;20260402april/js/rans.js&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Optional AI cat review feature: &lt;code&gt;20260402april/js/chat.js&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Localization strings: &lt;code&gt;20260402april/js/i18n.js&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How I Built It
&lt;/h2&gt;

&lt;p&gt;I built NyanZip with plain HTML, CSS, and JavaScript.&lt;/p&gt;

&lt;p&gt;A few things made it fun to put together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;TextEncoder&lt;/code&gt; and &lt;code&gt;TextDecoder&lt;/code&gt; for converting text to bytes and back&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CompressionStream&lt;/code&gt; and &lt;code&gt;DecompressionStream&lt;/code&gt; for the ultra mode&lt;/li&gt;
&lt;li&gt;A simple bilingual UI for Japanese and English&lt;/li&gt;
&lt;li&gt;Optional Gemini integration for the cat review comments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I also published it with GitHub Pages so the joke works directly in the browser, with no setup required.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prize Category
&lt;/h2&gt;

&lt;p&gt;I’d submit this for &lt;strong&gt;Community Favorite&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It is intentionally silly, easy to try, and built to make people smile first and ask questions later. The optional Google AI feature adds a fun extra layer, but the core joke stands on its own.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>418challenge</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Hakozuna v3.2 Released: Bringing Optimized Memory Allocation to M1 Mac</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Thu, 19 Mar 2026 23:29:20 +0000</pubDate>
      <link>https://forem.com/charmpic/hakozuna-v32-released-bringing-optimized-memory-allocation-to-m1-mac-b7m</link>
      <guid>https://forem.com/charmpic/hakozuna-v32-released-bringing-optimized-memory-allocation-to-m1-mac-b7m</guid>
      <description>&lt;p&gt;I've added the main text to chatgpt5.4 ↓&lt;/p&gt;

&lt;p&gt;I am pleased to announce the release of Hakozuna v3.2.&lt;br&gt;
While my previous update focused on Windows, this release marks a significant milestone: Full support for M1 Mac.&lt;/p&gt;

&lt;p&gt;GitHub Release:&lt;a href="https://github.com/hakorune/hakozuna" rel="noopener noreferrer"&gt;https://github.com/hakorune/hakozuna&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Zenodo Record: 19120414&lt;/p&gt;

&lt;p&gt;DOI: 10.5281/zenodo.19120414&lt;/p&gt;

&lt;p&gt;What is Hakozuna?&lt;br&gt;
Hakozuna is a memory allocator designed for small objects, built upon the Box Theory framework. It is currently split into two specialized lineages:&lt;/p&gt;

&lt;p&gt;hz3: Optimized for local-heavy / low-RSS workloads.&lt;/p&gt;

&lt;p&gt;hz4: Optimized for remote-heavy / high-thread environments.&lt;/p&gt;

&lt;p&gt;What’s New in the M1 Mac Update&lt;br&gt;
The primary goal of this update was to establish a seamless workflow on M1 Mac—encompassing development, observation, and running benchmarks for academic papers.&lt;/p&gt;

&lt;p&gt;Key Improvements:&lt;/p&gt;

&lt;p&gt;Refined Mac Entrypoints: All Mac-specific logic is now consolidated in the mac/ directory.&lt;/p&gt;

&lt;p&gt;Pipeline Separation: Decoupled the Build Lane and Observe Lane for better modularity.&lt;/p&gt;

&lt;p&gt;Streamlined Paper-Suite: You can now run the full suite of benchmarks required for research papers with a single setup.&lt;/p&gt;

&lt;p&gt;Comparative Benchmarking: Integrated mimalloc and tcmalloc into the suite to allow direct performance comparisons against hz3 and hz4.&lt;/p&gt;

&lt;p&gt;Performance Insights: Where it Shines&lt;br&gt;
Testing the paper-suite on Mac revealed clear strengths for each allocator:&lt;/p&gt;

&lt;p&gt;hz3 showed dominant performance in the Larson benchmark.&lt;/p&gt;

&lt;p&gt;hz4 took the lead in MT remote (Multi-threaded remote free) scenarios.&lt;/p&gt;

&lt;p&gt;In Redis-like workloads, the winner shifted depending on the specific workload characteristics.&lt;/p&gt;

&lt;p&gt;Note on mimalloc-bench: In our subset tests, certain malloc-large treatments were flagged as "no-go."&lt;/p&gt;

&lt;p&gt;Segment Registry: For high-remote conditions, we found that slots=32768 yielded better results.&lt;/p&gt;

&lt;p&gt;The Takeaway:&lt;br&gt;
The M1 Mac results reinforce our core philosophy: rather than trying to create a "one-size-fits-all" allocator, partitioning "boxes" based on specific conditions leads to superior efficiency.&lt;/p&gt;

&lt;p&gt;Conclusion&lt;br&gt;
With v3.2, "Mac support" is more than just a port—it is a functional environment ready for rigorous academic benchmarking.&lt;/p&gt;

&lt;p&gt;Summary of Gains:&lt;/p&gt;

&lt;p&gt;Improved DX (Developer Experience) on M1 Mac.&lt;/p&gt;

&lt;p&gt;Automated and reliable comparative benchmarking via paper-suite.&lt;/p&gt;

&lt;p&gt;Clearer functional boundaries between the hz3 and hz4 lineages.&lt;/p&gt;

&lt;p&gt;Next, I plan to utilize this Mac environment to refine the supplementary data and further validate my research for the upcoming paper.&lt;/p&gt;

</description>
      <category>c</category>
      <category>hakozuna</category>
      <category>chatgpt</category>
    </item>
    <item>
      <title>Porting Hakozuna to Windows Native: Lessons from Benchmarking hz3 and hz4 beyond Ubuntu</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Tue, 10 Mar 2026 14:36:09 +0000</pubDate>
      <link>https://forem.com/charmpic/porting-hakozuna-to-windows-native-lessons-from-benchmarking-hz3-and-hz4-beyond-ubuntu-4mfh</link>
      <guid>https://forem.com/charmpic/porting-hakozuna-to-windows-native-lessons-from-benchmarking-hz3-and-hz4-beyond-ubuntu-4mfh</guid>
      <description>&lt;p&gt;The Windows native support for Hakozuna has finally moved past the "it runs" stage to the "measurable and comparable" stage.&lt;/p&gt;

&lt;p&gt;Previously, my allocator research was focused on Ubuntu. The major milestone here is that the entire pipeline—from source builds to application benchmarks—is now fully operational on Windows.&lt;/p&gt;

&lt;p&gt;The TL;DR: hz3 remains incredibly strong on Windows. Meanwhile, while hz4 is functional and reproducible, it hasn't yet consistently outperformed others in real-world application benchmarks on Windows without specific tuning. Investigation is ongoing.&lt;/p&gt;

&lt;p&gt;What’s New?&lt;br&gt;
This update isn't just about successful compilation. I've established a robust foundation for comparative allocator research on Windows:&lt;/p&gt;

&lt;p&gt;Native Comparisons: Capability to benchmark hz3, hz4, mimalloc, tcmalloc, and CRT on Windows.&lt;/p&gt;

&lt;p&gt;Real-world Workloads: Support for not just synthetic benchmarks, but also real-world Redis and memcached-style loads.&lt;/p&gt;

&lt;p&gt;Infrastructure: Organized public runners, documentation, and benchmark summary repositories.&lt;/p&gt;

&lt;p&gt;Publications: Updated both Japanese and English versions of the research paper with Windows-specific appendices.&lt;/p&gt;

&lt;p&gt;Distribution: Updated GitHub Releases, Zenodo, and public PDFs.&lt;/p&gt;

&lt;p&gt;The Challenges of Windows Porting&lt;br&gt;
Porting to Windows was far from a simple "copy-paste" from Linux. The difficulties lay less in the allocator's hot path and more in the surrounding ecosystem:&lt;/p&gt;

&lt;p&gt;Build Toolchains: Significant differences in build boxes and environments.&lt;/p&gt;

&lt;p&gt;Linking Nuances: Handling DLL vs. static link mode variations.&lt;/p&gt;

&lt;p&gt;OS-Specific APIs: Architecting around VirtualAlloc paths.&lt;/p&gt;

&lt;p&gt;Porting Workloads: Bringing memcached, memtier, and Redis into a native Windows environment.&lt;/p&gt;

&lt;p&gt;Fixed Costs: Noticing OS-specific fixed costs that were negligible on Linux but prominent on Windows.&lt;/p&gt;

&lt;p&gt;Interestingly, some design choices and default "knobs" that worked perfectly for hz4 on Ubuntu didn't translate into a winning strategy for Windows application benchmarks. This highlights the fascinating—and exhausting—reality of how an allocator's behavior changes depending on the OS.&lt;/p&gt;

&lt;p&gt;Key Benchmark Findings&lt;br&gt;
While the Ubuntu results remain the primary baseline, the Windows native tests revealed:&lt;/p&gt;

&lt;p&gt;hz3 Dominance: Highly performant in real Redis workloads (balanced, kv_only, list_only, highpipe).&lt;/p&gt;

&lt;p&gt;Workload Sensitivity: In memcached external-client tests, the "winning" allocator shifts depending on the specific workload.&lt;/p&gt;

&lt;p&gt;hz4 Potential: While hz4 shows promise in synthetic benchmarks with specific tuning, it showed mixed signals in real Redis balanced tests.&lt;/p&gt;

&lt;p&gt;Current Verdict:&lt;/p&gt;

&lt;p&gt;Default: Use hz3.&lt;/p&gt;

&lt;p&gt;Research Focus: Use hz4 for remote-heavy and high-thread count scenarios.&lt;/p&gt;

&lt;p&gt;Paper and Release Updates&lt;br&gt;
I've synchronized all assets with this release:&lt;/p&gt;

&lt;p&gt;Updated Japanese &amp;amp; English PDFs.&lt;/p&gt;

&lt;p&gt;Added Windows Native supplemental tables.&lt;/p&gt;

&lt;p&gt;GitHub Release v3.1 &amp;amp; Zenodo v3.1 (with updated DOI).&lt;/p&gt;

&lt;p&gt;Latest papers are available directly in the repo at docs/paper/main_ja.pdf and main_en.pdf.&lt;/p&gt;

&lt;p&gt;Personal Insights&lt;br&gt;
The most intriguing discovery was seeing "boxes" (design components) that were unremarkable on Ubuntu suddenly show significant impact on Windows—and vice versa.&lt;/p&gt;

&lt;p&gt;The gap between "performing well in synthetics" and "winning in real apps" is crucial. It’s a stark reminder that in allocator research, looking "fast" on paper matters far less than proving which workload you actually conquer.&lt;/p&gt;

&lt;p&gt;What’s Next?&lt;br&gt;
"Completion" of Windows support actually means reaching a level of maturity where research can truly begin. Moving forward, I plan to:&lt;/p&gt;

&lt;p&gt;Further optimize hz4 specifically for Windows.&lt;/p&gt;

&lt;p&gt;Refine common profiles and OS-specific configurations for Ubuntu/Windows.&lt;/p&gt;

&lt;p&gt;Improve paper and documentation readability.&lt;/p&gt;

&lt;p&gt;Evolve the "Box Theory" into the next architectural phase.&lt;/p&gt;

&lt;p&gt;If there’s interest, my next posts will dive deeper into:&lt;/p&gt;

&lt;p&gt;Why I separated hz3 and hz4.&lt;/p&gt;

&lt;p&gt;How to design an allocator using Box Theory.&lt;/p&gt;

&lt;p&gt;The technical nuances of what looks different on Windows compared to Linux.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/hakorune/hakozuna" rel="noopener noreferrer"&gt;https://github.com/hakorune/hakozuna&lt;/a&gt;&lt;/p&gt;

</description>
      <category>chatgpt</category>
    </item>
    <item>
      <title>Does Audio Cable Affect Sound? I Built a Physics Simulator to Find Out.</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Sat, 07 Mar 2026 07:54:38 +0000</pubDate>
      <link>https://forem.com/charmpic/does-audio-cable-affect-sound-i-built-a-physics-simulator-to-find-out-4aff</link>
      <guid>https://forem.com/charmpic/does-audio-cable-affect-sound-i-built-a-physics-simulator-to-find-out-4aff</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvgn2pfk0rucju4ejo456.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvgn2pfk0rucju4ejo456.png" alt=" " width="800" height="660"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Discussing whether audio cables change the sound often turns into a never-ending debate based on subjective impressions.&lt;br&gt;
"Long RCA cables make the sound feel muddy."&lt;br&gt;
"Silver wires add a glossy texture to the highs."&lt;br&gt;
"Extremely long speaker cables lose their punch."&lt;/p&gt;

&lt;p&gt;These anecdotes have existed for decades, but personal experiences alone don't reach a conclusion. Conversely, simple IDEAL circuit models often fail to explain what audiophiles actually hear.&lt;/p&gt;

&lt;p&gt;That's why I decided to stop looking at the cable in isolation. Instead, I built a simulator that treats the entire signal path as a single physical system:&lt;/p&gt;

&lt;p&gt;Physical characteristics of the cable&lt;/p&gt;

&lt;p&gt;Interaction with connected equipment&lt;/p&gt;

&lt;p&gt;Amplifier behavior&lt;/p&gt;

&lt;p&gt;Response degradation in the time domain&lt;/p&gt;

&lt;p&gt;Analytical metrics closer to human perception&lt;/p&gt;

&lt;p&gt;The Project&lt;br&gt;
GitHub: &lt;a href="https://github.com/moe-charm/audio-chain-physics" rel="noopener noreferrer"&gt;https://github.com/moe-charm/audio-chain-physics&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Live Demo: &lt;a href="https://audio-chain-physics.streamlit.app/" rel="noopener noreferrer"&gt;https://audio-chain-physics.streamlit.app/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Zenodo DOI: &lt;a href="https://doi.org/10.5281/zenodo.18898657" rel="noopener noreferrer"&gt;https://doi.org/10.5281/zenodo.18898657&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What I Built&lt;br&gt;
Audio Chain Physics is a research-oriented simulator designed to handle the audio chain in stages. It models the following layers:&lt;/p&gt;

&lt;p&gt;RLGC Model of the Cable: Moving beyond simple resistance.&lt;/p&gt;

&lt;p&gt;Interface Interaction: Output impedance, input capacitance, and common return paths.&lt;/p&gt;

&lt;p&gt;Non-linear Elements &amp;amp; Small-signal Stability: How the amplifier reacts to the load.&lt;/p&gt;

&lt;p&gt;Dielectric Absorption: Approximating "trailing" responses.&lt;/p&gt;

&lt;p&gt;Time-Domain Analysis: Group delay, impulse response, step response, and "TailRatio."&lt;/p&gt;

&lt;p&gt;The core philosophy is not just analyzing the cable itself, but how the cable changes the operating conditions of the equipment.&lt;/p&gt;

&lt;p&gt;For example, the propagation delay of a 3-meter cable is negligible. It’s too small to be an "audible difference" on its own. However, when you combine Output Impedance + Cable Capacitance + Load Capacitance + Amp Phase Margin + Complex Speaker Load, the settling time, group delay, and ringing change. This synergy is likely what manifests as a perceptible difference in sound.&lt;/p&gt;

&lt;p&gt;What the Simulator Reveals (So Far)&lt;br&gt;
The most significant result is that under extreme conditions, sound quality degradation can be clearly reproduced through calculation.&lt;/p&gt;

&lt;p&gt;In scenarios such as:&lt;/p&gt;

&lt;p&gt;Long, poor-quality RCA cables&lt;/p&gt;

&lt;p&gt;Excessively long speaker cables&lt;/p&gt;

&lt;p&gt;Amplifiers with high output impedance&lt;/p&gt;

&lt;p&gt;The simulation clearly shows:&lt;/p&gt;

&lt;p&gt;High-frequency attenuation or peaking&lt;/p&gt;

&lt;p&gt;Distorted group delay&lt;/p&gt;

&lt;p&gt;"Tails" in the impulse response&lt;/p&gt;

&lt;p&gt;Ringing in the step response&lt;/p&gt;

&lt;p&gt;Drop in Damping Factor&lt;/p&gt;

&lt;p&gt;This confirms a solid starting point: If cables or connection conditions are handled poorly, the response of the entire audio chain will degrade.&lt;/p&gt;

&lt;p&gt;What Still Remains Unexplained&lt;br&gt;
To be completely honest, I haven't yet fully explained the more subtle audible differences I’ve experienced myself—those "nuances" that are hard to put into numbers:&lt;/p&gt;

&lt;p&gt;The sense of "glossiness" or "air" in the highs.&lt;/p&gt;

&lt;p&gt;Changes in "soundstage" or "depth."&lt;/p&gt;

&lt;p&gt;The feeling of the sound becoming "thicker" or "denser."&lt;/p&gt;

&lt;p&gt;While I can reproduce degradation in extreme cases, I cannot yet claim to have simulated those delicate nuances under "normal" high-end equipment conditions.&lt;/p&gt;

&lt;p&gt;This is my current conclusion: While degradation due to extreme conditions is reproducible, explaining subtle sonic nuances requires further research. This feels like the most honest scientific stance at this stage.&lt;/p&gt;

&lt;p&gt;Future Roadmap&lt;br&gt;
There is still a lot of work to be done. My focus will move toward:&lt;/p&gt;

&lt;p&gt;Validation: Comparing simulation results with real-world measurement data.&lt;/p&gt;

&lt;p&gt;Complex Loads: Supporting measured speaker impedance curves.&lt;/p&gt;

&lt;p&gt;Stochastic Models: Modeling RF interference, hum, and poor contact points.&lt;/p&gt;

&lt;p&gt;Controlled Listening Tests: Correlating sim data with subjective perception.&lt;/p&gt;

&lt;p&gt;I want to move from "degradation happens at extremes" to "explaining why we hear what we hear in everyday listening."&lt;/p&gt;

&lt;p&gt;If you’re interested in the physics of audio, please check out the GitHub repo or the live demo!&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/moe-charm/audio-chain-physics" rel="noopener noreferrer"&gt;https://github.com/moe-charm/audio-chain-physics&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Live Demo: &lt;a href="https://audio-chain-physics.streamlit.app/" rel="noopener noreferrer"&gt;https://audio-chain-physics.streamlit.app/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>audio</category>
      <category>python</category>
      <category>simulation</category>
    </item>
    <item>
      <title>Building My Own Image Engine "HakoNyans": I Beat PNG, but WebP is a Wall</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Sat, 21 Feb 2026 03:05:12 +0000</pubDate>
      <link>https://forem.com/charmpic/building-my-own-image-engine-hakonyans-i-beat-png-but-webp-is-a-wall-3kpj</link>
      <guid>https://forem.com/charmpic/building-my-own-image-engine-hakonyans-i-beat-png-but-webp-is-a-wall-3kpj</guid>
      <description>&lt;p&gt;Hi DEV community! 👋&lt;/p&gt;

&lt;p&gt;I originally started developing a custom image engine called HakoNyans for a DEV Challenge. The challenge has ended, but my passion hasn't—I'm still actively building and refining it every day.&lt;/p&gt;

&lt;p&gt;The Milestone: Beating PNG&lt;br&gt;
I'm happy to report that HakoNyans has officially surpassed PNG! Getting the architecture and logic right to beat such a classic, widely-used format was a huge milestone for this project.&lt;/p&gt;

&lt;p&gt;The Current Struggle: The WebP Wall&lt;br&gt;
However, my next target is WebP, and let me tell you... WebP is incredibly strong.&lt;/p&gt;

&lt;p&gt;Here is where HakoNyans currently stands against WebP:&lt;/p&gt;

&lt;p&gt;Processing Speed: Currently 20% slower.&lt;/p&gt;

&lt;p&gt;Compression Ratio: Losing by 20% (resulting in larger file sizes).&lt;/p&gt;

&lt;p&gt;I've hit a bit of a wall. Competing with WebP's highly optimized compression logic is no easy task. Closing this 20% gap in both speed and size is my current obsession.&lt;/p&gt;

&lt;p&gt;Let's Discuss!&lt;br&gt;
I'm continually pushing the limits of my current algorithms, but I'd love to hear from the community.&lt;/p&gt;

&lt;p&gt;Has anyone else here tried building an image codec or data compression engine from scratch?&lt;/p&gt;

&lt;p&gt;What optimization techniques or mental models helped you break through performance plateaus?&lt;/p&gt;

&lt;p&gt;Any advice, feedback, or shared experiences would be greatly appreciated. I'll keep pushing HakoNyans forward!&lt;/p&gt;

</description>
      <category>chatgpt</category>
    </item>
    <item>
      <title>4 Months of Developing a Memory Allocator: Updating "Hakozuna" to v3.0 (hz3/hz4)</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Tue, 17 Feb 2026 20:13:31 +0000</pubDate>
      <link>https://forem.com/charmpic/4-months-of-developing-a-memory-allocator-updating-hakozuna-to-v30-hz3hz4-9bb</link>
      <guid>https://forem.com/charmpic/4-months-of-developing-a-memory-allocator-updating-hakozuna-to-v30-hz3hz4-9bb</guid>
      <description>

&lt;h1&gt;
  
  
  4 Months of Developing a Memory Allocator: Updating "Hakozuna" to v3.0 (hz3/hz4)
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;I am excited to announce the release of &lt;strong&gt;Hakozuna&lt;/strong&gt;, a high-performance memory allocator.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repository:&lt;/strong&gt; [&lt;a href="https://github.com/hakorune/hakozuna" rel="noopener noreferrer"&gt;https://github.com/hakorune/hakozuna&lt;/a&gt;]&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Paper &amp;amp; Artifacts (Zenodo v3.0):&lt;/strong&gt; [&lt;a href="https://zenodo.org/records/18674502" rel="noopener noreferrer"&gt;https://zenodo.org/records/18674502&lt;/a&gt;]&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Over the past four months, I’ve been through countless cycles of implementation and benchmarking, optimizing the performance against industry standards like &lt;code&gt;mimalloc&lt;/code&gt; and &lt;code&gt;tcmalloc&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The biggest takeaway from this journey? Instead of trying to create a "one-size-fits-all" configuration to win every race, the real solution was to &lt;strong&gt;branch out into specialized profiles based on use cases.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;hz3&lt;/code&gt;&lt;/strong&gt;: Optimized for &lt;strong&gt;local-heavy / Redis-like&lt;/strong&gt; workloads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;hz4&lt;/code&gt;&lt;/strong&gt;: Optimized for &lt;strong&gt;remote-heavy / high-thread&lt;/strong&gt; workloads.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What is Hakozuna?
&lt;/h2&gt;

&lt;p&gt;Hakozuna is built on &lt;strong&gt;Box Theory&lt;/strong&gt;—a design philosophy centered on aggregating boundaries to isolate responsibilities. During development, I prioritized:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero overhead in the hot path:&lt;/strong&gt; Eliminating unnecessary operations where it matters most.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reversibility:&lt;/strong&gt; Ensuring every optimization can be toggled via compile flags.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability:&lt;/strong&gt; Using A/B benchmarking and one-shot counters to make performance data transparent.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Benchmark Summary (Ubuntu Native, Representative Values at RUNS=10)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  MT lane x remote% (Ops/s)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;th&gt;&lt;code&gt;hz3&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;&lt;code&gt;hz4&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;&lt;code&gt;mimalloc&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;&lt;code&gt;tcmalloc&lt;/code&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;main_r0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;375.4M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;137.4M&lt;/td&gt;
&lt;td&gt;224.2M&lt;/td&gt;
&lt;td&gt;232.7M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;main_r50&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;66.5M&lt;/td&gt;
&lt;td&gt;78.1M&lt;/td&gt;
&lt;td&gt;17.9M&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;84.3M&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;main_r90&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;62.6M&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;67.6M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;13.0M&lt;/td&gt;
&lt;td&gt;54.9M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;cross128_r90&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1.80M&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;50.65M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;10.94M&lt;/td&gt;
&lt;td&gt;7.50M&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Redis-like (Median ops/s)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;hz3&lt;/code&gt;&lt;/strong&gt;: &lt;strong&gt;571,199&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;mimalloc&lt;/code&gt;&lt;/strong&gt;: 568,740&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;tcmalloc&lt;/code&gt;&lt;/strong&gt;: 568,052&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;hz4&lt;/code&gt;&lt;/strong&gt;: 560,576&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Choosing the Right Version
&lt;/h2&gt;

&lt;p&gt;For most scenarios, I recommend starting with &lt;strong&gt;&lt;code&gt;hz3&lt;/code&gt;&lt;/strong&gt; as the default. Switch to &lt;strong&gt;&lt;code&gt;hz4&lt;/code&gt;&lt;/strong&gt; only if your workload is strictly remote-heavy or involves extremely high thread counts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# hz3 (Default)&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;hakozuna/hz3
make clean all_ldpreload_scale
&lt;span class="nv"&gt;LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./libhakozuna_hz3_scale.so ./your_app

&lt;span class="c"&gt;# hz4 (Remote-heavy / High-thread)&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ../hz4
make clean all
&lt;span class="nv"&gt;LD_PRELOAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./libhakozuna_hz4.so ./your_app

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Lessons Learned
&lt;/h2&gt;

&lt;p&gt;Through this development process, I gained two major insights:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Identify "NO-GOs" Early&lt;/strong&gt;
Documenting and archiving optimizations that &lt;em&gt;didn't&lt;/em&gt; work was just as important as the successes. Moving on quickly from failed paths ultimately accelerated the final progress.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;There is no single "winning" path&lt;/strong&gt;
Stability in performance numbers only came after separating the logic: &lt;code&gt;hz3&lt;/code&gt; for local-heavy and &lt;code&gt;hz4&lt;/code&gt; for remote-heavy/high-thread counts. Specialization is the key to outperforming general-purpose allocators in specific niches.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; [hakorune/hakozuna]&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zenodo v3.0:&lt;/strong&gt; [View Records]&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DOI:&lt;/strong&gt; [10.5281/zenodo.18674502]&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>c</category>
      <category>codex</category>
      <category>claudecode</category>
    </item>
    <item>
      <title>HakoNyans: A Transparent Lossless Codec Challenge (with GitHub Copilot CLI)</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Sat, 14 Feb 2026 17:11:27 +0000</pubDate>
      <link>https://forem.com/charmpic/hakonyans-a-transparent-lossless-codec-challenge-with-github-copilot-cli-2imj</link>
      <guid>https://forem.com/charmpic/hakonyans-a-transparent-lossless-codec-challenge-with-github-copilot-cli-2imj</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-01-21"&gt;GitHub Copilot CLI Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;## What I Built&lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;HakoNyans&lt;/strong&gt;, an experimental image codec focused on practical decode speed and transparent lossless results&lt;br&gt;
  across different image types (photo, anime, UI/screen).&lt;/p&gt;

&lt;p&gt;For this challenge, I focused on two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;A clearer lossless workflow in CLI&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Added a new command:&lt;/li&gt;
&lt;li&gt;&lt;code&gt;hakonyans encode-lossless &amp;lt;in.ppm&amp;gt; &amp;lt;out.hkn&amp;gt; [preset: fast|balanced|max]&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This makes lossless testing reproducible from the terminal without custom scripts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reproducible visual and metric snapshots&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Prepared a dedicated asset pack for challenge demos:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;docs/assets/devchallenge_2026_01_21/&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Included side-by-side comparisons and both &lt;strong&gt;win&lt;/strong&gt; and &lt;strong&gt;lose&lt;/strong&gt; cases (to keep reporting honest).&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What this project means to me:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I wanted to show not just “best-case screenshots,” but a realistic engineering snapshot:&lt;/li&gt;
&lt;li&gt;where HKN already wins (some natural-photo cases), and where PNG is still much stronger (some structured/UI-like
cases).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;## Demo&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repository:&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/hakorune/HakoNyans" rel="noopener noreferrer"&gt;https://github.com/hakorune/HakoNyans&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Challenge asset pack:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/hakorune/HakoNyans/tree/main/docs/assets/devchallenge_2026_01_21" rel="noopener noreferrer"&gt;https://github.com/hakorune/HakoNyans/tree/main/docs/assets/devchallenge_2026_01_21&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;### Screenshots&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Main comparison&lt;/strong&gt;&lt;br&gt;
  &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flinerwry58cyhywxlham.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flinerwry58cyhywxlham.jpg" alt="Artoria side-by-side compare" width="800" height="180"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Main comparison metrics (Artoria, lossless)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Source PNG (Artoria): &lt;code&gt;17,669,320 bytes&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Env: &lt;code&gt;HAKONYANS_THREADS=1&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Command: &lt;code&gt;./build/hakonyans encode-lossless &amp;lt;in.ppm&amp;gt; &amp;lt;out.hkn&amp;gt; &amp;lt;preset&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Encode time: 2-run median wall time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;| Preset | HKN bytes | PNG bytes | PNG/HKN | Encode time |&lt;br&gt;
  | &lt;code&gt;balanced&lt;/code&gt; | 17,374,550 | 17,669,320 | 1.0170 | 2.94 s |&lt;br&gt;
  | &lt;code&gt;max&lt;/code&gt; | 16,767,516 | 17,669,320 | 1.0538 | 101.18 s |&lt;/p&gt;

&lt;p&gt;&lt;code&gt;PNG/HKN &amp;gt; 1.0&lt;/code&gt; means HKN is smaller.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lossless win case (nature_01)&lt;/strong&gt;&lt;br&gt;
  &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcmfj47tsy0jd6txnx5h3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcmfj47tsy0jd6txnx5h3.jpg" alt="Lossless win case" width="800" height="164"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lossless lose case (hd_01)&lt;/strong&gt;&lt;br&gt;
  &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0egq5apvf0pezi0hwp1k.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0egq5apvf0pezi0hwp1k.jpg" alt="Lossless lose case" width="800" height="164"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;### Example metrics snapshot (fixed6, max preset)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;nature_01&lt;/code&gt;: HKN 812,567 bytes vs PNG 1,281,481 bytes (&lt;code&gt;PNG/HKN = 1.577&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;nature_02&lt;/code&gt;: HKN 999,685 bytes vs PNG 1,446,470 bytes (&lt;code&gt;PNG/HKN = 1.447&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;hd_01&lt;/code&gt;: HKN 699,858 bytes vs PNG 8,785 bytes (&lt;code&gt;PNG/HKN = 0.0126&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So currently:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HKN can beat PNG on some photo-like content.&lt;/li&gt;
&lt;li&gt;PNG still dominates some structured/worst-case images.&lt;/li&gt;
&lt;li&gt;The project is actively improving both sides.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;## My Experience with GitHub Copilot CLI&lt;/p&gt;

&lt;p&gt;GitHub Copilot CLI was most useful for my fast implementation loop:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Refactoring large headers into smaller units safely&lt;/li&gt;
&lt;li&gt;Adding a new CLI command (&lt;code&gt;encode-lossless&lt;/code&gt;) while preserving existing behavior&lt;/li&gt;
&lt;li&gt;Running repetitive verify loops quickly (&lt;code&gt;build&lt;/code&gt;, &lt;code&gt;ctest&lt;/code&gt;, benchmark checks)&lt;/li&gt;
&lt;li&gt;Generating and validating challenge-ready visual assets from command pipelines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The biggest impact was &lt;strong&gt;speed + consistency&lt;/strong&gt;.&lt;br&gt;
  I could iterate quickly in terminal-first workflows while keeping changes verifiable (tests, checksums, RMSE checks,&lt;br&gt;
  benchmark CSVs).&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
      <category>cli</category>
      <category>githubcopilot</category>
    </item>
    <item>
      <title>AVX2 SIMD Optimization for 12-bit JPEG Decoding in libjpeg-turbo — Pair Programming with Copilot CLI</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Tue, 10 Feb 2026 15:49:32 +0000</pubDate>
      <link>https://forem.com/charmpic/avx2-simd-optimization-for-12-bit-jpeg-decoding-in-libjpeg-turbo-pair-programming-with-copilot-cli-3o37</link>
      <guid>https://forem.com/charmpic/avx2-simd-optimization-for-12-bit-jpeg-decoding-in-libjpeg-turbo-pair-programming-with-copilot-cli-3o37</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-01-21"&gt;GitHub Copilot CLI Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;## What I Built&lt;/p&gt;

&lt;p&gt;I added &lt;strong&gt;AVX2 SIMD optimizations&lt;/strong&gt; to libjpeg-turbo's 12-bit JPEG decoding pipeline, achieving &lt;strong&gt;4.6% speedup&lt;/strong&gt; on&lt;br&gt;
  Full HD and &lt;strong&gt;2.5% on 4K&lt;/strong&gt; images.&lt;/p&gt;

&lt;p&gt;libjpeg-turbo is the world's most widely used JPEG library, with highly optimized SIMD paths for 8-bit JPEG. However,&lt;br&gt;
  &lt;strong&gt;12-bit JPEG&lt;/strong&gt; (used in medical imaging and high-precision workflows) had &lt;strong&gt;zero SIMD support&lt;/strong&gt; — everything ran as&lt;br&gt;
  scalar C code.&lt;/p&gt;

&lt;p&gt;Using &lt;code&gt;perf&lt;/code&gt; profiling, I identified &lt;strong&gt;3 hotspots&lt;/strong&gt; and implemented AVX2 intrinsics for each:&lt;/p&gt;

&lt;p&gt;| Target | Implementation | Impact |&lt;br&gt;
  | --- | --- | --- |&lt;br&gt;
  | &lt;strong&gt;IDCT&lt;/strong&gt; (Inverse DCT) | 64-bit arithmetic + AVX2 parallelization | ~3% |&lt;br&gt;
  | &lt;strong&gt;YCC→RGB Color Conversion&lt;/strong&gt; | SIMD compute + packed RGB interleave output | ~3% |&lt;br&gt;
  | &lt;strong&gt;H2V2 Fancy Upsample&lt;/strong&gt; | 16-bit SIMD weighted interpolation | ~1.8% |&lt;/p&gt;

&lt;p&gt;### Why "just 4.6%" matters&lt;/p&gt;

&lt;p&gt;libjpeg-turbo is &lt;strong&gt;already one of the most optimized codebases in existence&lt;/strong&gt;. Profiling reveals that &lt;strong&gt;37.6% of CPU&lt;br&gt;
  time&lt;/strong&gt; is spent in Huffman decoding — which is structurally impossible to SIMD-ize due to the sequential&lt;br&gt;
  bit-dependency in the JPEG spec. The SIMD-able portion (IDCT + color conversion + upsampling ≈ 44%) was effectively&lt;br&gt;
  optimized across all three targets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📊 Benchmarks (AMD Ryzen 9 9950X, GCC 13.3.0, -O3)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;| Resolution | Before | After | Improvement |&lt;br&gt;
  | --- | --- | --- | --- |&lt;br&gt;
  | Full HD (1920×1080) | 27.87 ms | 26.58 ms | &lt;strong&gt;4.6%&lt;/strong&gt; |&lt;br&gt;
  | 4K (3840×2160) | 113.07 ms | 110.26 ms | &lt;strong&gt;2.5%&lt;/strong&gt; |&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧪 All 662 tests pass&lt;/strong&gt; — JPEG compliance tests allow zero tolerance for bit-level differences.&lt;/p&gt;

&lt;p&gt;## Demo&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔗 Repository&lt;/strong&gt;: moe-charm/dev_libjpeg-turbo-12bit-simd&lt;/p&gt;

&lt;p&gt;### Profiling Breakdown (4K 12-bit JPEG)&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;37.63%  decode_mcu                    ← Huffman decoding (cannot SIMD)
21.95%  jsimd_idct_islow_avx2_12bit   ← ✅ AVX2 optimized
11.38%  ycc_rgb_convert               ← ✅ AVX2 optimized
10.57%  h2v2_fancy_upsample           ← ✅ AVX2 optimized
 8.25%  put_rgb                       ← File I/O
 5.00%  jpeg_fill_bit_buffer          ← Bitstream parsing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;### Key Implementation Detail&lt;/p&gt;

&lt;p&gt;// 12-bit samples are 16-bit → widen to 32-bit for arithmetic → pack back to 16-bit&lt;br&gt;
  &lt;strong&gt;m256i y = _mm256_cvtepu16_epi32(_mm_loadu_si128((&lt;/strong&gt;m128i *)inptr0));&lt;br&gt;
  // ... AVX2 YCC→RGB conversion ...&lt;br&gt;
  __m256i r16 = _mm256_packus_epi32(r, zero);  // 32-bit → 16-bit pack&lt;/p&gt;

&lt;p&gt;## My Experience with GitHub Copilot CLI&lt;/p&gt;

&lt;p&gt;This entire project was built &lt;strong&gt;exclusively through Copilot CLI in the terminal&lt;/strong&gt; — no IDE involved.&lt;/p&gt;

&lt;p&gt;### 🔍 The Profile → Implement → Verify Cycle&lt;/p&gt;

&lt;p&gt;Copilot CLI handled &lt;code&gt;perf record&lt;/code&gt; / &lt;code&gt;perf report&lt;/code&gt; execution and analysis, AVX2 intrinsics code generation, and running&lt;br&gt;
   all 662 &lt;code&gt;ctest&lt;/code&gt; tests — &lt;strong&gt;all within a single terminal session&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;What stood out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Profiling-driven prioritization&lt;/strong&gt; — After running &lt;code&gt;perf&lt;/code&gt;, Copilot analyzed the results and suggested which
function to optimize next based on CPU time share. This data-driven approach kept the work focused on high-impact
targets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AVX2 intrinsics generation&lt;/strong&gt; — Instructions like &lt;code&gt;_mm256_packus_epi32&lt;/code&gt;, &lt;code&gt;_mm256_permute4x64_epi64&lt;/code&gt;, and
&lt;code&gt;_mm256_cvtepu16_epi32&lt;/code&gt; are notoriously hard to get right without reading Intel manuals. Copilot generated correct
sequences and understood the cross-lane behavior of AVX2.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging bit-level failures&lt;/strong&gt; — The 12-bit IDCT initially had 1-bit rounding errors that failed compliance tests.
Copilot helped diagnose the overflow issue and switch from 32-bit to 64-bit intermediate arithmetic to fix it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A/B testing infrastructure&lt;/strong&gt; — Copilot proposed and implemented the &lt;code&gt;JPEG12_IDCT_FORCE_C&lt;/code&gt; environment variable for
toggling between SIMD and scalar paths, enabling clean before/after benchmarking.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;### 💡 Why CLI Was Perfect for This&lt;/p&gt;

&lt;p&gt;SIMD optimization lives and dies by the &lt;strong&gt;"write → build → test → profile → analyze → rewrite"&lt;/strong&gt; loop. Copilot CLI&lt;br&gt;
  keeps this cycle &lt;strong&gt;entirely within the terminal&lt;/strong&gt; — no context switching to an editor. Run &lt;code&gt;cmake --build&lt;/code&gt;, see the&lt;br&gt;
  result, fix the code, run 662 tests, benchmark — all in one continuous conversation.&lt;/p&gt;

&lt;p&gt;### ⚠️ Challenges&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rare type system&lt;/strong&gt;: libjpeg-turbo's 12-bit types (&lt;code&gt;J12SAMPLE&lt;/code&gt;, &lt;code&gt;J12SAMPROW&lt;/code&gt;, &lt;code&gt;J12SAMPARRAY&lt;/code&gt;) barely exist in
training data. Copilot initially generated dispatch logic using the compile-time &lt;code&gt;BITS_IN_JSAMPLE&lt;/code&gt; macro, but the
correct approach requires runtime &lt;code&gt;data_precision&lt;/code&gt; checks — since libjpeg-turbo builds a single binary supporting
multiple precisions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Measuring small gains&lt;/strong&gt;: When your baseline is already world-class, proving that a 2-3% improvement is real (not
noise) requires careful benchmark design with multiple runs and statistical analysis.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Built with GitHub Copilot CLI + libjpeg-turbo 3.1.x on AMD Ryzen 9 9950X / Ubuntu / GCC 13.3.0&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
      <category>cli</category>
      <category>githubcopilot</category>
    </item>
    <item>
      <title>I suffered a crushing defeat against mimalloc in specific tuning scenarios.</title>
      <dc:creator>CharmPic</dc:creator>
      <pubDate>Sun, 25 Jan 2026 05:42:49 +0000</pubDate>
      <link>https://forem.com/charmpic/i-suffered-a-crushing-defeat-against-mimalloc-in-specific-tuning-scenarios-4dj3</link>
      <guid>https://forem.com/charmpic/i-suffered-a-crushing-defeat-against-mimalloc-in-specific-tuning-scenarios-4dj3</guid>
      <description>&lt;ol&gt;
&lt;li&gt;The Scalability Wall (T=16)
At 16 threads, hz3 hit a limit.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;hz3: 76.6M ops/s&lt;/p&gt;

&lt;p&gt;mimalloc: 85.0M ops/s&lt;/p&gt;

&lt;p&gt;hz4 (AI): 106.3M ops/s My allocator struggles to scale linearly at high thread counts compared to the AI's brute-force approach.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The "Aggressive Purge" Defeat
In the default benchmarks, hz3 used less memory (1.36GB) than mimalloc (1.52GB). However, when we enabled mimalloc's aggressive memory release mode (purge_delay=0), the tables turned dramatically.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;mimalloc (tuned): 0.52 GB 😱&lt;/p&gt;

&lt;p&gt;hz3 (tuned): 1.39 GB&lt;/p&gt;

&lt;p&gt;Result: hz3 used 2.7x more memory.&lt;/p&gt;

&lt;p&gt;mimalloc has a highly sophisticated page reclaiming system that I haven't implemented yet. While hz3 holds onto memory to keep speed up, mimalloc can strip down to the bare metal when asked. This is a clear victory for mimalloc's engineering maturity.&lt;/p&gt;

</description>
      <category>cpp</category>
    </item>
  </channel>
</rss>
