<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Tatsuya Ogawa</title>
    <description>The latest articles on Forem by Tatsuya Ogawa (@_4a49fbaa067787556beb).</description>
    <link>https://forem.com/_4a49fbaa067787556beb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3587889%2F7687c66b-6852-4a15-873e-b1d614706427.png</url>
      <title>Forem: Tatsuya Ogawa</title>
      <link>https://forem.com/_4a49fbaa067787556beb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/_4a49fbaa067787556beb"/>
    <language>en</language>
    <item>
      <title>Exploring Metal 4 Placement Sparse Buffers: Granularity Limits in 3D Textures</title>
      <dc:creator>Tatsuya Ogawa</dc:creator>
      <pubDate>Sun, 19 Apr 2026 15:37:27 +0000</pubDate>
      <link>https://forem.com/_4a49fbaa067787556beb/exploring-metal-4-placement-sparse-buffers-granularity-limits-in-3d-textures-47ik</link>
      <guid>https://forem.com/_4a49fbaa067787556beb/exploring-metal-4-placement-sparse-buffers-granularity-limits-in-3d-textures-47ik</guid>
      <description>&lt;h1&gt;
  
  
  Exploring Metal 4 Placement Sparse Buffers: Granularity Limits in 3D Textures
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;With the release of &lt;strong&gt;Metal 4&lt;/strong&gt; at WWDC25, I was particularly interested in the updated &lt;strong&gt;Placement Sparse Buffers&lt;/strong&gt; (and textures). I wanted to see how the new APIs changed the landscape of memory management and whether they improved memory reduction efficiency for 3D resources like Signed Distance Fields (SDFs).&lt;/p&gt;

&lt;p&gt;To find out, I built a verification app to audit the actual behavior on physical hardware.&lt;/p&gt;

&lt;p&gt;You can find the full source code here:&lt;br&gt;
&lt;a href="https://github.com/tatsuya-ogawa/MetalPlacementSparseVerification" rel="noopener noreferrer"&gt;tatsuya-ogawa/MetalPlacementSparseVerification&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Verification App Overview
&lt;/h2&gt;

&lt;p&gt;The app renders a sphere using &lt;strong&gt;Raymarching&lt;/strong&gt; from an SDF stored in a 3D texture. It compares three different memory strategies:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Dense&lt;/strong&gt;: Fully allocated 3D texture (Baseline)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hardware Sparse (Metal 4)&lt;/strong&gt;: Using the new Placement Sparse APIs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Software Atlas (Bricks)&lt;/strong&gt;: A custom implementation of a 3D brick atlas&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0kjqkc994l5qameclfpu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0kjqkc994l5qameclfpu.png" alt="App Screenshot" width="800" height="513"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Finding 1: The "64x64x1" Granularity Bottleneck
&lt;/h2&gt;

&lt;p&gt;While implementing the Hardware Sparse mode using Metal 4, I encountered a physical constraint that significantly impacts 3D resource optimization: &lt;strong&gt;Sparse Page Granularity&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;On a physical iOS device using &lt;code&gt;R32Float&lt;/code&gt; and a 16KB page size, &lt;code&gt;device.sparseTileSize&lt;/code&gt; returns a tile dimension of &lt;strong&gt;64 x 64 x 1&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why is this a problem?
&lt;/h3&gt;

&lt;p&gt;When trying to store a "shell" of an object (like an SDF surface), a depth of "1" is extremely thin but the 64x64 footprint is quite large. If a surface even slightly grazes a tile, the entire 64x64x1 block is committed to memory.&lt;/p&gt;

&lt;p&gt;At a resolution of $256^3$:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dense&lt;/strong&gt;: ~102.4 MB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hardware Sparse&lt;/strong&gt;: &lt;strong&gt;48.4 MB&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While there is a saving, it's not as efficient as it could be because the "coarse" footprint captures too much empty space.&lt;/p&gt;




&lt;h2&gt;
  
  
  Finding 2: Superiority of the Software Brick Atlas
&lt;/h2&gt;

&lt;p&gt;To overcome this, I implemented a traditional "Software Brick Atlas" using $8 \times 8 \times 8$ blocks.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Software Atlas (Bricks)&lt;/strong&gt;: &lt;strong&gt;15.8 MB&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result was clear: the software approach used &lt;strong&gt;only about 1/3 of the memory&lt;/strong&gt; compared to the hardware-based approach. By using small cubes ($8^3$) instead of thin plates ($64 \times 64 \times 1$), the atlas can tightly bound the surface of the sphere, excluding far more empty voxels.&lt;/p&gt;




&lt;h2&gt;
  
  
  Metal 4 vs. Metal 3: What Actually Changed?
&lt;/h2&gt;

&lt;p&gt;The biggest takeaway from this verification is that while Metal 4 introduces a more refined API (Placement Sparse Buffers, improved residency sets, etc.), &lt;strong&gt;the underlying memory reduction efficiency remains identical to Metal 3.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The "minimum unit" of memory allocation is defined by the hardware's sparse page size. Since that hasn't changed, Metal 4 doesn't provide any inherent memory-saving advantage over Metal 3 for 3D textures. Both are limited by the same tile dimensions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Metal 4's Placement Sparse Buffers offer a modern and clean API for resource management. However, for 3D memory optimization, we must still respect the physical limits of hardware tile granularity.&lt;/p&gt;

&lt;p&gt;If you are dealing with sparse 3D data where high-density packing is critical, a Software Brick Atlas remains a superior choice despite the added implementation complexity.&lt;/p&gt;

&lt;p&gt;Feel free to check out the repo for the implementation details!&lt;br&gt;
&lt;a href="https://github.com/tatsuya-ogawa/MetalPlacementSparseVerification" rel="noopener noreferrer"&gt;tatsuya-ogawa/MetalPlacementSparseVerification&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Verified on physical iOS hardware with Metal 4 support.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mobile</category>
      <category>gpu</category>
      <category>ios</category>
      <category>graphics</category>
    </item>
    <item>
      <title>Investigating Missing Skeletons in RealityKit After Converting UniRig GLB Files</title>
      <dc:creator>Tatsuya Ogawa</dc:creator>
      <pubDate>Wed, 29 Oct 2025 17:27:25 +0000</pubDate>
      <link>https://forem.com/_4a49fbaa067787556beb/investigating-missing-skeletons-in-realitykit-after-converting-unirig-glb-files-2kfj</link>
      <guid>https://forem.com/_4a49fbaa067787556beb/investigating-missing-skeletons-in-realitykit-after-converting-unirig-glb-files-2kfj</guid>
      <description>&lt;p&gt;UniRig does a great job auto-generating skeletons for GLB models, which makes preparing assets for RealityKit much easier. The surprise came when I pulled that GLB into Reality Composer Pro, exported it as USDZ, and loaded it in a RealityKit app: calling &lt;code&gt;ModelEntity&lt;/code&gt; for the skeleton just returned &lt;code&gt;nil&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;skeletonIterator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;entity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;components&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;ModelComponent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;!.&lt;/span&gt;&lt;span class="n"&gt;mesh&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;skeletons&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;makeIterator&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The same code works flawlessly with Apple's sample &lt;code&gt;robot.usdz&lt;/code&gt;, so the issue clearly lives in the asset converted from UniRig. Here is what I discovered and how I worked around it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Observed
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Convert the UniRig GLB to USDZ via Reality Composer Pro.&lt;/li&gt;
&lt;li&gt;Load the USDZ in RealityKit and the &lt;code&gt;ModelEntity&lt;/code&gt; reports a &lt;code&gt;nil&lt;/code&gt; skeleton.&lt;/li&gt;
&lt;li&gt;Swap in Apple's &lt;code&gt;robot.usdz&lt;/code&gt; and everything behaves, confirming the runtime side is fine.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Digging Into The Asset
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Converted the problematic USDZ from binary (&lt;code&gt;usdc&lt;/code&gt;) to text (&lt;code&gt;usda&lt;/code&gt;) with &lt;code&gt;usdcat&lt;/code&gt; so I could diff it.&lt;/li&gt;
&lt;li&gt;Replaced its skeleton with the one from &lt;code&gt;robot.usdz&lt;/code&gt;; the scene loaded, which pointed to naming or structure in the skeleton data.&lt;/li&gt;
&lt;li&gt;Checked the &lt;code&gt;uniform token[] joints = []&lt;/code&gt; definition and noticed some joint names in the UniRig output contained slashes (&lt;code&gt;XXXX/YYYY&lt;/code&gt;). RealityKit appears to reject skeletons whose joint names include &lt;code&gt;/&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Fixing The Skeleton Names
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Unpack the USDZ you exported from Reality Composer Pro.&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;usdcat&lt;/code&gt; to convert the &lt;code&gt;usdc&lt;/code&gt; file to &lt;code&gt;usda&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;In the &lt;code&gt;uniform token[] joints = []&lt;/code&gt; array, batch-replace joint names that contain &lt;code&gt;/&lt;/code&gt; with names that don't, e.g. turn &lt;code&gt;XXXX/YYYY&lt;/code&gt; into &lt;code&gt;XXXXYYYY&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Repack the edited &lt;code&gt;usda&lt;/code&gt; back into USDZ.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After stripping the slashes from those joint names and repacking, RealityKit once again exposes the skeleton without any code changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;RealityKit seems intolerant of joint names that include &lt;code&gt;/&lt;/code&gt;. When you pipeline UniRig GLBs through Reality Composer Pro, double-check the skeleton joint names and sanitize them before publishing the USDZ if needed. That one tweak keeps RealityKit happy.*** End Patch&lt;/p&gt;

</description>
      <category>realitykit</category>
      <category>unirig</category>
    </item>
  </channel>
</rss>
