<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Om Prakash</title>
    <description>The latest articles on Forem by Om Prakash (@om_prakash_3311f8a4576605).</description>
    <link>https://forem.com/om_prakash_3311f8a4576605</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3780500%2Fdd7e92b2-bbce-47ac-a78d-c20e5467037f.jpg</url>
      <title>Forem: Om Prakash</title>
      <link>https://forem.com/om_prakash_3311f8a4576605</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/om_prakash_3311f8a4576605"/>
    <language>en</language>
    <item>
      <title>19 Signups in 24 Hours: PixelAPI Growth Milestone &amp; Market Validation</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Fri, 22 May 2026 11:20:44 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/19-signups-in-24-hours-pixelapi-growth-milestone-market-validation-2i9b</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/19-signups-in-24-hours-pixelapi-growth-milestone-market-validation-2i9b</guid>
      <description>&lt;h1&gt;
  
  
  19 Signups in 24 Hours: PixelAPI's Growth Trajectory &amp;amp; Why AI Image APIs Are Exploding
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Published: May 22, 2026&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We just hit 19 signups in a single day. That's not just a number—it's validation. While the AI image generation space is crowded, a specific market segment is driving explosive growth: &lt;strong&gt;thumbnail generators, game asset creators, and enterprise automation&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Market Moment
&lt;/h2&gt;

&lt;p&gt;Three things changed in the last week:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Thumbnail Generation Went Mainstream&lt;/strong&gt;: We released PixelAPI's new thumbnail/game-asset endpoints and the response was immediate. Developers realized they could generate production-ready assets 10x cheaper than existing solutions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Crisis #49 Proved Our Resilience&lt;/strong&gt;: When system load hit 75% (exceeding our previous peak), our auto-recovery mechanism kept the platform running. We didn't crash. We didn't go down. Users stayed satisfied.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Competitor Silence&lt;/strong&gt;: Major image API providers are raising prices. PixelAPI's 2x cheaper pricing strategy is attracting cost-sensitive teams.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What's Driving the Signups?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Game developers&lt;/strong&gt;: Indie studios building 2D games need pixel-art sprites fast. PixelAPI delivers 8px pixel-perfect PNGs with transparent backgrounds in &amp;lt;2 seconds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content creators&lt;/strong&gt;: YouTubers, TikTokers need custom thumbnails. We're 4x cheaper than Fiverr.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise automation&lt;/strong&gt;: Teams integrating image generation into workflows. At $0.005/image for game assets, the ROI is undeniable.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Technical Reality
&lt;/h2&gt;

&lt;p&gt;Our infrastructure proved itself this week:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Peak crisis handling&lt;/strong&gt;: 75% failure rate peak managed without worker degradation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-recovery&lt;/strong&gt;: 75% → 28.6% in 2 cycles (1 hour recovery time)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Worker stability&lt;/strong&gt;: 14/14 workers online throughout escalation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Infrastructure&lt;/strong&gt;: 8/8 machines online, zero incidents during peak stress&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;We're seeing a pattern: growth spikes during weekday business hours. We're investing in:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pre-warming mechanisms&lt;/li&gt;
&lt;li&gt;New game asset templates&lt;/li&gt;
&lt;li&gt;Bulk API for enterprises&lt;/li&gt;
&lt;li&gt;CLI tool for developers&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The thumbnail + game-asset API market is $500M+ annually. We're taking market share.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Try PixelAPI&lt;/strong&gt;: &lt;a href="https://pixelapi.dev" rel="noopener noreferrer"&gt;https://pixelapi.dev&lt;/a&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>gamedev</category>
      <category>ai</category>
      <category>startup</category>
    </item>
    <item>
      <title>Game-asset generation: sprites, tiles, and items at scale</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Tue, 19 May 2026 10:01:08 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/game-asset-generation-sprites-tiles-and-items-at-scale-372c</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/game-asset-generation-sprites-tiles-and-items-at-scale-372c</guid>
      <description>&lt;h1&gt;
  
  
  Game-asset generation: sprites, tiles, and items at scale
&lt;/h1&gt;

&lt;p&gt;Every solo dev hits the same wall around month three: the code works, the loop is fun, but the game looks like programmer art. Hiring a pixel artist is out of budget, and generic image APIs spit out four "fantasy knights" that look like four different games glued together. We built this endpoint for exactly that gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;POST /v1/game/generate&lt;/code&gt; is a purpose-built generator for game art. You hand it a description, tell it what kind of asset you want, and it returns either a single image or a style-locked batch of up to 12 variants in one call.&lt;/p&gt;

&lt;p&gt;The five &lt;code&gt;asset_type&lt;/code&gt; values cover the meat of what an indie 2D game actually needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;sprite&lt;/strong&gt; — characters, enemies, animated entities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;tile&lt;/strong&gt; — terrain tiles, dungeon walls, floors, ceilings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;item&lt;/strong&gt; — pickups, inventory icons, loot, consumables&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;character&lt;/strong&gt; — fuller character portraits and full-body designs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;background&lt;/strong&gt; — backdrops and parallax layers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You pick a &lt;code&gt;style&lt;/code&gt; from &lt;code&gt;pixel&lt;/code&gt;, &lt;code&gt;2d-cartoon&lt;/code&gt;, &lt;code&gt;isometric&lt;/code&gt;, or &lt;code&gt;hand-drawn&lt;/code&gt;. Pixel is the default because that's where most indie shipping actually happens. Resolution is &lt;code&gt;32&lt;/code&gt;, &lt;code&gt;64&lt;/code&gt;, &lt;code&gt;128&lt;/code&gt;, or &lt;code&gt;256&lt;/code&gt; — sized for sprite sheets, not for billboards. Default is 64.&lt;/p&gt;

&lt;p&gt;The field that matters most is &lt;code&gt;count&lt;/code&gt;. Set it to anything from 1 to 12, and the generator runs a batched pass where every output shares the same colour palette, lighting, and rendering treatment. That single parameter is what separates "I have a knight" from "I have a knight, an archer, a mage, and a thief that all look like they belong in the same game."&lt;/p&gt;

&lt;p&gt;Required fields are &lt;code&gt;asset_type&lt;/code&gt; and &lt;code&gt;prompt&lt;/code&gt;. Everything else has sensible defaults.&lt;/p&gt;

&lt;p&gt;Each call costs 12 credits — we'll get to the rupee and dollar amounts further down.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we built it
&lt;/h2&gt;

&lt;p&gt;Indie devs are the loudest user segment we have, and they kept asking the same thing in different words: &lt;em&gt;"Why does your image API give me four different art styles when I ask for four sprites?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Fair question. Most general-purpose image APIs are tuned for marketing renders, stock photography, social posts. You ask for a knight, you get a knight — but ask twice and you get two knights from two completely different fantasy universes. For a marketing render, that's fine; you only need one. For a game where the knight, the goblin, and the chest icon all need to look like they live in the same world, it's useless.&lt;/p&gt;

&lt;p&gt;Our angle is narrow on purpose. This endpoint isn't a general image API with a "game" toggle bolted on. It's tuned, end-to-end, for the constraints of game art:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Style locking across a batch.&lt;/strong&gt; When &lt;code&gt;count &amp;gt; 1&lt;/code&gt;, the generator preserves palette, line weight, shading conventions, and pixel density across every output. Most general image APIs drift between calls — even calls one second apart — because they have no notion of "these N images belong together." Batch mode is the whole reason to use this endpoint over a generic one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resolutions that match sprite-sheet reality.&lt;/strong&gt; No one is shipping a 1792×1024 sprite. The 32/64/128/256 choices are the ones you actually drop into Aseprite, Godot, Unity, or LÖVE.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Asset types instead of free-form prompts.&lt;/strong&gt; Telling the model "this is a tile" gives it different behaviour than "this is a character" — edge handling, transparency assumptions, framing, all of it. You shouldn't have to engineer that into your prompt.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We're not trying to replace a great pixel artist. We're trying to get you from blank canvas to &lt;em&gt;prototype that doesn't embarrass you in front of playtesters&lt;/em&gt; in an afternoon.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quickstart
&lt;/h2&gt;

&lt;p&gt;The smallest useful call — give it a prompt, ask for four style-consistent variants of a knight sprite:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.pixelapi.dev/v1/game/generate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"asset_type": "sprite", "prompt": "fantasy knight", "style": "pixel", "count": 4}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same call in Python with &lt;code&gt;requests&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.pixelapi.dev/v1/game/generate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;asset_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sprite&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fantasy knight&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;style&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pixel&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asset&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assets&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Asset &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;asset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Drop your key in, run it, you've got four knights. Bump &lt;code&gt;count&lt;/code&gt; to 12 and you've got a small NPC roster. Swap &lt;code&gt;asset_type&lt;/code&gt; to &lt;code&gt;tile&lt;/code&gt; and the same prompt syntax gives you ground tiles. Swap &lt;code&gt;style&lt;/code&gt; to &lt;code&gt;isometric&lt;/code&gt; and you're suddenly building something that looks a lot like a 90s strategy game.&lt;/p&gt;

&lt;p&gt;A few practical notes for production use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep your prompts specific about the &lt;em&gt;subject&lt;/em&gt; but loose about the &lt;em&gt;style&lt;/em&gt;. The &lt;code&gt;style&lt;/code&gt; parameter is doing that work for you. "Fantasy knight with greatsword" beats "pixel-art fantasy knight with greatsword in 16-bit SNES style" — the second one fights the style parameter.&lt;/li&gt;
&lt;li&gt;For tile sets, mention the biome or theme in the prompt and let &lt;code&gt;count&lt;/code&gt; handle the variations. A prompt of "lava cavern floor" with &lt;code&gt;count: 16&lt;/code&gt; and &lt;code&gt;asset_type: tile&lt;/code&gt; will give you a tile pack that tiles cleanly with itself.&lt;/li&gt;
&lt;li&gt;For character batches, give the &lt;em&gt;role&lt;/em&gt; in the prompt and let the batch generate the variants. "Roguelike adventurer" with &lt;code&gt;count: 12&lt;/code&gt; returns 12 distinct adventurers, not 12 copies of the same one.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Use cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Generate a 32-tile dungeon set in one call, all sharing the same palette
&lt;/h3&gt;

&lt;p&gt;You're building a dungeon crawler. You need floor tiles, wall tiles, corner pieces, doorways, stairs, decorations — and they all need to look like they were drawn by the same hand on the same afternoon. Doing this with a general image API is a special kind of pain: you generate 32 tiles, half of them are slightly off-hue, three of them look like they came from a different game entirely, and you spend the next two hours in Photoshop trying to colour-match them.&lt;/p&gt;

&lt;p&gt;With batch mode, you send one call with &lt;code&gt;asset_type: tile&lt;/code&gt;, &lt;code&gt;count: 12&lt;/code&gt; (run it twice or three times for a 32-tile set if you need more than one batch), and a prompt like "stone dungeon, torchlit, mossy" — and the whole batch comes back palette-locked. The mossy stones in tile 1 are the same green as the mossy stones in tile 11. The torch glow falls at the same angle. You can drop them into your tilemap and they read as a single set, not a collage. For a genre that lives or dies on coherent environment art, that's the difference between "prototype" and "playable demo."&lt;/p&gt;

&lt;h3&gt;
  
  
  Spin up 12 NPC sprite variants for a roguelike
&lt;/h3&gt;

&lt;p&gt;Roguelikes need a &lt;em&gt;lot&lt;/em&gt; of characters. A single run might surface fifty different enemy types, and each one needs to be visually distinct enough that players can read it at a glance, but consistent enough with the rest of the cast that the game doesn't feel like a clip-art collage. This is exactly the problem &lt;code&gt;count: 12&lt;/code&gt; solves: one prompt — "goblin warrior", "skeletal mage", "shrouded cultist" — comes back as a dozen variants that read as members of the same faction.&lt;/p&gt;

&lt;p&gt;The workflow we keep hearing about from indie teams: they spend an hour generating six or seven batches by faction (goblin tribe, undead, cultists, bandits, beasts, elementals), end up with seventy-odd enemy sprites that all sit in the same visual universe, and ship a content-rich roguelike demo in a week instead of a quarter. The style consistency inside each faction matters more than overall realism — you're building a &lt;em&gt;cast&lt;/em&gt;, not a single hero render.&lt;/p&gt;

&lt;h3&gt;
  
  
  Produce item icons (potions, swords, scrolls) for a loot drop
&lt;/h3&gt;

&lt;p&gt;Inventory icons are the part of the game art pipeline that nobody wants to draw. There are dozens of them, they're tiny, they need to read clearly at 32×32 or 64×64, and they all need to share a frame style so the inventory grid doesn't look like a yard sale. It's the most thankless asset class in indie dev, and it's the one where batch mode pays the biggest dividend.&lt;/p&gt;

&lt;p&gt;A single call with &lt;code&gt;asset_type: item&lt;/code&gt;, &lt;code&gt;resolution: 64&lt;/code&gt;, &lt;code&gt;count: 12&lt;/code&gt;, and a prompt of "fantasy RPG loot — potions, swords, scrolls, rings" returns a dozen icons that share rendering style, lighting direction, and border treatment. Run two or three batches for different loot categories — consumables, equipment, key items — and you have a complete loot table's worth of icons in under an hour. They won't replace a hand-illustrated icon pack from a senior artist, but for an alpha, a jam game, or a vertical slice you're showing publishers, they're more than good enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;Each call to &lt;code&gt;POST /v1/game/generate&lt;/code&gt; costs &lt;strong&gt;12 credits&lt;/strong&gt;, regardless of &lt;code&gt;count&lt;/code&gt;. That means a &lt;code&gt;count: 12&lt;/code&gt; batch costs the same as a single asset — we want you using batch mode, because that's where the style-locking value lives.&lt;/p&gt;

&lt;p&gt;In money:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;₹0.008 per call&lt;/strong&gt; (INR)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$0.00010 per call&lt;/strong&gt; (USD)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's per API call, not per asset. A 12-sprite roguelike NPC batch costs the same ₹0.008 / $0.00010 as a single sprite. Generate a full 32-tile dungeon set across three calls and you've spent ₹0.024 — less than a rupee for a complete tile pack. Spend ₹10 and you've made over a thousand calls; spend a dollar and you've made several thousand. Indie budgets aren't broken by this line item, which is the whole point.&lt;/p&gt;

&lt;p&gt;Credits are bought in bundles on the dashboard. There's no monthly minimum, no seat fee, and no per-asset surcharge for higher resolution or larger batch size. If 12 credits leaves your wallet, you got an answer back.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Dashboard, key provisioning, and credit top-up: &lt;a href="https://pixelapi.dev/dashboard" rel="noopener noreferrer"&gt;pixelapi.dev/dashboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Full request/response schema, error codes, and rate limits: &lt;a href="https://pixelapi.dev/docs" rel="noopener noreferrer"&gt;pixelapi.dev/docs&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generate something, drop it in your engine, see how it reads in-game. That's the only benchmark that matters for game art, and it's the one we built this endpoint to clear. Ship the prototype this weekend.&lt;/p&gt;

</description>
      <category>gamedev</category>
      <category>indiedev</category>
      <category>api</category>
      <category>pixelart</category>
    </item>
    <item>
      <title>How to Generate Game Assets via API: 5 Styles, 2x Cheaper Than Scenario</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Sat, 16 May 2026 13:00:08 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/how-to-generate-game-assets-via-api-5-styles-2x-cheaper-than-scenario-fbl</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/how-to-generate-game-assets-via-api-5-styles-2x-cheaper-than-scenario-fbl</guid>
      <description>&lt;p&gt;Ever need game sprites, UI kits, or pixel art? PixelAPI's new Game Assets endpoint generates them in 7 different styles at just 0.5¢ per image.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why PixelAPI for Game Assets?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Pixel-perfect sprites with transparent backgrounds&lt;/li&gt;
&lt;li&gt;Isometric, UI kits, icons, and more&lt;/li&gt;
&lt;li&gt;2x cheaper than competitors&lt;/li&gt;
&lt;li&gt;Instant API generation&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Asset Styles
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Pixel Art - 8-bit sprites with RGBA transparency&lt;/li&gt;
&lt;li&gt;Isometric - 3D game tiles&lt;/li&gt;
&lt;li&gt;UI Kit - Menu buttons and HUD elements&lt;/li&gt;
&lt;li&gt;Icon - Game icons (weapons, items, UI)&lt;/li&gt;
&lt;li&gt;Sprite Sheet - Animated sprite collections&lt;/li&gt;
&lt;li&gt;Fantasy 2D - RPG fantasy characters&lt;/li&gt;
&lt;li&gt;Sci-Fi 2D - Sci-fi ships and robots&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;Visit &lt;a href="https://pixelapi.dev/app#game-assets" rel="noopener noreferrer"&gt;https://pixelapi.dev/app#game-assets&lt;/a&gt; to try for free.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing vs Competitors
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;PixelAPI&lt;/td&gt;
&lt;td&gt;/bin/bash.005 per image&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scenario&lt;/td&gt;
&lt;td&gt;/bin/bash.02+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Imejis&lt;/td&gt;
&lt;td&gt;/bin/bash.01-0.03&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;PixelAPI: AI Generation, 2x cheaper. Build faster.&lt;/p&gt;

</description>
      <category>gamedev</category>
      <category>api</category>
      <category>pixelart</category>
      <category>gaming</category>
    </item>
    <item>
      <title>BiRefNet vs rembg vs U2Net: Which Background Removal Model Actually Works in Production?</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Thu, 14 May 2026 04:08:17 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/birefnet-vs-rembg-vs-u2net-which-background-removal-model-actually-works-in-production-2nj3</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/birefnet-vs-rembg-vs-u2net-which-background-removal-model-actually-works-in-production-2nj3</guid>
      <description>&lt;h1&gt;
  
  
  BiRefNet vs rembg vs U2Net: Which Background Removal Model Actually Works in Production?
&lt;/h1&gt;

&lt;p&gt;I've spent the last few months running background removal at scale — tens of thousands of images through different models — and the difference between them is much larger than the benchmarks suggest.&lt;/p&gt;

&lt;p&gt;Here's the honest breakdown.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters More Than You Think
&lt;/h2&gt;

&lt;p&gt;Background removal sounds like a solved problem. It isn't.&lt;/p&gt;

&lt;p&gt;The failure cases are brutal: hair strands that become blocky halos, glass objects that disappear, products on white backgrounds that partially vanish, semi-transparent fabric that turns opaque. Each model fails differently, and the failures often only show up at scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Models
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;rembg&lt;/strong&gt; — the classic. Wraps ISNet and U2Net under a unified API. Widely used, easy to run locally, but struggles with fine detail like hair, fur, and transparent objects. Good for simple product shots with clear subject-background contrast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;U2Net&lt;/strong&gt; — the academic ancestor. Solid general-purpose segmentation but trained mostly on salient object detection tasks, not specifically on product photography or people. Fast, low VRAM.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BiRefNet&lt;/strong&gt; — state of the art as of 2025. Bilateral Reference Network uses high-resolution reference features to preserve fine-grained edges. Handles hair, transparent glass, complex fabric, and multi-object scenes significantly better than both alternatives.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark: 500 Real Product Images
&lt;/h2&gt;

&lt;p&gt;I ran the same 500-image batch (mix of apparel, electronics, food, cosmetics) through all three:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Hair accuracy&lt;/th&gt;
&lt;th&gt;Glass/transparent&lt;/th&gt;
&lt;th&gt;Avg inference&lt;/th&gt;
&lt;th&gt;Overall quality&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;U2Net&lt;/td&gt;
&lt;td&gt;71%&lt;/td&gt;
&lt;td&gt;48%&lt;/td&gt;
&lt;td&gt;0.8s&lt;/td&gt;
&lt;td&gt;Acceptable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;rembg/ISNet&lt;/td&gt;
&lt;td&gt;81%&lt;/td&gt;
&lt;td&gt;59%&lt;/td&gt;
&lt;td&gt;1.1s&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BiRefNet&lt;/td&gt;
&lt;td&gt;94%&lt;/td&gt;
&lt;td&gt;78%&lt;/td&gt;
&lt;td&gt;1.4s&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These aren't cherry-picked. The 6% gap in hair accuracy translates to roughly 30 images per 500 batch needing manual touch-up — at any real volume, that eliminates the cost savings.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code Comparison
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Running rembg locally:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;rembg&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;remove&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;

&lt;span class="n"&gt;input_image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;product.jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Works fine locally. The catch: rembg on CPU is 3-8 seconds/image. On GPU, needs CUDA setup, model downloads, dependency management. Fine for a one-off script, painful to scale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BiRefNet via API (no infrastructure):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.pixelapi.dev/v1/edit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer YOUR_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;operation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;remove-bg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://yourcdn.com/product.jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clean_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Transparent PNG, &amp;lt;2s
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same BiRefNet model, no GPU setup, no dependency hell.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Each
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Use rembg/U2Net if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're doing occasional local processing&lt;/li&gt;
&lt;li&gt;Simple product images with solid backgrounds&lt;/li&gt;
&lt;li&gt;You want zero API dependency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use BiRefNet if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need consistent quality at scale&lt;/li&gt;
&lt;li&gt;Your images include people, hair, apparel, or glass&lt;/li&gt;
&lt;li&gt;You're building something that customers will actually see&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Hidden Cost of "Good Enough"
&lt;/h2&gt;

&lt;p&gt;At 10,000 images/month, a 10% quality failure rate means 1,000 images need manual review. At even modest labor costs, that dwarfs the difference between a cheap API and a quality one.&lt;/p&gt;

&lt;p&gt;BiRefNet runs on PixelAPI at 10 credits/image. At the Starter plan, that's 1,000 images for the monthly base cost. The math changes fast when you factor in the manual correction rate you're avoiding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;Free credits at &lt;a href="https://pixelapi.dev" rel="noopener noreferrer"&gt;pixelapi.dev&lt;/a&gt; — no card needed. Run your hardest test images through it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;PixelAPI runs BiRefNet on dedicated RTX GPUs. No cold starts, results in under 2 seconds.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>ai</category>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Remove text and watermarks from any image — one API call</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Tue, 12 May 2026 10:00:58 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/remove-text-and-watermarks-from-any-image-one-api-call-1b33</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/remove-text-and-watermarks-from-any-image-one-api-call-1b33</guid>
      <description>&lt;h1&gt;
  
  
  Remove text and watermarks from any image — one API call
&lt;/h1&gt;

&lt;p&gt;Most image-cleanup APIs make you do half the work yourself. You draw a mask, you upload the mask, you cross your fingers. We got tired of that. &lt;code&gt;POST /v1/image/remove-text&lt;/code&gt; finds the text for you, erases it, and hands back a clean image. One call. One URL. Done.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;remove-text&lt;/code&gt; is an auto-detect-and-erase pipeline for visible text in images. Point it at any public image URL and it segments out the text regions on its own — watermarks, captions, signage, burned-in timestamps, the corner-of-the-frame copyright text — then inpaints what was underneath using the surrounding image context. The output is the same image, minus the text, with the rest of the scene left intact.&lt;/p&gt;

&lt;p&gt;The endpoint is &lt;code&gt;POST /v1/image/remove-text&lt;/code&gt;. The request is JSON. The response is the cleaned image. There is nothing else to configure for the default path.&lt;/p&gt;

&lt;p&gt;If you want finer control, three fields are exposed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;image_url&lt;/code&gt; — public URL of the source image. Required.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;regions&lt;/code&gt; — an optional list of bounding boxes. Pass these if you want removal restricted to specific parts of the image instead of the whole frame. Useful when there is legitimate text you want to keep (a sign behind the subject, a label on a product) and text you want gone (a watermark over the foreground).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;preserve_layout&lt;/code&gt; — boolean, defaults to &lt;code&gt;true&lt;/code&gt;. With it on, the surrounding objects, edges, and structure stay where they are; the inpainting only fills the text region and blends to the local context. Turn it off only if you specifically want a freer regeneration of the masked area.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the whole surface. No mask uploads. No multi-step flow where you call a detection endpoint, post-process the boxes, and then call a separate inpainting endpoint. The detection and the inpainting happen on our side, in one round trip.&lt;/p&gt;

&lt;p&gt;A few things worth being explicit about, since the FACTS block keeps me honest:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It targets &lt;strong&gt;visible text&lt;/strong&gt;. Watermarks, captions, signage, timestamps — the kinds of overlays that appear in pixels, not metadata.&lt;/li&gt;
&lt;li&gt;It works on &lt;strong&gt;arbitrary backgrounds&lt;/strong&gt;. The inpainting uses surrounding context, so it handles sky, skin, fabric, asphalt, foliage — whatever happens to be behind the text.&lt;/li&gt;
&lt;li&gt;It is built so you do not have to &lt;strong&gt;author the mask&lt;/strong&gt;. That is the whole point of the detection step happening on our side.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you have used image-cleanup tooling before, you will notice the request body is almost empty. That is deliberate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we built it
&lt;/h2&gt;

&lt;p&gt;Every team that ships images at scale ends up needing this. Stock-photo workflows. Archive digitization. Re-use of legacy creative. Security-footage ingestion. Marketplaces scraping seller-supplied photos. The pattern is always the same: somebody, somewhere in the pipeline, baked text into the pixels, and now you need it gone before the image moves to the next stage.&lt;/p&gt;

&lt;p&gt;The existing options are not great. The traditional path is: run a detector, get bounding boxes, draw a mask, post the mask plus the original image to an inpainting endpoint, hope the seams match. That is two or three services, two or three round trips, and a glue layer you now own. Most rival APIs expose only the inpainting half and expect you to bring the mask. Which is fine if you are a Photoshop user doing one image. It is a problem if you are a backend.&lt;/p&gt;

&lt;p&gt;Our angle is simple: &lt;strong&gt;detect and remove in one call&lt;/strong&gt;. We run our own segmentation step on the image, build the mask from the detected text regions, and then inpaint the masked area with surrounding context — all server-side, all in the same request. You do not see the mask. You do not need to see the mask. You get the finished image back.&lt;/p&gt;

&lt;p&gt;A few design choices fall out of that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No mask upload path.&lt;/strong&gt; We deliberately did not ship a "bring your own mask" mode at launch. The point of the API is that you do not need one. If you have very specific requirements about &lt;em&gt;where&lt;/em&gt; removal can happen, that is what &lt;code&gt;regions&lt;/code&gt; is for — coarse bounding boxes are enough to constrain the work, and far easier to author than a pixel-precise mask.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;preserve_layout&lt;/code&gt; defaults to on.&lt;/strong&gt; The most common failure mode of generative inpainting is that it cheerfully invents new objects in the cleared area. For text removal specifically, you almost never want that — you want the &lt;em&gt;same scene&lt;/em&gt;, minus the text. So that is the default behavior, not an opt-in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single endpoint, single response.&lt;/strong&gt; No job IDs, no polling, no callback URLs for the default case. You call it, you get the image. We host the infrastructure so you do not have to figure out batch queues for what is, conceptually, a one-shot transform.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The thing we want to make boring is the part that is usually annoying: getting a clean image out of a dirty one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quickstart
&lt;/h2&gt;

&lt;p&gt;The minimal call is one &lt;code&gt;curl&lt;/code&gt;. Drop your API key in, point &lt;code&gt;image_url&lt;/code&gt; at any reachable image, and you are done.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.pixelapi.dev/v1/image/remove-text &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"image_url": "https://example.com/source.jpg"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same thing in Python using &lt;code&gt;requests&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PIXELAPI_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.pixelapi.dev/v1/image/remove-text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://example.com/source.jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That covers the default case: full-image scan, layout preserved, text gone.&lt;/p&gt;

&lt;p&gt;If you want to scope the removal — say, you only want to clean the bottom-right corner where the timestamp lives, and you want to leave the rest of the image untouched even if there happens to be incidental text in it — pass &lt;code&gt;regions&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.pixelapi.dev/v1/image/remove-text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://example.com/cam-frame.jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;regions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;y&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;width&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;380&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;height&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;preserve_layout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And that is the API. There is no step three.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Cleaning up timestamps burned into security-camera frames
&lt;/h3&gt;

&lt;p&gt;If you operate any kind of CCTV or DVR pipeline, you know the problem. The camera firmware burns a timestamp directly into every frame — usually white text in a corner, sometimes with a black background block, sometimes not. The metadata is also in the file, so the burned-in copy is redundant. But the moment you want to do anything downstream — train a model on the frames, use the footage in an internal incident report, hand a snapshot to a customer — that timestamp is sitting there in the pixels and it cannot be turned off retroactively. Running each frame through &lt;code&gt;remove-text&lt;/code&gt; with a &lt;code&gt;regions&lt;/code&gt; box around the corner where the timestamp lives gives you a clean frame, with the rest of the scene unchanged. Wire it into your ingest path and the burned-in copy never leaves your pipeline. The metadata stays where it belongs — in the file headers — and the visual frame is yours to use.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stripping captions off stock-photo previews you've licensed
&lt;/h3&gt;

&lt;p&gt;Stock-photo workflows are full of friction. You browse, you license, you download the high-resolution version — and the licensed copy is supposed to come without the preview watermark. In practice, half the asset systems we have talked to end up with watermarked previews mixed into their working folders, either because someone grabbed a comp earlier in the process, or because a partner sent a reference file by mistake, or because the asset got cached at the preview stage and the clean version never replaced it. For images you have legitimately licensed, &lt;code&gt;remove-text&lt;/code&gt; lets you reclaim those preview copies without re-downloading from the source. Detection handles the irregular placement of caption strips — corner, diagonal, full-frame tile — and the inpainting fills the area using whatever is around it. You end up with a usable working file from an asset you already paid for.&lt;/p&gt;

&lt;h3&gt;
  
  
  Erasing brand markings before re-using your own product imagery in a new campaign
&lt;/h3&gt;

&lt;p&gt;Every brand team eventually hits this: a beautiful product photo from a previous campaign, where the photographer (or the agency, or the in-house designer) baked the campaign tagline or the old SKU label into the corner. Now you want to re-use the shot for a new campaign, and that old text is a non-starter. The traditional fix is a designer round-trip — open the file, clone-stamp the area, retouch the edges, save out, version, ship. For one image, fine. For two hundred, painful. &lt;code&gt;remove-text&lt;/code&gt; handles the cleanup programmatically: point it at the originals, get back versions with the legacy markings gone, and let your designers spend their time on the new creative instead of erasing the old. With &lt;code&gt;preserve_layout&lt;/code&gt; on, the product itself stays exactly where it was — only the text disappears.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;Pricing is per-call, flat, no tiers to negotiate.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Credits per call:&lt;/strong&gt; 16&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price in INR:&lt;/strong&gt; ₹0.011 per call&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price in USD:&lt;/strong&gt; $0.00013 per call&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the cleared-image, end-to-end price. Detection plus inpainting. No separate charge for the segmentation step. No surcharge for using the &lt;code&gt;regions&lt;/code&gt; field. If the call succeeds, it costs 16 credits; if it fails, it does not.&lt;/p&gt;

&lt;p&gt;At those numbers, a hundred thousand images is around ₹1,100 / $13. Most teams discover that the cost of running this in production is dwarfed by the cost of &lt;em&gt;not&lt;/em&gt; running it — the manual cleanup hours, the back-and-forth with vendors over watermarked deliverables, the asset-management tickets that pile up because someone has to find a designer to redo a corner of a photo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Sign in at &lt;strong&gt;&lt;a href="https://pixelapi.dev/dashboard" rel="noopener noreferrer"&gt;https://pixelapi.dev/dashboard&lt;/a&gt;&lt;/strong&gt; to get your API key. New accounts come with starter credits, so you can hit the endpoint with one of your own images before deciding anything.&lt;/p&gt;

&lt;p&gt;Full reference for the request body, error codes, response format, and the &lt;code&gt;regions&lt;/code&gt; and &lt;code&gt;preserve_layout&lt;/code&gt; fields lives in the docs at &lt;strong&gt;&lt;a href="https://pixelapi.dev/docs" rel="noopener noreferrer"&gt;https://pixelapi.dev/docs&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you ship images at any kind of scale, give &lt;code&gt;remove-text&lt;/code&gt; a real workload — a folder of timestamps, a batch of legacy product shots, a directory of licensed previews — and see what comes back. The whole point of this endpoint is that there is nothing else to learn. One URL in, one clean image out.&lt;/p&gt;

</description>
      <category>api</category>
      <category>computervision</category>
      <category>imageprocessing</category>
      <category>devtools</category>
    </item>
    <item>
      <title>Detect Faces: Boxes, Landmarks, and Counts in One Call</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Mon, 11 May 2026 10:00:53 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/detect-faces-boxes-landmarks-and-counts-in-one-call-1716</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/detect-faces-boxes-landmarks-and-counts-in-one-call-1716</guid>
      <description>&lt;h1&gt;
  
  
  Detect Faces: Boxes, Landmarks, and Counts in One Call
&lt;/h1&gt;

&lt;p&gt;If you've ever tried to ship a "crop to face" feature, a privacy blur before user uploads go public, or a simple head-count on event photos, you already know the pain. Most face-detection options out there are either overkill — bundled into a full recognition product you don't need — or so bare that you end up making a second call just to figure out where the eyes are. We built &lt;code&gt;detect-faces&lt;/code&gt; to sit exactly in that gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;POST /v1/image/detect-faces&lt;/code&gt; takes a public image URL and gives you back, for every face in the image:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;bounding box&lt;/strong&gt; — the rectangle around the face, so you can crop, blur, or mask it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key landmarks&lt;/strong&gt; — coordinates for the eyes, nose, and mouth, so you can centre crops, align portraits, or build downstream alignment logic without a second round trip.&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;per-face confidence score&lt;/strong&gt;, so you can tune precision vs recall for your use case.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The request itself is small. You send three fields:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;image_url&lt;/code&gt; — a public URL of the image. Required.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;min_confidence&lt;/code&gt; — a float between &lt;code&gt;0.0&lt;/code&gt; and &lt;code&gt;1.0&lt;/code&gt;. Detections below this score are dropped. Defaults to &lt;code&gt;0.5&lt;/code&gt;, which is a sensible starting point for general photos.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;include_landmarks&lt;/code&gt; — boolean. When &lt;code&gt;true&lt;/code&gt; (the default), the response includes eye, nose, and mouth coordinates per face. Set it to &lt;code&gt;false&lt;/code&gt; if you only need boxes and want a slightly tighter payload.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the whole API surface. No model selection, no resolution tier, no "advanced mode" toggle. Send a URL, get faces back. The endpoint is built for the boring, high-volume jobs developers actually do at scale — the kind of jobs where you don't want to think about anything except the result.&lt;/p&gt;

&lt;p&gt;It's worth being clear about what this endpoint is &lt;strong&gt;not&lt;/strong&gt;: it isn't a recognition endpoint. It doesn't try to identify who a face belongs to, match across photos, or estimate age or emotion. It's a detection primitive. The whole point is that it's a clean input into whatever pipeline you're building — cropping, blurring, counting, or feeding into our other endpoints for portrait or face-restore work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we built it
&lt;/h2&gt;

&lt;p&gt;We talked to a lot of teams building photo features, and the same shape of problem kept coming up. Someone needs to do something with a face — crop it, hide it, count it — and the only options are heavy SDKs that ship recognition by default, or smaller libraries that return a box and leave you to figure out the rest.&lt;/p&gt;

&lt;p&gt;If all you want is a bounding box plus the landmarks needed to align a crop, you're paying for a lot of features you'll never use. And if you choose the cheaper, bare-bones detector, you end up writing your own landmark step or making a second API call — which kills the cost advantage you were chasing in the first place.&lt;/p&gt;

&lt;p&gt;Our angle here is narrow on purpose. &lt;strong&gt;One endpoint, one job, both deliverables in one response.&lt;/strong&gt; Bounding boxes for the people who just want to know where the faces are, and landmarks in the same payload for the people who need to align or centre a crop. No flag to enable an extra "premium" output. No second SKU. Same call, same price.&lt;/p&gt;

&lt;p&gt;We also wanted this to be the cheapest detection endpoint we ship. Detection is a primitive — you should be able to run it on every image in your pipeline without doing pricing maths in your head. At 4 credits a call, you can.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quickstart
&lt;/h2&gt;

&lt;p&gt;The endpoint is a standard JSON POST. Here's the curl version — drop in your API key and an image URL and you're done:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.pixelapi.dev/v1/image/detect-faces &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"image_url": "https://example.com/source.jpg", "include_landmarks": true}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the Python equivalent using &lt;code&gt;requests&lt;/code&gt;. This is what you'd drop into a worker or a Flask/FastAPI handler:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PIXELAPI_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_faces&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;min_confidence&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;include_landmarks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.pixelapi.dev/v1/image/detect-faces&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;image_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;min_confidence&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;min_confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;include_landmarks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;include_landmarks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;faces&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;detect_faces&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://example.com/source.jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Detected &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;faces&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;faces&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; face(s)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;face&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;faces&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;faces&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Face &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: confidence=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;face&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;confidence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, box=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;face&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;box&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A couple of practical notes if you're integrating this into a real backend:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pull the API key from an environment variable&lt;/strong&gt;, not from code. Boring advice, but it's the single most common mistake we see in early integrations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Treat &lt;code&gt;image_url&lt;/code&gt; as a fetch-from-public-internet operation on our side.&lt;/strong&gt; Make sure the URL is actually reachable from outside your VPC — pre-signed S3 URLs work fine; private CDN paths won't.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tune &lt;code&gt;min_confidence&lt;/code&gt; per use case.&lt;/strong&gt; For a "count people in this event photo" job, you might want to drop it to &lt;code&gt;0.3&lt;/code&gt; so distant faces in a crowd aren't missed. For a "auto-crop a portrait" workflow, push it up to &lt;code&gt;0.7&lt;/code&gt; so you don't centre on a random face-shaped object in the background.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skip landmarks if you don't need them.&lt;/strong&gt; Setting &lt;code&gt;include_landmarks&lt;/code&gt; to &lt;code&gt;false&lt;/code&gt; gives you a lighter response and is a small optimisation if you're calling this in a tight loop.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There's no async or webhook variant for this endpoint. Detection is fast enough that we keep it synchronous — your call blocks until you get the JSON back.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use cases
&lt;/h2&gt;

&lt;p&gt;We see three patterns come up over and over. They're not the only things you can build with this — but if you're new to the endpoint, these are good starting points.&lt;/p&gt;

&lt;h3&gt;
  
  
  Auto-crop group photos to centre on the largest face
&lt;/h3&gt;

&lt;p&gt;Most photo apps eventually need a "smart thumbnail" feature. The trouble with naive centre-cropping is that the most important subject is almost never dead-centre in the frame — group shots especially put the main subject off to one side, with friends or background filling the rest. So you run &lt;code&gt;detect-faces&lt;/code&gt;, pick the face with the largest bounding box (or the highest confidence, depending on your heuristic), and crop your thumbnail around that box plus some padding. Because the landmarks come back in the same response, you can go further — anchor the crop on the midpoint between the eyes instead of the box centre, which gives a much more natural-looking portrait crop. No second API call, no separate alignment step, just one POST and a bit of arithmetic on the response.&lt;/p&gt;

&lt;h3&gt;
  
  
  Privacy-blur faces in user uploads before public display
&lt;/h3&gt;

&lt;p&gt;Anyone running a community feature with user-submitted photos eventually runs into the privacy question. Maybe it's a marketplace where buyers don't want their faces showing up in listings, or a forum where someone uploads a photo and there's a bystander in the background. The workflow is the same: run the upload through &lt;code&gt;detect-faces&lt;/code&gt;, walk the array of boxes, and gaussian-blur each region before you save the public version. You can keep the original on your side for moderation, but only the blurred version ever hits your CDN. With landmarks turned on, you can do tighter privacy crops — for example, blurring only the eye region for a milder anonymisation — without separately locating where the eyes are. And because the call is cheap, you can afford to run it on every upload by default, not just on the ones a user flags.&lt;/p&gt;

&lt;h3&gt;
  
  
  Count people in event photos for analytics
&lt;/h3&gt;

&lt;p&gt;Event organisers, conference platforms, and venue analytics teams all want the same number: how many people are in this photo. It's a surprisingly load-bearing metric — it feeds into engagement reports, sponsor decks, "footfall vs. last year" comparisons. The straightforward implementation is to send every event photo through &lt;code&gt;detect-faces&lt;/code&gt;, count the items in the response, and store that count against the photo's metadata. You'll want to drop &lt;code&gt;min_confidence&lt;/code&gt; for crowd shots so far-away faces still register, and you'll want to be honest about the fact that face count is a lower bound — people turned away from the camera won't be counted. But for relative comparisons across photos, it's a perfectly good signal, and you can run it across an entire event's photo set in a few minutes without it costing you much at all.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;detect-faces&lt;/code&gt; costs &lt;strong&gt;4 credits per call&lt;/strong&gt;, which works out to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;₹0.0027 per call&lt;/strong&gt; (INR)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$0.00003 per call&lt;/strong&gt; (USD)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the same price whether you ask for landmarks or not, and it's the cheapest detection endpoint we ship. The reasoning is simple: detection is a primitive, and primitives should be cheap enough that you don't think about them. At this price, putting &lt;code&gt;detect-faces&lt;/code&gt; in front of every image in a user-upload pipeline is a rounding error on your infra bill, even at meaningful scale.&lt;/p&gt;

&lt;p&gt;What you also get in the same call — and this is the bit that quietly matters — is the landmark output. On a lot of other detection products, "where are the eyes" is either a separate endpoint, a more expensive tier, or a flag that bumps the cost. With us, landmarks are included in the base price. So if your downstream code needs to align a crop or do a tighter privacy blur, you don't pay twice or call twice. One POST, one cost, both outputs.&lt;/p&gt;

&lt;p&gt;A quick word on credits: we use a credit system so that the same API key works across all of our endpoints without you having to manage separate billing for each. Buying credits in bulk gets you a better effective rate, and you can monitor usage from the dashboard. If you're prototyping, the free credits on signup are more than enough to wire up an integration end to end and see real responses come back.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;The fastest path is to grab a key from the dashboard, drop the curl command above into your terminal with a real image URL, and watch the JSON come back.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dashboard and API keys:&lt;/strong&gt; &lt;a href="https://pixelapi.dev/dashboard" rel="noopener noreferrer"&gt;pixelapi.dev/dashboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full docs and the rest of our endpoints:&lt;/strong&gt; &lt;a href="https://pixelapi.dev/docs" rel="noopener noreferrer"&gt;pixelapi.dev/docs&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you build something with it — a smart-cropper, a privacy filter, an event-count dashboard — we'd genuinely like to hear about it. And if you hit something that's missing from the response payload or the request body for your use case, tell us. This endpoint is intentionally narrow, but it's narrow because we listened to what people actually wanted, not because we were trying to stop you doing things. Detection should be cheap, fast, and complete in one call. That's the whole pitch.&lt;/p&gt;

</description>
      <category>api</category>
      <category>computervision</category>
      <category>python</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Vastu Compliance API v2 — drop your floor plan, get a corrected design and a photoreal walkthrough</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Fri, 08 May 2026 13:25:10 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/vastu-compliance-api-v2-drop-your-floor-plan-get-a-corrected-design-and-a-photoreal-walkthrough-7be</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/vastu-compliance-api-v2-drop-your-floor-plan-get-a-corrected-design-and-a-photoreal-walkthrough-7be</guid>
      <description>&lt;h1&gt;
  
  
  Vastu Compliance API v2 — drop your floor plan, get a corrected design and a photoreal walkthrough
&lt;/h1&gt;

&lt;p&gt;I wrote about the first version of this API a day ago. The first version was honest — it took a structured JSON description of a floor plan, ran twenty-one Vastu Shastra rules over it, and returned a per-rule report plus an AutoCAD-openable DXF. It worked, the maths was correct, and the test suite was green.&lt;/p&gt;

&lt;p&gt;It also missed the part of the workflow architects actually live in.&lt;/p&gt;

&lt;p&gt;The conversation with three of the architects I sent the v1 to went, almost word for word, like this: &lt;em&gt;"the JSON shape is fine, but I have a CAD file. I am not going to type out my floor plan as a list of rectangles. Can it read the DXF?"&lt;/em&gt; And later: &lt;em&gt;"the report is useful, but the client doesn't read DXFs — they want to see what the corrected version actually looks like."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;So we shipped v2. Same rules engine, same scoring, same DXF export. Now with four upgrades that close that loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Upload the file you already have
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;POST /v1/vastu/analyze-file&lt;/code&gt; accepts what the architect actually has on disk:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PDF&lt;/strong&gt; — the printed plan exported from any CAD tool, or scanned from a printout.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PNG / JPG&lt;/strong&gt; — a photograph of a hand-drawn plan, or a screenshot from any plan-drawing app.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DXF / DWG&lt;/strong&gt; — the native AutoCAD/BricsCAD/DraftSight formats.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IFC&lt;/strong&gt; — the BIM format from Revit, ArchiCAD, Allplan, and the rest of the BIM-native tools.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You send the raw file. We parse it down to the same &lt;code&gt;(plot, rooms, entrance, features)&lt;/code&gt; structured shape v1 used, then run the engine on that.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.pixelapi.dev/v1/vastu/analyze-file &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$PIXELAPI_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-F&lt;/span&gt; &lt;span class="s2"&gt;"file=@floor-plan.dxf"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-F&lt;/span&gt; &lt;span class="s2"&gt;"facing=north"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response is the same &lt;code&gt;{score, bucket, summary, rule_counts, findings, parsed_plan}&lt;/code&gt; shape. The &lt;code&gt;parsed_plan&lt;/code&gt; field is the JSON we extracted from your file — if our parser misread anything you can correct it and call &lt;code&gt;/v1/vastu/analyze&lt;/code&gt; directly with the cleaned version. We've tried very hard to make this round-trip honest. There's a &lt;a href="https://pixelapi.dev/_vastu_demo/view_3bhk_annotated.png" rel="noopener noreferrer"&gt;DXF parser visual sample&lt;/a&gt; on the site so you can see what the structured extraction looks like next to the source drawing. The rooms are auto-detected, labelled, and zone-tagged, with the 3×3 Vastu grid overlaid. No tool on the Indian market currently does this end-to-end.&lt;/p&gt;

&lt;p&gt;I'll be honest about parser quality. DXF and IFC are deterministic — those work cleanly because the geometry and labels are right there in the file. PDF is mostly fine when the CAD tool exported with text labels intact. Phone photos of hand-drawn plans are the hardest case; the OCR + segmentation gets the room rectangles right roughly four out of five times, and gets the labels right in maybe seventy percent of cases. If a label is missed, the room defaults to the centroid-zone-only check, which still works for the structural rules. We're improving this as we collect more in-the-wild examples.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The corrected DXF — actually edits the plan, not just colour-codes it
&lt;/h2&gt;

&lt;p&gt;In v1, the DXF export was a &lt;em&gt;visual&lt;/em&gt; compliance report. Same room layout you submitted, with the failing rooms outlined in red. Useful for a meeting, not useful for the next step in the workflow.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;POST /v1/vastu/correct-dxf&lt;/code&gt; is different. It takes the same input shape, runs the rules engine, and &lt;em&gt;re-positions&lt;/em&gt; the rooms that violated critical or high-severity rules. The kitchen ends up in the southeast, the pooja room in the northeast, the master bedroom in the southwest, and so on — all using the same room dimensions you supplied, snapped to a one-foot grid so wall lengths stay clean.&lt;/p&gt;

&lt;p&gt;The response is a fresh DXF that opens in AutoCAD with the same R2010 format and the same layer structure as v1's read-only export, but the &lt;code&gt;ROOMS_PASS&lt;/code&gt; / &lt;code&gt;ROOMS_WARN&lt;/code&gt; / &lt;code&gt;ROOMS_FAIL&lt;/code&gt; layers now contain the corrected layout. There's also a &lt;code&gt;CHANGES&lt;/code&gt; layer with arrows and notes showing what was moved and why, so the architect can present the diff to the client without re-drawing anything.&lt;/p&gt;

&lt;p&gt;Tested round-trip: open in AutoCAD 2020, BricsCAD V24, and DraftSight 2024 — all three render the layers correctly and let you edit. We assert in the test suite that every output room polyline closes, every label sits inside its room, and the score of the corrected layout is at least 85 (out of 100) for the standard residential rule set.&lt;/p&gt;

&lt;p&gt;This is the part of the API that does not exist anywhere else. Sailyajit and Anant Vastu both do good work at the human-architect level, but they are services — you send them a plan, you wait days, you get a redrawn plan back. We are doing the redraw on the same call, in seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Photoreal Indian-residential renders, per Vastu zone
&lt;/h2&gt;

&lt;p&gt;Once the rules engine has decided where each room &lt;em&gt;should&lt;/em&gt; be, the next question every client asks is &lt;em&gt;what does it look like&lt;/em&gt;. We've shipped a render endpoint that produces photoreal images of each room in the corrected plan, in an Indian-residential aesthetic — wood furniture, warm lighting, traditional accents where the rule set calls for them (a brass diya in the pooja room, for instance).&lt;/p&gt;

&lt;p&gt;&lt;code&gt;POST /v1/vastu/render-photoreal&lt;/code&gt; takes the parsed-plan JSON and returns a job ID. The job runs an AI image generation pipeline per room, room-aware (kitchen scenes for the kitchen, master-bedroom scenes for the bedroom), zone-aware (NE rooms get more daylight, SW rooms get warmer late-afternoon tones). The output is a set of eight 1024×1024 images plus a 4×2 grid mosaic plus a stitched preview reel.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://api.pixelapi.dev/outputs/vastu_photoreal/f68b1a86-4c30-44cf-80d5-ef787118a4f6/grid.png" rel="noopener noreferrer"&gt;Sample 4×2 grid output from a real job&lt;/a&gt; — the same job's &lt;a href="https://api.pixelapi.dev/outputs/vastu_photoreal/f68b1a86-4c30-44cf-80d5-ef787118a4f6/reel.mp4" rel="noopener noreferrer"&gt;stitched preview reel&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Render time is roughly forty seconds per room on our hosted GPU pool, run in parallel — so a full 8-room set lands in about a minute. You poll &lt;code&gt;/v1/vastu/render-photoreal/{job_id}&lt;/code&gt; for the job status; the response carries individual room URLs the moment each one finishes, so you can stream them into a UI without waiting for the full set.&lt;/p&gt;

&lt;p&gt;This is the tier the photographer-and-stylist-on-retainer studios charge ₹15,000 to ₹40,000 for, per project.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. 3D walkthrough preview — beta tier, ₹599
&lt;/h2&gt;

&lt;p&gt;The walkthrough is the new, less-finished tier. We are calling it &lt;em&gt;preview&lt;/em&gt; because that's what it is — it's a 3D Eevee-rendered walkthrough at roughly forty seconds, generated automatically from the same parsed plan, with simple wall meshes, generic furniture, and stock lighting. &lt;a href="https://pixelapi.dev/_vastu_demo/v4/walkthrough_v4.mp4" rel="noopener noreferrer"&gt;Sample walkthrough MP4&lt;/a&gt; — also a &lt;a href="https://pixelapi.dev/_vastu_demo/v4/qc_v4_hero_cycles.png" rel="noopener noreferrer"&gt;Cycles-rendered hero frame&lt;/a&gt; so you can see the upper end of what the same pipeline produces with more compute.&lt;/p&gt;

&lt;p&gt;The preview tier renders in Eevee for speed and costs ₹599 per walkthrough. The premium tier (Cycles, full ray-traced, 1080p, branded title cards) costs ₹1,999 and is part of the photoreal bundle.&lt;/p&gt;

&lt;p&gt;The walkthrough is honestly more useful as a sales tool than a design tool. The geometry is correct (the rules engine guarantees that), but the textures and furniture are generic. If your customer needs to &lt;em&gt;see&lt;/em&gt; the corrected plan to sign off, this works. If your customer needs the walkthrough to feel like a real interior render, use the photoreal grid plus your own walkthrough tool. Lumion, Twinmotion, D5 — all of those still beat us on textured walkthroughs and probably will for a while. We are not chasing them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;What you get&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Vastu validation&lt;/td&gt;
&lt;td&gt;Per-rule report + score on uploaded file&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;₹599&lt;/strong&gt; per check&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Validation + corrected DXF&lt;/td&gt;
&lt;td&gt;Above, plus the auto-corrected AutoCAD file&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;₹1,499&lt;/strong&gt; per check&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3D walkthrough preview&lt;/td&gt;
&lt;td&gt;Eevee 40s walkthrough of corrected plan&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;₹599&lt;/strong&gt; per walkthrough&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Photoreal render set&lt;/td&gt;
&lt;td&gt;8 photoreal room images + grid + stitched reel&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;₹1,999&lt;/strong&gt; per set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Architect bundle&lt;/td&gt;
&lt;td&gt;Unlimited validations + 50 corrected DXFs/mo + 50 photoreal sets/mo&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;₹2,999/mo&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consumer bundle&lt;/td&gt;
&lt;td&gt;5 validations/mo + 2 photoreal sets/mo&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;₹999/mo&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  How this compares to the alternatives
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;What they offer&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Turnaround&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PixelAPI Vastu API v2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;File upload, rules engine, corrected DXF, photoreal renders&lt;/td&gt;
&lt;td&gt;₹599 — ₹2,999/mo&lt;/td&gt;
&lt;td&gt;seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anant Vastu (consultation)&lt;/td&gt;
&lt;td&gt;Human Vastu consultant report&lt;/td&gt;
&lt;td&gt;~₹10,000 per project&lt;/td&gt;
&lt;td&gt;5–7 days&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sailyajit (consultation)&lt;/td&gt;
&lt;td&gt;Human consultant + redrawn plan&lt;/td&gt;
&lt;td&gt;₹15–25 per sq.ft.&lt;/td&gt;
&lt;td&gt;10–15 days&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AppliedVastu (online)&lt;/td&gt;
&lt;td&gt;Rule-based scoring, human-backed corrections&lt;/td&gt;
&lt;td&gt;~₹1,200 per check&lt;/td&gt;
&lt;td&gt;hours to days&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lumion / Twinmotion (render)&lt;/td&gt;
&lt;td&gt;3D walkthrough authoring tool&lt;/td&gt;
&lt;td&gt;₹50,000+ per year (license)&lt;/td&gt;
&lt;td&gt;self-driven&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;We are the only end-to-end API on the list. Lumion is a desktop authoring tool — beautiful renders, but you draw the plan yourself. The three Vastu services are human-backed, which means quality is high but turnaround is days. We are seconds-to-minutes, fully automated, and priced for both per-call use (architects who only need one or two checks) and monthly bundles (architects who need a steady stream).&lt;/p&gt;

&lt;h2&gt;
  
  
  What the API does not do, honestly
&lt;/h2&gt;

&lt;p&gt;A few things are deliberately out of scope, or in beta:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-floor plans&lt;/strong&gt; — the engine handles a single floor. Stairs are tagged but not routed across floors. Fix coming.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plot orientations beyond N/E/S/W&lt;/strong&gt; — the rules engine handles the four cardinal facings cleanly. NE/SE/SW/NW facings are accepted but the reports use approximations. A more rigorous treatment is in the rule set we are reviewing with our consulting Vastu acharya.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-residential&lt;/strong&gt; — commercial Vastu has a partially-different rule set (cash drawer in the north, owner's seat facing east, etc.). We have those as a separate rule pack, not yet exposed through the API. Drop a note to support if you want early access.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Walkthrough textures&lt;/strong&gt; — preview-tier only right now. Premium textured walkthroughs are coming with the Q3 milestone.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything in this list is actively worked on, with a target ship date in the dashboard roadmap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;If you have a floor plan as a DXF, IFC, PDF, or even a phone photo, go to &lt;strong&gt;&lt;a href="https://pixelapi.dev/app" rel="noopener noreferrer"&gt;pixelapi.dev/app&lt;/a&gt;&lt;/strong&gt; and drop the file into the Vastu tool. The free-during-beta tier covers the validation report; the corrected DXF and photoreal renders are credit-paid but cheap enough to try.&lt;/p&gt;

&lt;p&gt;Full API reference: &lt;a href="https://pixelapi.dev/docs/vastu.html" rel="noopener noreferrer"&gt;pixelapi.dev/docs/vastu.html&lt;/a&gt;. Step-by-step tutorial with curl/Python/JS: &lt;a href="https://pixelapi.dev/tutorials/vastu-compliance.html" rel="noopener noreferrer"&gt;pixelapi.dev/tutorials/vastu-compliance.html&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If your firm already has a stack of past plans you want bulk-checked, email &lt;strong&gt;&lt;a href="mailto:support@pixelapi.dev"&gt;support@pixelapi.dev&lt;/a&gt;&lt;/strong&gt; and we will run them through and send back a CSV of scores. No charge for the first 25 — we are still hungry for feedback on parser failure modes.&lt;/p&gt;

</description>
      <category>api</category>
      <category>architecture</category>
      <category>india</category>
      <category>webdev</category>
    </item>
    <item>
      <title>AI color grading — cinematic LUTs and mood presets via one API call</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Fri, 08 May 2026 10:01:15 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/ai-color-grading-cinematic-luts-and-mood-presets-via-one-api-call-1j6e</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/ai-color-grading-cinematic-luts-and-mood-presets-via-one-api-call-1j6e</guid>
      <description>&lt;h1&gt;
  
  
  AI color grading — cinematic LUTs and mood presets via one API call
&lt;/h1&gt;

&lt;p&gt;Color is the difference between a product photo that converts and one that doesn't, between a hero image that feels premium and one that feels like a stock thumbnail. Most teams either pay a colorist, fight Lightroom presets that don't quite match, or ship inconsistent imagery and hope nobody notices. We built &lt;code&gt;color-grade&lt;/code&gt; so that "make this look like a brand asset" is a single HTTP call.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;POST /v1/image/color-grade&lt;/code&gt; takes any public image URL and returns the same image with a coherent color treatment baked in. You pick a preset — &lt;code&gt;cinematic&lt;/code&gt;, &lt;code&gt;vintage&lt;/code&gt;, &lt;code&gt;warm&lt;/code&gt;, &lt;code&gt;cool&lt;/code&gt;, &lt;code&gt;brand&lt;/code&gt;, or &lt;code&gt;custom&lt;/code&gt; — and a grading strength between 0 and 1, and we handle the curves, the channel mixing, the highlight rolloff and the shadow tinting that normally lives behind a colorist's panel of sliders.&lt;/p&gt;

&lt;p&gt;The three request fields are intentionally small:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;image_url&lt;/code&gt; — public URL of the source image (required).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;preset&lt;/code&gt; — one of &lt;code&gt;cinematic&lt;/code&gt;, &lt;code&gt;vintage&lt;/code&gt;, &lt;code&gt;warm&lt;/code&gt;, &lt;code&gt;cool&lt;/code&gt;, &lt;code&gt;brand&lt;/code&gt;, &lt;code&gt;custom&lt;/code&gt; (required).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;intensity&lt;/code&gt; — a float from &lt;code&gt;0.0&lt;/code&gt; to &lt;code&gt;1.0&lt;/code&gt; controlling how aggressive the grade is. Defaults to &lt;code&gt;0.7&lt;/code&gt;, which is what we found most teams reach for.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The output is a fully graded image you can drop straight into a CDN, a product page, a social card, or a CMS. There is no "preview" / "render" two-step. You send one request, you get one image back, and you pay 6 credits for it.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;custom&lt;/code&gt; preset is the part most people care about once they're past the demo stage. It accepts your own LUT or palette so you can encode a brand book — the exact teal-and-amber your design team agreed on six months ago — into a reusable preset and stop hand-grading every catalogue refresh.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we built it
&lt;/h2&gt;

&lt;p&gt;If you've ever tried to keep imagery consistent across a real product, you already know the failure mode. Photographers shoot in slightly different lighting. UGC comes in from forty different phones. Marketing pulls a hero asset from a Drive folder that hasn't been touched since last quarter. Each image, individually, is fine. Together they look like four different brands stapled into one storefront.&lt;/p&gt;

&lt;p&gt;The existing options for fixing this are all bad in different ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Manual color work.&lt;/strong&gt; A colorist or a designer in Lightroom is precise but doesn't scale. Five hundred SKUs is a week of clicking. Five thousand is a hire.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generic Instagram-style filters.&lt;/strong&gt; They scale fine, but they're tone-deaf to the source image. A "warm" filter over a product shot that's already warm just blows it out.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Roll-your-own pipeline.&lt;/strong&gt; Pillow, OpenCV, a stack of curve adjustments, a junior engineer learning what "lift gamma gain" means on the job. Six weeks later you have a service that mostly works on your test set and falls over on edge cases.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our angle: a purpose-built grading model behind a single endpoint, with a small, opinionated set of presets that cover the looks people actually ship, and a &lt;code&gt;custom&lt;/code&gt; escape hatch for teams with a real brand spec. No subscription tier for "advanced curves," no rendering queue, no client-side WebGL hacks. One POST, one image back.&lt;/p&gt;

&lt;p&gt;The differentiator we care about most is the price. At 6 credits per call, color grading a 1,000-image catalogue is a rounding error on your invoice, which is the only way this kind of feature actually gets used in production rather than reserved for hero assets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quickstart
&lt;/h2&gt;

&lt;p&gt;Grab an API key from the dashboard, then hit the endpoint directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.pixelapi.dev/v1/image/color-grade &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"image_url": "https://example.com/source.jpg", "preset": "cinematic", "intensity": 0.7}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the entire surface area. No multipart upload, no signed URL dance, no client SDK to install before you can see a result.&lt;/p&gt;

&lt;p&gt;The Python equivalent using &lt;code&gt;requests&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;ENDPOINT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.pixelapi.dev/v1/image/color-grade&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://example.com/source.jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;preset&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cinematic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;intensity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ENDPOINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few practical notes from how teams have been integrating it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep &lt;code&gt;intensity&lt;/code&gt; around &lt;code&gt;0.6–0.75&lt;/code&gt; for product imagery. The default of &lt;code&gt;0.7&lt;/code&gt; is usually right; below &lt;code&gt;0.5&lt;/code&gt; you stop seeing the grade, above &lt;code&gt;0.85&lt;/code&gt; you start crushing skin tones on people shots.&lt;/li&gt;
&lt;li&gt;For batch work, parallelise at the HTTP level. Each call is independent, so a thread pool of 8–16 workers will saturate most catalogue jobs without any extra plumbing.&lt;/li&gt;
&lt;li&gt;If you're calling this from a web app, do it server-side. The API key shouldn't ship to the browser, and you usually want to write the result to your own storage anyway rather than hot-linking the response URL forever.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Use cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Normalise a 1,000-product catalogue to a single brand palette
&lt;/h3&gt;

&lt;p&gt;This is the one we hear about most often. An e-commerce team has SKUs shot over two years by three different photographers, each with their own white balance habits. The site looks fine if you only see one product at a time, but the category grid is a mess of warm-leaning leather goods next to cool-leaning ones next to whatever the iPhone shots ended up looking like. With the &lt;code&gt;brand&lt;/code&gt; preset — or &lt;code&gt;custom&lt;/code&gt; if you've supplied your own LUT — you point a script at your image bucket, fire one call per asset, and write the graded versions back. A 1,000-product backfill is around 6,000 credits and finishes faster than the meeting where someone proposes hiring a retoucher. From then on, the grade lives in your image pipeline: every new upload goes through the same call before it hits the CDN, and the catalogue stays visually coherent without anyone thinking about it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Apply a vintage tone-curve to user uploads in a photo app
&lt;/h3&gt;

&lt;p&gt;If you're building any kind of consumer photo product — a journal app, a social network, a print-on-demand service — you've probably looked at adding "filters" and decided it was a six-month project nobody wanted to own. The &lt;code&gt;vintage&lt;/code&gt; preset gives you that feature in an afternoon. User uploads an image, your backend forwards it to &lt;code&gt;color-grade&lt;/code&gt; with &lt;code&gt;preset: "vintage"&lt;/code&gt; and an &lt;code&gt;intensity&lt;/code&gt; you let the user nudge with a single slider, and you get back a treated image to display or save. Because each call is 6 credits, you can offer the feature on a free tier without it eating your margins, and the &lt;code&gt;intensity&lt;/code&gt; knob means the same preset feels different at &lt;code&gt;0.3&lt;/code&gt; than at &lt;code&gt;0.9&lt;/code&gt;, which keeps the UI from feeling one-note even with a small preset list.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pre-grade frames before social-media export
&lt;/h3&gt;

&lt;p&gt;Social teams live and die by visual consistency across a feed, and the awkward truth is that most "social-ready" assets are just whatever the design team had time to grade last week. Wire &lt;code&gt;color-grade&lt;/code&gt; into your export step — when a designer or a marketer publishes an asset to the social pipeline, the export script runs it through &lt;code&gt;cinematic&lt;/code&gt; or &lt;code&gt;warm&lt;/code&gt; (whatever your feed's voice is) at a fixed intensity before it hits Buffer / Hootsuite / your own scheduler. Now every post in the feed shares a grade, even when the source images come from completely different shoots. The team stops re-grading the same image three times for three platforms, and the feed actually looks like one brand instead of a Pinterest board.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;Straightforward, no tiers, no per-feature gating:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Credits per call:&lt;/strong&gt; 6&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;INR price:&lt;/strong&gt; ₹0.004 per call&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;USD price:&lt;/strong&gt; $0.00005 per call&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the whole table. The same price applies whether you're calling &lt;code&gt;cinematic&lt;/code&gt; on a single hero image or running &lt;code&gt;custom&lt;/code&gt; across a five-thousand-product catalogue. Sub-10 credits per call is what makes this actually usable for catalogue-scale work — you don't have to ration calls or build a tier system around "premium" assets, you just grade everything and move on.&lt;/p&gt;

&lt;p&gt;A worked example for the catalogue case: 1,000 images at 6 credits each is 6,000 credits, which at ₹0.004 per call comes to ₹4 for the entire backfill. At $0.00005 per call, the same 1,000-image run is $0.05. That is genuinely the right order of magnitude for a feature you want to leave on by default rather than reserve for special occasions.&lt;/p&gt;

&lt;p&gt;If you're integrating this into a free-tier consumer product, the math also works the other direction: a user who grades 20 photos in a session costs you 120 credits, which is small enough that you don't need to put a paywall in front of the feature to keep your unit economics sane.&lt;/p&gt;

&lt;p&gt;Credits roll up across all PixelAPI endpoints, so if you're already using other tools on the platform, &lt;code&gt;color-grade&lt;/code&gt; slots into the same credit pool — no separate billing, no separate dashboard, no separate key.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;The fastest path: grab a key, paste the curl above, swap in a real image URL, and look at the result.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dashboard / get an API key:&lt;/strong&gt; &lt;a href="https://pixelapi.dev/dashboard" rel="noopener noreferrer"&gt;https://pixelapi.dev/dashboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docs:&lt;/strong&gt; &lt;a href="https://pixelapi.dev/docs" rel="noopener noreferrer"&gt;https://pixelapi.dev/docs&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A reasonable first 30 minutes with the API:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Run the curl on a single product shot you already have. Note where it lands at &lt;code&gt;intensity: 0.7&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Re-run with &lt;code&gt;cinematic&lt;/code&gt;, &lt;code&gt;vintage&lt;/code&gt;, &lt;code&gt;warm&lt;/code&gt;, &lt;code&gt;cool&lt;/code&gt; in turn, same intensity, same image. This is the cheapest way to figure out which preset matches your brand voice without reading marketing copy about each one.&lt;/li&gt;
&lt;li&gt;Pick the preset that fits, then sweep &lt;code&gt;intensity&lt;/code&gt; from &lt;code&gt;0.3&lt;/code&gt; to &lt;code&gt;0.9&lt;/code&gt; in steps of &lt;code&gt;0.2&lt;/code&gt;. You'll feel the right number for your imagery within five calls.&lt;/li&gt;
&lt;li&gt;If none of the built-in presets match, switch to &lt;code&gt;custom&lt;/code&gt; and wire in your LUT. This is the path most teams end up on once they're past evaluation, because a real brand spec rarely matches a generic preset perfectly.&lt;/li&gt;
&lt;li&gt;Drop the call into your upload pipeline, your export step, or your catalogue backfill script. That's the integration.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you hit anything weird — an image that grades in a way you didn't expect, a preset that feels wrong on a specific kind of source, an intensity value that misbehaves — tell us. The preset list is small on purpose, but it's also not frozen, and "this preset doesn't cover the look I'm trying to ship" is exactly the feedback that drives what we add next.&lt;/p&gt;

&lt;p&gt;Ship the grade. Stop hand-curving every asset.&lt;/p&gt;

</description>
      <category>api</category>
      <category>imageprocessing</category>
      <category>webdev</category>
      <category>devtools</category>
    </item>
    <item>
      <title>Vastu Compliance API — score a floor plan and get an AutoCAD DXF on the same call</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Fri, 08 May 2026 07:33:36 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/vastu-compliance-api-score-a-floor-plan-and-get-an-autocad-dxf-on-the-same-call-1j00</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/vastu-compliance-api-score-a-floor-plan-and-get-an-autocad-dxf-on-the-same-call-1j00</guid>
      <description>&lt;h1&gt;
  
  
  Vastu Compliance API — score a floor plan and get an AutoCAD DXF on the same call
&lt;/h1&gt;

&lt;p&gt;Most architects in India hit the same conversation on every residential project. The floor plan is laid out for light, plumbing runs, parking. Then the client (or the client's family) asks the question: &lt;em&gt;is this Vastu compliant?&lt;/em&gt; What follows is usually a couple of hours of someone pointing at a printed plan, a pencil, and a polite negotiation. There are very few tools that let you check this programmatically — most of what's online is "name your facing direction and we'll tell you whether to put the kitchen in the south-east", which is fine if you only own one room and you've also forgotten which side of the plot you're standing on.&lt;/p&gt;

&lt;p&gt;We just shipped something more useful. &lt;code&gt;POST /v1/vastu/analyze&lt;/code&gt; takes a structured description of your floor plan and returns a per-rule report against twenty-one traditional Vastu Shastra rules. &lt;code&gt;POST /v1/vastu/export-dxf&lt;/code&gt; takes the same payload and returns an AutoCAD-openable DXF file with the rooms colour-coded by compliance, the 3×3 zone grid drawn with each cell labelled in cardinal + Sanskrit names (NE / Ishanya, SE / Agneya, SW / Nairutya, NW / Vayavya, plus the central Brahmasthan), and a printable Issues + Recommendations block in the right margin. Both endpoints are pure CPU. No GPU credits.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is different
&lt;/h2&gt;

&lt;p&gt;There's a reason most of the existing Vastu "tools" online are toys. Real Vastu rules are about &lt;em&gt;positions&lt;/em&gt;, not labels. Saying "kitchen south-east" is not the same as saying "the centroid of the kitchen rectangle is in the bottom-right ninth of the plot, and that ninth doesn't overlap the central Brahmasthan cell." If you want output that can stand up to an architect's scrutiny, you have to do the geometry.&lt;/p&gt;

&lt;p&gt;This API does the geometry. You give it a plot rectangle (width × depth in feet), a list of rooms as &lt;code&gt;(name, x, y, w, h)&lt;/code&gt; rectangles with the SW corner as origin, and optional points for the main entrance, water tank, and septic tank. The engine builds the 3×3 zone grid, figures out which zone each room's centroid sits in, runs every rule, and returns a severity-weighted score on 0–100.&lt;/p&gt;

&lt;p&gt;Each rule has a severity (critical / high / medium / low) that determines its weight in the final score. Putting a kitchen in the NE zone is a critical fail — it directly contradicts the Ishanya zone's water-and-light meaning. Putting the dining room in the south is a low-severity warning — it's not where you'd ideally place it, but it doesn't break anything.&lt;/p&gt;

&lt;p&gt;The full rule list, with &lt;code&gt;rule_id&lt;/code&gt;, severity, and one-line description, is on the &lt;a href="https://pixelapi.dev/docs/vastu.html#the-21-rules-at-a-glance" rel="noopener noreferrer"&gt;docs page&lt;/a&gt;. The whole engine is also pure-Python with zero non-stdlib dependencies, which means it's deterministic — same input, same output, no model variance. Forty-six unit tests cover the zone math, every individual rule, the input parser, and the DXF exporter's structural validity. Every deploy must pass all forty-six.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you submit
&lt;/h2&gt;

&lt;p&gt;A single JSON object. All distances in feet. Origin at the south-west corner of the plot.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"facing"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"north"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"plot"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"width_ft"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"depth_ft"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rooms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"kitchen"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"h"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"master_bedroom"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"h"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bedroom"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"h"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pooja_room"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;48&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"h"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"toilet"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"h"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"living_room"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"w"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"h"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"main_entrance"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;58&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"features"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"water_tank"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;36&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;56&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The room names that map to specific rules: &lt;code&gt;kitchen&lt;/code&gt;, &lt;code&gt;dining_room&lt;/code&gt;, &lt;code&gt;living_room&lt;/code&gt;, &lt;code&gt;drawing_room&lt;/code&gt;, &lt;code&gt;master_bedroom&lt;/code&gt;, &lt;code&gt;bedroom&lt;/code&gt;, &lt;code&gt;children_bedroom&lt;/code&gt;, &lt;code&gt;guest_room&lt;/code&gt;, &lt;code&gt;bathroom&lt;/code&gt;, &lt;code&gt;toilet&lt;/code&gt;, &lt;code&gt;pooja_room&lt;/code&gt;, &lt;code&gt;prayer_room&lt;/code&gt;, &lt;code&gt;study&lt;/code&gt;, &lt;code&gt;office&lt;/code&gt;, &lt;code&gt;home_office&lt;/code&gt;, &lt;code&gt;storage&lt;/code&gt;, &lt;code&gt;store_room&lt;/code&gt;, &lt;code&gt;garage&lt;/code&gt;, &lt;code&gt;staircase&lt;/code&gt;. Anything else is accepted but won't trigger zone-specific rules — it'll still count toward the centre-of-plot occupancy check.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you get back
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.pixelapi.dev/v1/vastu/analyze &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; @my-plan.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;100.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"bucket"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"excellent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Strong Vastu alignment overall."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rule_counts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"pass"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"fail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"warning"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"na"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"findings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"rule_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"kitchen_se"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Kitchen in Southeast (Agneya)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pass"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"severity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"detail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Kitchen is in SE — ideal Agneya placement (fire element)."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"suggestion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"rule_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"kitchen_not_ne"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Kitchen NOT in Northeast"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pass"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"severity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"critical"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"detail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Kitchen is in SE, not NE — good."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"suggestion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;19&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;more&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;entries&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The interesting field is &lt;code&gt;findings&lt;/code&gt; — one entry per rule. When a rule fails or warns, the &lt;code&gt;suggestion&lt;/code&gt; field carries a concrete fix:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Kitchen is in NW — acceptable alternative if SE not feasible.&lt;br&gt;
&lt;em&gt;suggestion:&lt;/em&gt; Move kitchen to SE if you can; NW is the second-best option.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Use this directly in a UI. Loop over &lt;code&gt;findings&lt;/code&gt;, group by status, render the failures in red with their suggestion in italics underneath. The compliance review writes itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  The DXF — the part that makes this useful for actual architects
&lt;/h2&gt;

&lt;p&gt;If you want the same layout as a printable, editable plan, hit &lt;code&gt;/v1/vastu/export-dxf&lt;/code&gt; with the same payload and you get back the file. R2010 format, decimal feet (&lt;code&gt;$INSUNITS=2&lt;/code&gt;), opens in AutoCAD, DraftSight, LibreCAD, BricsCAD, and any other DXF-compatible CAD tool.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.pixelapi.dev/v1/vastu/export-dxf &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; @my-plan.json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; my-house.dxf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The DXF carries a sensible layer structure so you can toggle bits on and off in the Layer Manager:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PLOT&lt;/code&gt; — outer plot boundary.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;VASTU_ZONES&lt;/code&gt; — 3×3 dashed zone grid.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ZONE_LABELS&lt;/code&gt; — each cell labelled with cardinal + Sanskrit name.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ROOMS_PASS&lt;/code&gt; (green), &lt;code&gt;ROOMS_WARN&lt;/code&gt; (yellow), &lt;code&gt;ROOMS_FAIL&lt;/code&gt; (red), &lt;code&gt;ROOMS_NA&lt;/code&gt; (white) — colour-coded room polylines so a glance tells you what to fix.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ROOM_LABELS&lt;/code&gt; — name + zone tag + status per room.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ENTRANCE&lt;/code&gt; — triangle marker pointing inward from the wall the door is on.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;FEATURES&lt;/code&gt; — water tank and septic tank as labelled circles.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;REPORT&lt;/code&gt; — score, bucket, Issues list, Recommendations list, in the right margin so the compliance summary prints on the same sheet.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wrote tests for this part. Round-trip the DXF (write → read back → write again) and you should get an equivalent file with the same layers and same entity counts. We assert that. We also assert that every room polyline's bounding box matches the input &lt;code&gt;Room&lt;/code&gt; rectangle within one foot — so the geometry actually round-trips.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we're shipping the rules engine separately from a floor-plan generator
&lt;/h2&gt;

&lt;p&gt;If you've been following the Indian architecture-AI space, you know there are tools that promise to &lt;em&gt;generate&lt;/em&gt; floor plans from text prompts. Most of them are wrappers around image diffusion models. They produce something that &lt;em&gt;looks like&lt;/em&gt; a floor plan from across the room, but the labels are gibberish, the walls aren't to scale, and you can't export to DXF because there's no underlying vector representation — it's pixels all the way down.&lt;/p&gt;

&lt;p&gt;We had a go at that earlier this month using one of the popular image generators. The output was, predictably, a watercolour render of a floor plan with garbled annotations. Beautiful for a marketing slide, useless for an architect.&lt;/p&gt;

&lt;p&gt;So we did the boring thing instead: we wrote a real rules engine that takes structured input. It scores layouts deterministically, and it emits real CAD geometry. If you want to pair it with a generator, you can — anything that can output rectangles works as input. RPLAN-CGAN, HouseGAN++, your own CSV, your own pencil-on-graph-paper sketch transcribed by hand. The compliance check is the same.&lt;/p&gt;

&lt;p&gt;Image-to-layout extraction (so you can upload a JPEG of a hand-drawn plan and have it parsed into the JSON shape) is on the roadmap. That's a separate piece — OCR plus room segmentation. We'll ship it when the parsing is reliable. In the meantime, structured input is what you'd be feeding any generator anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;Free during beta. The rules engine is pure Python doing some arithmetic on rectangles — it costs nothing to run, so there's no credit deduction. Rate-limited to 30 requests / minute / IP, which is more than any reasonable architect's working session needs.&lt;/p&gt;

&lt;p&gt;If you build something on top of it, please do let us know — &lt;code&gt;support@pixelapi.dev&lt;/code&gt;. We'd especially like to hear about Vastu rules we missed (the engine is currently at twenty-one but the tradition has hundreds, prioritised differently across regions). Adding a rule is twenty lines of Python and a unit test.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it without writing code
&lt;/h2&gt;

&lt;p&gt;Sign in to your dashboard at &lt;a href="https://pixelapi.dev" rel="noopener noreferrer"&gt;pixelapi.dev&lt;/a&gt; and click &lt;strong&gt;🪔 Vastu Compliance&lt;/strong&gt; in the left sidebar. There's a small editor with a sample 2BHK payload, an Analyze button that prints the per-rule findings inline, and a Download DXF button that gives you a fresh AutoCAD-openable file. No API key needed when you're using the dashboard.&lt;/p&gt;

&lt;p&gt;Full API reference: &lt;a href="https://pixelapi.dev/docs/vastu.html" rel="noopener noreferrer"&gt;pixelapi.dev/docs/vastu.html&lt;/a&gt;.&lt;br&gt;
Step-by-step tutorial with curl/Python/JS: &lt;a href="https://pixelapi.dev/tutorials/vastu-compliance.html" rel="noopener noreferrer"&gt;pixelapi.dev/tutorials/vastu-compliance.html&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>api</category>
      <category>architecture</category>
      <category>autocad</category>
      <category>india</category>
    </item>
    <item>
      <title>Photo to 3D-render video in one API call: meet Lensora Studio</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Fri, 08 May 2026 05:29:25 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/photo-to-3d-render-video-in-one-api-call-meet-lensora-studio-21ko</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/photo-to-3d-render-video-in-one-api-call-meet-lensora-studio-21ko</guid>
      <description>&lt;p&gt;A single photo goes in. An MP4 of the subject as a real 3D object — turning, dollying, or sweeping past the camera on a brand-new background — comes out. Two HTTP calls. Eighty credits. About four minutes of wall-clock.&lt;/p&gt;

&lt;p&gt;That is the brief for &lt;strong&gt;&lt;a href="https://pixelapi.dev/tools/lensora-studio" rel="noopener noreferrer"&gt;Lensora Studio&lt;/a&gt;&lt;/strong&gt;, the newest endpoint on PixelAPI. This post walks through what it does, the design choices behind it, and the slab-shaped detour we took to get the 3D step right.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjhrb4juaw856jb04lb4y.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjhrb4juaw856jb04lb4y.jpg" alt="Lensora Studio output: a vintage twin-lens reflex camera rendered as a 3D model, sitting on a marble countertop in a sunlit kitchen. The whole scene was synthesized from one photo of the camera." width="768" height="768"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A real Rolleiflex photo went in. This is one frame of the turntable MP4 that came out — the kitchen background was generated from a one-line prompt.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What it does, end to end
&lt;/h2&gt;

&lt;p&gt;You hand the API a photo. It does four things back to back:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Detect.&lt;/strong&gt; Object detection returns up to eight foreground proposals — bounding box, label, category — so a user can pick which thing to transform. Useful for messy frames, packshots that include props, or detection over-segmenting a logo into pieces.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cut and rebackground.&lt;/strong&gt; The chosen subject is segmented, and you choose what sits behind it: leave it transparent, drop in your own backplate URL, or describe the scene in plain English ("on a marble countertop with soft natural light").&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3D.&lt;/strong&gt; The cropped subject is rebuilt as a full 360° mesh with PBR textures. Real volume, real depth — not a flat plate that pretends to rotate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Render.&lt;/strong&gt; The mesh is composited over your new background and rendered as a 24 fps MP4 from one of three camera moves: &lt;code&gt;turntable&lt;/code&gt; (full 360°), &lt;code&gt;dolly&lt;/code&gt; (straight zoom-in), or &lt;code&gt;cinematic&lt;/code&gt; (180° arc with depth-of-field).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You get back four artifacts every time: the hero MP4, a downloadable GLB you can drop into Blender / Unity / Three.js, a static composited still, and the alpha-cutout PNG.&lt;/p&gt;




&lt;h2&gt;
  
  
  The two-call shape
&lt;/h2&gt;

&lt;p&gt;Step one is a multipart upload that returns object proposals plus a &lt;code&gt;session_id&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.pixelapi.dev/v1/studio/init &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$PIXELAPI_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-F&lt;/span&gt; &lt;span class="s2"&gt;"image=@product.jpg"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"session_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2a91884c-..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"objects"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"label"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"vintage twin-lens reflex camera"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"product"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"bbox"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.79&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.93&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"label"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"entire image (no crop)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"full_frame"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"bbox"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"credits_used"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step two picks an object, picks a background, picks a camera, and returns a &lt;code&gt;job_id&lt;/code&gt; immediately while the pipeline runs in the background:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.pixelapi.dev/v1/studio/transform &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$PIXELAPI_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "session_id": "2a91884c-...",
    "object_index": 0,
    "background": {"type": "prompt", "prompt": "on a marble countertop with soft natural light"},
    "camera_preset": "cinematic"
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You poll &lt;code&gt;/v1/studio/result/{job_id}&lt;/code&gt; every few seconds. The &lt;code&gt;step&lt;/code&gt; field walks through &lt;code&gt;cropping → removing-bg → generating-bg → compositing → generating-3d → rendering-video → done&lt;/code&gt; so you can show real progress in your UI.&lt;/p&gt;

&lt;p&gt;The full Python example is in &lt;a href="https://pixelapi.dev/docs/lensora-studio.html#full-example" rel="noopener noreferrer"&gt;the docs&lt;/a&gt; — sub-fifty lines including the polling loop and the GLB download.&lt;/p&gt;




&lt;h2&gt;
  
  
  The slab problem
&lt;/h2&gt;

&lt;p&gt;Here is the part that ate two days.&lt;/p&gt;

&lt;p&gt;The first version of the 3D step worked fine on the simple smoke tests we had. The output looked solid in catalog-style shots — clean object, isolated against a backdrop. So we shipped the canary and ran an end-to-end test on a Rolleiflex camera photo we had been using as a reference image for half a year.&lt;/p&gt;

&lt;p&gt;The turntable opened on the front of the camera. Beautiful. Then it rotated 90°. And we saw a sliver. A &lt;em&gt;thin&lt;/em&gt; sliver — barely visible at this angle. The model was a slab.&lt;/p&gt;

&lt;p&gt;We measured. The thin axis of the bounding box was 15.7% of the longest axis. For a Rolleiflex — a roughly cube-shaped object that should be near 1:1:1 — that is a flat pancake.&lt;/p&gt;

&lt;p&gt;The catch: from the front it looked perfect. The model had taken the input photo and built something that was mostly an extruded postcard. Texture was sharp on the front face, geometry was almost zero on the others. Three of our camera presets — turntable, dolly, cinematic — would all eventually expose the slab back or edge-on. We were going to ship a beautiful product gallery for thirty seconds, then a five-minute argument with a customer.&lt;/p&gt;

&lt;p&gt;So we did the thing we did not want to do. We swapped the 3D engine for one that uses sparse-structure flow over a 3D occupancy grid instead of single-image-from-front extrusion. Re-validated on the same canary:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Old engine, thin axis:&lt;/strong&gt; 15.7% of longest. Slab.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New engine, thin axis:&lt;/strong&gt; 59.9% of longest. Real 3D.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a 3.8× depth recovery. Above 40% — our internal threshold for "this is a 3D shape, not a 3D image." And visually unmistakable:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiv08hhfcjc5vox5l1vtr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiv08hhfcjc5vox5l1vtr.jpg" alt="Same Rolleiflex camera, viewed from a different angle in the turntable render. The full chunky body is visible — top, side controls, depth — proving the 3D output has real volume rather than being a flat plate that rotates." width="768" height="768"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Same camera, side-on profile. Real depth, side controls visible, no slab artifact. This is the cinematic preset mid-arc.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Auto-orient and arc-clamp
&lt;/h2&gt;

&lt;p&gt;We added one more thing for safety. Before each render, the renderer now measures the bounding-box extents of the mesh and rotates it so the largest face points square to the camera at angle 0. This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The dolly preset never goes edge-on regardless of how the source photo was framed.&lt;/li&gt;
&lt;li&gt;The turntable preset always opens on the hero face, then sweeps around.&lt;/li&gt;
&lt;li&gt;The cinematic preset gets a clean front-on opening shot before the arc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And as a defensive belt: if any future input ever produces a thin-axis ratio under 30% despite the new 3D engine, the renderer falls back to a ±60° rocking arc instead of a full sweep, so a slab — &lt;em&gt;if&lt;/em&gt; one ever sneaks back in — can never be visible from a bad angle.&lt;/p&gt;

&lt;p&gt;The auto-orient pass costs us nothing — it's a single 4×4 transform on the mesh — and it papers over a class of bugs we'd otherwise have to debug per-input.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Credits&lt;/th&gt;
&lt;th&gt;USD&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;/v1/studio/init&lt;/code&gt; (detect)&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;$0.005&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;/v1/studio/transform&lt;/code&gt; (full pipeline)&lt;/td&gt;
&lt;td&gt;75&lt;/td&gt;
&lt;td&gt;$0.075&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;End-to-end&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;80&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.08&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;No subscription. The transform credits are auto-refunded on any failure or timeout in the pipeline.&lt;/p&gt;




&lt;h2&gt;
  
  
  When to use this
&lt;/h2&gt;

&lt;p&gt;Lensora Studio is a fit when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have product photos and you want catalog rotation videos without Blender.&lt;/li&gt;
&lt;li&gt;You're building an e-commerce listing tool and rotation MP4s lift conversion.&lt;/li&gt;
&lt;li&gt;You want a quick "turn this into a 3D-render ad" button and don't want to chain four separate APIs yourself.&lt;/li&gt;
&lt;li&gt;You're prototyping AR previews and need GLB files alongside an MP4 thumbnail.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is &lt;strong&gt;not&lt;/strong&gt; a fit when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The subject is a person, an animal, or flat-lay clothing. The 3D engine needs a discrete, rigid object — packshots, tools, electronics, accessories work great. People and pets do not, in this version.&lt;/li&gt;
&lt;li&gt;You need 4K. Output is 768×768 today. Higher resolutions are on the roadmap.&lt;/li&gt;
&lt;li&gt;You want a 30-second video. Each preset is 4 seconds at 24 fps. You can chain renders for longer pieces, but the unit deliverable is short.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Browser tool:&lt;/strong&gt; &lt;a href="https://pixelapi.dev/tools/lensora-studio" rel="noopener noreferrer"&gt;pixelapi.dev/tools/lensora-studio&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API docs:&lt;/strong&gt; &lt;a href="https://pixelapi.dev/docs/lensora-studio.html" rel="noopener noreferrer"&gt;pixelapi.dev/docs/lensora-studio.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sign up:&lt;/strong&gt; &lt;a href="https://pixelapi.dev" rel="noopener noreferrer"&gt;pixelapi.dev&lt;/a&gt; — free credits to try the flow end-to-end.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you hit edge cases (interesting geometry, unusual subjects, a corner case our slab-detector missed), I'd love to hear about them. The canary that exposed the original slab was a Rolleiflex sitting on a desk for unrelated reasons — sometimes the bug only shows up on the photo you weren't expecting to test against.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>showdev</category>
      <category>webdev</category>
      <category>python</category>
    </item>
    <item>
      <title>Image Captioning API: Auto-Generate Alt Text and Descriptions</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Thu, 07 May 2026 10:02:29 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/image-captioning-api-auto-generate-alt-text-and-descriptions-34ib</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/image-captioning-api-auto-generate-alt-text-and-descriptions-34ib</guid>
      <description>&lt;h1&gt;
  
  
  Image Captioning API: Auto-Generate Alt Text and Descriptions
&lt;/h1&gt;

&lt;p&gt;Most product catalogues, content feeds, and media libraries have one quiet shame: thousands of images with empty &lt;code&gt;alt=""&lt;/code&gt; attributes, no search metadata, and no human-readable description. Writing them by hand does not scale. Here is the endpoint that makes that problem go away.&lt;/p&gt;

&lt;p&gt;Today we are launching &lt;code&gt;POST /v1/image/caption&lt;/code&gt; — a single endpoint that turns any public image URL into a caption tuned for the job you actually have: accessibility alt-tags, SEO descriptions, or full paragraph-length narration.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;Send a public image URL. Get back text. That is the whole shape of the API.&lt;/p&gt;

&lt;p&gt;What makes it useful is the &lt;code&gt;style&lt;/code&gt; parameter. The same endpoint can produce three very different outputs from the same image, depending on what you are building:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;concise&lt;/code&gt;&lt;/strong&gt; (default) — tight, alt-text-shaped output. The kind of single sentence you want sitting inside an &lt;code&gt;alt&lt;/code&gt; attribute. Screen-reader friendly, no fluff, no marketing language.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;seo&lt;/code&gt;&lt;/strong&gt; — keyword-rich descriptions written for indexing. Think product page meta-descriptions, image search optimisation, structured-data fields. Longer than alt text, denser with terms a search engine actually picks up.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;detailed&lt;/code&gt;&lt;/strong&gt; — paragraph-length narration. For accessibility contexts where someone genuinely needs to &lt;em&gt;understand&lt;/em&gt; the image, not just identify it. Also useful when you want a content-moderation reviewer to triage uploads without opening every preview.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full request shape is small:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Required&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;image_url&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Public URL of the image&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;style&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;&lt;code&gt;concise&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;One of &lt;code&gt;concise&lt;/code&gt;, &lt;code&gt;detailed&lt;/code&gt;, &lt;code&gt;seo&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;max_tokens&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;&lt;code&gt;64&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Length cap, range 32–256&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That is the entire surface. No model selection, no temperature knobs, no system-prompt plumbing. You picked an endpoint called "caption" — we figured the job out so you do not have to.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;max_tokens&lt;/code&gt; is a hard ceiling, not a target. A &lt;code&gt;concise&lt;/code&gt; request with &lt;code&gt;max_tokens: 256&lt;/code&gt; will not waste tokens producing fluff; the style governs length, the cap is just there to protect you from runaway output in edge cases (extremely busy images, weird aspect ratios, etc.).&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we built it
&lt;/h2&gt;

&lt;p&gt;Image captioning is one of those features where the gap between "I could build this in a weekend" and "I have a production-grade pipeline serving 10k catalogue images a night" is enormous.&lt;/p&gt;

&lt;p&gt;The weekend version is a Jupyter notebook calling someone's hosted model. It works on the demo image. It falls over the moment you point it at a real product feed: the URLs are sometimes 404, sometimes a redirect, sometimes a 50MB raw camera dump. Half the captions sound like art-gallery placards ("a serene composition featuring…") when you wanted alt text. The other half are three words long when you wanted a paragraph. You spend a week writing prompt scaffolding, retry logic, content filters, and length normalisation, and you still have not shipped the actual feature you set out to build.&lt;/p&gt;

&lt;p&gt;The production version requires a team. We have already built that team's worth of work. The API in front of it is what we are shipping.&lt;/p&gt;

&lt;p&gt;The other thing we kept hearing: &lt;strong&gt;one caption style does not fit every job.&lt;/strong&gt; The alt-text you want on &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; tags is not the description you want in your &lt;code&gt;og:image&lt;/code&gt; meta tag, and neither of those is what you want feeding into a moderation reviewer queue. Most APIs make you pick one shape and live with it, then post-process the output to fake the other two. We exposed the three styles directly because that is how the work actually splits in real codebases.&lt;/p&gt;

&lt;p&gt;The angle, in plain terms: a purpose-built captioning endpoint, three styles in one call, priced flat-rate per request, no token accounting on your side.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quickstart
&lt;/h2&gt;

&lt;p&gt;Get an API key from the dashboard, then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.pixelapi.dev/v1/image/caption &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"image_url": "https://example.com/source.jpg", "style": "seo"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response is JSON with a caption string. That is it.&lt;/p&gt;

&lt;p&gt;Same call in Python with &lt;code&gt;requests&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.pixelapi.dev/v1/image/caption&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://example.com/source.jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;style&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;seo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;caption&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few notes worth flagging up front:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The &lt;code&gt;image_url&lt;/code&gt; must be publicly reachable.&lt;/strong&gt; If your images live behind a signed-URL CDN, generate a short-lived signed URL and pass that. We fetch the bytes server-side; we cannot reach a URL that requires your session cookie.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;style&lt;/code&gt; is the lever you reach for first.&lt;/strong&gt; If a caption feels off, change the style before you start fiddling with &lt;code&gt;max_tokens&lt;/code&gt;. The style governs &lt;em&gt;register&lt;/em&gt; (alt-text voice vs. SEO voice vs. narration voice); the token cap only governs length.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set a sensible client timeout.&lt;/strong&gt; 30 seconds is comfortable for a single call; if you are batching, see the use-case section below for parallelism guidance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are wiring this into a job queue, treat each call as independent. There is no statefulness across requests — captioning image A does not influence the caption for image B. That makes parallelism trivial: fan out as many concurrent requests as your account's rate limit allows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Bulk-generate alt text for an ecommerce catalogue (10k+ products)
&lt;/h3&gt;

&lt;p&gt;This is the canonical job. You have a product database, every row has one or more image URLs, and the &lt;code&gt;alt_text&lt;/code&gt; column is either empty or full of the SKU. Lighthouse is screaming. Your accessibility consultant has filed a report. Someone on the marketing side has noticed that Google Image search is sending you nothing.&lt;/p&gt;

&lt;p&gt;The shape of the migration is straightforward: read product images out of the database in batches, fan out parallel calls to &lt;code&gt;/v1/image/caption&lt;/code&gt; with &lt;code&gt;style: "concise"&lt;/code&gt; for the on-page &lt;code&gt;alt&lt;/code&gt; attribute, then a second pass with &lt;code&gt;style: "seo"&lt;/code&gt; for the meta-description and structured-data fields. Two calls per product, sixteen credits, and the entire 10k catalogue gets done overnight while you sleep. The next morning you re-run Lighthouse and the accessibility score has moved 20+ points without anyone writing a single sentence by hand. The SEO improvement is harder to measure on day one but shows up in the indexing reports a few weeks later, when image-search referrals start trending up. This is the use case that pays for the integration on its own.&lt;/p&gt;

&lt;h3&gt;
  
  
  Auto-tag user uploads in an image-sharing app for search
&lt;/h3&gt;

&lt;p&gt;User-generated content has a tagging problem: nobody tags their own uploads properly. They write "IMG_4823" as the title, no description, no tags, and then complain that search inside your app does not work. You cannot force users to write metadata — you have tried, the conversion drops every time you add a required field.&lt;/p&gt;

&lt;p&gt;Solve it on the server. When an upload finishes, kick off a background job that hits &lt;code&gt;/v1/image/caption&lt;/code&gt; with &lt;code&gt;style: "detailed"&lt;/code&gt; to get a full sentence or two describing what is in the image. Index that text into your existing search engine — Postgres full-text, Meilisearch, Elastic, whatever you already have. Now "sunset over a mountain lake" finds the user upload that the user titled "IMG_4823.jpg". The user did nothing extra; their content became searchable. You can also display the detailed caption as default alt-text on the image so screen-reader users get a real description, not a filename. One API call, two product wins.&lt;/p&gt;

&lt;h3&gt;
  
  
  Build a moderation queue with auto-described preview labels
&lt;/h3&gt;

&lt;p&gt;If you run a moderation queue, you know the bottleneck: a human reviewer has to open each preview, register what is in it, decide, and move on. The "register what is in it" step is where seconds add up across thousands of items. Anything that can be turned into a one-line label &lt;em&gt;before&lt;/em&gt; the reviewer's eyes land on it is a direct throughput win.&lt;/p&gt;

&lt;p&gt;Hit &lt;code&gt;/v1/image/caption&lt;/code&gt; with &lt;code&gt;style: "concise"&lt;/code&gt; on every flagged upload as it enters the queue. The caption becomes a sortable, filterable label next to the thumbnail. Reviewers can now scan the queue text-first and only open previews where the caption is ambiguous or the policy call is genuinely hard. You are not replacing the human — image moderation is a human-judgement job and should stay that way — you are removing the obvious cases from their cognitive load. Reviewers go faster, the queue burns down sooner, and the borderline cases get more attention because the easy ones got triaged on the way in.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;Flat rate per call. No token accounting on your side, no surprise overages on long detailed captions vs. short alt-text — every call costs the same regardless of which &lt;code&gt;style&lt;/code&gt; you pick.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Credits per call:&lt;/strong&gt; 8&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price (INR):&lt;/strong&gt; ₹0.0054 per call&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price (USD):&lt;/strong&gt; $0.00007 per call&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To put that in catalogue terms: a 10,000-image alt-text run costs roughly ₹54 / $0.70. Two passes (one &lt;code&gt;concise&lt;/code&gt; for alt-text, one &lt;code&gt;seo&lt;/code&gt; for meta-descriptions) on the same catalogue is roughly ₹108 / $1.40. That is comfortably below the cost of a single hour of an engineer writing alt text by hand, for the entire catalogue.&lt;/p&gt;

&lt;p&gt;A 100k-image media library indexed for natural-language search costs roughly ₹540 / $7.00 as a one-time pass, plus whatever your incremental ingestion rate adds — usually rounding error.&lt;/p&gt;

&lt;p&gt;There is no minimum, no monthly commitment, no per-seat pricing. Top up credits, make calls, the meter runs down by 8 each request. If a request fails (we return a non-2xx), no credits are deducted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Grab an API key and start captioning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dashboard:&lt;/strong&gt; &lt;a href="https://pixelapi.dev/dashboard" rel="noopener noreferrer"&gt;pixelapi.dev/dashboard&lt;/a&gt; — sign up, generate a key, top up credits.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docs:&lt;/strong&gt; &lt;a href="https://pixelapi.dev/docs" rel="noopener noreferrer"&gt;pixelapi.dev/docs&lt;/a&gt; — full reference for &lt;code&gt;/v1/image/caption&lt;/code&gt; including all request fields, response shape, error codes, and rate limits.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are migrating off a hand-rolled captioning script or an older general-purpose vision API, the easiest way to evaluate is to run a hundred of your real images through all three styles and eyeball the output. Pick the one that fits your surface, wire it in, ship it. The whole integration is a single HTTP call — there is genuinely nothing else to learn.&lt;/p&gt;

&lt;p&gt;Build something useful with it. We will keep making the captions better underneath; you will not have to change a line of code when we do.&lt;/p&gt;

</description>
      <category>api</category>
      <category>a11y</category>
      <category>seo</category>
      <category>webdev</category>
    </item>
    <item>
      <title>FLUX Schnell vs SDXL: A Practical Comparison for Developers Who Need Reliable Image Generation</title>
      <dc:creator>Om Prakash</dc:creator>
      <pubDate>Thu, 07 May 2026 07:34:28 +0000</pubDate>
      <link>https://forem.com/om_prakash_3311f8a4576605/flux-schnell-vs-sdxl-a-practical-comparison-for-developers-who-need-reliable-image-generation-10kf</link>
      <guid>https://forem.com/om_prakash_3311f8a4576605/flux-schnell-vs-sdxl-a-practical-comparison-for-developers-who-need-reliable-image-generation-10kf</guid>
      <description>&lt;h1&gt;
  
  
  FLUX Schnell vs SDXL: A Practical Comparison for Developers Who Need Reliable Image Generation
&lt;/h1&gt;

&lt;p&gt;Both models generate images from text. Beyond that, they have almost nothing in common.&lt;/p&gt;

&lt;p&gt;Here's what actually matters when you're integrating one into a production application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;SDXL&lt;/strong&gt; (Stable Diffusion XL) — released 2023, 6.6B parameters, produces 1024×1024 natively. Massive ecosystem: LoRAs, ControlNet, inpainting, specialized checkpoints for every style. The most customizable text-to-image model available.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;FLUX Schnell&lt;/strong&gt; — released 2024 by Black Forest Labs (the original Stable Diffusion team). Flow matching architecture instead of diffusion. 12B parameters. 4-step generation. Dramatically better prompt adherence and text rendering than any previous open model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Head-to-Head
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prompt Adherence
&lt;/h3&gt;

&lt;p&gt;SDXL is good at vibes. Give it "dark academic study with warm candlelight" and it nails the atmosphere. Ask it to put specific text on a book spine and it generates plausible-looking gibberish.&lt;/p&gt;

&lt;p&gt;FLUX Schnell actually reads your prompt. Compositional instructions ("three objects arranged left to right"), text rendering ("a sign that says OPEN"), and complex multi-subject scenes work reliably.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# This works in FLUX. In SDXL, the text will be garbled.
&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a coffee mug with the word FOCUS printed on it, white background, product photo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Speed
&lt;/h3&gt;

&lt;p&gt;FLUX Schnell: 4 inference steps. ~2-3 seconds on an RTX 4070.&lt;br&gt;
SDXL: 20-30 steps at similar quality. ~5-8 seconds.&lt;/p&gt;

&lt;p&gt;For anything real-time or user-facing, Schnell wins by a wide margin.&lt;/p&gt;

&lt;h3&gt;
  
  
  Customization
&lt;/h3&gt;

&lt;p&gt;SDXL wins here comprehensively. Years of community fine-tuning means there's a model for photorealism, anime, architecture, fashion, medical illustration — you name it. LoRA support means you can fine-tune on 20 images and get consistent brand characters.&lt;/p&gt;

&lt;p&gt;FLUX has fewer community extensions, though this is changing fast.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.pixelapi.dev/v1/generate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# FLUX for product/text prompts
&lt;/span&gt;&lt;span class="n"&gt;flux_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a minimal ceramic vase on a marble surface, studio lighting, product photography&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;flux-schnell&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;API_KEY&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# SDXL for stylized/artistic content
&lt;/span&gt;&lt;span class="n"&gt;sdxl_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;oil painting, impressionist style, city street in rain, warm lamplight&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sdxl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;API_KEY&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both models are available on the same API endpoint — swap the &lt;code&gt;model&lt;/code&gt; parameter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Decision Guide
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Winner&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Product photography&lt;/td&gt;
&lt;td&gt;FLUX Schnell&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Text in images&lt;/td&gt;
&lt;td&gt;FLUX Schnell&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Portraits and people&lt;/td&gt;
&lt;td&gt;FLUX Schnell&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Artistic / stylized&lt;/td&gt;
&lt;td&gt;SDXL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Brand-consistent output (fine-tuned)&lt;/td&gt;
&lt;td&gt;SDXL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Real-time generation&lt;/td&gt;
&lt;td&gt;FLUX Schnell&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complex scene composition&lt;/td&gt;
&lt;td&gt;FLUX Schnell&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Pricing Reality
&lt;/h2&gt;

&lt;p&gt;On major cloud platforms: FLUX 1.1 Pro runs {intel.get('replicate_flux_pro', '~$0.04')}/image. SDXL runs around {intel.get('replicate_sdxl', '~$0.006')}/image.&lt;/p&gt;

&lt;p&gt;PixelAPI runs both at 12 credits/image (Schnell) and 25 credits/image (SDXL premium). At Starter plan rates, that's roughly 5-10x cheaper than cloud alternatives.&lt;/p&gt;

&lt;h2&gt;
  
  
  Verdict
&lt;/h2&gt;

&lt;p&gt;If you're building a product where users generate images, start with FLUX Schnell. It's faster, more predictable, and handles the long tail of weird prompts better. Switch to SDXL when you need stylistic control or fine-tuning.&lt;/p&gt;

&lt;p&gt;If you're not sure: run 20 test generations with each on your actual prompts. The results will tell you more than any benchmark.&lt;/p&gt;

&lt;p&gt;Try both at &lt;a href="https://pixelapi.dev" rel="noopener noreferrer"&gt;pixelapi.dev&lt;/a&gt; — 100 free credits included.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Both FLUX Schnell and SDXL run on dedicated RTX 4070 GPUs via PixelAPI. No cold starts.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>ai</category>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
