BowerBot: Building an LLM Agent for OpenUSD

Arturo — Fri, 08 May 2026 03:32:40 +0000

I run Binary Core LLC, a small consultancy that builds OpenUSD and AI tooling for visualization, digital twin, and robotics teams. Over the past several months, we've been building BowerBot, an open source LLM agent that assembles structured OpenUSD scenes from natural language.

This post walks through the architectural decisions behind it: why we built it FastAPI-style, how the plugin system works, why validation is core to the agent loop, and the tradeoffs we made along the way.

What problem are we solving

OpenUSD (Pixar's Universal Scene Description) is becoming the universal 3D scene format. Pixar uses it. Apple ships USDZ to Vision Pro. NVIDIA Omniverse is USD-native. Adobe, Autodesk, Siemens, and Trimble all back it through the Alliance for OpenUSD.

But authoring USD scenes correctly is hard. You need to manage:

Asset folder structure (root + geo + materials separated)
ASWF compliance (defaultPrim, metersPerUnit, upAxis)
Material binding (MaterialX or existing materials)
Native USD lighting (sun, dome, point, area)
Reference resolution and validation
Final packaging (USDZ for Vision Pro, USD for Omniverse, etc.)

Most of this is mechanical work that pipeline TDs end up scripting manually for every project. We thought: this is a great fit for an LLM agent.

Architecture: FastAPI-style separation

The codebase is organized like a FastAPI service, even though BowerBot isn't a web service. Why? Because the same separation of concerns that makes web services maintainable also makes agent codebases maintainable.

bowerbot/ ├── schemas/ # Pydantic models for tool inputs and outputs ├── services/ # Core business logic (USD operations) ├── tools/ # Agent-callable tool wrappers ├── utils/ # Shared utilities └── agent/ # The agent loop itself

The schemas layer is critical. Every tool input and output is a Pydantic model. This means:

The LLM gets structured tool definitions automatically (via Pydantic JSON schema)
Tool outputs are typed and validated, making them easy to chain
Tests are easier because we mock at the schema boundary
Adding new tools doesn't require touching the agent core

The plugin system: Python entry points

This was the most contested architectural decision. We considered three options for extensibility:

Monorepo with all skills built in
Plugin loading from a known directory
Python entry points (the setuptools mechanism)

We went with entry points. Here's why:

When someone wants to extend BowerBot with a custom skill (say, a custom asset provider for their internal DAM, or a specialized DCC connector), they create a separate pip-installable package. Their pyproject.toml declares an entry point:

[project.entry-points."bowerbot.skills"]
my_custom_skill = "my_package.skills:MyCustomSkill"

When BowerBot starts, it discovers all installed packages exposing the bowerbot.skills entry point and registers them. No forking. No central registry. No PR queues.

This matters because BowerBot's value compounds when third parties extend it. A studio with internal asset systems can write a skill, ship it as a private package, and BowerBot becomes their tool without any code changes to BowerBot itself.

Validation as a core agent capability

The biggest insight from building this: validation isn't a separate step, it's part of the agent loop.

Here's the pattern:

Agent calls a tool that modifies the USD stage
Agent calls validate_stage to check the result
Validation returns structured ValidationIssue objects (not just pass/fail)
The agent can call fix tools targeting specific issues
Re-validate, iterate

This is a tight feedback loop. The agent doesn't just hope its output is correct, it verifies and corrects. ASWF compliance, reference resolution, material bindings, unit consistency are all checked. Issues come back with enough context for the agent to know what to fix and how.

The validation logic is shared with our packaging step. Before we package a USDZ, we validate the stage. If validation fails, packaging fails with a clear error. No surprise broken USDZ files reaching Vision Pro.

Multi-LLM support via litellm

We didn't want to lock BowerBot to one LLM provider. Studios have different policies (some can't send data to OpenAI, others have Anthropic enterprise contracts, some run local models). We wrapped LLM calls through litellm, which gives us a common interface across OpenAI, Anthropic, Google, Cohere, local Ollama, and more.

This was a small architectural choice with big practical impact. When a prospect says "we use Anthropic exclusively" or "we run on local Llama," BowerBot just works.

What we got wrong (and learned)

A few things didn't work the way we expected.

First: agent loops are expensive. Each iteration is an LLM call. Naively, an agent that "just keeps iterating until done" can rack up surprising token costs. We added explicit step limits and structured the agent to do as much work as possible in each tool call.

Second: tool description quality is everything. Early on, we wrote terse tool descriptions because the function names felt self-explanatory. The LLM struggled to chain tools correctly. Rewriting descriptions to be richer (with examples, edge cases, and explicit input/output behavior) dramatically improved success rates.

Third: the LLM lies about what it did. Sometimes the agent would claim it placed an asset successfully when the underlying tool failed silently. The fix was making sure every tool returns explicit success/failure structures, and the agent's prompt encourages it to verify with validation tools rather than trust its own narration.

What's next

The next major release adds UsdPhysics support: mass, inertia, collision, and articulations. This makes BowerBot useful for simulation pipelines (Isaac Sim, Omniverse PhysX, MuJoCo-Warp) in addition to scene assembly.

Once that ships, BowerBot can output SimReady scenes from natural language: "set up a warehouse with 200 pallets, articulated forklift, and robotic arms at picking stations, with realistic mass distributions." That's a meaningful capability for robotics, manufacturing, deconstruction simulation, and Physical AI training.

Try it

BowerBot is Apache 2.0 on GitHub. Recently added to the ASWF Landscape in the OpenUSD section.

Demo videos:

Always interested in hearing how others are approaching USD scene authoring at scale.

Forem: Arturo