How I Built an AI Product Photography Pipeline with 30+ Models (Next.js + Express + Replicate/FAL)

Tyler — Sun, 08 Feb 2026 21:11:51 +0000

I've been building PixelMotion — a SaaS that takes product photos and transforms them into enhanced images and AI-generated videos. Here's a deep dive into the technical architecture and the lessons I learned orchestrating 30+ AI models in production.

The Problem

Ecommerce brands need high-quality product visuals but professional photography is expensive ($500-2000 per product). AI models like Flux, Stable Diffusion, and video generators like Kling, Sora, and Veo can now produce stunning results — but each model has different strengths, APIs, pricing, and failure modes.

I wanted to build a system where a user uploads a single product photo, and the platform handles everything: background removal, enhancement, upscaling, and even generating marketing videos.

Architecture Overview

The stack:

Frontend: Next.js 15 (App Router) + TailwindCSS
Backend: Express.js + TypeScript + PostgreSQL (Sequelize ORM)
AI Providers: Replicate, FAL AI, OpenAI
Storage: Google Cloud Storage
Payments: Stripe (usage-based credits)

User Upload → Website Scraper → AI Analysis → Model Selection → Enhancement/Generation → Storage → Delivery

Key Technical Decisions

1. Multi-Provider Model Orchestration

The biggest challenge was abstracting away the differences between AI providers. Replicate uses a prediction-based API with webhooks. FAL uses queue-based async processing. OpenAI has its own patterns.

I built a unified llmService that normalizes the interface:

// Simplified model orchestration
const result = await llmService.generate({
  provider: model.provider, // 'replicate' | 'fal' | 'openai'
  model: model.id,
  input: normalizedParams,
  webhook: callbackUrl,
});

Each provider adapter handles its own retry logic, timeout behavior, and error mapping. This means adding a new model is just a config change — no service code modifications.

2. Fallback Chains

AI models fail. A lot. Rate limits, cold starts, model updates that change output quality. I implemented fallback chains:

const ENHANCEMENT_CHAIN = [
  { model: 'flux-pro-v2', provider: 'replicate' },
  { model: 'flux-pro', provider: 'fal' },
  { model: 'stable-diffusion-xl', provider: 'replicate' },
];

If the primary model fails or times out, the system automatically tries the next one. Users don't see errors — they just get results.

3. Credit-Based Pricing

Different models have wildly different costs. Sora costs ~10x more than Kling per generation. Instead of flat subscriptions, I went with a credit system where each model costs a different number of credits:

Model	Credits	Actual Cost
Flux 2 Pro (photo)	2	~$0.05
Kling 1.6 (video)	5	~$0.13
Sora 2 (video)	20	~$0.50
Veo 3.1 (video)	25	~$0.65

This lets users choose based on their budget and quality needs.

4. Automated Product Intelligence

When a user connects their ecommerce store, the platform scrapes product data and uses GPT-4o to analyze:

Product category and type
Target audience
Brand aesthetic
Optimal AI models for that product type

This analysis feeds into prompt generation, so a luxury watch gets different treatment than a kitchen gadget.

5. Async Job Queue with Polling

AI generation takes 30-120 seconds. I use a job queue pattern:

User submits request → job created with pending status
Backend dispatches to AI provider
Frontend polls every 3 seconds via custom usePolling hook
Webhook or poll catches completion → status updated to completed
Frontend displays result

// Frontend polling hook (simplified)
const useGenerationStatus = (jobId: string) => {
  const [status, setStatus] = useState('pending');

  useEffect(() => {
    const interval = setInterval(async () => {
      const res = await api.get(`/jobs/${jobId}`);
      setStatus(res.data.status);
      if (res.data.status === 'completed') clearInterval(interval);
    }, 3000);
    return () => clearInterval(interval);
  }, [jobId]);

  return status;
};

Lessons Learned

1. Don't trust AI model output blindly. Some models occasionally return blank images, corrupted files, or completely wrong outputs. Always validate outputs before serving to users.

2. Cost management is critical. Early on, I burned through $200 in a weekend because I didn't have per-user rate limits. Now every generation checks credit balance first, and I have daily cost alerts.

3. Prompt engineering > model selection. A well-crafted prompt on a cheaper model often beats a bad prompt on an expensive one. I spent weeks refining prompts and built a versioned prompt system to track changes.

4. Webhooks are unreliable. Both Replicate and FAL occasionally fail to deliver webhooks. Always implement polling as a fallback.

5. Users don't care about the model. They care about the result. Auto-selecting the best model based on the product type improved satisfaction more than letting users choose manually.

What's Next

Currently working on:

Multi-photo video generation (combine multiple product angles into one video)
UGC-style video generation with AI avatars
Direct publishing to TikTok/YouTube from the platform

If you're building with AI APIs and want to chat about multi-provider orchestration, fallback patterns, or credit systems — drop a comment. Happy to go deeper on any of these topics.

You can check out the live product at pixelmotion.io.

Built with Next.js 15, Express.js, PostgreSQL, Replicate, FAL AI, and OpenAI.

Forem: Tyler