How We Got Tired of the AI Subscription Zoo and Combined 30 Neural Networks in One Interface

Buddy Anderson — Mon, 09 Mar 2026 13:06:27 +0000

If you actively use AI for work, you probably know the feeling. You have a ChatGPT tab open, Gemini right next to it (because of Nano Banana), and Perplexity in a third tab for research. Somewhere in your bookmarks, image and video generators are gathering dust, waiting for the once-in-a-couple-of-months occasion you actually need them.

All this joy costs around $60-100 a month—assuming you even have access. Paying for it from Russia is a quest in itself, involving crypto and virtual card resellers. But the most frustrating part is that if you go on vacation or have a week packed with calls, your subscriptions simply "burn out" at the end of the month.

Our team realized it was time to bring order to this "zoo." That's how VEGA was born: a unified interface that brings top-tier AI tools under the hood of a convenient chat, without the need for VPNs or pushy subscriptions.

In this article, we want to share how our project is built, both from a product and an architectural perspective.

The Concept: Pay-As-You-Go and No Expiring Months

The core principle we built into the product is that the user should only pay for results. We ditched the classic SaaS model with rigid monthly subscriptions and introduced an internal currency (Stars ⭐).

Need to generate 10 photorealistic images once every six months? You just spend your balance on that specific task. Did top-tier models from Anthropic and Google drop tomorrow? They appear on our platform on release day, and you don't need to buy another $20 subscription just to test them out.

Furthermore, for basic everyday tasks, we offer 10 requests per day to free models (300+ requests per month) with absolutely no limits on quality.

What VEGA Can Do

To stand out among dozens of classic "OpenAI wrappers," we integrated over 30 different services and models:

Multi-level Web Search: We didn't just bolt on a Google parser; we integrated two engines at once—Exa and Parallel AI. It works in real-time. Need to research publications, gather trend analytics for the year, or study open-source library documentation via a link? The AI downloads tens or hundreds of relevant sources and summarizes them. All of this works in Chat mode as well—link reading is integrated across all AI models, and the "Web" button uses Exa so the AI grounds its answers in facts from web searches.

"Autopilot" (Smart Routing): The hardest part of building an all-in-one tool is avoiding UX overload. We don't have a complicated dashboard with 100 buttons. You select "Autopilot" mode, write a prompt, and the system understands your intent.

If you write, "Create a song about Masha and Misha," the request flies straight to the Suno integration.

Drop a PDF and 10 photos, and it triggers multimodal analysis.
Ask, "Give me an analysis of Apple stock," and it fires up a financial API and parses investment data.
Note: There are far more use cases, of course, as the Autopilot has access to the project's entire feature set (except Transcription).

Transcription in 99+ Languages: Upload a voice memo or video, and the system (powered by AssemblyAI) performs speech-to-text, diarizes speakers, detects emotional tone, and provides a concise summary. You definitely haven't seen anything like this before.

Video and Photo Generation: Veo, Kling, Grok, Nano Banana—with support for all their features. To keep the UI clean, there's an AI router under the hood that understands exactly what you want: whether it's editing or stitching videos together, or using attached images as the first and last frames. It's highly complex on the backend, but perfectly seamless for you.

Under the Hood: How We Pieced It Together Technically

Now for the most interesting part—the tech. This isn't just "Next.js + fetch to OpenAI." We've assembled a serious, modern stack from the worlds of Serverless and AI engineering.

Core Stack

Framework: Next.js 16 (React 19, TypeScript 5.9). We're riding the bleeding edge: Server Components and Server Actions keep our bundle lightweight.

UI/Styles: Tailwind CSS v4, deep Radix UI integration for accessibility, and Framer Motion for micro-animations. Our design is minimalist (to keep the focus on the content) but highly responsive.

AI Orchestration

Manually building dozens of integrations is shooting yourself in the foot. That's why the Vercel AI SDK became the "heart" of our routing. Using @openrouter/ai-sdk-provider, @ai-sdk/openai, @ai-sdk/google, and @ai-sdk/xai, we unified model interactions into a single interface (Unified API). For data streaming (so text prints in real-time and photo generation returns status updates), we use streaming protocols. The Autopilot's logic relies on intent classification: using a rapid LLM call or heuristics, the system determines if Tool Calling is needed, if it should query a vector database, and which provider is best suited for the task.

Long-Term Memory

We wanted our chat to cure its amnesia when starting new sessions. To fix this, we integrated Mem0 and Supermemory. The system analyzes user dialogues, extracts facts (e.g., "user is a frontend developer," "loves coffee," "writes in React"), and stores them. During new requests, these facts are injected into the context, creating the experience of a truly personalized assistant. Users can manage their saved facts in a dedicated tab.

Databases and Infrastructure
We went fully Serverless/Edge:

Primary DB: Currently using PostgreSQL 17 (haven't migrated to 18 yet :/).

ORM: We use Drizzle ORM as our data layer. No pain with Prisma abstractions—just clean, strict SQL-like syntax in TypeScript.

Caching and Rate Limits: Upstash Redis and their @upstash/ratelimit library. Since we offer a free tier, strictly controlling abuse and DDoS is vital. Upstash runs on the Edge with single-digit millisecond latency.

Authentication: We use the lightweight better-auth.

Document Parsing: User-uploaded document processing is handled by Gemini 2.5 Flash Lite and its fallback. To allow users to upload almost any format, files are converted using headless LibreOffice, pdfjs, mammoth (for Word), and table parsers. For rendering math and diagrams right inside the chat, we use rehype-katex and the powerful mermaid tool.

Invitation to Test
We built a tool that we genuinely enjoy using ourselves for code, documentation, and everyday tasks. But we want VEGA to become a go-to assistant for a wider audience, especially developers.

Upon registration, we immediately credit bonus ⭐ Stars so you can test out VEGA's core chat features. Nevertheless, the 10 basic daily requests to Free models will always remain free. No VPN is required, and everything runs blazing fast.

We'd love to see you at vega.chat, and we'll be even happier to receive constructive criticism, bug reports, and suggestions in the comments. Our "Autopilot" gets smarter every day, but only experienced AI users can give it a true crash test.

P.S. If you're interested in the technologies we use in VEGA, let us know, and we'll gladly share more detailed case studies.