Taki

Posted on May 23

Open source AI stack components

#programming #ai #gen #javascript

Here’s a comprehensive and categorized list of open source AI stack components that you can mix and match when building GenAI applications — especially when focusing on modularity, scalability, and performance. This includes components for data processing, model serving, retrieval-augmented generation (RAG), vector search, and orchestration.

🧠 Foundational Model Alternatives

Models you can self-host or fine-tune:

LLMs
- llama.cpp – Inference for LLaMA and derivatives (CPU/GPU).
- mistral – Mistral models.
- Falcon – Powerful open weights.
- GPT-J, GPT-NeoX – From EleutherAI.
- OpenChat – Open fine-tuned chat model.
- WizardLM – Instruction-tuned LLMs.
Multimodal
- llava – Language + vision.
- bakllava – More optimized multimodal variant.
- CLIP – Text-image understanding.
Fine-Tuning
- QLoRA, LoRA, PEFT (via 🤗 Transformers + PEFT)
- Axolotl – Full stack fine-tuning.

📚 RAG (Retrieval-Augmented Generation) Stack

Tools to power knowledge-based Q&A systems:

Embeddings
- sentence-transformers
- Instructor-XL – Instruction-based embeddings.
Vector Databases
- Qdrant
- Weaviate
- Pinecone (closed source but popular)
- Milvus
- Chroma – Python-native vector DB.
- FAISS – Facebook AI Similarity Search.
Document Loaders & Chunking
- LangChain or LlamaIndex
- Haystack – Full RAG pipelines.

🔧 Serving & Orchestration

Serving models with APIs, managing prompts, memory, and chaining tools:

Model Servers
- vLLM – Fast LLM serving with paged attention.
- TGI – HuggingFace’s scalable inference server.
- Triton Inference Server
- LMDeploy – Model optimization & serving.
Agent / Workflow Frameworks
- LangChain
- LlamaIndex
- Haystack
- CrewAI – Multi-agent framework.
- AutoGen
Prompt Management
- PromptLayer
- Langfuse
- Helicone (for logging OpenAI usage)

🖼️ Frontend / Chat UI

For chatbots or multimodal interfaces:

Next.js – UI + SSR/ISR.
ShadCN/ui – Design system for building clean UIs.
Chatbot UI – Open-source ChatGPT-style interface.
Open WebUI – Web UI for LM Studio / Ollama.

🚀 Inference & Runtime Optimization

llm.rs – LLM inference in Rust.
ggml – Quantized models, runs on CPU.
exllama – High-perf quantized inference.

🔒 Security & DevOps (for production)

AuthN/AuthZ: [Auth.js (NextAuth)], [Clerk], [Ory], [ZITADEL]
Logging/Tracing: [Langfuse], [OpenTelemetry], [Sentry]
DevOps: Docker, Kubernetes, GitHub Actions, Terraform

🧱 Full Stack Boilerplates

If you're looking to start fast:

AI Engineer OS – Full-stack open source GenAI stack.
LangChainHub – Reusable chains and prompts.
OpenChatKit – Chatbot framework.
Flowise – Visual LangChain builder.

🧪 Experimental Tools

Ollama – Run and manage LLMs locally.
Modal – Serverless infra for AI.
LiteLLM – Drop-in proxy for OpenAI-compatible APIs.

Deploy with ease. Manage efficiently. Scale faster.

Leave the infrastructure headaches to us, while you focus on pushing boundaries, realizing your vision, and making a lasting impression on your users.

Get Started

Top comments (1)

Duc Nguyen Thanh • May 27

Could you write a new topic about how to use that

ACI.dev: Fully Open-source AI Agent Tool-Use Infra (Composio Alternative)

100% open-source tool-use platform (backend, dev portal, integration library, SDK/MCP) that connects your AI agents to 600+ tools with multi-tenant auth, granular permissions, and access through direct function calling or a unified MCP server.

Check out our GitHub!