← Back to Vibe-Coding
RAG Agent with Database Routing
A routed retrieval system that decides which knowledge base should answer a question instead of forcing every query through the same index.
RAG Routing Workflow
Route first, retrieve second, and fall back to the web only when the indexed sources are weak
This project demonstrates a more production-minded RAG pattern: separate corpora for product, support, and finance content, confidence-aware routing across them, and a web-research fallback when the internal collections do not contain strong evidence.
Overview
RAG Agent with Database Routing is built around a simple idea: not every question belongs in the same retrieval bucket. Instead of dumping all documents into one vector store and hoping retrieval sorts it out, the system keeps multiple specialized collections and decides which one should handle the query before answer generation begins.
What The Product Does
- Lets users upload PDF documents into separate domain-specific databases for product, support, and financial knowledge
- Routes each question to the most relevant collection based on similarity scoring across the indexed databases
- Falls back to an LLM-based router when similarity confidence is low or ambiguous
- Uses a web-research fallback when none of the internal databases are a good fit
- Returns context-grounded answers while keeping the retrieval path explicit to the user
Implementation Details
- Built as a Streamlit application with a credential and upload flow in the sidebar plus database-specific upload tabs in the main view
- Uses PyPDFLoader and RecursiveCharacterTextSplitter to process uploaded PDFs into retrievable chunks
- Uses OpenAI embeddings with Qdrant collections for the three domain-specific databases
- Uses LangChain retrieval chains to assemble grounded answers from the chosen collection
- Uses an Agno routing agent as a backup router for low-confidence cases
- Uses a LangGraph ReAct-style fallback agent with DuckDuckGo search for out-of-distribution questions
Why It Matters
The value here is architectural, not just visual. It demonstrates that retrieval quality is partly a routing problem. By separating product, support, and finance knowledge and using a staged routing strategy, the system reduces the chances of semantically adjacent but operationally wrong documents steering the answer.
Design Decisions
- Multiple collections keep the corpora semantically narrower and easier to reason about
- Vector similarity is used first because it is fast and cheap for obvious queries
- LLM routing only appears when the similarity signals are weak, which keeps the flow efficient
- Web fallback is explicit rather than silent so the user can tell when the system has left the curated internal data
Role and Focus
Role: Solo builder focused on multi-corpus retrieval architecture, routing logic, and fallback design.
Tech Stack: Streamlit, LangChain, LangGraph, Agno, Qdrant, OpenAI embeddings.
Category: Retrieval-augmented generation, query routing, knowledge systems, search architecture.
Positioning: Think "intent router for RAG," rather than a single-index document chatbot.
Thumbnail Alt Text
RAG routing interface showing multiple specialized document databases and a question-answer flow that chooses the best collection before retrieval.