RAG Agent with Database Routing

A routed retrieval system that decides which knowledge base should answer a question instead of forcing every query through the same index.

RAG Routing Workflow

Route first, retrieve second, and fall back to the web only when the indexed sources are weak

This project demonstrates a more production-minded RAG pattern: separate corpora for product, support, and finance content, confidence-aware routing across them, and a web-research fallback when the internal collections do not contain strong evidence.

Overview

RAG Agent with Database Routing is built around a simple idea: not every question belongs in the same retrieval bucket. Instead of dumping all documents into one vector store and hoping retrieval sorts it out, the system keeps multiple specialized collections and decides which one should handle the query before answer generation begins.

What The Product Does

Lets users upload PDF documents into separate domain-specific databases for product, support, and financial knowledge
Routes each question to the most relevant collection based on similarity scoring across the indexed databases
Falls back to an LLM-based router when similarity confidence is low or ambiguous
Uses a web-research fallback when none of the internal databases are a good fit
Returns context-grounded answers while keeping the retrieval path explicit to the user

Implementation Details

Built as a Streamlit application with a credential and upload flow in the sidebar plus database-specific upload tabs in the main view
Uses PyPDFLoader and RecursiveCharacterTextSplitter to process uploaded PDFs into retrievable chunks
Uses OpenAI embeddings with Qdrant collections for the three domain-specific databases
Uses LangChain retrieval chains to assemble grounded answers from the chosen collection
Uses an Agno routing agent as a backup router for low-confidence cases
Uses a LangGraph ReAct-style fallback agent with DuckDuckGo search for out-of-distribution questions

Why It Matters

The value here is architectural, not just visual. It demonstrates that retrieval quality is partly a routing problem. By separating product, support, and finance knowledge and using a staged routing strategy, the system reduces the chances of semantically adjacent but operationally wrong documents steering the answer.

Design Decisions

Multiple collections keep the corpora semantically narrower and easier to reason about
Vector similarity is used first because it is fast and cheap for obvious queries
LLM routing only appears when the similarity signals are weak, which keeps the flow efficient
Web fallback is explicit rather than silent so the user can tell when the system has left the curated internal data

Role and Focus

Role: Solo builder focused on multi-corpus retrieval architecture, routing logic, and fallback design.

Tech Stack: Streamlit, LangChain, LangGraph, Agno, Qdrant, OpenAI embeddings.

Category: Retrieval-augmented generation, query routing, knowledge systems, search architecture.

Positioning: Think "intent router for RAG," rather than a single-index document chatbot.

Thumbnail Alt Text

RAG routing interface showing multiple specialized document databases and a question-answer flow that chooses the best collection before retrieval.