Back to Portfolio

Document RAG Pipeline

Self-hosted RAG system for capturing, indexing, and querying technical documentation with browser extension, API, and MCP server.

Cloudflare WorkersVectorizeD1Workers AIMCP

Screenshot of Document RAG Pipeline - Self-hosted RAG system for capturing, indexing, and querying technical documentation with browser extension, API, and MCP server.

About This Project

A complete Retrieval-Augmented Generation pipeline for building your own searchable knowledge base from technical documentation. This system captures documentation from anywhere on the web, indexes it with vector embeddings, and makes it queryable through multiple interfaces. Perfect for teams who want to build institutional knowledge without relying on external services. Key features include: - Browser extension for one-click documentation capture - Vector search powered by Cloudflare Vectorize - RESTful API for programmatic access - MCP server for AI assistant integration - Automatic chunking and embedding generation - Full-text and semantic search capabilities