RAG
檢索增強生成 - Retrieval Augmented Generation
RAG 主要用來解決大型語言模型(LLM)實際應用時的兩大侷限:幻覺/錯覺(hallucination)與資料時限。RAG 結合「資訊檢索(retrieval)」和「生成(generation)」:在文字生成之前,先從資料庫中檢索相關的資料放入上下文,以確保 LLM 可依照正確的最新資訊生成結果。
RAG 優點:
- 降低 AI 幻覺
- 提升資料數據安全
- 減少模型微調
- 改善資料時限
Tutorials
GitHub Projects
Embedding Models
Vector Databases
- Qdrant - 一個開源的向量搜索引擎,旨在處理高維數據。有GUI管理介面。
Verba
Verba is a fully-customizable personal assistant for querying and interacting with your data, either locally or deployed via cloud. Resolve questions around your documents, cross-reference multiple data points or gain insights from existing knowledge bases. Verba combines state-of-the-art RAG techniques with Weaviate's context-aware database. Choose between different RAG frameworks, data types, chunking & retrieving techniques, and LLM providers based on your individual use-case.
- Github: Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
- Weaviate is an open source, AI-native vector database
- Video: Open Source RAG with Ollama - YouTube
PrivateGPT
- Introduction – PrivateGPT | Docs
- GitHub: https://github.com/zylon-ai/private-gpt
- Video: PrivateGPT 2.0 - FULLY LOCAL Chat With Docs (PDF, TXT, HTML, PPTX, DOCX, and more) - YouTube
- Video: Installing Private GPT to interact with your own documents!! - YouTube
LLMWare
The Ultimate Toolkit for Enterprise RAG Pipelines with Small, Specialized Models.
talkd/dialog
Talkd.ai—Optimizing LLMs with easy RAG deployment and management.