RAG

檢索增強生成 - Retrieval Augmented Generation 
 RAG 主要用來解決大型語言模型（LLM）實際應用時的兩大侷限：幻覺/錯覺（hallucination）與資料時限。RAG 結合「資訊檢索（retrieval）」和「生成（generation）」：在文字生成之前，先從資料庫中檢索相關的資料放入上下文，以確保 LLM 可依照正確的最新資訊生成結果。 
 RAG 優點： 
 
 降低 AI 幻覺 
 提升資料數據安全 
 減少模型微調 
 改善資料時限 
 
 流程示意圖 
 
 Introduction 
 
 Introduction to Retrieval Augmented Generation (RAG) | Weaviate 
 
 Tutorials 
 Introduction to RAG 
 
 ollama + Langchain + Gradio RAG 程式碼範例 
 A flexible Q&A-chat-app for your selection of documents with langchain, Streamlit and chatGPT | by syrom | Medium 
 【圖解】4步驟教人資打造AI法律顧問！讓你的ChatGPT不再胡說八道|數位時代 BusinessNext (bnext.com.tw) 
 創建本地PDF Chatbot with Llama3 & RAG技術 #chatbot #chatgpt #llama3 #rag #chatpdf - YouTube 
 一些程式範例： https://github.com/Shubhamsaboo/awesome-llm-apps   
 Easy AI/Chat For Your Docs with Langchain and OpenAI in Python 
 RAG共学一：16个问题帮你快速入门RAG 
 YT: RAG共学一：16个问题帮你快速入门RAG - YouTube 
 全端 LLM 應用開發-Day26-用 Langchain 來做 PDF 文件問答 - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天 (ithome.com.tw) 
 RAG實作教學，LangChain + Llama2 |創造你的個人LLM | by ChiChieh Huang | Medium 
 Python RAG Tutorial (with Local LLMs): AI For Your PDFs - YouTube 
 對 PDF 的文字、表格與圖片向量化進行檢索 
 
 Embedding/Rerank Models 
 
 嵌入模型排行榜 
 中文
 
 BCEmbedding 
 
 HuggingFace: https://huggingface.co/maidalun1020/bce-embedding-base_v1   
 
 
 BAAI 
 GTE 
 
 
 API Service
 
 Cohere (Rerank) 
 
 
 
 Vector Databases 
 
 Qdrant - 一個開源的向量搜索引擎，旨在處理高維數據。有GUI管理介面。
 
 什麼是Qdrant？理解這個向量搜索引擎的最終指南 – AI StartUps Product Information, Reviews, Latest Updates 
 
 
 Chroma 
 
 Doc: 🔑 Getting Started | Chroma Docs 
 Chroma向量数据库完全手册. 这里算是做一个汇总，以及对它的细节做补充。 | by Lemooljiang | Medium 
 Chroma with Docker 
 
 
 VectorAdmin - 向量資料庫管理介面 (嵌入模型僅支援 OpenAI)
 
 GitHub: https://github.com/Mintplex-Labs/vector-admin 
 YT: VectorAdmin | The universal GUI for vector databases - YouTube   
 VectorAdmin in Docker 
 
 
 Pinecone (Cloud)
 
 Introducing Pinecone Inference to streamline your AI workflow | Pinecone 
 multilingual-e5-large - Pinecone Docs 
 
 
 Supabase (Cloud) 
 Astra DB (Cloud)
 
 Doc: Quickstart | Astra DB Serverless | DataStax Docs 
 
 
 FAISS
 
 Vector Search Using Ollama for Retrieval-Augmented Generation (RAG) - PyImageSearch 
 
 
 
 Advanced RAG 
 
 RAG 優化技巧| 7 大挑戰與解決方式 | 增進你的 LLM. 儘管 LLM + RAG 的能力已經令人驚嘆，但我們在使用 RAG 優化… | by ChiChieh Huang | Medium 
 Advanced RAG: MultiQuery and ParentDocument | RAGStack | DataStax Docs 
 Advanced Retrieval With LangChain (ipynb) 
 Advanced RAG Implementation using Hybrid Search, Reranking with Zephyr Alpha LLM | by Nadika Poudel | Medium   
 Advanced RAG: Query Expansion   
 Cohere Cookbooks  
 RAG Techniques: Part 1 of 5— Implementing 5 Effective Methods 
 
 標準 RAG (Standard RAG) 
 糾正式 RAG (Corrective RAG) 
 推測式 RAG (Speculative RAG) 
 融合式 RAG (Fusion RAG) 
 代理式 RAG (Agentic RAG) 
 
 
 
 ReRank 
 
 RAG 重排序算法（ReRank）的關鍵作用與優化指南 | DataAgent 
 
 
 Chunking/Splitting 
 
 Mastering RAG: Advanced Chunking Techniques for LLM Applications - Galileo (rungalileo.io) 
 5 Levels Of Text Splitting (ipynb) 
 Five Levels of Chunking Strategies in RAG| Notes from Greg’s Video | by Anurag Mishra | Medium   
 [中文] Semantic Chunking 
 使用繁體中文評測 RAG 的 Chunking 切塊策略 
 Chunking Evaluation 
 Online Tools
 
 Online Text Splitter 
 ChunkViz 
 
 
 15 Chunking Techniques  to Build Exceptional RAGs Systems 
 chonkie - The no-nonsense RAG chunking library 
 Chunkr - Vision infrastructure to turn complex documents into RAG/LLM-ready data 
 
 RAG Projects 
 
 Dot   
 ragapp   
 RAGFlow 

 
 YT: RAGFlow：知识库终极引擎 - YouTube 
 
 
 R2R 
 Easy-RAG 
 Langchain-Chatchat 
 kotaemon (For end users and developers) 
 Agentic RAG for Dummies - 代理式RAG，可學習，可客製後生產。 
 
 Danswer 
 Danswer is the AI Assistant connected to your company's docs, apps, and people. Danswer provides a Chat interface and plugs into any LLM of your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud. 
 
 GitHub: https://github.com/danswer-ai/danswer   
 Doc: https://docs.danswer.dev/introduction   
 
 Embedchain 
 Embedchain streamlines the creation of personalized LLM applications, offering a seamless process for managing various types of unstructured data. 
 
 Doc: ⚡ Quickstart - Embedchain 
 GitHub: https://github.com/embedchain/embedchain   
 
 GraphRAG 
 微軟開源一個基於圖譜的檢索與推理增強的解決方案。GraphRAG 透過從預檢索、後檢索到提示壓縮的過程中考慮知識圖譜的檢索與推理，為回答生成提供了一種更精準和相關的方法。 
 
 Get Started 
 GitHub: https://github.com/microsoft/graphrag   
 YT: Microsoft GraphRAG | 基于知识图谱的RAG套件，构建更完善的知识库 - YouTube 
 GitHub: GraphRAG Local with Ollama and Gradio UI 
 YT: 颠覆传统RAG！GraphRAG结合本地大模型：Gemma 2+Nomic Embed齐上阵，轻松掌握GraphRAG+Chainlit+Ollama技术栈 #graphrag #ollama #ai - YouTube 
 GitHub: GraphRAG + AutoGen + Ollama + Chainlit UI = Local Multi-Agent RAG Superbot 
 
 neo4j 
 
 Doc: GenAI Ecosystem - Neo4j Labs 
 中文: 生成式 AI 的資料救星！GraphRAG 知識圖譜革命，大幅提升 LLM 準確度！ | T客邦 (techbang.com) 
 NeoConverse - Graph Database Search with Natural Language - Neo4j Labs 
 LangChain: Enhancing RAG-based application accuracy by constructing and leveraging knowledge graphs (langchain.dev) 
 Build a Question Answering application over a Graph Database | 🦜️🔗 LangChain 
 LangChain: https://neo4j.com/labs/genai-ecosystem/langchain/   
 https://github.com/neo4j-labs/llm-graph-builder   
 ipynb: https://github.com/tomasonjo/blogs/blob/master/llm/enhancing_rag_with_graph.ipynb   
 
 Verba 
 Verba is a fully-customizable personal assistant for querying and interacting with your data, either locally or deployed via cloud. Resolve questions around your documents, cross-reference multiple data points or gain insights from existing knowledge bases. Verba combines state-of-the-art RAG techniques with Weaviate's context-aware database. Choose between different RAG frameworks, data types, chunking & retrieving techniques, and LLM providers based on your individual use-case. 
 
 Github: Retrieval Augmented Generation (RAG) chatbot powered by Weaviate 
 Weaviate is an open source, AI-native vector database

 
 Doc: Quickstart Tutorial | Weaviate - Vector Database 
 
 
 Video: Open Source RAG with Ollama - YouTube 
 
 PrivateGPT 
 
 Introduction – PrivateGPT | Docs 
 GitHub: https://github.com/zylon-ai/private-gpt 
 Video: PrivateGPT 2.0 - FULLY LOCAL Chat With Docs (PDF, TXT, HTML, PPTX, DOCX, and more) - YouTube 
 Video: Installing Private GPT to interact with your own documents!! - YouTube 
 
 LLMWare 
 The Ultimate Toolkit for Enterprise RAG Pipelines with Small, Specialized Models. 
 
 Home llmware | llmware (llmware-ai.github.io) 
 GitHub: https://github.com/llmware-ai/llmware   
 
 talkd/dialog 
 Talkd.ai—Optimizing LLMs with easy RAG deployment and management. 
 
 talkd/dialog | dialog 
 GitHub: https://github.com/talkdai/dialog   
 
 RAG 評估 
 評估生成（Generation）指標 
 
 忠誠度（Faithfulness） 忠誠度是評估 RAG 模型生成答案的真實度和可靠性的關鍵指標。它主要衡量生成答案與給定上下文事實之間的一致性。忠誠度高的答案意味著模型能夠準確地從給定的上下文中提取信息，並生成與事實一致的回答。這對於保證生成內容的質量和信任度至關重要。 
 答案相關性（Answer Relevancy） 答案相關性則重點衡量生成答案與用戶提問的匹配程度。高相關性的答案不僅要求模型能夠理解用戶的問題，還要求其能夠生成與問題密切相關的回答。這直接影響到用戶的滿意度和模型的實用性。 
 答案正確性（Answer Correctness） 答案正確性是衡量生成的答案與已知的“地面真相”答案之間的一致性。計算方法是評估生成答案的準確度，即答案與真實答案的一致性。技術達成的方式可以通過比較生成答案與真實答案的文字相似度來完成，這類似於答案相關性，但更側重於答案的準確性。 
 
 評估檢索（Retrieval）指標 
 
 上下文召回率（Context Recall） 上下文召回率關注於模型在檢索過程中能否准確地找到與問題相關的上下文訊息。一個高召回率的模型能夠從大量數據中有效地過濾出最相關的訊息，這是提升問答系統準確性和效率的關鍵。 
 上下文精確度（Context Precision） 上下文精確度是衡量RAG系統在回答問題時使用的上下文資料的相關性。計算方法是確定RAG系統為回答特定問題而選擇的上下文資料與問題的相關性。技術達成的方式通常涵蓋比較RAG選擇的上下文與一組預先定義的相關上下文，計算這些上下文在生成答案時的重要性。 
 
 URLs 
 
 Ragas - 🚀 Get Started | Ragas 
 LLM Hallucination Index RAG Special - Galileo - Galileo (rungalileo.io)