RAG

檢索增強生成 - Retrieval Augmented Generation

RAG 主要用來解決大型語言模型（LLM）實際應用時的兩大侷限：幻覺/錯覺（hallucination）與資料時限。RAG 結合「資訊檢索（retrieval）」和「生成（generation）」：在文字生成之前，先從資料庫中檢索相關的資料放入上下文，以確保 LLM 可依照正確的最新資訊生成結果。

RAG 優點：

降低 AI 幻覺
提升資料數據安全
減少模型微調
改善資料時限

流程示意圖

Introduction

Introduction to Retrieval Augmented Generation (RAG) | Weaviate

Tutorials

Introduction to RAG

~~ReRank~~

~~RAG 重排序算法（ReRank）的關鍵作用與優化指南 | DataAgent~~
Advanced RAG: MultiQuery and ParentDocument | RAGStack | DataStax Docs Advanced Retrieval With LangChain (ipynb) Advanced RAG Implementation using Hybrid Search, Reranking with Zephyr Alpha LLM | by Nadika Poudel | Medium ~~Five~~Advanced ~~Levels~~RAG: ofQuery ~~Chunking~~Expansion ~~Strategies~~
inCohere ~~RAG|~~Cookbooks ~~Notes from Greg’s Video | by Anurag Mishra | Medium~~ RAG Techniques: Part 1 of 5— Implementing 5 Effective Methods

標準 RAG (Standard RAG) 糾正式 RAG (Corrective RAG) 推測式 RAG (Speculative RAG) 融合式 RAG (Fusion RAG) 代理式 RAG (Agentic RAG)

ReRank

RAG 重排序算法（ReRank）的關鍵作用與優化指南 | DataAgent

Chunking/Splitting

Five Levels of Chunking Strategies in RAG| Notes from Greg’s Video | by Anurag Mishra | Medium [中文] Semantic Chunking 使用繁體中文評測 RAG 的 Chunking 切塊策略 Chunking Evaluation Online Tools

15 Chunking Techniques to Build Exceptional RAGs Systems chonkie - The no-nonsense RAG chunking library ~~Advanced RAG: Query Expansion~~
~~Cohere Cookbooks~~ ~~RAG Techniques: Part 1 of 5— Implementing 5 Effective Methods~~Chunkr

- 標準Vision ~~RAG~~infrastructure ~~(Standard~~to ~~RAG)~~turn ~~糾正式~~complex ~~RAG~~documents ~~(Corrective~~into ~~RAG)~~RAG/LLM-ready ~~推測式 RAG (Speculative RAG)~~ ~~融合式 RAG (Fusion RAG)~~ ~~代理式 RAG (Agentic RAG)~~ data

RAG Projects

Danswer

Danswer is the AI Assistant connected to your company's docs, apps, and people. Danswer provides a Chat interface and plugs into any LLM of your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud.

GitHub: https://github.com/danswer-ai/danswer
Doc: https://docs.danswer.dev/introduction

Embedchain

Embedchain streamlines the creation of personalized LLM applications, offering a seamless process for managing various types of unstructured data.

Doc: ⚡ Quickstart - Embedchain
GitHub: https://github.com/embedchain/embedchain

GraphRAG

微軟開源一個基於圖譜的檢索與推理增強的解決方案。GraphRAG 透過從預檢索、後檢索到提示壓縮的過程中考慮知識圖譜的檢索與推理，為回答生成提供了一種更精準和相關的方法。

neo4j

Verba

Verba is a fully-customizable personal assistant for querying and interacting with your data, either locally or deployed via cloud. Resolve questions around your documents, cross-reference multiple data points or gain insights from existing knowledge bases. Verba combines state-of-the-art RAG techniques with Weaviate's context-aware database. Choose between different RAG frameworks, data types, chunking & retrieving techniques, and LLM providers based on your individual use-case.

Github: Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
Weaviate is an open source, AI-native vector database
- Doc: Quickstart Tutorial | Weaviate - Vector Database
Video: Open Source RAG with Ollama - YouTube

PrivateGPT

LLMWare

The Ultimate Toolkit for Enterprise RAG Pipelines with Small, Specialized Models.

talkd/dialog

Talkd.ai—Optimizing LLMs with easy RAG deployment and management.

talkd/dialog | dialog
GitHub: https://github.com/talkdai/dialog

RAG 評估

評估生成（Generation）指標

忠誠度（Faithfulness）
忠誠度是評估 RAG 模型生成答案的真實度和可靠性的關鍵指標。它主要衡量生成答案與給定上下文事實之間的一致性。忠誠度高的答案意味著模型能夠準確地從給定的上下文中提取信息，並生成與事實一致的回答。這對於保證生成內容的質量和信任度至關重要。
答案相關性（Answer Relevancy）
答案相關性則重點衡量生成答案與用戶提問的匹配程度。高相關性的答案不僅要求模型能夠理解用戶的問題，還要求其能夠生成與問題密切相關的回答。這直接影響到用戶的滿意度和模型的實用性。
答案正確性（Answer Correctness）
答案正確性是衡量生成的答案與已知的“地面真相”答案之間的一致性。計算方法是評估生成答案的準確度，即答案與真實答案的一致性。技術達成的方式可以通過比較生成答案與真實答案的文字相似度來完成，這類似於答案相關性，但更側重於答案的準確性。

評估檢索（Retrieval）指標

上下文召回率（Context Recall）
上下文召回率關注於模型在檢索過程中能否准確地找到與問題相關的上下文訊息。一個高召回率的模型能夠從大量數據中有效地過濾出最相關的訊息，這是提升問答系統準確性和效率的關鍵。
上下文精確度（Context Precision）
上下文精確度是衡量RAG系統在回答問題時使用的上下文資料的相關性。計算方法是確定RAG系統為回答特定問題而選擇的上下文資料與問題的相關性。技術達成的方式通常涵蓋比較RAG選擇的上下文與一組預先定義的相關上下文，計算這些上下文在生成答案時的重要性。