# RAG

檢索增強生成 - Retrieval Augmented Generation

RAG 主要用來解決大型語言模型（LLM）實際應用時的兩大侷限：幻覺/錯覺（hallucination）與資料時限。RAG 結合「資訊檢索（retrieval）」和「生成（generation）」：在文字生成之前，先從資料庫中檢索相關的資料放入上下文，以確保 LLM 可依照正確的最新資訊生成結果。

RAG 優點：

- 降低 AI 幻覺
- 提升資料數據安全
- 減少模型微調
- 改善資料時限

流程示意圖

[![rag_flow.png](https://osslab.tw/uploads/images/gallery/2024-08/scaled-1680-/rag-flow.png)](https://osslab.tw/uploads/images/gallery/2024-08/rag-flow.png)

#### Introduction

- [Introduction to Retrieval Augmented Generation (RAG) | Weaviate](https://weaviate.io/blog/introduction-to-rag)

#### Tutorials

Introduction to RAG

- [ollama + Langchain + Gradio RAG 程式碼範例](https://www.youtube.com/watch?v=HtqmEREAPC0)
- [A flexible Q&amp;A-chat-app for your selection of documents with langchain, Streamlit and chatGPT | by syrom | Medium](https://medium.com/@syrom_85473/a-flexible-q-a-chat-app-for-your-selection-of-documents-with-langchain-streamlit-and-chatgpt-8205c403a818)
- [【圖解】4步驟教人資打造AI法律顧問！讓你的ChatGPT不再胡說八道|數位時代 BusinessNext (bnext.com.tw)](https://www.bnext.com.tw/article/79136/chatgpt-hr-law)
- [創建本地PDF Chatbot with Llama3 &amp; RAG技術 #chatbot #chatgpt #llama3 #rag #chatpdf - YouTube](https://www.youtube.com/watch?v=d11L0JynGq4&t=853s)
- 一些程式範例：[https://github.com/Shubhamsaboo/awesome-llm-apps](https://github.com/Shubhamsaboo/awesome-llm-apps)
- [Easy AI/Chat For Your Docs with Langchain and OpenAI in Python](https://morioh.com/a/bb6f97863522/easy-aichat-for-your-docs-with-langchain-and-openai-in-python)
- [RAG共学一：16个问题帮你快速入门RAG](https://techdiylife.github.io/blog/blog.html?category1=c02&blogid=0050)
- YT:[RAG共学一：16个问题帮你快速入门RAG - YouTube](https://www.youtube.com/watch?v=MJ3I7dgyF04)
- [全端 LLM 應用開發-Day26-用 Langchain 來做 PDF 文件問答 - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天 (ithome.com.tw)](https://ithelp.ithome.com.tw/articles/10338349)
- [RAG實作教學，LangChain + Llama2 |創造你的個人LLM | by ChiChieh Huang | Medium](https://medium.com/@cch.chichieh/rag%E5%AF%A6%E4%BD%9C%E6%95%99%E5%AD%B8-langchain-llama2-%E5%89%B5%E9%80%A0%E4%BD%A0%E7%9A%84%E5%80%8B%E4%BA%BAllm-d6838febf8c4)
- [Python RAG Tutorial (with Local LLMs): AI For Your PDFs - YouTube](https://www.youtube.com/watch?v=2TJxpyO3ei4)
- [對 PDF 的文字、表格與圖片向量化進行檢索](https://edge.aif.tw/application-langchain-rag/)

##### Embedding/Rerank Models

- [嵌入模型排行榜](https://huggingface.co/spaces/mteb/leaderboard)
- 中文 
    - [BCEmbedding](https://github.com/netease-youdao/BCEmbedding)  
        
        - HuggingFace: [https://huggingface.co/maidalun1020/bce-embedding-base\_v1](https://huggingface.co/maidalun1020/bce-embedding-base_v1)
    - [BAAI](https://huggingface.co/BAAI)
    - [GTE](https://huggingface.co/thenlper)
- API Service 
    - [Cohere](https://cohere.com/rerank) (Rerank)

##### Vector Databases

- [Qdrant](https://qdrant.tech/) - 一個開源的向量搜索引擎，旨在處理高維數據。有GUI管理介面。 
    - [什麼是Qdrant？理解這個向量搜索引擎的最終指南 – AI StartUps Product Information, Reviews, Latest Updates](https://cheatsheet.md/zh/vector-database/what-is-qdrant.zh)
- [Chroma](https://www.trychroma.com/)
    - Doc: [🔑 Getting Started | Chroma Docs](https://docs.trychroma.com/getting-started)
    - [Chroma向量数据库完全手册. 这里算是做一个汇总，以及对它的细节做补充。 | by Lemooljiang | Medium](https://medium.com/@lemooljiang/chroma%E5%90%91%E9%87%8F%E6%95%B0%E6%8D%AE%E5%BA%93%E5%AE%8C%E5%85%A8%E6%89%8B%E5%86%8C-4248b15679ea)
    - [Chroma with Docker](https://github.com/chroma-core/chroma/blob/main/docker-compose.yml)
- [VectorAdmin](https://vectoradmin.com/) - 向量資料庫管理介面 (嵌入模型僅支援 OpenAI) 
    - GitHub: [https://github.com/Mintplex-Labs/vector-admin](https://github.com/Mintplex-Labs/vector-admin)
    - YT: [VectorAdmin | The universal GUI for vector databases - YouTube](https://www.youtube.com/watch?v=cW8Eohz6pzs)
    - [VectorAdmin in Docker](https://github.com/Mintplex-Labs/vector-admin/blob/master/docker/DOCKER.md)
- [Pinecone ](https://www.pinecone.io/)(Cloud) 
    - [Introducing Pinecone Inference to streamline your AI workflow | Pinecone](https://www.pinecone.io/blog/pinecone-inference/)
    - [multilingual-e5-large - Pinecone Docs](https://docs.pinecone.io/models/multilingual-e5-large)
- [Supabase](https://supabase.com/) (Cloud)
- [Astra DB](https://www.datastax.com/products/datastax-astra) (Cloud) 
    - Doc: [Quickstart | Astra DB Serverless | DataStax Docs](https://docs.datastax.com/en/astra-db-serverless/get-started/quickstart.html)
- FAISS 
    - [Vector Search Using Ollama for Retrieval-Augmented Generation (RAG) - PyImageSearch](https://pyimagesearch.com/2026/02/23/vector-search-using-ollama-for-retrieval-augmented-generation-rag/)

#### Advanced RAG

- [RAG 優化技巧| 7 大挑戰與解決方式 | 增進你的 LLM. 儘管 LLM + RAG 的能力已經令人驚嘆，但我們在使用 RAG 優化… | by ChiChieh Huang | Medium](https://medium.com/@cch.chichieh/rag-%E5%84%AA%E5%8C%96%E6%8A%80%E5%B7%A7-7-%E5%A4%A7%E6%8C%91%E6%88%B0%E8%88%87%E8%A7%A3%E6%B1%BA%E6%96%B9%E5%BC%8F-%E5%A2%9E%E9%80%B2%E4%BD%A0%E7%9A%84-llm-0e4ac8adc6df)
- [Advanced RAG: MultiQuery and ParentDocument | RAGStack | DataStax Docs](https://docs.datastax.com/en/ragstack/examples/advanced-rag.html)
- [Advanced Retrieval With LangChain](https://github.com/gkamradt/langchain-tutorials/blob/main/data_generation/Advanced%20Retrieval%20With%20LangChain.ipynb) (ipynb)
- [Advanced RAG Implementation using Hybrid Search, Reranking with Zephyr Alpha LLM | by Nadika Poudel | Medium](https://medium.com/@nadikapoudel16/advanced-rag-implementation-using-hybrid-search-reranking-with-zephyr-alpha-llm-4340b55fef22)
- [Advanced RAG: Query Expansion](https://haystack.deepset.ai/blog/query-expansion)
- [Cohere Cookbooks ](https://docs.cohere.com/page/cookbooks#rag)
- [RAG Techniques: Part 1 of 5— Implementing 5 Effective Methods](https://medium.com/ai-in-plain-english/rag-techniques-part-1-of-5-implementing-5-effective-methods-a92c58399875)
    1. 標準 RAG (Standard RAG)
    2. 糾正式 RAG (Corrective RAG)
    3. 推測式 RAG (Speculative RAG)
    4. 融合式 RAG (Fusion RAG)
    5. 代理式 RAG (Agentic RAG)

##### ReRank

- [RAG 重排序算法（ReRank）的關鍵作用與優化指南 | DataAgent](https://idataagent.com/2024/05/23/the-key-role-and-optimization-guide-of-rag-reranking-algorithm-rerank/)


##### Chunking/Splitting

- [Mastering RAG: Advanced Chunking Techniques for LLM Applications - Galileo (rungalileo.io)](https://www.rungalileo.io/blog/mastering-rag-advanced-chunking-techniques-for-llm-applications)
- [5 Levels Of Text Splitting](https://github.com/FullStackRetrieval-com/RetrievalTutorials/blob/main/tutorials/LevelsOfTextSplitting/5_Levels_Of_Text_Splitting.ipynb) (ipynb)
- [Five Levels of Chunking Strategies in RAG| Notes from Greg’s Video | by Anurag Mishra | Medium](https://medium.com/@anuragmishra_27746/five-levels-of-chunking-strategies-in-rag-notes-from-gregs-video-7b735895694d)
- \[中文\] [Semantic Chunking](https://www.cnblogs.com/theseventhson/p/18279980)
- [使用繁體中文評測 RAG 的 Chunking 切塊策略](https://ihower.tw/blog/archives/12373)
- [Chunking Evaluation](https://github.com/brandonstarxel/chunking_evaluation)
- Online Tools 
    - [Online Text Splitter](https://onlinetexttools.com/split-text)
    - [ChunkViz](https://chunkviz.up.railway.app/)
- [15 Chunking Techniques to Build Exceptional RAGs Systems](https://www.analyticsvidhya.com/blog/2024/10/chunking-techniques-to-build-exceptional-rag-systems/)
- [chonkie](https://github.com/bhavnicksm/chonkie) - The no-nonsense RAG chunking library
- [Chunkr](https://github.com/lumina-ai-inc/chunkr) - Vision infrastructure to turn complex documents into RAG/LLM-ready data

#### RAG Projects

- [Dot](https://github.com/alexpinel/Dot)
- [ragapp](https://github.com/ragapp/ragapp)
- [RAGFlow](https://github.com/infiniflow/ragflow)
    - YT: [RAGFlow：知识库终极引擎 - YouTube](https://www.youtube.com/watch?v=9x-9-r2ifig)
- [R2R](https://r2r-docs.sciphi.ai/introduction)
- [Easy-RAG](https://github.com/yuntianhe2014/Easy-RAG)
- [Langchain-Chatchat](https://github.com/chatchat-space/Langchain-Chatchat)
- [kotaemon](https://github.com/Cinnamon/kotaemon) (For end users and developers)
- [Agentic RAG for Dummies](https://github.com/GiovanniPasq/agentic-rag-for-dummies) - 代理式RAG，可學習，可客製後生產。

#### Danswer

**[Danswer](https://www.danswer.ai/)** is the AI Assistant connected to your company's docs, apps, and people. Danswer provides a Chat interface and plugs into any LLM of your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud.

- GitHub: [https://github.com/danswer-ai/danswer](https://github.com/danswer-ai/danswer)
- Doc: [https://docs.danswer.dev/introduction](https://docs.danswer.dev/introduction)

#### Embedchain

Embedchain streamlines the creation of personalized LLM applications, offering a seamless process for managing various types of unstructured data.

- Doc: [⚡ Quickstart - Embedchain](https://docs.embedchain.ai/get-started/quickstart)
- GitHub: [https://github.com/embedchain/embedchain](https://github.com/embedchain/embedchain)

#### GraphRAG

微軟開源一個基於圖譜的檢索與推理增強的解決方案。GraphRAG 透過從預檢索、後檢索到提示壓縮的過程中考慮知識圖譜的檢索與推理，為回答生成提供了一種更精準和相關的方法。

- [Get Started](https://microsoft.github.io/graphrag/posts/get_started/)
- GitHub: [https://github.com/microsoft/graphrag](https://github.com/microsoft/graphrag)
- YT: [Microsoft GraphRAG | 基于知识图谱的RAG套件，构建更完善的知识库 - YouTube](https://www.youtube.com/watch?v=MRHbQusLgkk)
- GitHub: [GraphRAG Local with Ollama and Gradio UI](https://github.com/severian42/GraphRAG-Local-UI)
- YT: [颠覆传统RAG！GraphRAG结合本地大模型：Gemma 2+Nomic Embed齐上阵，轻松掌握GraphRAG+Chainlit+Ollama技术栈 #graphrag #ollama #ai - YouTube](https://www.youtube.com/watch?v=XiLEZzm7yCk)
- GitHub: [GraphRAG + AutoGen + Ollama + Chainlit UI = Local Multi-Agent RAG Superbot](https://github.com/karthik-codex/Autogen_GraphRAG_Ollama)

[neo4j](https://neo4j.com/)

- Doc: [GenAI Ecosystem - Neo4j Labs](https://neo4j.com/labs/genai-ecosystem/)
- 中文: [生成式 AI 的資料救星！GraphRAG 知識圖譜革命，大幅提升 LLM 準確度！ | T客邦 (techbang.com)](https://www.techbang.com/posts/116888-graphraggithub-starai)
- [NeoConverse - Graph Database Search with Natural Language - Neo4j Labs](https://neo4j.com/labs/genai-ecosystem/neoconverse/)
- LangChain: [Enhancing RAG-based application accuracy by constructing and leveraging knowledge graphs (langchain.dev)](https://blog.langchain.dev/enhancing-rag-based-applications-accuracy-by-constructing-and-leveraging-knowledge-graphs/)
- [Build a Question Answering application over a Graph Database | 🦜️🔗 LangChain](https://python.langchain.com/v0.2/docs/tutorials/graph/)
- LangChain: [https://neo4j.com/labs/genai-ecosystem/langchain/](https://neo4j.com/labs/genai-ecosystem/langchain/)
- [https://github.com/neo4j-labs/llm-graph-builder](https://github.com/neo4j-labs/llm-graph-builder)
- ipynb: [https://github.com/tomasonjo/blogs/blob/master/llm/enhancing\_rag\_with\_graph.ipynb](https://github.com/tomasonjo/blogs/blob/master/llm/enhancing_rag_with_graph.ipynb)

#### Verba

Verba is a fully-customizable personal assistant for querying and interacting with your data, either locally or deployed via cloud. Resolve questions around your documents, cross-reference multiple data points or gain insights from existing knowledge bases. Verba combines state-of-the-art RAG techniques with Weaviate's context-aware database. Choose between different RAG frameworks, data types, chunking &amp; retrieving techniques, and LLM providers based on your individual use-case.

- Github: [Retrieval Augmented Generation (RAG) chatbot powered by Weaviate](https://github.com/weaviate/verba)
- [Weaviate](https://weaviate.io/) is an open source, AI-native vector database 
    - Doc: [Quickstart Tutorial | Weaviate - Vector Database](https://weaviate.io/developers/weaviate/quickstart)
- Video: [Open Source RAG with Ollama - YouTube](https://www.youtube.com/watch?v=swKKRdLBhas)

#### PrivateGPT

- [Introduction – PrivateGPT | Docs](https://docs.privategpt.dev/overview/welcome/introduction)
- GitHub: [https://github.com/zylon-ai/private-gpt](https://github.com/zylon-ai/private-gpt)
- Video: [PrivateGPT 2.0 - FULLY LOCAL Chat With Docs (PDF, TXT, HTML, PPTX, DOCX, and more) - YouTube](https://www.youtube.com/watch?v=XFiof0V3nhA)
- Video: [Installing Private GPT to interact with your own documents!! - YouTube](https://www.youtube.com/watch?v=rwqPA4qi0H8)

#### LLMWare

The Ultimate Toolkit for Enterprise RAG Pipelines with Small, Specialized Models.

- [Home llmware | llmware (llmware-ai.github.io)](https://llmware-ai.github.io/llmware/)
- GitHub: [https://github.com/llmware-ai/llmware](https://github.com/llmware-ai/llmware)

#### talkd/dialog

Talkd.ai—Optimizing LLMs with easy RAG deployment and management.

- [talkd/dialog | dialog](https://dialog.talkd.ai/)
- GitHub: [https://github.com/talkdai/dialog](https://github.com/talkdai/dialog)

#### RAG 評估

評估生成（Generation）指標

- 忠誠度（Faithfulness）  
    忠誠度是評估 RAG 模型生成答案的真實度和可靠性的關鍵指標。它主要衡量生成答案與給定上下文事實之間的一致性。忠誠度高的答案意味著模型能夠準確地從給定的上下文中提取信息，並生成與事實一致的回答。這對於保證生成內容的質量和信任度至關重要。
- 答案相關性（Answer Relevancy）  
    答案相關性則重點衡量生成答案與用戶提問的匹配程度。高相關性的答案不僅要求模型能夠理解用戶的問題，還要求其能夠生成與問題密切相關的回答。這直接影響到用戶的滿意度和模型的實用性。
- 答案正確性（Answer Correctness）  
    答案正確性是衡量生成的答案與已知的“地面真相”答案之間的一致性。計算方法是評估生成答案的準確度，即答案與真實答案的一致性。技術達成的方式可以通過比較生成答案與真實答案的文字相似度來完成，這類似於答案相關性，但更側重於答案的準確性。

評估檢索（Retrieval）指標

- 上下文召回率（Context Recall）  
    上下文召回率關注於模型在檢索過程中能否准確地找到與問題相關的上下文訊息。一個高召回率的模型能夠從大量數據中有效地過濾出最相關的訊息，這是提升問答系統準確性和效率的關鍵。
- 上下文精確度（Context Precision）  
    上下文精確度是衡量RAG系統在回答問題時使用的上下文資料的相關性。計算方法是確定RAG系統為回答特定問題而選擇的上下文資料與問題的相關性。技術達成的方式通常涵蓋比較RAG選擇的上下文與一組預先定義的相關上下文，計算這些上下文在生成答案時的重要性。

##### URLs

- Ragas - [🚀 Get Started | Ragas](https://docs.ragas.io/en/stable/getstarted/index.html)
- [LLM Hallucination Index RAG Special - Galileo - Galileo (rungalileo.io)](https://www.rungalileo.io/hallucinationindex)