Data Analysis (Chat with CSV)

Chat with Dataset


PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps you to explore, clean, and analyze your data using generative AI.

Web Scraper

- Crawlee

A web scraping and browser automation library. 

- ScrapeGraphAI

ScrapeGraphAI is a open-source web scraping python library designed to usher in a new era of scraping tools.

- Crew AI

Crew AI is a collaborative working system designed to enable various artificial intelligence agents to work together as a team, efficiently accomplishing complex tasks. Each agent has a specific role, resembling a team composed of researchers, writers, and planners.



Web UI Development for LLM

- Gradio

Gradio is the fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere!

- Streamlit

Streamlit is the UI powering the LLM movement

AI Memory

Structured outputs powered by llms. Designed for simplicity, transparency, and control.


Phidata adds memory, knowledge and tools to LLMs.

PDF Extractor

- gptpdf

使用 OpenAI API 提取 PDF 內容,輸出為 Markdown 格式。 

- omniparse

PDF to Markdown

- PDF-Extract-Kit
  • Layout Detection: Using the LayoutLMv3 model for region detection, such as images, tables, titles, text, etc.;
  • Formula Detection: Using YOLOv8 for detecting formulas, including inline formulas and isolated formulas;
  • Formula Recognition: Using UniMERNet for formula recognition; Optical Character Recognition: Using PaddleOCR for text recognition;


- Mark

Marker converts PDF to markdown quickly and accurately.

  • Supports a wide range of documents (optimized for books and scientific papers)
  • Supports all languages
  • Removes headers/footers/other artifacts
  • Formats tables and code blocks
  • Extracts and saves images along with the markdown
  • Converts most equations to latex
  • Works on GPU, CPU, or MPS


Using VLLM (like GPT-4o) to parse PDF into markdown.

Gemini API