Skip to main content
BookOSSLab
View All
Search
Shelves
Books
Log in
Info
Content
Books
Gen AI
Voice
Voice
Gen Audio
Stability AI
Stable Audio
HF:
https://huggingface.co/stabilityai/stable-audio-open-1.0
Stability AI Launches Open-Source Model to Generate Audio (itsfoss.com)
FunAudioLLM
- Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs
GitHub:
https://github.com/FunAudioLLM
Instant voice cloning
OpenVoice
Text to Speech (TTS)
ChatTTS
6drf21e/ChatTTS_colab: 🚀 一键部署(含离线整合包)!基于 ChatTTS ,支持音色抽卡、长音频生成和分角色朗读。简单易用,无需复杂安装。 (github.com)
MARS 5
GitHub:
https://github.com/Camb-ai/MARS5-TTS
HF:
https://huggingface.co/CAMB-AI/MARS5-TTS
edge-tts - An Python module that allows you to use Microsoft Edge's online text-to-speech service from within your Python code or using the provided edge-tts or edge-playback command.
GitHub:
https://github.com/rany2/edge-tts
fish-speech
GitHub:
https://github.com/fishaudio/fish-speech
TTSMaker
ASR - Automatic Speech Recognition
FrogBase
- OpenAI 影片逐字稿生成與翻譯
InstantID
- 文字生成圖像 AI,個人風格頭像生成
WhisperDesktop
- 影片生成字幕逐字稿,For Windows Only
[Video]
免安裝版Whisper 無須安裝便可使用|硬體需求大幅降低|使用C++編寫 無須額外安裝函式庫
OpenAI Whisper
Whisper WebUI
- 網頁操作介面
WhisperX
- 比 whisper large-v2 快 70 倍
Fast Whisper
- 比 OpenAI Whisper 的速度快,資源消耗較低
Vosk
- Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Translator - 翻譯機
OpenAI Translator
- 基於 ChatGPT API 的翻譯擴充功能,Chrome、Edge 都能用
Chrome Extension
Enter section select mode
Previous
LLM Models
Next
RAG
No Comments
Back to top
No Comments