Skip to main content
Voice
Gen Audio
- Stability AI
- FunAudioLLM - Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs
Instant voice cloning
Text to Speech (TTS)
VibeVoice (Microsoft)
ASR - Automatic Speech Recognition
- FrogBase - OpenAI 影片逐字稿生成與翻譯
- InstantID - 文字生成圖像 AI,個人風格頭像生成
- WhisperDesktop - 影片生成字幕逐字稿,For Windows Only
- OpenAI Whisper
- Whisper WebUI - 網頁操作介面
- WhisperX - 比 whisper large-v2 快 70 倍
- Fast Whisper - 比 OpenAI Whisper 的速度快,資源消耗較低
- Vosk - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
- Handy - A free, open source, and extensible speech-to-text application that works completely offline.