#计算机科学#🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
#大语言模型#ChatTTS是专门为对话场景设计的文本转语音模型,例如LLM助手对话任务。它支持英文和中文两种语言
Instant voice cloning by MIT and MyShell. Audio foundation model.
🧠 Leon is your open-source personal assistant.
翻译 - 🧠Leon是您的开源个人助理。
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。
#大语言模型#Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
#计算机科学#:robot: 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
翻译 - 文本到语音的深度学习
#计算机科学#End-to-End Speech Processing Toolkit
翻译 - 端到端语音处理工具包
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
微软VALL-E X 零样本语音合成模型的开源实现
#计算机科学#EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
#计算机科学#VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
翻译 - VITS:用于端到端文本到语音的具有对抗性学习的条件变分自动编码器
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
#计算机科学#StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
#安卓#Sherpa-ONNX 是一个轻量级语音识别框架, 基于 Kaldi 和 onnxruntime,无需联网即可实现语音转文本、文本转语音、说话人分离以及语音活动检测(VAD)。支持嵌入式系统、安卓、iOS、鸿蒙系统、树莓派、RISC-V、x86_64 服务器、WebSocket 服务器 / 客户端,以及 C/C++、Python、Kotlin、C#、Go、NodeJS、Java、Swift、Dart、JavaScript、Flutter、Object Pascal、Lazarus、Rust 等编程语言。
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
翻译 - Silero模型:经过预先训练的STT模型和基准测试非常简单