#计算机科学#🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
#大语言模型#ChatTTS是专门为对话场景设计的文本转语音模型,例如LLM助手对话任务。它支持英文和中文两种语言
Instant voice cloning by MIT and MyShell. Audio foundation model.
🧠 Leon is your open-source personal assistant.
翻译 - 🧠Leon是您的开源个人助理。
#大语言模型#Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。
#计算机科学#:robot: 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
翻译 - 文本到语音的深度学习
#计算机科学#End-to-End Speech Processing Toolkit
翻译 - 端到端语音处理工具包
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
#计算机科学#EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
微软VALL-E X 零样本语音合成模型的开源实现
#计算机科学#VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
翻译 - VITS:用于端到端文本到语音的具有对抗性学习的条件变分自动编码器
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
#计算机科学#StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
#安卓#Sherpa-ONNX 是一个轻量级语音识别框架, 基于 Kaldi 和 onnxruntime,无需联网即可实现语音转文本、文本转语音、说话人分离以及语音活动检测(VAD)。支持嵌入式系统、安卓、iOS、鸿蒙系统、树莓派、RISC-V、x86_64 服务器、WebSocket 服务器 / 客户端,以及 C/C++、Python、Kotlin、C#、Go、NodeJS、Java、Swift、Dart、JavaScript、Flutter、Object Pascal、Lazarus、Rust 等编程语言。
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
翻译 - Silero模型:经过预先训练的STT模型和基准测试非常简单