🧠 Leon is your open-source personal assistant.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
PaddleSpeech 是基于飞桨 PaddlePaddle 的语音方向的开源模型库,用于语音和音频中的各种关键任务的开发,典型的应用包括:语音识别、语音翻译、语音合成等
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
#计算机科学#基于 so-vits-svc4.0(V1)的一个分支,支持实时推理和图形化推理界面,且兼容其模型。
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
#计算机科学#EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
#计算机科学#VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
#计算机科学#StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
#安卓#eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
An Open Source text-to-speech system built by inverting Whisper.
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isola...
#计算机科学#Foundational model for human-like, expressive TTS
#计算机科学#Speech To Speech: an effort for an open-sourced and modular GPT4-o