#计算机科学#🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
#自然语言处理#🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
翻译 - 🤗 PyTorch,TensorFlow,NumPy和Pandas中用于自然语言处理以及其他功能的快速,高效,开放式数据集和评估指标
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
kaldi-asr/kaldi is the official location of the Kaldi project.
翻译 - 这是Kaldi项目的正式所在地。
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
#计算机科学#:robot: 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
翻译 - 文本到语音的深度学习
#计算机科学#EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
#自然语言处理#ModelScope: bring the notion of Model-as-a-Service to life.
#自然语言处理#Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.
💬 Speech recognition for your site
翻译 - :speech_balloon:您网站的语音识别
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
翻译 - Silero模型:经过预先训练的STT模型和基准测试非常简单
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
#计算机科学#Foundational model for human-like, expressive TTS
#计算机科学#Speech To Speech: an effort for an open-sourced and modular GPT4-o
#IOS#Code examples for new APIs of iOS 10.