speech · GitHub Topics

#计算机科学#🐸💬 - 一个深度学习的 TTS 语言合成库

Python text-to-speech 深度学习 speech PyTorch tts vocoder tacotron glow-tts melgan speaker-encoder hifigan speaker-encodings multi-speaker-tts tts-model speech-synthesis 声音克隆 voice-synthesis voice-conversion

Python 39.26 k

8 个月前

babysor / MockingBird

#计算机科学#🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

人工智能 speech PyTorch 深度学习 text-to-speech tts

Python 36.11 k

5 个月前

svc-develop-team / so-vits-svc

#计算机科学#SoftVC VITS Singing Voice Conversion

人工智能 audio-analysis Generative Adversarial Network singing-voice-conversion so-vits-svc sovits variational-inference vc vits voice voice-conversion voiceconversion voice-changer flow 深度学习 PyTorch speech

Python 26.89 k

1 年前

huggingface / datasets

#自然语言处理#🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

翻译 - 🤗 PyTorch，TensorFlow，NumPy和Pandas中用于自然语言处理以及其他功能的快速，高效，开放式数据集和评估指标

自然语言处理数据集 PyTorch Tensorflow pandas NumPy 机器视觉机器学习深度学习 speech Hacktoberfest

Python 19.97 k

3 天前

IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

open-vocabulary-detection open-vocabulary-segmentation data-generation automatic-labeling-system caption speech image-editing

Jupyter Notebook 16.1 k

7 个月前

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

asr speech speech-recognition speech-to-text Whisper

Python 14.93 k

11 小时前

kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

翻译 - 这是Kaldi项目的正式所在地。

kaldi C++CUDA Shell speech-recognition speech-to-text speaker-verification speaker-id speech

Shell 14.76 k

2 个月前

AIGC-Audio / AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

audio gpt music sound speech talking-head

Python 10.13 k

9 个月前

mozilla / TTS

#计算机科学#:robot: 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

翻译 - 文本到语音的深度学习

深度学习 text-to-speech Python PyTorch tacotron tts speaker-encoder dataset-analysis tacotron2 tensorflow2 vocoder melgan glow-tts speech

Jupyter Notebook 9.78 k

1 年前

netease-youdao / EmotiVoice

#计算机科学#EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

PyTorch speech speech-synthesis tts multi-speaker text-to-speech 深度学习 prompt emotivoice 人工智能 Python emotion style

Python 7.89 k

8 个月前

modelscope / modelscope

#自然语言处理#ModelScope: bring the notion of Model-as-a-Service to life.

自然语言处理 cv speech multi-modal science 深度学习机器学习 Python

Python 7.69 k

2 天前

PaddlePaddle / models

#自然语言处理#Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

paddlepaddle 深度学习神经网络机器视觉自然语言处理 recommendation speech cv models

Python 6.92 k

3 个月前

TalAter / annyang

💬 Speech recognition for your site

翻译 - ：speech_balloon：您网站的语音识别

speech-recognition speech speech-to-text voice

JavaScript 6.66 k

8 个月前

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

voice-detection voice-recognition voice-commands PyTorch onnx voice-activity-detection voice-control onnx-runtime onnxruntime speech speech-processing vad

Python 5.55 k

19 天前

snakers4 / silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

翻译 - Silero模型：经过预先训练的STT模型和基准测试非常简单

speech-recognition speech-to-text stt asr pretrained-models english german spanish stt-benchmark PyTorch colab onnx text-to-speech speech speech-synthesis tts

Jupyter Notebook 5.22 k

1 年前

MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

asr speaker-diarization speech speech-recognition speech-to-text Whisper

Jupyter Notebook 4.37 k

1 个月前

metavoiceio / metavoice-src

#计算机科学#Foundational model for human-like, expressive TTS

text-to-speech 人工智能深度学习 PyTorch speech speech-synthesis tts voice-clone zero-shot-tts

Python 4.08 k

8 个月前

huggingface / speech-to-speech

#计算机科学#Speech To Speech: an effort for an open-sourced and modular GPT4-o

人工智能 assistant language-model 机器学习 Python speech speech-synthesis speech-to-text speech-translation

Python 3.96 k

11 天前

fixie-ai / ultravox

#大语言模型#A fast multimodal LLM for real-time voice

人工智能大语言模型 slm speech

Python 3.82 k

2 个月前

shu223 / iOS-10-Sampler

#IOS#Code examples for new APIs of iOS 10.

iOS Swift speech metal cnn image-recognition convolutional-neural-networks Demo

Swift 3.31 k

1 年前