speech-processing · GitHub Topics

#计算机科学#A PyTorch-based Speech Toolkit

翻译 - 基于Pytorch的语音工具包

speech-recognition speech-toolkit speaker-recognition speech-to-text speech-enhancement speech-separation audio audio-processing speech-processing speechrecognition asr voice-recognition speaker-diarization speaker-verification PyTorch huggingface transformers language-model 深度学习

Python 9.67 k

2 天前

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

PyTorch speech-processing speaker-diarization voice-activity-detection pretrained-models speaker-recognition speaker-verification

Jupyter Notebook 7.25 k

4 天前

pliang279 / awesome-multimodal-ml

#自然语言处理#Reading list for research topics in multimodal machine learning

multimodal-learning 机器学习 representation-learning 自然语言处理机器视觉 speech-processing Robotics healthcare reading-list 深度学习 reinforcement-learning

6.38 k

8 个月前

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

voice-detection voice-recognition voice-commands PyTorch onnx voice-activity-detection voice-control onnx-runtime onnxruntime speech speech-processing vad

Python 5.55 k

19 天前

microsoft / torchscale

#自然语言处理#Foundation Architecture for (M)LLMs

机器视觉机器学习 multimodal 自然语言处理 pretrained-language-model speech-processing transformer translation

Python 3.07 k

1 年前

linto-ai / whisper-timestamped

#计算机科学#Multilingual Automatic Speech Recognition with word-level timestamps and confidence

深度学习 speech speech-recognition speech-to-text asr 机器学习 Python PyTorch attention-is-all-you-need attention-mechanism attention-model speaker-diarization speech-processing transformers Whisper

Python 2.35 k

13 天前

r9y9 / wavenet_vocoder

WaveNet vocoder

wavenet speech-synthesis speech-processing PyTorch Python neural-vocoder speech

Python 2.35 k

2 年前

r9y9 / deepvoice3_pytorch

#计算机科学#PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

tts speech-synthesis end-to-end speech-processing 机器学习 PyTorch Python multi-speaker

Python 1.98 k

1 年前

resemble-ai / resemble-enhance

AI powered speech denoising and enhancement

denoise speech-denoising speech-enhancement speech-processing

Python 1.73 k

4 个月前

wq2012 / awesome-diarization

#Awesome#A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

speaker-diarization Awesome Lists 机器学习 speech-recognition speech-processing 深度学习

1.73 k

6 个月前

DigitalPhonetics / IMS-Toucan

#计算机科学#Controllable and fast Text-to-Speech for over 7000 languages!

text-to-speech toolkit speech-synthesis 深度学习 speech-processing tts PyTorch speech

Python 1.58 k

5 个月前

coqui-ai / open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

tts stt speech-to-text text-to-speech speech-recognition speech-synthesis speech-processing voice-recognition voice-activity-detection 声音克隆 speech-separation

1.32 k

10 个月前

mravanelli / SincNet

#计算机科学#SincNet is a neural architecture for efficiently processing raw audio samples.

深度学习 audio waveform filtering cnn convolutional-neural-networks speaker-recognition speaker-verification speech-recognition asr audio-processing speech-processing digital-signal-processing signal-processing neural-networks 人工智能 timit PyTorch Python

Python 1.17 k

4 年前

haoheliu / voicefixer

General Speech Restoration

speech-processing speech-synthesis speech-enhancement speech-analysis speech tts denoise super-resolution vocoder

Python 1.12 k

2 个月前

midas-research / audino

#数据仓库#Open source audio annotation tool for humans

翻译 - 面向人类的开源音频注释工具™

audio-processing speech-processing 机器学习 annotation-tool audio-annotation Python 数据集

JavaScript 1.09 k

2 个月前

ictnlp / StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

speech speech-recognition speech-synthesis speech-to-text speech-translation translation all-in-one machine-translation streaming-audio text-to-speech asr tts voice text-to-audio non-autoregressive speech-enhancement audio-processing speech-processing

Python 1.06 k

8 个月前

Ryuk17 / SpeechAlgorithms

You can find the speech algorithms you want here

speech-processing

C 796

3 个月前

X-LANCE / SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

audio-processing large-language-model multimodal-large-language-models peft speech-processing

Python 776

19 小时前

nanahou / Awesome-Speech-Enhancement

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

speech-enhancement speech-processing signal-processing 深度神经网络机器学习

MATLAB 757

4 年前

drethage / speech-denoising-wavenet

#计算机科学#A neural network for end-to-end speech denoising

机器学习深度学习 neural-networks speech-denoising speech wavenet end-to-end speech-processing

Python 690

2 年前