#计算机科学#A PyTorch-based Speech Toolkit
翻译 - 基于Pytorch的语音工具包
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
#自然语言处理#Reading list for research topics in multimodal machine learning
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
#自然语言处理#Foundation Architecture for (M)LLMs
WaveNet vocoder
#计算机科学#Multilingual Automatic Speech Recognition with word-level timestamps and confidence
#计算机科学#PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
#Awesome#A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
AI powered speech denoising and enhancement
#计算机科学#Controllable and fast Text-to-Speech for over 7000 languages!
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
#计算机科学#SincNet is a neural architecture for efficiently processing raw audio samples.
General Speech Restoration
#数据仓库#Open source audio annotation tool for humans
翻译 - 面向人类的开源音频注释工具™
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Speech, Language, Audio, Music Processing with Large Language Model
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
#计算机科学#A neural network for end-to-end speech denoising