Real-time microphone noise suppression on Linux.
翻译 - Linux上的实时麦克风噪声抑制。
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
#数据仓库#🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, Lich...
#计算机科学#A python package to build AI-powered real-time audio applications
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
An audio/acoustic activity detection and audio segmentation tool
Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/
#计算机科学#🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
#计算机科学#An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
#计算机科学#Voice Activity Detection based on Deep Learning & TensorFlow
#安卓#Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
#计算机科学#Auto transcribe tool based on whisper