A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
faster_whisper GUI with PySide6
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, Lich...
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
An audio/acoustic activity detection and audio segmentation tool
#人脸识别#ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processi...
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
#计算机科学#An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
#计算机科学#Voice Activity Detection based on Deep Learning & TensorFlow
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
#安卓#Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
On-device voice activity detection (VAD) powered by deep learning
A statistical model-based Voice Activity Detection
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021
Enumerate user mode shared memory mappings on Windows.