Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
Synchronized Translation for Videos. Video dubbing
#大语言模型#turnkey self-hosted offline transcription and diarization service with llm summary
UniSpeech - Large Scale Self-Supervised Learning for Speech
Open source inference code for Rev's model
Gecko - A Tool for Effective Annotation of Human Conversations
#计算机科学#Identify the emotion of multiple speakers in an Audio Segment
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
Rust bindings to https://github.com/k2-fsa/sherpa-onnx
#大语言模型#Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.
#计算机科学#A lightweight library to compute Diarization Error Rate (DER).
pyannote audio diarization in rust
Neural network based similarity scoring for diarization (pytorch implementation of "LSTM based Similarity Measurement with Spectral Clustering for Speaker Diarization")
Tool for automatic transcription and speaker diarization based on whisper and pyannote.
#计算机科学#On-device speaker diarization powered by deep learning
#自然语言处理#Easy to use Multi-Provider ASR/Speech To Text and NLP engine
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
Convert kaldi feature extraction and nnet3 models into Tensorflow Lite models. Currently aimed at converting kaldi's x-vector models and diarization pipelines to tensorflow models.