A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
翻译 - NeMo:用于对话式AI的工具包
#计算机科学#A PyTorch-based Speech Toolkit
翻译 - 基于Pytorch的语音工具包
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
#计算机科学#This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
翻译 - 这是用于无界交错状态递归神经网络(UIS-RNN)算法的库,与论文《完全监督的说话人歧义》相对应。
#计算机科学#SincNet is a neural architecture for efficiently processing raw audio samples.
In defence of metric learning for speaker recognition
an open-source implementation of sequence-to-sequence based speech processing engine
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
#计算机科学#🔈 Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
#人脸识别#Angular penalty loss functions in Pytorch (ArcFace, SphereFace, Additive Margin, CosFace)
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
#计算机科学#The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN a...
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
使用Tensorflow实现声纹识别
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
#计算机科学#A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.