wavlm · GitHub Topics

#计算机科学#StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

深度学习 PyTorch speaker-adaptation speech-synthesis text-to-speech tts wavlm diffusion-models latent-diffusion latent-diffusion-models Generative Adversarial Network

Python 5.64 k

8 个月前

s3prl / s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

翻译 - 自我监督的语音预训练和表征学习工具包。

speech-representation mockingjay representation-learning apc tera self-supervised-learning speech-pretraining vq-apc wav2vec hubert wavlm

Python 2.37 k

1 个月前

wenet-e2e / wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

production-ready PyTorch resnet speaker-recognition speaker-verification speaker-diarization repvgg TLS (Transport Layer Security)dino wavlm

Python 875

2 个月前

lucadellalib / focalcodec

#计算机科学#A low-bitrate single-codebook 16 kHz speech codec based on focal modulation

codec 深度学习 PyTorch speech-synthesis wavlm

Python 84

2 个月前

mjhydri / Singing-Vocal-Beat-Tracking

This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trai...

beat-tracking hubert music music-information-retrieval self-supervised singing-voice wavlm

Python 30

3 年前

lucadellalib / discrete-wavlm-codec

A neural speech codec based on discrete WavLM representations

clustering codec hifi-gan PyTorch quantization self-supervised-learning speech-synthesis wavlm

Python 23

8 个月前

lucadellalib / audiocodecs

A collections of audio codecs with a standardized API

codec dac PyTorch quantization self-supervised-learning speech-synthesis text-to-speech wavlm

Python 11

2 个月前

Sarasadeghii / Sharif-WavLM

In this repository, the wavLM model is used for quality and poor quality data for speaker verification task, and the PyCM library is used for evaluation.

confusion-matrix speaker-verification wavlm

Jupyter Notebook 8

2 年前

alessandropec / data_driven_ai_voice_cloning

#计算机科学#This repository contain the code of the main part of my master thesis degree at Politecnico di Torino in Data science & Engineering

人工智能 generative-ai speaker-verification text-to-speech 声音克隆 zero-shot-learning 深度学习机器学习 wavlm fastspeech2 tacotron2

Python 8

2 年前

theolepage / wavlm_ssl_sv

SOTA method for self-supervised speaker verification leveraging a large-scale pretrained ASR model.

asr dino PyTorch self-supervised-learning speaker-recognition speaker-verification wavlm

Python 7

2 个月前

bunyaminergen / WavLMMSDD

This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia...

diarization embedding Microsoft speaker-diarization speech wavlm

Jupyter Notebook 7

1 个月前

sadPororo / UniPool-SV

Universal Pooling Method for Speaker Verification Utilizing Pre-trained Multi-layer Features, 2025 preprint

hubert pretrained-models speaker-recognition speaker-verification wavlm

Python 5

7 个月前

zhu00121 / Universal-representation-dynamics-of-deepfake-speech

This repo contains code used in the paper "Characterizing the temporal dynamics of universal speech representations for generalizable deepfake detection"

deepfake-detection self-supervised wavlm

Python 4

1 年前

bunyaminergen / WavLMRawNetXSVBase

WavLM Large + RawNetX Speaker Verification Base: End-to-End Speaker Verification Architecture

audio feature-extraction speaker-verification speech speech-processing wavlm

Python 3

1 个月前