chinese speech pretrained models
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
Classification of 11 types of audio clips using MFCCs features and LSTM. Pretrained on Speech Command Dataset with intensive data augmentation.
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
2021 INTERSPEECH, "Improved Personalized Speech Enhancement through SNR-Informed Self-Supervised Pretraining".
Semi-supervised spoken language understanding (SLU) via self-supervised speech and language model pretraining
#计算机科学#Large-scale pretraining for dialogue
翻译 - 对话的大规模预培训
XLNet: Generalized Autoregressive Pretraining for Language Understanding
翻译 - XLNet:用于语言理解的广义自回归预训练
PyTorch original implementation of Cross-lingual Language Model Pretraining.
翻译 - PyTorch最初执行跨语言模型预训练。
#自然语言处理#A large-scale 7B pretraining language model developed by BaiChuan-Inc.
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
ALBERT model Pretraining and Fine Tuning using TF2.0
Variational Methods for Pretraining in Resource-limited Environments
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
speech enhancement\speech seperation\sound source localization
Papers about pretraining and self-supervised learning on Graph Neural Networks (GNN).
ZeroRF: Fast Sparse View 360° Reconstruction with Zero Pretraining
ICLR 2022 Paper, SOTA Table Pre-training Model, TAPEX: Table Pre-training via Learning a Neural SQL Executor
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
PITI: Pretraining is All You Need for Image-to-Image Translation
Code accompanying the paper Pretraining Language Models with Human Preferences