Audio-Visual Speech Recognition using Deep Learning
Audio-Visual Speech Recognition using Sequence to Sequence Models
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
Kaldi-based audio-visual speech recognition
ViSpeR: Multilingual Audio-Visual Speech Recognition
Official PyTorch implementation of paper Leveraging Unimodal Self Supervised Learning for Multimodal Audio-Visual Speech Recognition (ACL 2022)
Visual Speech Recognition for Multiple Languages
Python toolkit for Visual Speech Recognition
Web Audio Speech Synthesis / Recognition for p5.js
Wav2Vec for speech recognition, classification, and audio classification
Audio-Visual Speech Separation with Cross-Modal Consistency
A self-supervised learning framework for audio-visual speech
翻译 - 一种用于视听语音的自监督学习框架
🔓 Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
A HTML widget for speech recognition from audio or video
Simple Python audio transcriber using OpenAI's Whisper speech recognition model
Speech recognition
A large-scale publicly-available visual-thermal-audio dataset designed to encourage research in the general areas of user authentication, facial recognition, speech recognition, and human-computer int...
Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.
Using Convolutional Neural Networks in speech emotion recognition on the RAVDESS Audio Dataset.
Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18