A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
#计算机科学#DeepSpeech 是一款开源嵌入式(离线、设备上)语音识别引擎,最低可以在树莓派上运行
#计算机科学#A PyTorch-based Speech Toolkit
Speech recognition module for Python, supporting several engines and APIs, online and offline.
PaddleSpeech 是基于飞桨 PaddlePaddle 的语音方向的开源模型库,用于语音和音频中的各种关键任务的开发,典型的应用包括:语音识别、语音翻译、语音合成等
#计算机科学#Speech To Speech: an effort for an open-sourced and modular GPT4-o
whisper 是一个通用语音识别模型
#新手入门#Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Zero-Shot Speech Editing and Text-to-Speech in the Wild
speech enhancement\speech seperation\sound source localization
A Conversational Speech Generation Model
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
General Speech Restoration
#安卓#Android speech recognition and text to speech made easy
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.