A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
#计算机科学#DeepSpeech 是一款开源嵌入式(离线、设备上)语音识别引擎,最低可以在树莓派上运行
#计算机科学#A PyTorch-based Speech Toolkit
Speech recognition module for Python, supporting several engines and APIs, online and offline.
PaddleSpeech 是基于飞桨 PaddlePaddle 的语音方向的开源模型库,用于语音和音频中的各种关键任务的开发,典型的应用包括:语音识别、语音翻译、语音合成等
#计算机科学#Speech To Speech: an effort for an open-sourced and modular GPT4-o
whisper 是一个通用语音识别模型
#新手入门#Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Zero-Shot Speech Editing and Text-to-Speech in the Wild
speech enhancement\speech seperation\sound source localization
A Conversational Speech Generation Model
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
#安卓#Android speech recognition and text to speech made easy
General Speech Restoration
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Speech Recognition using DeepSpeech2.
Alibaba speech technology