Whisper is an autoregressive language model developed by OpenAI. It is trained on a large corpus of text using a transformer architecture and is capable of generating high-quality natural language text. Whisper can be used for tasks such as language modeling, text completion, and text generation. It has shown impressive performance on various benchmarks and has been released by OpenAI to encourage research in the field of language modeling. Whisper is not yet available for public use, but it has the potential to transform the field of natural language processing and generate new opportunities for language-based applications.
OpenAI Whisper语音识别模型,C++移植版本。
#计算机科学#Faster Whisper transcription with CTranslate2
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
PaddleSpeech 是基于飞桨 PaddlePaddle 的语音方向的开源模型库,用于语音和音频中的各种关键任务的开发,典型的应用包括:语音识别、语音翻译、语音合成等
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
#安卓#Open source real-time translation app for Android that runs locally
#大语言模型#Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any...
#计算机科学#JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
#大语言模型#Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR...
#IOS#On-device Speech Recognition for Apple Silicon
Production First and Production Ready End-to-End Speech Recognition Toolkit
翻译 - 生产优先和生产就绪的端到端语音识别工具包
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
#区块链#Framework for serverless Decentralized Applications using Ethereum, IPFS and other platforms
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isola...
#大语言模型#ChatGPT Java SDK支持流式输出、Gpt插件、联网。支持OpenAI官方所有接口。ChatGPT的Java客户端。OpenAI GPT-3.5-Turb GPT-4 Api Client for Java