pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。
翻译 - pycorrector是用于文本错误纠正的工具包。它的开发是为了方便设计,比较和共享深层文本纠错模型。
#计算机科学#State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines of C++. This is a 100% private 100% offline 100% free CLI tool.
#自然语言处理#Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
CTC+Beam_Search+kenlm 是用于以汉字为声学模型建模单元的解码系统
#自然语言处理#A complete instruction for training a Persian spell checker and a language model based on SymSpell and KenLM, respectively using Wikipedia dataset.
Wave2vec 2.0 Recognize pipeline
Optical Character Recognition + Instance Segmentation for russian and english languages
Romanian Automatic Speech Recognition from the ROBIN project
#自然语言处理#🎲 KenLM extension for spaCy 2.0.
#自然语言处理#A Java JNI wrapper for KenLM: Faster and Smaller Language Model Queries
Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡
#计算机科学#INACTIVE - http://mzl.la/ghe-archive - Generate language models from OSCAR corpora
#计算机科学#Neural Grammatical Error Correction for Romanian using Transformer
#自然语言处理#Developed an AI tool to automatically generate captions and transcripts for YouTube videos in 67 languages and can generate summarized texts in 133 languages.
Scripts to train a n-gram language models on Wikipedia articles
Automatic Speech Recognition using Conformer with Speech Sentiment Analysis & Text Summarizer
#自然语言处理#This repo shows how to finetune the wav2vec2.0 model along with its prerequisites.
demo of domain corpus bootstrapping using language model perplexity