An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
翻译 - 具有多语言支持的集成语料库工具,用于语言,文学和翻译研究
#自然语言处理#A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
#自然语言处理#Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German
A web-based engine for creating and annotating textual corpora
#网络爬虫#data resource untuk NLP bahasa indonesia
#网络爬虫#🕷️ The pipeline for the OSCAR corpus
Kanji usage frequency data collected from various sources
#自然语言处理#An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.
#搜索#Quran, Hadith, Translations, Tafaseer, Corpus Linguistics. Everything for NLP
An advanced, extensible web front-end for the Manatee-open corpus search engine
#自然语言处理#Large silver standart Russian corpus with NER, morphology and syntax markup
#自然语言处理#A textual corpus database for the digital humanities.
SpeCT - Speech Corpus Toolkit for Praat. Documentation: https://lennes.github.io/spect/
#自然语言处理#My solutions to selected exercises to "Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit" by Steven Bird, Ewan Klein, and Edward Loper.
#自然语言处理#A set of workflows for corpus building through OCR, post-correction and normalisation