100+ Chinese Word Vectors 上百种预训练中文词向量
pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation
百度NLP:分词,词性标注,命名实体识别,词重要性
#自然语言处理#Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
翻译 - SymSpell:通过“对称删除”拼写校正算法快一百万倍
#自然语言处理#Datasets, SOTA results of every fields of Chinese NLP
翻译 - 中国自然语言处理各领域的数据集,SOTA结果
#自然语言处理#Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction imp...
Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
#自然语言处理#The Jieba Chinese Word Segmentation Implemented in Rust
High performance Chinese tokenizer with both GBK and UTF-8 charset support based on MMSEG algorithm developed by ANSI C. Completely based on modular implementation and can be easily embedded in other...
#自然语言处理#MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型
g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese
A PyTorch implementation of a BiLSTM \ BERT \ Roberta (+ BiLSTM + CRF) model for Chinese Word Segmentation (中文分词) .
一个轻量且功能全面的中文分词器,帮助学生了解分词器的工作原理。MicroTokenizer: A lightweight Chinese tokenizer designed for educational and research purposes. Provides a practical, hands-on approach to understanding NLP concepts, fe...
#计算机科学#Some experiments about Machine Learning
#自然语言处理#手工整理医疗行业词汇、术语等语料。可用于语音识别、对话系统等各类nlp模型训练。
#计算机科学#Source codes for paper "Neural Networks Incorporating Dictionaries for Chinese Word Segmentation", AAAI 2018
Source code for an ACL2017 paper on Chinese word segmentation