chinese-nlp · GitHub Topics

#网络爬虫#📙 中华新华字典数据库。包括歇后语，成语，词语，汉字。

data scraper chinese-traditional Python 中文 chinese-characters chinese-nlp chinese-language chinese-simplified json-dataset JSON json-data

Python 11.25 k

2 年前

brightmart / nlp_chinese_corpus

#自然语言处理#大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

chinese-dataset chinese-corpus pretrain word2vec 自然语言处理 bert language-model Wiki news question-answering 中文 corpus chinese-nlp dataset text-classification

9.74 k

1 年前

LianjiaTech / BELLE

BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）

bloom instruction-set llama open-models gpt-q instruct-gpt gpt-evaluation chinese-nlp lora instruct-finetune

HTML 8.18 k

8 个月前

crownpku / Awesome-Chinese-NLP

#自然语言处理#A curated list of resources for Chinese NLP 中文自然语言处理相关资料

自然语言处理 chinese-nlp

7.89 k

2 年前

lyogavin / airllm

#大语言模型#AirLLM 70B inference with single 4GB GPU

chinese-nlp finetune generative-ai instruct-gpt instruction-set llama 大语言模型 lora open-models Open Source qlora

Jupyter Notebook 5.81 k

2 个月前

HIT-SCIR / ltp

#自然语言处理#Language Technology Platform

自然语言处理 chinese-nlp 机器学习

Python 5.15 k

1 个月前

IDEA-CCNL / Fengshenbang-LM

Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系，成为中文AIGC和认知智能的基础设施。

chinese-nlp pretrained-models PyTorch distributed-training transformers aigc multimodal

Python 4.13 k

1 年前

baidu / lac

百度NLP：分词，词性标注，命名实体识别，词重要性

word-segmentation part-of-speech-tagger named-entity-recognition chinese-word-segmentation chinese-nlp Parsing Python Java

C++ 3.95 k

4 年前

esbatmop / MNBVC

#自然语言处理#MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

中文 chinese-language chinese-nlp chinese-simplified corpus-data 自然语言处理

3.89 k

21 天前

fastnlp / fastNLP

#自然语言处理#fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

自然语言处理深度学习 nlp-library nlp-parsing chinese-nlp text-classification text-processing

Python 3.13 k

2 年前

CVI-SZU / Linly

#自然语言处理#Chinese-LLaMA 1&2、Chinese-Falcon 基础模型；ChatFlow中文对话模型；中文OpenLLaMA模型；NLP预训练/指令微调数据集

bert gpt-3 language-model llama 自然语言处理 zero-shot-learning 聊天机器人 ChatGPT 中文 chinese-nlp

Python 3.05 k

1 年前

crownpku / Information-Extraction-Chinese

#自然语言处理#Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取

自然语言处理 chinese-nlp information-extraction relation-extraction named-entity-recognition

Python 2.26 k

1 年前

thunlp / THULAC-Python

An Efficient Lexical Analyzer for Chinese

chinese-nlp

Python 2.07 k

3 年前

didi / ChineseNLP

#自然语言处理#Datasets, SOTA results of every fields of Chinese NLP

自然语言处理 chinese-nlp machine-translation chinese-word-segmentation Entity resolution question-answering nlp-tasks

HTML 1.8 k

3 年前

baidu / DDParser

百度开源的依存句法分析系统

dependency-parser chinese-nlp Python dependency-parsing

Python 995

2 年前

lionsoul2014 / jcseg

#自然语言处理#Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction imp...

Java chinese-word-segmentation 自然语言处理 pos-tagging chinese-text-segmentation chinese-nlp

Java 918

2 年前

OYE93 / Chinese-NLP-Corpus

#数据仓库#Collections of Chinese NLP corpus

chinese-nlp 数据集 corpus

Python 902

5 年前

Doragd / Chinese-Chatbot-PyTorch-Implementation

#计算机科学#🍀 Another Chinese chatbot implemented in PyTorch, which is the sub-module of intelligent work order processing robot. 👩‍🔧

深度学习聊天机器人 PyTorch pytorch-nlp chinese-nlp

Python 898

1 年前

thunlp / THULAC

An Efficient Lexical Analyzer for Chinese

chinese-nlp

C++ 810

2 年前

ECNU-ICALK / EduChat

#大语言模型#An open-source educational chat model from ICALK, East China Normal University. 开源中英教育对话大模型。(通用基座模型，GPU部署，数据清理) 致敬: LLaMA, MOSS, BELLE, Ziya, vLLM

belle chinese-nlp data-cleaning 教学 llama 大语言模型 moss open-models

Jupyter Notebook 802

2 个月前