#自然语言处理#Text preprocessing, representation and visualization from zero to hero.
翻译 - 从零到英雄的文本预处理,表示和可视化。
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
中文文本分析工具包(包括- 文本分类 - 文本聚类 - 文本相似性 - 关键词抽取 - 关键短语抽取 - 情感分析 - 文本纠错 - 文本摘要 - 主题关键词-同义词、近义词-事件三元组抽取)
#自然语言处理#短文本聚类预处理模块 Short text cluster
#自然语言处理#Library of state-of-the-art models (PyTorch) for NLP tasks
#自然语言处理#TopicGPT allows to integrate the benefits of LLMs into Topic Modelling
#自然语言处理#Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!
semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models (BERT).
FastThresholdClustering is an efficient vector clustering algorithm based on FAISS, particularly suitable for large-scale vector data clustering tasks. The algorithm features intuitive and easy-to-sel...
#自然语言处理#TopicGPT allows to integrate the benefits of LLMs into Topic Modelling
Cross-lingual Language Model (XLM) pretraining and Model-Agnostic Meta-Learning (MAML) for fast adaptation of deep networks
Using word embeddings, TFIDF and text-hashing to cluster and visualise text documents
This code belongs to ACL conference paper entitled as "An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering"
Implementation of some algorithms for text clustering
#网络爬虫#Graph clustering and Node embeddings with word2vec
#自然语言处理#SLS : Neural Information Retrieval(IR)-based Semantic Search model
#自然语言处理#Chapter 3: Text and Speech Basics
Sentence Clustering and visualization. Created Date: 25 Apr 2018