sentence-embeddings · GitHub Topics

#搜索#All-in-one 一站式 embedding 数据库，语义搜索、LLM 编排和语言模型workflows

Python search 机器学习自然语言处理 semantic-search vector-search txtai 大语言模型 vector-database language-model transformers sentence-embeddings large-language-models information-retrieval 搜索引擎 embeddings retrieval-augmented-generation rag 人工智能

Python 10.71 k

10 小时前

FlagOpen / FlagEmbedding

#大语言模型#Retrieval and Retrieval-augmented LLMs

embeddings information-retrieval 大语言模型 sentence-embeddings text-semantic-similarity retrieval-augmented-generation

Python 9.3 k

3 天前

MaartenGr / BERTopic

#自然语言处理#Leveraging BERT and c-TF-IDF to create easily interpretable topics.

翻译 - 利用BERT和基于类的TF-IDF创建易于理解的主题。

bert transformers topic-modeling sentence-embeddings 自然语言处理机器学习 topic ldavis topic-modelling topic-models

Python 6.65 k

4 天前

shibing624 / text2vec

#自然语言处理#text2vec, text to vector. 文本向量表征工具，把文本转化为向量矩阵，实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型，开箱即用。

Python 4.67 k

3 天前

princeton-nlp / SimCSE

#自然语言处理#[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

翻译 - SimCSE：句子嵌入的简单对比学习

自然语言处理 sentence-embeddings

Python 3.54 k

6 个月前

Separius / awesome-sentence-embedding

#自然语言处理# A curated list of pretrained sentence and word embedding models

word-embeddings sentence-embeddings 自然语言处理 Awesome Lists pretrained-models unsupervised-learning natural-language bert pretrained-language-model language-model

Python 2.26 k

4 年前

SeanLee97 / xmnlp

#自然语言处理#xmnlp：提供中文分词, 词性标注, 命名体识别，情感分析，文本纠错，文本转拼音，文本摘要，偏旁部首，句子表征及文本相似度计算等功能

pinyin 自然语言处理 spell-checker sentiment-analysis ner Parsing segmentation sentence-embeddings sentence-similarity

Python 1.28 k

2 年前

JohnSnowLabs / nlu

1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.

nlu natural-language-understanding text-classification transformers language-detection named-entity-recognition seq2seq t5 lemmatizer spell-checker sentence-embeddings sentiment-analysis Streamlit pandas dependency-parsing Entity resolution

Python 910

2 个月前

Muennighoff / sgpt

SGPT: GPT Sentence Embeddings for Semantic Search

gpt information-retrieval language-model large-language-models retrieval semantic-search sentence-embeddings text-embedding neural-search

Jupyter Notebook 863

1 年前

wangyuxinwhy / uniem

#自然语言处理#unified embedding model

embeddings huggingface 自然语言处理 sentence-embeddings sentence-transformers

Python 853

2 年前

goru001 / inltk

#自然语言处理#Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need

翻译 - 用于印度语言的Natural Language Toolkit旨在为应用程序开发人员可能需要的各种NLP任务提供开箱即用的支持

自然语言处理深度学习 indic-languages PyTorch data-augmentation sentence-similarity sentence-encoding word-embeddings sentence-embeddings

Python 829

1 年前

oborchers / Fast_Sentence_Embeddings

Compute Sentence Embeddings Fast!

sentence-embeddings sentence-representation sentence-similarity gensim fasttext cython embeddings maxpooling fse

Jupyter Notebook 622

2 年前

jina-ai / vectordb

#向量搜索引擎#A Python vector database you just need - no more, no less.

Python 602

1 年前

ncbi-nlp / BioSentVec

#自然语言处理#BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences

自然语言处理 fasttext sentence-embeddings word-embeddings sentence-similarity

Jupyter Notebook 590

2 年前

SeanLee97 / AnglE

#大语言模型#Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

llama llama2 semantic-similarity text-embedding sentence-embeddings text-similarity retrieval-augmented-generation text2vec embeddings sts rag 大语言模型 information-retrieval

Python 531

1 个月前

kaushalshetty / Structured-Self-Attention

#计算机科学#A Structured Self-attentive Sentence Embedding

attention-mechanism attention-model self-attention PyTorch 深度学习 Python attention 可视化 classification sentence-embeddings

Python 492

6 年前

sunyilgdx / SIFRank_zh

Keyphrase or Keyword Extraction 基于预训练模型的中文关键词抽取方法（论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model 的中文版代码）

keyphrase-extraction keyword-extraction elmo pre-trained-language-models word-embeddings sentence-embeddings python36

Python 426

5 年前

JohnGiorgi / DeCLUTR

#自然语言处理#The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!

contrastive-learning 自然语言处理 PyTorch transformers representation-learning sentence-embeddings sentence-similarity semantic-search metric-learning self-supervised-learning

Python 380

2 年前