retrieval · GitHub Topics

chonkie-ai / chonkie

#自然语言处理#🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library

人工智能 chunking rag text-processing 自然语言处理 Python semantic-segmentation vector-search etl retrieval

Python 2.87 k

20 天前

embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark

benchmark clustering information-retrieval sentence-transformers sts text-embedding retrieval neural-search semantic-search sbert text-classification reranking

Jupyter Notebook 2.39 k

6 小时前

apache / lucenenet

Apache Lucene.NET

lucene text search information retrieval analysis index Query (disambiguation)apache Hacktoberfest

C# 2.29 k

24 天前

intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

large-language-model 聊天机器人 4-bits llm-inference llm-cpu chatpdf streamingllm intel-optimized-llamacpp speculative-decoding habana rag retrieval

Python 2.17 k

6 个月前

qdrant / fastembed

#向量搜索引擎#Fast, Accurate, Lightweight Python library to make State of the Art Embedding

embeddings openai rag retrieval retrieval-augmented-generation vector-search

Python 1.94 k

2 天前

shervinea / mit-15-003-data-science-tools

Study guides for MIT's 15.003 Data Science Tools

翻译 - 麻省理工学院15.003数据科学工具的学习指南

study-guide 数据科学 SQL R Git Bash manipulation 可视化 retrieval

1.83 k

5 年前

beir-cellar / beir

#自然语言处理#A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

自然语言处理 information-retrieval bert benchmark sentence-transformers retrieval elasticsearch sbert dataset colbert 深度学习 PyTorch 大语言模型 rag

Python 1.77 k

2 个月前

parthsarthi03 / raptor

#大语言模型#The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

rag retrieval retrieval-augmented-generation clustering language-model 机器学习 vector-database agents 框架大语言模型

Python 1.17 k

7 个月前

xhluca / bm25s

Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy

bm25 lexical-search retrieval search okapi-bm25 rag information-retrieval

Python 1.1 k

5 天前

memodb-io / memobase

#大语言模型#Profile-Based Long-Term Memory for AI Applications

ChatGPT llm-application memory rag retrieval ai-memory long-term-memory 数据结构 data-structures-and-algorithms

Python 1.03 k

1 天前

superlinked / superlinked

#自然语言处理#Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured and unstructured data.

embeddings etl vector-search data-pipeline 深度学习 information-retrieval 大语言模型机器学习 mlops 自然语言处理 Python retrieval retrieval-augmented-generation semantic-search vectorization vector-database

Jupyter Notebook 1.02 k

2 天前

tensorlakeai / indexify

#大语言模型#A realtime serving engine for Data-Intensive Generative AI Applications

大语言模型机器学习 retrieval

Rust 987

2 天前

ArrowLuo / CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

multimodal-learning multimodality multimodal search ranking retrieval-model retrieval activitynet clip

Python 933

1 年前

Muennighoff / sgpt

SGPT: GPT Sentence Embeddings for Semantic Search

gpt information-retrieval language-model large-language-models retrieval semantic-search sentence-embeddings text-embedding neural-search

Jupyter Notebook 863

1 年前

lucidrains / RETRO-pytorch

#计算机科学#Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

人工智能深度学习 transformers attention-mechanism retrieval

Python 861

1 年前

NeumTry / NeumAI

#大语言模型#Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

人工智能 data embeddings etl 大语言模型 vector-database ChatGPT data-engineering 数据库 pipeline Python retrieval vectors llmops mlops ops rag

Python 855

1 年前

epsilla-cloud / vectordb

#搜索#Epsilla is a high performance Vector Database Management System

人工智能 infrastructure llms ChatGPT data 数据科学数据库 embeddings 机器学习 rag retrieval vector-database embeddings-similarity 神经网络 neural-search 搜索引擎 vector-search

C++ 850

1 个月前

AnswerDotAI / byaldi

#自然语言处理#Use late-interaction multi-modal models such as ColPali in just a few lines of code.

colbert 自然语言处理 rag reranking retrieval multi-modal

Python 770

2 个月前

michaelthwan / searchGPT

#自然语言处理#Grounded search engine (i.e. with source reference) based on LLM / ChatGPT / OpenAI API. It supports web search, file content search etc.

ChatGPT 大语言模型自然语言处理 openai Python retrieval retrieval-model 人工智能 language-model 机器学习

Python 691

8 个月前

OpenBMB / VisRAG

Parsing-free RAG supported by VLMs

rag retrieval retrieval-augmented-generation vision-language-model multi-modal multi-modality document-retrieval document-understanding

Python 662

2 个月前