#计算机科学#Jina 是一个基于深度学习的搜索框架,支持各种类型如图片,视频,长文本,PDF等。
#搜索#Weaviate 是一个开源矢量数据库,它同时存储对象和矢量,允许将矢量搜索与结构化过滤与云原生数据库的容错和可扩展性相结合,所有这些都可以通过 GraphQL、REST 和各种语言客户端访问。
#计算机科学#🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
翻译 - 使用BERT模型将可变长度句子映射到固定长度向量
#搜索#PaddleNLP 2.0是飞桨生态的文本领域核心库,具备易用的文本领域API,多场景的应用示例、和高性能分布式训练三大特点,旨在提升开发者文本领域的开发效率,并提供基于飞桨2.0核心框架的NLP任务最佳实践。
#计算机科学#Represent, send, store and search multimodal data
翻译 - 非结构化数据的数据结构
🌊 A Human-in-the-Loop workflow for creating HD images from text
MTEB: Massive Text Embedding Benchmark
🎯 Task-oriented embedding tuning for BERT, CLIP, etc.
翻译 - 微调任何 DNN 以更好地嵌入神经搜索任务
#自然语言处理#The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
SGPT: GPT Sentence Embeddings for Semantic Search
#搜索#Epsilla is a high performance Vector Database Management System
#计算机科学#PostgreSQL vector database extension for building AI applications
#自然语言处理#The prime repository for state-of-the-art Multilingual Question Answering research and development.
#向量搜索引擎#A Python vector database you just need - no more, no less.
#自然语言处理#Jina examples and demos to help you get started
Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.
#搜索#An easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.
Neural Search