#自然语言处理#RAGFlow 是一款基于深度文档理解构建的开源 RAG(Retrieval-Augmented Generation)引擎
一个分布式易扩展的可视化DAG工作流任务调度系统。致力于解决数据处理流程中错综复杂的依赖关系,使调度系统在数据处理流程中开箱即用
An orchestration platform for the development, production, and observation of data assets.
翻译 - 用于构建数据应用程序的Python库:ETL,ML,数据管道等。
#自然语言处理#Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
#数据仓库#Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop....
翻译 - 访问和管理PyTorch和TensorFlow数据集的最快方法。轻松构建可伸缩的数据管道。Leading Data 2.0 http://activeloop.ai
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
The best place to learn data engineering. Built and maintained by the data engineering community.
The Feldera Incremental Computation Engine
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
#自然语言处理#Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.