#计算机科学#Apache Airflow 是一个workflow工作流调度、编排、监控平台
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
一个分布式易扩展的可视化DAG工作流任务调度系统。致力于解决数据处理流程中错综复杂的依赖关系,使调度系统在数据处理流程中开箱即用
An orchestration platform for the development, production, and observation of data assets.
翻译 - 用于构建数据应用程序的Python库:ETL,ML,数据管道等。
#自然语言处理#Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
#计算机科学#🧙 Build, run, and manage data pipelines for integrating and transforming data.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
#编辑器#Build data pipelines, the easy way 🛠️
翻译 - Orchest是用于创建数据科学管道的工具。
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
#大语言模型#A system for agentic LLM-powered data processing and ETL
#大语言模型#Preswald is a framework for building and deploying interactive data apps, internal tools, and dashboards with Python. With one command, you can launch, share, and deploy locally or in the cloud, turni...
The best place to learn data engineering. Built and maintained by the data engineering community.
MLeap: Deploy ML Pipelines to Production
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
The Feldera Incremental Computation Engine
Concurrent Python made simple
#计算机科学#Kickstart your MLOps initiative with a flexible, robust, and productive Python package.
Visual Data Transformation and Data Preparation. Low-Code Python-based ETL.
#自然语言处理#Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.