Airbyte 开源 EL(T) 平台,帮助用户将数据从应用程序,API 和数据库中同步到数据仓库
An orchestration platform for the development, production, and observation of data assets.
翻译 - 用于构建数据应用程序的Python库:ETL,ML,数据管道等。
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Flink CDC Connector 是ApacheFlink的一组数据源连接器
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Par...
翻译 - AWS上的Pandas
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
Quadratic | Spreadsheet with Python, SQL, and AI
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
翻译 - DevLake:用于 DevOps 工具的开源数据湖和仪表板。
Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
A lightweight stream processing library for Go
翻译 - 流处理库
Efficient data transformation and modeling framework that is backwards compatible with dbt.
Implementing best practices for PySpark ETL jobs and applications.
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack - Reverse ETL & Data Activation
The best place to learn data engineering. Built and maintained by the data engineering community.