Apache Superset 是一个企业级数据可视化和数据分析的平台。
Apache Airflow 是一个workflow工作流调度、编排、监控平台
Prefect 是一个现代化工作流编排工具,使开发人员能够构建、观察数据管道并对其做出反应
Airbyte 开源 EL(T) 平台,帮助用户将数据从应用程序,API 和数据库中同步到数据仓库
Turns Data and AI algorithms into production-ready web applications in no time.
#新手入门# Roadmap to becoming a data engineer in 2021
翻译 - 2020年成为数据工程师的路线图
#计算机科学# 🧙 Build, run, and manage data pipelines for integrating and transforming data.
#计算机科学# The Open Source Feature Store for Machine Learning
翻译 - 机器学习功能库
lakeFS - Data version control for your data lake | Git for data
翻译 - 一个开源平台,可为基于对象存储的数据湖提供弹性和可管理性
SQL Translator is a tool for converting natural language queries into SQL code using artificial intelligence. This project is 100% free and open source.
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Par...
翻译 - AWS上的Pandas
A list of useful resources to learn Data Engineering from scratch
翻译 - 从零开始学习数据工程的有用资源列表
#计算机科学# The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
翻译 - 建立数据管道的最快 ⚡️ 方式。迭代开发,随处部署。 ☁️
Quadratic | Spreadsheet with Python, SQL, and AI
Compare tables within or across databases
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
翻译 - DevLake:用于 DevOps 工具的开源数据湖和仪表板。
#数据仓库# Blazing-fast Data-Wrangling toolkit