一个基于 Apache Flink 二次开发、易扩展的一站式开发运维 FlinkSQL 及 SQL 的实时计算平台
Postgres-native columnar storage extension
Dozer is a real-time data movement tool that leverages CDC from various sources and moves data into various sinks.
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
An open-source columnar data format designed for fast & realtime analytic with big data.
Free and open source schema versioning and database migration made natively with .NET/6. NEW THIS MAY 2022! v1.3.15 released!
Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a...
#面试#Roadmap for Data Engineering
#搜索#Hydra九头龙,保姆级为您打造属于您的造跨平台TB-PB级别专属搜索引擎、专属上帝之眼。Hydra-面向云计算、多任务调度、服务通信、数仓、微服务化、抽象化分布式操作系统——以实现小型爬虫搜索引擎为例。
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)
All of my individual learning materials, documents, and notes from the process of getting the Coursera IBM Data Engineer Professional Certificate specialization are stored in this repository.
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
#Awesome#A curated list of awesome Online Analytical Processing databases, frameworks, ressources and other awesomeness.
implementing an end-to-end tweets ETL/Analysis pipeline.
AlphaSQL provides Integrated Type and Schema Check and Parallelization for SQL file set mainly for BigQuery
End to end data engineering project
A library for data warehouse and data integration pattern and architecture documentation.