An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
翻译 - 用于构建数据湖,数据仓库和分析平台的端到端GoodReads数据管道。
Building ETL Pipelines with Python
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.
ETL pipeline using pyspark (Spark - Python)
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Build configuration-driven ETL pipelines on Apache Spark
Airbyte 开源 EL(T) 平台,帮助用户将数据从应用程序,API 和数据库中同步到数据仓库
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
This solution helps you deploy ETL jobs on data lake using CDK Pipelines.
Golang framework for streaming ETL, observability data pipeline, and event processing apps
翻译 - 用于流ETL,可观察性数据管道和事件处理应用程序的Golang框架
Bigquery ETL
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Go...
LinkedPipes ETL is an RDF based, lightweight ETL tool
Pentaho Data Integration ( ETL ) a.k.a Kettle
翻译 - Pentaho数据集成(ETL)水壶
ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
visualized crawler & ETL IDE written with C#/WPF
翻译 - 用C#/ WPF编写的可视化爬虫和ETL IDE
ETL best practices with airflow, with examples
A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and Delta Lake.
nvpro-pipeline is a research rendering pipeline
Extract, Transform, and Load data with Ruby
Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. How to get any Ethereum smart contract into BigQuery https://towardsdatascience.com/how-to-get-any-ethereum-smart-contrac...