Implementing best practices for PySpark ETL jobs and applications.
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
翻译 - 用于构建数据湖,数据仓库和分析平台的端到端GoodReads数据管道。
Mass processing data with a complete ETL for .net developers
翻译 - .net开发人员使用完整的ETL批量处理数据
A Python PySpark Projet with Poetry
An end-to-end Twitter Data Pipeline that extracts data from Twitter and loads it into AWS S3.