Upserts, Deletes And Incremental Processing on Big Data.
翻译 - 大数据的更新,删除和增量处理。
汇总Apache Hudi相关资料
汇总Apache Hudi中的一些Demo,便于快速上手Apache Hudi(Apache Hudi Demos to help beginners know about Hudi)
A native Rust library for Apache Hudi, with bindings into Python
emr-hudi-example
Apache Hudi Demo
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
hudi 中文文档
EMR Hudi Workshop content
Hudi Demo Notebook
spark hudi demo
spark-hudi-example
Data lake implementation demo, include iceberg on flink, iceberg on spark, hudi on flink, hudi on spark
A library based on Hudi for Spark.
Build Glue(Spark) Streaming pipeline for clicksstreams and power data lake with Apache Hudi and Query Real time with Athena
Some demos of using Spark to write MySQL and Kafka data to data lake,such as Delta,Hudi,Iceberg
LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as Delta Lake, Apache Hudi, and Apache Iceberg.
A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational excellence, reliability and application specific best practices ...
大数据组件学习代码
本项目计划打造基于 国产化 平台, 包括 飞腾 和 鲲鹏 CPU平台,麒麟 和 UOS 操作系统的大数据生态组件管理工具以及组件安装包。规划适配Ambari,HDFS,Yarn,ZooKeeper,MapReduce,Hive,Tez,Spark,Pig,Storm,Flink,Sqoop,Flume,Datax,FlinkX,Filebeat,Canal,Debezium,Presto,Drui...