SeaweedFS 是一个用于blob、对象、文件和数据湖的分布式存储系统,可快速存储和服务数十亿个文件
Ceph is a distributed object, block, and file storage platform
翻译 - Ceph是一个分布式对象,块和文件存储平台
为开发者设计的云文件系统。为云环境设计,兼容 POSIX、HDFS 和 S3 协议的分布式文件系统
Utils for streaming large files (S3, HDFS, gzip, bz2...)
The Universal Storage Engine
Addax is a versatile open-source ETL tool that can seamlessly transfer data between various RDBMS and NoSQL databases, making it an ideal solution for data migration.
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML...
Real Time Analytics and Data Pipelines based on Spark Streaming
Web tool for Kafka Connect |
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underl...
Big Data Ecosystem Docker
StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
#计算机科学#Fundamentals of Spark with Python (using PySpark), code examples