#Awesome#A curated list of awesome big data frameworks, ressources and other awesomeness.
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
翻译 - Miller就像awk,sed,cut,join和对名称索引数据(例如CSV,TSV和表格JSON)进行排序
Fancy stream processing made operationally mundane
翻译 - 普通任务和数据工程的声明式流处理
Real-time Data Integration and Transformation: use SQL to transform, deliver, and act on fast-changing data.
翻译 - 流数据仓库
#计算机科学#🌊 Online machine learning in Python
翻译 - :custard:Python中的在线机器学习
Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the hood, ReadySet caches the ...
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Utils for streaming large files (S3, HDFS, gzip, bz2...)
Open-source graph database, tuned for dynamic analytics environments. Easy to adopt, scale and own.
A lightweight stream processing library for Go
翻译 - 流处理库
Pravega - Streaming as a new software defined storage primitive
翻译 - Pravega-流式传输作为一种新的软件定义的存储原语
#计算机科学#Python Stream Processing
#计算机科学#Python stream processing for Kafka
Trill is a single-node query processor for temporal or streaming data.
翻译 - Trill是用于时间或流数据的单节点查询处理器。
📐 Pushing the boundaries of simplicity
翻译 - 📐推动简单的界限
Superdiff provides a complete and readable diff for both arrays and objects. Plus, it supports stream and file inputs for handling large datasets efficiently, is battle-tested, has zero dependencies, ...
Open-Source Web UI for managing Apache Kafka clusters