map-reduce

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

distributed-computing map-reduce Go distributed-systems

Go 3.54 k

2 个月前

numaproj / numaflow

Kubernetes-native platform to run massively parallel data/streaming jobs

Kubernetes stream-processing data-processing pipeline map-reduce Hacktoberfest

Rust 2.26 k

9 小时前

Qihoo360 / poseidon

#搜索#A search engine which can hold 100 trillion lines of log data.

poseidon 搜索引擎 Go big-data map-reduce

Go 1.99 k

8 年前

JuliaFolds / Transducers.jl

Efficient transducers for Julia

Julia 语言 transducers parallel high-performance map-reduce distributed-computing iterators

Julia 441

5 天前

tirthajyoti / Spark-with-Python

#计算机科学#Fundamentals of Spark with Python (using PySpark), code examples

pyspark Apache Spark dataframe 机器学习 big-data 数据库 map-reduce Python hdfs analytics hadoop distributed-computing parallel-computing SQL apache

Jupyter Notebook 352

3 年前

tkf / ThreadsX.jl

Parallelized Base functions

Julia 语言 high-performance map-reduce transducers sorting-algorithms parallel

Julia 330

16 天前

phelps-sg / python-bigdata

Data science and Big Data with Python

数据科学 Python hbase NumPy numerical-methods notebook-jupyter Apache Spark map-reduce

Jupyter Notebook 135

2 年前

xarray-contrib / flox

Fast & furious GroupBy operations for dask.array

dask xarray map-reduce

Python 133

14 天前

asavinov / prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

workflow data-processing map-reduce Apache Spark pandas Python feature-engineering 数据科学 data-wrangling data-preprocessing data-preparation business-intelligence olap

Python 91

4 年前