Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.
翻译 - 快速,高效,可扩展的分布式地图/缩减系统,以纯Go编写的DAG在内存或磁盘上执行,可独立运行或分布式运行。
#搜索#A search engine which can hold 100 trillion lines of log data.
翻译 - 一个可以保存100万亿行日志数据的搜索引擎。
Kubernetes-native platform to run massively parallel data/streaming jobs
Efficient transducers for Julia
#计算机科学#Fundamentals of Spark with Python (using PySpark), code examples
Parallelized Base functions
Data science and Big Data with Python
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Efficient and scalable parallelism using the message passing interface (MPI) to handle big data and highly computational problems.
Data-parallelism on CUDA using Transducers.jl and for loops (FLoops.jl)
The core parallel and shared memory library used by Hack, Flow, and Pyre
There are Python 2.7 codes and learning notes for Spark 2.1.1
Inverted Indexer, web crawler, sort, search and poster steamer written using Python for information retrieval.
Appengine Datastore Mapper in Go