Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
翻译 - Miller就像awk,sed,cut,join和对名称索引数据(例如CSV,TSV和表格JSON)进行排序
#计算机科学#MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection ...
c++ LINQ -like library of higher-order functions for data manipulation
Dynatrace hash library for Java
Performant implementations of various streaming algorithms, including Count–min sketch, Top k, HyperLogLog, Reservoir sampling.
Learning M-Way Tree - Web Scale Clustering - EM-tree, K-tree, k-means, TSVQ, repeated k-means, bitwise clustering
Federated Principal Component Analysis Revisited!
A Set of Streaming Algorithms in C++, Python, and Go
RiverText is a framework that standardizes the Incremental Word Embeddings proposed in the state-of-art. Please feel welcome to open an issue in case you have any questions or a pull request if you wa...
Streaming, Memory-Limited, r-truncated SVD Revisited!
This is the codebase for Faucet, described in our manuscript: https://academic.oup.com/bioinformatics/article/34/1/147/4004871, by Roye Rozov, Gil Goldshlager, Eran Halperin, and Ron Shamir
Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlatio...
#算法刷题#This repository contains all the solutions of assignments, starter files and other materials related to this specialization.
A simple, time-tested, family of random hash functions in Python, based on CRC32 and xxHash, affine transformations, and the Mersenne Twister. 🎲
Create MPEG2-TS encapsulated stream-segments.
#计算机科学#Python-Wrapper for Francesco Parrella's OnlineSVR C++ implementation with scikit-learn-compatible interface.