Apache Arrow 是用于内存分析的开发平台,支持多语言。包含一个标准化的物件栏内存格式,且能够表示平面及层级化数据,以便在现代CPU和GPU硬体上进行高效率的分析操作。
Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.
翻译 - 用于针对 JSON、CSV、Excel、Parquet 等运行 SQL 查询的命令行工具。
#数据仓库#Create full-fledged APIs for slowly moving datasets without writing a single line of code.
#数据仓库#Blazing-fast Data-Wrangling toolkit
#计算机科学#Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, a...
A large-scale entity and relation database supporting aggregation of properties
Single-binary Postgres read replica optimized for analytics
Quilt is a data mesh for connecting people with actionable data
Postgres Data Warehouse, built on Iceberg
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
A portable embedded database using Arrow.
Simple Windows desktop application for viewing & querying Apache Parquet files
Fastest open-source tool for replicating Databases to Data Lake in Open Table Formats like Apache Iceberg. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres,...