#数据库#ClickHouse是性能强悍、适合OLAP实时分析的列式数据库,支持SQL语法
Presto 是用于大数据的高性能分布式SQL查询引擎
Doris 是百度开源的支持对海量大数据进行快速分析的MPP数据库。
StarRocks 是新一代极速全场景 MPP (Massively Parallel Processing) 数据库。StarRocks 的愿景是能够让用户的数据分析变得更加简单和敏捷。用户无需经过复杂的预处理,就可以用 StarRocks 来支持多种数据分析场景的极速分析。
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
翻译 - 具有云原生架构的现代实时数据处理和分析 DBMS,旨在简化数据云
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
ByConity 是字节跳动开源的云原生数据仓库,提供读写分离、弹性扩缩容、租户资源隔离和数据读写的强一致性
YTsaurus is a scalable and fault-tolerant open-source big data platform.
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Postgres-native Data Warehouse
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
DuckDB-powered data lake analytics from Postgres
Fastest open-source tool for replicating Databases to Apache Iceberg or Data Lakehouse. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Starting with MongoDB
Use SQL to build ELT pipelines on a data lakehouse.
#Awesome#A curated list of open source tools used in analytics platforms and data engineering ecosystem
Examples of using Terraform to deploy Databricks resources
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Prod...