Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
翻译 - 使您的公司数据驱动。连接到任何数据源,轻松可视化,显示板并共享数据。
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
#计算机科学#.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
翻译 - .NET forApache®Spark™使.NET开发人员可以轻松访问Apache Spark™。
A Scala kernel for Jupyter
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Qubole Sparklens tool for performance tuning Apache Spark
🐍 Quick reference guide to common patterns & functions in PySpark.
The Internals of Spark SQL
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsigh...
翻译 - 适用于Apache Spark的Data Accelerator简化了大数据流的入门。它提供了丰富,易于使用的体验,可帮助在Azure HDInsights或Databricks上创建,编辑和管理Spark作业,同时启用Spark引擎的全部功能。
Use SQL to build ELT pipelines on a data lakehouse.
Apache Spark™ and Scala Workshops
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
#计算机科学#A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apach...
A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype
Spark Structured Streaming / Kafka / Cassandra / Elastic
#计算机科学#An encrypted data analytics platform
Spark SQL 实现 ItemCF,UserCF,Swing,推荐系统,推荐算法,协同过滤