#面试#A curated list of awesome System Design (A.K.A. Distributed Systems) resources.
翻译 - 精选的出色系统设计(A.K.A.分布式计算)资源列表。
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.
Life-cycle: Internal working of HDFS, SQOOP, HIVE, SPARK, HBASE, KAFKA with code.
Hadoop3.2 single/cluster mode with web terminal gotty, spark, jupyter pyspark, hive, eco etc.
Instructions on setting up Hadoop, HDFS, java, sbt, kafka, scala, spark and flume on Ubuntu 18.04
Dockerfile for running Apache Knox (http://knox.apache.org/) in Docker
#计算机科学#The goal of this project is to identify the flood-prone areas with probabilities of flood in counties in a future date, using Spark MLLib.
Analysis of YouTube Data using Hadoop Mapreduce framework in Java.
Built a Large Scale Distributed Data Processing system for Streaming Analytics using Hadoop Ecosystem (Apache Spark and HDFS), in Cloud for real-time spatial analytics.
Helm chart for Apache Knox
Getting tweets using Flume service and analyzing tweets
Spark Streaming & Kafka Quick Start Tutorial
Practise programs in hadoop ecosystem for refrence
[BigData] one year weblog analysis using PIG
Big Data is Stored and analyzed of various Customer using Hadoop and other tools like Hive, Zookeeper, Hbase and sqoop and all details of the customer is analyzed then result are given.This result is ...
This project focuses on analyzing movie data using Pyspark tailored for efficient data processing on Hadoop Distributed File System (HDFS)