hadoop-mapreduce · GitHub Topics

#计算机科学# MapReduce, Spark, Java, and Scala for Data Algorithms Book

hadoop-mapreduce Java distributed-computing Scala mapreduce Python 机器学习 pyspark Apache Spark design-patterns

Java 1.07 k

6 个月前

Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.

flink Apache Spark hadoop-mapreduce

Java 255

1 年前

touero / ctenopharyngodon-idella

#网络爬虫#Use the MapReduce's Java interface to distributed crawle the data of Chinese universities and learn basic knowledge of hdfs.

FastAPI hadoop hadoop-mapreduce Java mapreduce Maven scraping

Java 140

6 个月前

groda / big_data

Tutorials on Big Data essentials: Hadoop, MapReduce, Spark. Explore a variety of tutorials and demonstrations on Big Data technologies, primarily in the form of Jupyter notebooks. Most notebooks are s...

big-data bigdata Apache Spark spark-sql Docker mapreduce pyspark hadoop Jupyter Notebook hadoop-hdfs hadoop-mapreduce

Jupyter Notebook 73

3 个月前

vim89 / datapipelines-essentials-python

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformati...

Apache Spark spark-sql Python pyspark etl etl-pipeline etl-framework XML xml-parsing datalake big-data hadoop hadoop-mapreduce hadoop-hdfs data-pipeline

Python 53

2 年前

maniram-yadav / Big_DataHadoop_Projects

Big data projects implemented by Maniram yadav

Apache Spark pig hadoop hdfs sqoop hive mapreduce big-data-analytics hadoop-mapreduce hadoop-hdfs flume

PigLatin 51

7 年前

seraogianluca / k-means-mapreduce

K-Means algorithm implementation with Hadoop and Spark for the course of Cloud Computing of the MSc AIDE at the University of Pisa.

机器学习 hadoop-mapreduce Apache Spark hadoop

Java 47

4 年前

caizkun / mapreduce-examples

A collection of mapreduce problems and solutions

mapreduce hadoop-mapreduce

Java 35

8 年前

anjalysam / Hadoop

This contain how to install Hadoop on google colab and how to run map-reduce in Hadoop

hadoop hadoop-mapreduce

Jupyter Notebook 33

5 年前

absnaik810 / CloudComputing

Projects done in the Cloud Computing course.

hadoop hadoop-mapreduce hbase inverted-index NoSQL hdfs

Java 25

7 年前

jmaister / wordcount

Hadoop MapReduce word counting with Java

hadoop-mapreduce Java Maven

Java 24

5 年前

jyzhangchn / FBDP-project2

中文文本挖掘|舆情分析|Hadoop|Java|MapReduce

Java knn naive-bayes hadoop-mapreduce

HTML 23

7 年前

arshdeepbahga / cloud-computing-solutions-architect-book-code

Source code for the examples in the book Cloud Computing Solutions Architect: A Hands-On Approach by Arshdeep Bahga and Vijay Madisetti

cloud-computing Amazon Web Services aws-lambda aws-s3 aws-dynamodb boto3 aws-ec2 aws-iot aws-sqs aws-apigateway aws-iam serverless-architectures hadoop-mapreduce Apache Spark storm flink spark-streaming MongoDB

CSS 22

6 年前

benedekh / bigdata-projects

Student projects in Big Data field.

bigdata big-data Apache Spark hadoop hadoop-mapreduce mapreduce

Java 19

2 个月前

MoustafaAMahmoud / BigDataInDepth

Data Engineering Course

hadoop hadoop-mapreduce Apache Spark distributed-systems Scala kafka

TeX 18

10 个月前

QiushiSun / Distributed-Computing-Systems

2021 Spring (Distributed Computing Systems) 分布式系统与编程

distributed-systems distributed-computing Apache Spark hadoop-mapreduce flink

Java 15

4 年前

lucas91batista / twitter-hashtag-graph

Twitter + Flume + Hadoop (HDFS, MapReduce) + Neo4j + Pyhton

Twitter hadoop hadoop-mapreduce hadoop-hdfs Neo4j

JavaScript 15

3 年前

Keerthivasan13 / CSCI572-Information_Retrieval_And_Web_Search_Engines

#搜索#Search Engine projects

information-retrieval scraping-websites crawling pagerank-algorithm hadoop-mapreduce hadoop apache solr lucene tika jsoup networkx 搜索引擎 autocomplete spellchecker PHP

Java 14

5 年前

rajatgarg149 / BigData-Essentials-HDFS-SPARK-RDD

big-data Coursera Apache Spark mapreduce hadoop hadoop-mapreduce distributed-file-system

Jupyter Notebook 14

6 年前

James-QiuHaoran / distributed-computing-platform-mapreduce

This repository contains a simple Hadoop-like (MapReduce) distributed computing platform implemented in Java. It is extended from a course project at UIUC awarded the best Java version implementation ...

mapreduce hadoop hadoop-mapreduce distributed-computing distributed-file-system membership-management distributed-systems cloud-computing

Java 13

4 年前