GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

pyspark

Website
Wikipedia
https://static.github-zh.com/github_avatars/ibis-project?size=40
ibis-project / ibis

the portable Python dataframe library

Pythonimpalapandas数据库clickhousePostgreSQLSQLiteMySQLdatafusionSQLpysparkduckdbBigQuerysql-serverpolarssnowflaketrino
Python 5.92 k
2 天前
microsoft/SynapseML
https://static.github-zh.com/github_avatars/microsoft?size=40
microsoft / SynapseML

#计算机科学#Simple and Distributed Machine Learning

Apache SparkpysparkAzureScalaMicrosoft机器学习databrickscognitive-serviceslightgbmHTTPmodel-deployment深度学习人工智能数据科学synapsebig-dataonnxOpenCV
Scala 5.15 k
3 天前
JohnSnowLabs/spark-nlp
https://static.github-zh.com/github_avatars/JohnSnowLabs?size=40
JohnSnowLabs / spark-nlp

#自然语言处理#State of the Art Natural Language Processing

自然语言处理Apache Sparkpysparknamed-entity-recognitionsentiment-analysislemmatizerspell-checkerentity-extractionpart-of-speech-taggerberttransformersTensorflowlanguage-detectionmachine-translationtext-classification大语言模型question-answeringllamacpponnx
Scala 4.01 k
2 天前
https://static.github-zh.com/github_avatars/apache?size=40
apache / linkis

Linkis 在上层应用和底层引擎之间构建了一层计算中间件。通过使用Linkis 提供的REST/WebSocket/JDBC 等标准接口,上层应用可以方便地连接访问Spark, Presto, Flink 等底层引擎,同时实现跨引擎上下文共享、统一的计算任务和引擎治理与编排能力

SQLApache Sparkhivepysparklivylinkisenginestorageresource-managerapplication-managerscriptisREST APIthrift-serverjdbcprestoimpala
Java 3.38 k
1 天前
https://static.github-zh.com/github_avatars/AlexIoannides?size=40
AlexIoannides / pyspark-example-project

Implementing best practices for PySpark ETL jobs and applications.

pysparketl-jobPythondata-engineeringApache Spark数据科学etletl-pipeline
Python 1.96 k
3 年前
https://static.github-zh.com/github_avatars/uber?size=40
uber / petastorm

#计算机科学#Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, a...

TensorflowPyTorch深度学习机器学习pysparkparquet
Python 1.85 k
2 年前
https://static.github-zh.com/github_avatars/awesome-spark?size=40
awesome-spark / awesome-spark

A curated list of awesome Apache Spark packages and resources.

Apache SparkpysparkAwesome Lists
Shell 1.81 k
9 个月前
https://static.github-zh.com/github_avatars/jadianes?size=40
jadianes / spark-py-notebooks

#计算机科学#Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Apache SparkPythonpyspark数据分析Jupyter NotebooknotebookIPython数据科学机器学习big-databigdata
Jupyter Notebook 1.65 k
1 年前
https://static.github-zh.com/github_avatars/hi-primus?size=40
hi-primus / optimus

#计算机科学#🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

Apache Sparkpysparkdata-wranglingbigdata数据科学data-cleansingdata-transformation机器学习data-profilingdata-extractiondata-exploration数据分析data-preparationcudfdaskdata-cleaning
Python 1.51 k
7 个月前
https://static.github-zh.com/github_avatars/ptyadana?size=40
ptyadana / SQL-Data-Analysis-and-Visualization-Projects

SQL data analysis & visualization projects using MySQL, PostgreSQL, SQLite, Tableau, Apache Spark and pySpark.

SQLMySQLexercises数据分析PostgreSQLSQLitetableauchallengessql-queriesPythonpysparkApache Spark
Jupyter Notebook 1.48 k
3 年前
https://static.github-zh.com/github_avatars/jupyter-incubator?size=40
jupyter-incubator / sparkmagic

Jupyter magics and kernels for working with remote Spark clusters

Apache SparkKernelclusterlivymagicsql-querypandas-dataframeJupyter Notebookpysparkkerberosnotebook
Python 1.36 k
5 天前
https://static.github-zh.com/github_avatars/logicalclocks?size=40
logicalclocks / hopsworks

#计算机科学#Hopsworks - Data-Intensive AI platform with a Feature Store

feature-storeAmazon Web ServicesAzure数据科学feature-engineeringfeature-managementGoogle 云governancekserve机器学习mlopsmodel-servingpysparkPythonServerless
Java 1.24 k
5 个月前
https://static.github-zh.com/github_avatars/mahmoudparsian?size=40
mahmoudparsian / pyspark-tutorial

PySpark-Tutorial provides basic algorithms using PySpark

big-databig-data-analyticspysparkApache Sparkdataframes
Jupyter Notebook 1.23 k
2 个月前
https://static.github-zh.com/github_avatars/narwhals-dev?size=40
narwhals-dev / narwhals

Lightweight and extensible compatibility layer between dataframe libraries!

cudfpandaspolarsdaskduckdbpyspark
Python 1.17 k
1 天前
https://static.github-zh.com/github_avatars/mahmoudparsian?size=40
mahmoudparsian / data-algorithms-book

#计算机科学# MapReduce, Spark, Java, and Scala for Data Algorithms Book

hadoop-mapreduceJavadistributed-computingScalamapreducePython机器学习pysparkApache Sparkdesign-patterns
Java 1.08 k
9 个月前
graphframes/graphframes
https://static.github-zh.com/github_avatars/graphframes?size=40
graphframes / graphframes

GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs

Apache Sparkbig-datadataframedataframesgraphsnetworkspyspark
Scala 1.06 k
9 天前
https://static.github-zh.com/github_avatars/h2oai?size=40
h2oai / sparkling-water

#计算机科学#Sparkling Water provides H2O functionality inside Spark cluster

h2oApache Spark机器学习integrationbig-datapysparkScala
Scala 973
8 个月前
https://static.github-zh.com/github_avatars/WeBankFinTech?size=40
WeBankFinTech / Scriptis

#编辑器#Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.

huezeppelinApache SparkhiveSQLpysparkScalaidehqllinkis
Vue 814
7 个月前
https://static.github-zh.com/github_avatars/lyhue1991?size=40
lyhue1991 / eat_pyspark_in_10_days

pyspark🍒🥭 is delicious,just eat it!😋😋

Apache Sparkpyspark
Python 812
3 年前
https://static.github-zh.com/github_avatars/lakehq?size=40
lakehq / sail

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

arrowbig-datadatapysparkRustApache SparkSQLdatafusionPython
Rust 811
5 天前
loading...