#计算机科学#🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
翻译 - 受Pandas和LINQ启发的JavaScript数据转换和分析工具包。
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algor...
Quizzes & Assignment Solutions for Google Data Analytics Professional Certificate on Coursera. Also included a few resources on side that I found helpful.
Exploratory data analysis 📊using python 🐍of used car 🚘 database taken from ⓚ𝖆𝖌𝖌𝖑𝖊
A domain-specific probabilistic programming language for scalable Bayesian data cleaning
Wrangler Transform: A DMD system for transforming Big Data
#计算机科学#XGBoost, LightGBM, LSTM, Linear Regression, Exploratory Data Analysis
This is a binary classification problem related with Autistic Spectrum Disorder (ASD) screening in Adult individual. Given some attributes of a person, my model can predict whether the person would ha...
Java DSL for (online) deduplication
#计算机科学#This repo created for sharing the required/discussed files during Online Internship training program on Data Science Using Python in May-2021
Comprehensive Power BI dashboards showcasing insights on Call Centre Trends, Customer Retention, and Diversity & Inclusion to drive business impact.
Make quick and dirty data mining made easier in Sublime Text
#计算机科学#Predict if a driver will file an insurance claim next year. (Kaggle Competition)
Data cleanse, clustering with Vector Quantization and Adaptive Resonance Theory
Product Rationalization of Pro Bikes Inc using Power BI
This library contains the file system extensions to Data-Forge that allow it to directly read and write CSV and JSON files in Node.js
Data Structures project in C++11 language, uses custom Vector & String structures with Move Semantics (Rule of Five)