The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
翻译 - 在数据集中查找标签错误并使用嘈杂的标签进行学习。
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team colla...
翻译 - 元数据开放标准。发现、协作和正确获取数据的单一场所。
#计算机科学#The Open Source Feature Store for Machine Learning
翻译 - 机器学习功能库
lakeFS - Data version control for your data lake | Git for data
翻译 - 一个开源平台,可为基于对象存储的数据湖提供弹性和可管理性
Compare tables within or across databases
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collect...
re_data - fix data issues before your users & CEO would discover them 😊
Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Prod...