DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics. DataOps applies to the entire data lifecycle from data preparation to reporting, and recognizes the interconnected nature of the data analytics team and information technology operations.
Prefect 是一个现代化工作流编排工具,使开发人员能够构建、观察数据管道并对其做出反应
Turns Data and AI algorithms into production-ready web applications in no time.
#数据仓库#The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
翻译 - 在数据集中查找标签错误并使用嘈杂的标签进行学习。
Fancy stream processing made operationally mundane
翻译 - 普通任务和数据工程的声明式流处理
#大语言模型#Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
翻译 - 大规模可靠地开发,执行和监视分布式工作流。
#计算机科学#Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
Redpanda Console is a developer-friendly UI for managing your Kafka/Redpanda workloads. Console gives you a simple, interactive approach for gaining visibility into your topics, masking data, managing...
#计算机科学#An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collect...
Efficient data transformation and modeling framework that is backwards compatible with dbt.
Kafka Docker for development. Kafka, Zookeeper, Schema Registry, Kafka-Connect, , 20+ connectors
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Collect, aggregate, and visualize a data ecosystem's metadata
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
Polyglot workflows without leaving the comfort of your technology stack.
翻译 - 基于软件代理的智能流程和工作流自动化平台。
#数据仓库#📙 Awesome Data Catalogs and Observability Platforms.
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
翻译 - Optimus 是一个易于使用、可靠且高性能的工作流编排器,用于数据转换、数据建模、管道和数据质量管理。
Tenzir is the data pipeline engine for security teams.
DataOps for Microsoft Data Platform technologies. https://aka.ms/dataops-repo