#学习与技能提升#Low-code framework for building custom LLMs, neural networks, and other AI models
翻译 - Ludwig是在TensorFlow之上构建的工具箱,无需编写代码即可训练和测试深度学习模型。
#计算机科学#Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
#计算机科学#A curated, but incomplete, list of data-centric AI resources.
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
#计算机科学#The toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling.
#自然语言处理#DataCLUE: 数据为中心的NLP基准和工具包
Rust implementation of the Data Distribution Service (DDS)
Simulator framework for analysis of performance, energy consumption, area and cost of multi-node multi-chiplet tile-based manycore designs
[ICLR'23] Implementation of "Empowering Graph Representation Learning with Test-Time Graph Transformation"
🔥🔥🔥 KDD2024 Best Student Paper
#自然语言处理#A Data Centric NER annotation tool for your Named Entity Recognition projects
Vue Form with Laravel Inspired Validation and Simply Enjoyable Error Messages Api. (Form Api, Validator Api, Rules Api, Error Messages Api)
#大语言模型#A list of data-efficient and data-centric LLM (Large Language Model) papers. Our Survey Paper: Towards Efficient LLM Post Training: A Data-centric Perspective
An observer is a wrapper over JSON data, that provides an interface to know when data is changed, with a focus on performance and memory efficiency.
#计算机科学#Codes for a Top 5% finish in the Data-Centric AI Competition organized by Andrew Ng and DeepLearning.AI
#计算机科学#Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)
From local functions to cloud deployed pipelines
#自然语言处理#Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information"
#计算机科学#Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)
#计算机科学#The official Python library for Openlayer, the Continuous Model Improvement Platform for AI. 📈