🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys...
#大语言模型#Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
#大语言模型#A model compilation solution for various hardware
#计算机科学#FedScale is a scalable and extensible open-source federated learning (FL) platform.
#大语言模型#SpargeAttention: A training-free sparse attention that can accelerate any model inference.
#计算机科学#Machine Learning Framework for Operating Systems - Brings ML to Linux kernel
翻译 - 操作系统的机器学习框架 - 将机器学习引入 Linux 内核
An acceleration library that supports arbitrary bit-width combinatorial quantization operations
#计算机科学#A scalable & efficient active learning/data selection system for everyone.
#大语言模型#The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)
#大语言模型#Distributed RL System for LLM Reasoning
#大语言模型#A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
📚FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1) GPU SRAM complexity for headdim > 256, ~2x↑🎉vs SDPA EA.
#算法刷题#Optimal Sparse Decision Trees
#自然语言处理#Materials for my 2021 NYU class on NLP and ML Systems (Master of Engineering).
#Awesome#Federated Learning Systems Paper List
#计算机科学#sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference