Efficient Triton Kernels for LLM Training
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
A service for autodiscovery and configuration of applications running in containers
翻译 - 自动发现和配置在容器中运行的应用程序的服务
#大语言模型#Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Playing with the Tigress software protection. Break some of its protections and solve their reverse engineering challenges. Automatic deobfuscation using symbolic execution, taint analysis and LLVM.
#数据仓库#🚀🚀🚀A collection of some wesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applica...
#计算机科学#A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
FlagGems is an operator library for large language models implemented in Triton Language.
Linux kernel module to support Turbo mode and RGB Keyboard for Acer Predator notebook series
Automatic ROPChain Generation
翻译 - 自动ROPChain生成
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
OpenDILab RL HPC OP Lib, including CUDA and Triton kernel
LLVM based static binary analysis framework
#大语言模型#🔥🔥🔥 A collection of some awesome public CUDA, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR and High Performance Computing (HPC) projects.
#计算机科学#A performance library for machine learning applications.
#计算机科学#ClearML - Model-Serving Orchestration and Repository Solution
#计算机科学#NVIDIA-accelerated, deep learned model support for image space object detection
(WIP)The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework for algorithm service that ensures reliability, high concurrency and scalability of...