CUDA Kernel Benchmarking Library
Fast CUDA Kernels for ResNet Inference.
📚Tensor/CUDA Cores, 📖150+ CUDA Kernels, ⚡️⚡️toy-hgemm library with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS 🎉🎉).
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
CUDA 开发人员使用的示例,演示了 CUDA 工具包中的功能
🚀 你的YOLO部署神器。TensorRT Plugin、CUDA Kernel、CUDA Graphs三管齐下,享受闪电般的推理速度。| Your YOLO Deployment Powerhouse. With the synergy of TensorRT Plugins, CUDA Kernels, and CUDA Graphs, experience lightning-fast i...
CLTune: An automatic OpenCL & CUDA kernel tuner
MWE for using the Eigen library in CUDA kernels
Pytorch Custom CUDA kernel for searchsorted
Using custom CUDA kernels with Open CV Mat objects.
Get down and dirty with FlashAttention2.0 in pytorch, plug in and play no complex CUDA kernels
Embree ray tracing kernels repository.
翻译 - Embree射线跟踪内核存储库。
Collections of Apollo Kernels
Ahead of Time compiler for numeric kernels
Kokkos C++ Performance Portability Programming Ecosystem: Math Kernels - Provides BLAS, Sparse BLAS and Graph Kernels
CUDA Library Samples
The Vector Optimized Library of Kernels
rtl88x2bu driver updated for current kernels.
翻译 - rtl88x2bu驱动程序已更新为当前内核。
Copyleft archives for Xperia kernels