cublas“ 的搜索结果

#计算机科学#Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL

C++546
5 年前

Wheels for llama-cpp-python compiled with cuBLAS support

HTML98
2 年前

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

Cuda381
8 个月前

Julia interface to CUBLAS

Julia26
6 年前

Code for testing the native float16 matrix multiplication performance on Tesla P100 and V100 GPU based on cublasHgemm

Cuda34
6 年前

code for benchmarking GPU performance based on cublasSgemm and cublasHgemm

Cuda32
3 年前
Elixir94
4 年前

#大语言模型#🚀🚀🚀 This repository lists some awesome public CUDA, cuda-python, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR, PTX and High Performance Computing (HPC) projects.

321
1 个月前

simple port of hpl-2.0 to use NVIDIA GPU accelation with CUBLAS

C28
12 年前