auto-tuning · GitHub Topics

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

low-precision pruning sparsity auto-tuning knowledge-distillation quantization quantization-aware-training post-training-quantization smoothquant large-language-models gptq int8

Python 2.37 k

1 天前

oracle / bpftune

bpftune uses BPF to auto-tune Linux systems

auto-tuning bpf eBPF Linux

C 1.58 k

5 天前

zwang4 / awesome-machine-learning-in-compilers

#计算机科学#Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation

机器学习编译器 optimisation parallel-computing parallel-programming parallelism 人工智能操作系统 auto-tuning

1.53 k

11 小时前

KernelTuner / kernel_tuner

#计算机科学#Kernel Tuner

cuda-kernels Python gpu CUDA opencl C C++auto-tuning gpu-computing Testing 软件工程 optimization 机器学习

Python 326

5 天前

sbu-fsl / kernel-ml

#计算机科学#Machine Learning Framework for Operating Systems - Brings ML to Linux kernel

翻译 - 操作系统的机器学习框架 - 将机器学习引入 Linux 内核

操作系统机器学习 Linux mlsys auto-tuning kernel-module

C 242

3 年前

ROCm / Tensile

#计算机科学#Stretching GPU performance for GEMMs and tensor contractions.

gemm blas dnn neural-networks 机器学习 tensors Python opencl hip auto-tuning amd gpu-computing gpu-acceleration gpu matrix-multiplication Assembly

Python 235

2 天前

CNugteren / CLTune

CLTune: An automatic OpenCL & CUDA kernel tuner

opencl CUDA tuner auto-tuning

C++ 177

2 年前

ederwander / PyAutoTune

Autotune Module for Python "PyAutoTune"

audio autotune C pitch pyaudio Python real-time realtime voice auto-tuning dsp fft

C 143

4 年前

HAL-42 / AlchemyCat

#计算机科学#Alchemy Cat —— 🔥Config System for SOTA

auto-tuning 机器视觉 configuration 深度学习机器学习 parameter-tuning

Python 115

24 天前

SUSE / phoebe

#计算机科学#Phoebe

人工智能 auto-tuning self-healing 机器学习 Linux systems

C 89

4 年前

tlc-pack / TLCBench

#计算机科学#Benchmark scripts for TVM

tvm benchmark 深度学习 auto-tuning

Python 74

3 年前

weixingsun / jBProF

ebpf profiler for jvm

eBPF Java profiler flamegraph breakpoint perf bpf latency auto-tuning jvmti jni

C++ 71

4 年前

ctuning / ck-crowdtuning

#计算机科学#Collective Knowledge crowd-tuning extension to let users crowdsource their experiments (using portable Collective Knowledge workflows) such as performance benchmarking, auto tuning and machine learnin...

optimization collaboration collective-intelligence auto-tuning 机器学习 knowledge-sharing Internet of things

Python 34

4 年前

addb-swstarlab / K2vTune

#计算机科学#K2vTune (A Workload-aware Configuration Tuning for RocksDB)

人工智能 auto-tuning 机器学习 rocksdb

Jupyter Notebook 28

1 年前

cornell-zhang / uptune

A Generic Distributed Auto-Tuning Infrastructure

heuristics distributed-systems auto-tuning Python C++

Python 22

4 年前

NTNU-HPC-Lab / BAT

A GPU benchmark suite for autotuners

benchmarking Kernel hpc CUDA bat auto-tuning

Cuda 18

1 年前

go-playground / backoff

:bowtie: Backoff uses an exponential backoff algorithm to backoff between retries with optional auto-tuning functionality.

retry backoff auto-tuning

Go 12

7 年前

AutoTuningAssociation / autotuning_methodology

This software package accompanies the paper "A Methodology for Comparing Auto-Tuning Optimization Algorithms" (https://doi.org/10.1016/j.future.2024.05.021), making the guidelines in the methodology e...

auto-tuning methodology optimization-algorithms performance-metrics performance-optimization

Python 6

18 天前