”flash-attention“ 的搜索结果

Fast and memory-efficient exact attention

Python14.43 k

1 天前

machine-learning-systems cefsharp wpf large-language-models flash-player browser swf csharp flash natural-language-processing

flash-linear-attention

@sustcsonglin

#自然语言处理#Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

large-language-models machine-learning-systems natural-language-processing

Python1.37 k

2 天前

ring-flash-attention

@zhuzilin

Ring attention implementation with flash attention

Python595

21 天前

flash-attention-minimal

@tspeterkim

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda633

8 个月前

metal-flash-attention

@philipturner

FlashAttention (Metal Port)

Swift388

2 个月前

flash-attention-jax

Phil Wang@lucidrains

Implementation of Flash Attention in Jax

Python198

9 个月前

FlashAttention-PyTorch

@shreyansh26

Implementation of FlashAttention in PyTorch

Python124

1 年前

tiny-flash-attention

@66RING

flash attention tutorial written in python, triton, cuda, cutlass

Cuda216

5 个月前

flash-cosine-sim-attention

Phil Wang@lucidrains

Implementation of fused cosine similarity attention in the same style as Flash Attention

Cuda207

2 年前

INT8-Flash-Attention-FMHA-Quantization

@jundaf2

Cuda156

1 年前

flash-gpt

@conceptofmind

Add Flash-Attention to Huggingface Models

Python33

2 年前

FlashAttention.jl

@YichengDWu

Julia implementation of the Flash Attention algorithm

Julia16

1 年前

Attention-Meter

@jackylee0424

Attention Meter measures a face attention via a WebCAM. Currently, Attention Meter is available in Python and Flash.

Python49

13 年前

FlashMHA

@kyegomez

An simple pytorch implementation of Flash MultiHead Attention

Jupyter Notebook13

10 个月前

FlashAttention.jl

@nikopj

Julia implementation of flash-attention operation for neural networks.

Julia7

1 年前

flash-genomics-model

Phil Wang@lucidrains

My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)

Python50

1 年前

dynamic-sparse-flash-attention

@epfml

Jupyter Notebook104

1 年前

CefFlashBrowser

@Mzying2001

Flash浏览器 / Flash Browser

flash flash-player browser swf WPF

C#2.85 k

1 个月前

whisper-flash-attention

@sanchit-gandhi

Python11

2 年前

lit-llama

⚡️ Lightning AI @Lightning-AI

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Python6.01 k

3 个月前