#大语言模型#Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
#大语言模型#SGLang is a fast serving framework for large language models and vision language models.
Mixture-of-Experts for Large Vision-Language Models
#大语言模型#MoBA: Mixture of Block Attention for Long-Context LLMs
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
#大语言模型#⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
#大语言模型#Tutel MoE: An Optimized Mixture-of-Experts Implementation
#计算机科学#Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
#大语言模型#A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
#自然语言处理#中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
#自然语言处理#MindSpore online courses: Step into LLM
#安卓#Official LISTEN.moe Android app
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
#安卓#A libGDX cross-platform API for InApp purchasing.
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language M...
MoH: Multi-Head Attention as Mixture-of-Head Attention
[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts