#大语言模型#A high-throughput and memory-efficient inference and serving engine for LLMs
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.