The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark" [NeurIPS 2024]
#自然语言处理#A large-scale 7B pretraining language model developed by BaiChuan-Inc.
#自然语言处理#A 13B large language model developed by Baichuan Intelligent Technology
#大语言模型#A series of large language models developed by Baichuan Intelligent Technology
This repository contains the outputs for two runs of SmartGPT on the MMLU benchmark.
Using LLM to evaluate MMLU dataset.