Loki: Open-source solution designed to automate the process of verifying factuality
#Awesome#Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models
#大语言模型#✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models
RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Language Models.
#大语言模型#[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
#大语言模型#[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.
#大语言模型#[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.
😎 curated list of awesome LMM hallucinations papers, methods & resources.
#大语言模型#Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space"
#自然语言处理#[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers
#大语言模型#up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
#自然语言处理#[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strategy.
Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"
#自然语言处理#Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute, relative and much more. It contains a list of all the availab...
#自然语言处理#"Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" by Jiarui Li and Ye Yuan and Zehua Zhang
OLAPH: Improving Factuality in Biomedical Long-form Question Answering
#大语言模型#Official Implementation of 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs