/llm-reliability-and-consistency-evaluation

Evaluating LLMs' factual accuracy, consistency, and robustness to prompt variations using diverse response and question formats.

Primary LanguageHTML

Watchers