2 links tagged with all of: multimodal + model-evaluation
Click any tag below to further narrow down your results
Links
This article outlines a method for training judges for Vision-Language Models (VLMs) without human annotations. The approach uses self-synthesized data in an iterative process to improve judgment accuracy, resulting in notable performance gains on various evaluation benchmarks.
LMEval, an open-source framework developed by Google, simplifies the evaluation of large language models across various providers by offering multi-provider compatibility, incremental evaluation, and multimodal support. With features like a self-encrypting database and an interactive visualization tool called LMEvalboard, it enhances the benchmarking process, making it easier for developers and researchers to assess model performance efficiently.