PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality
Published in arXiv, 2025
This paper introduces PRISM, a framework for robust alignment of Vision-Language Models (VLMs) with principled reasoning to ensure integrated safety in multimodal settings. The key challenge addressed is overdefense, which harms utility, and the balance between safety and benign performance. Paper Code Website