Skip to content

kulasuoit/IGSR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mitigating Safety Context Amnesia in Multimodal Reasoning Models via Intent-Guided Safety Reasoning

Repository status: we are organizing the public release. Code, evaluation scripts, prompts, and additional assets will be released progressively.

Overview

Multimodal Large Reasoning Models (MLRMs) can correctly perceive risk-relevant visual cues, yet still fail to enforce safety constraints when harmful objectives are embedded in seemingly benign contexts. We term this failure mode Safety Context Amnesia (SCA): during reasoning, the model over-prioritizes contextual coherence and narrative alignment, causing latent risk signals to be suppressed.

Across multiple multimodal safety benchmarks, IGSR substantially improves defense success rates while largely preserving utility.

Warning

This project studies multimodal safety failures and defenses. As a result, the paper materials contain unsafe or harmful examples used strictly for research and evaluation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages