T2ISafety

📢 We are currently organizing the code for T2ISafety. If you are interested in our work, please star ⭐ our project.

| 📖 Paper|

🎉 Introduction

Text-to-image (T2I) models have rapidly advanced, enabling the generation of high-quality images from text prompts across various domains. However, these models present notable safety concerns, including the risk of generating harmful, biased, or private content. Current research on assessing T2I safety remains in its early stages. While some efforts have been made to evaluate models on specific safety dimensions, many critical risks remain unexplored. To address this gap, we introduce T2ISafety, a safety benchmark that evaluates T2I models across three key domains: toxicity, fairness, and bias. We build a detailed hierarchy of 12 tasks and 44 categories based on these three domains, and meticulously collect 70K corresponding prompts. Based on this taxonomy and prompt set, we build a large-scale T2I dataset with 68K manually annotated images and train an evaluator capable of detecting critical risks that previous work has failed to identify, including risks that even ultra-large proprietary models like GPTs cannot correctly detect. We evaluate 12 prominent diffusion models on T2ISafety and reveal several concerns including persistent issues with racial fairness, a tendency to generate toxic content, and significant variation in privacy protection across the models, even with defense methods like concept erasing.

🚩 Features

Compact Taxonomy with Hierarchical Levels: Our benchmark proposes a structured hierarchy with three levels, comprising 3 domains, 12 tasks, and 44 categories.
Advanced Evaluation Framework
1. Specialized fine-tuned evaluator namely ImageGuard for Images.

🚩 Dataset construction

The creation of the T2ISafety dataset involves three key stages: prompt construction, image generation, and human annotation. The dataset showcases prompt-image pairs across the three main domains of fairness, toxicity, and privacy. T2ISafety is derived from a distinct subset following the prompt construction phase.

🚩 Model architecture

Network architecture and additional loss of ImageGuard. Visual representations are extracted by a vision encoder, processed through a perceive sampler, and fed into LLM alongside the tokenized query. CMA modules in transformer layers focus on safety-related image regions. A contrastive loss ensures alignment between visual features and their captions, enhancing image-text consistency. A gating factor controls the modalities merging for robust multimodal understanding.

🚩 T2I model results

📑 Citation

@article{libenchmarking,
  title={Benchmarking Ethics in Text-to-Image Models: A Holistic Dataset and Evaluator for Fairness, Toxicity, and Privacy},
  author={Li, Lijun and Shi, Zhelun and Hu, Xuhao and Dong, Bowen and Qin, Yiran and Liu, Xihui and Sheng, Lu and Shao, Jing}
}

SALAD-Bench Team

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

T2ISafety

| 📖 Paper|

🎉 Introduction

🚩 Features

🚩 Dataset construction

🚩 Model architecture

🚩 T2I model results

📑 Citation

About

Releases

Packages

AI45Lab/T2ISafety

Folders and files

Latest commit

History

Repository files navigation

T2ISafety

| 📖 Paper|

🎉 Introduction

🚩 Features

🚩 Dataset construction

🚩 Model architecture

🚩 T2I model results

📑 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages