We introduce OpenSDID, a large-scale dataset specifically curated for the OpenSDI challenge. Our dataset design addresses the three core requirements essential for open-world spotting of AI-generated content: user diversity, model innovation, and manipulation scope.
OpenSDID comprises 300,000 images, evenly distributed between real and fake samples, divided into training and testing sets. Refer to the paper for more dataset details.
How we create OpenSDID Dataset.
Method | SD1.5 IoU | SD1.5 F1 | SD2.1 IoU | SD2.1 F1 | SDXL IoU | SDXL F1 | SD3 IoU | SD3 F1 | Flux.1 IoU | Flux.1 F1 | AVG IoU | AVG F1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
MVSS-Net | 0.5785 | 0.6533 | 0.4490 | 0.5176 | 0.1467 | 0.1851 | 0.2692 | 0.3271 | 0.0479 | 0.0636 | 0.2983 | 0.3493 |
CAT-Net | 0.6636 | 0.7480 | 0.5458 | 0.6232 | 0.2550 | 0.3074 | 0.3555 | 0.4207 | 0.0497 | 0.0658 | 0.3739 | 0.4330 |
PSCC-Net | 0.5470 | 0.6422 | 0.3667 | 0.4479 | 0.1973 | 0.2605 | 0.2926 | 0.3728 | 0.0816 | 0.1156 | 0.2970 | 0.3678 |
ObjectFormer | 0.5119 | 0.6568 | 0.4739 | 0.4144 | 0.0741 | 0.0984 | 0.0941 | 0.1258 | 0.0529 | 0.0731 | 0.2414 | 0.2737 |
TruFor | 0.6342 | 0.7100 | 0.5467 | 0.6188 | 0.2655 | 0.3185 | 0.3229 | 0.3852 | 0.0760 | 0.0970 | 0.3691 | 0.4259 |
DeCLIP | 0.3718 | 0.4344 | 0.3569 | 0.4187 | 0.1459 | 0.1822 | 0.2734 | 0.3344 | 0.1121 | 0.1429 | 0.2520 | 0.3025 |
IML-ViT | 0.6651 | 0.7362 | 0.4479 | 0.5063 | 0.2149 | 0.2597 | 0.2363 | 0.2835 | 0.0611 | 0.0791 | 0.3251 | 0.3730 |
MaskCLIP | 0.6712 | 0.7563 | 0.5550 | 0.6289 | 0.3098 | 0.3700 | 0.4375 | 0.5121 | 0.1622 | 0.2034 | 0.4271 | 0.4941 |
Method | SD1.5 F1 | SD1.5 Acc | SD2.1 F1 | SD2.1 Acc | SDXL F1 | SDXL Acc | SD3 F1 | SD3 Acc | Flux.1 F1 | Flux.1 Acc | AVG F1 | AVG Acc |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CNNDet | 0.8460 | 0.8504 | 0.7156 | 0.7594 | 0.5970 | 0.6872 | 0.5627 | 0.6708 | 0.3572 | 0.5757 | 0.6157 | 0.7087 |
GramNet | 0.8051 | 0.8035 | 0.7401 | 0.7666 | 0.6528 | 0.7076 | 0.6435 | 0.7029 | 0.5200 | 0.6337 | 0.6723 | 0.7229 |
FreqNet | 0.7588 | 0.7770 | 0.6097 | 0.6837 | 0.5315 | 0.6402 | 0.5350 | 0.6437 | 0.3847 | 0.5708 | 0.5639 | 0.6631 |
NPR | 0.7941 | 0.7928 | 0.8167 | 0.8184 | 0.7212 | 0.7428 | 0.7343 | 0.7547 | 0.6762 | 0.7136 | 0.7485 | 0.7645 |
UniFD | 0.7745 | 0.7760 | 0.8062 | 0.8192 | 0.7074 | 0.7483 | 0.7109 | 0.7517 | 0.6110 | 0.6906 | 0.7220 | 0.7572 |
RINE | 0.9108 | 0.9098 | 0.8747 | 0.8812 | 0.7343 | 0.7876 | 0.7205 | 0.7678 | 0.5586 | 0.6702 | 0.7598 | 0.8033 |
MVSS-Net | 0.9347 | 0.9365 | 0.7927 | 0.8233 | 0.5985 | 0.7042 | 0.6280 | 0.7213 | 0.2759 | 0.5678 | 0.6460 | 0.7506 |
CAT-Net | 0.9615 | 0.9615 | 0.7932 | 0.8246 | 0.6476 | 0.7334 | 0.6526 | 0.7361 | 0.2266 | 0.5526 | 0.6563 | 0.7616 |
PSCC-Net | 0.9607 | 0.9614 | 0.7685 | 0.8094 | 0.5570 | 0.6881 | 0.5978 | 0.7089 | 0.5177 | 0.6704 | 0.6803 | 0.7676 |
ObjectFormer | 0.7172 | 0.7522 | 0.6679 | 0.7255 | 0.4919 | 0.6292 | 0.4832 | 0.6254 | 0.3792 | 0.5805 | 0.5479 | 0.6626 |
TruFor | 0.9012 | 0.9773 | 0.3593 | 0.5562 | 0.5804 | 0.6641 | 0.5973 | 0.6751 | 0.4912 | 0.6162 | 0.5859 | 0.6978 |
DeCLIP | 0.8068 | 0.7831 | 0.8402 | 0.8277 | 0.7069 | 0.7055 | 0.6993 | 0.6840 | 0.5177 | 0.6561 | 0.7142 | 0.7313 |
IML-ViT | 0.9447 | 0.7573 | 0.6970 | 0.6119 | 0.4098 | 0.4995 | 0.4469 | 0.5125 | 0.1820 | 0.4362 | 0.5361 | 0.5635 |
MaskCLIP | 0.9264 | 0.9272 | 0.8871 | 0.8945 | 0.7802 | 0.8122 | 0.7307 | 0.7801 | 0.5649 | 0.6850 | 0.7779 | 0.8198 |
The following are qualitative result examples of MaskCLIP and other methods on the OpenSDID dataset. It showcases the detection and localization performance on images generated by different diffusion models.
SD1.5 Result Example
SD2 Result Example
SDXL Result Example
SD3 Result Example
Flux.1 Result Example
The OpenSDID dataset and the code for MaskCLIP are open-sourced on GitHub:
If you use the OpenSDID dataset or MaskCLIP model in your research, please cite our paper:
@article{wang2024opensdi,
title={OpenSDI: Spotting Diffusion-Generated Images in the Open World},
author={Wang, Yabin and Huang, Zhiwu and Hong, Xiaopeng},
journal={arXiv preprint arXiv},
year={2024}
}