Spaces:
Running
A newer version of the Gradio SDK is available:
5.42.0
license: mit
title: INP-Former -- Zero-shot Anomaly Detection
sdk: gradio
sdk_version: 4.44.1
emoji: 🚀
colorFrom: red
colorTo: purple
pinned: false
app_file: Zero_Shot_App.py
short_description: Zero-shot Anomaly Detection!
INP-Former (CVPR2025)
The official implementation of “Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection”
🚀 Coming Soon!
The official implementation of “Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection”. This repo is created by Wei Luo.
Abstract
Anomaly detection (AD) is essential for industrial inspection, yet existing methods typically rely on ``comparing'' test images to normal references from a training set. However, variations in appearance and positioning often complicate the alignment of these references with the test image, limiting detection accuracy. We observe that most anomalies manifest as local variations, meaning that even within anomalous images, valuable normal information remains. We argue that this information is useful and may be more aligned with the anomalies since both the anomalies and the normal information originate from the same image. Therefore, rather than relying on external normality from the training set, we propose INP-Former, a novel method that extracts Intrinsic Normal Prototypes (INPs) directly from the test image. Specifically, we introduce the INP Extractor, which linearly combines normal tokens to represent INPs. We further propose an INP Coherence Loss to ensure INPs can faithfully represent normality for the testing image. These INPs then guide the INP-Guided Decoder to reconstruct only normal tokens, with reconstruction errors serving as anomaly scores. Additionally, we propose a Soft Mining Loss to prioritize hard-to-optimize samples during training. INP-Former achieves state-of-the-art performance in single-class, multi-class, and few-shot AD tasks across MVTec-AD, VisA, and Real-IAD, positioning it as a versatile and universal solution for AD. Remarkably, INP-Former also demonstrates some zero-shot AD capability.
Overview
Install Environments
Create a new conda environment and install required packages.
conda create -n INP python=3.8.12
conda activate INP
pip install -r requirements.txt
pip install gradio # Optional, for Zero Shot App
Experiments are conducted on NVIDIA GeForce RTX 4090 (24GB). Same GPU and package version are recommended.
Prepare Datasets
Noted that ../
is the upper directory of INP-Former code. It is where we keep all the datasets by default.
You can also alter it according to your need, just remember to modify the data_path
in the code.
MVTec AD
Download the MVTec-AD dataset from URL.
Unzip the file to ../mvtec_anomaly_detection
.
|-- mvtec_anomaly_detection
|-- bottle
|-- cable
|-- capsule
|-- ....
VisA
Download the VisA dataset from URL.
Unzip the file to ../VisA/
. Preprocess the dataset to ../VisA_pytorch/
in 1-class mode by their official splitting
code. ../VisA_pytorch
will be like:
|-- VisA_pytorch
|-- 1cls
|-- candle
|-- ground_truth
|-- test
|-- good
|-- bad
|-- train
|-- good
|-- capsules
|-- ....
Real-IAD
Contact the authors of Real-IAD URL to get the net disk link.
Download and unzip realiad_1024
and realiad_jsons
in ../Real-IAD
.
../Real-IAD
will be like:
|-- Real-IAD
|-- realiad_1024
|-- audiokack
|-- bottle_cap
|-- ....
|-- realiad_jsons
|-- realiad_jsons
|-- realiad_jsons_sv
|-- realiad_jsons_fuiad_0.0
|-- ....
Experiments
Checkpoints
We provide the weights trained under multi-class, few-shot, and super multi-class settings. All downloaded weights are named as model.pth
. Please place them in the corresponding folder, such as: saved_results/INP-Former-Multi-Class_dataset=MVTec-AD_Encoder=dinov2reg_vit_base_14_Resize=448_Crop=392_INP_num=6
. You can run the corresponding Python script to generate the appropriate directory.
Setting | Input Size | MVTec-AD | VisA | Real-IAD |
---|---|---|---|---|
Multi-Class | R4482-C3922 | model | model | model |
Few-shot-4 | R4482-C3922 | model | model | model |
Few-shot-2 | R4482-C3922 | model | model | model |
Few-shot-1 | R4482-C3922 | model | model | model |
Super-Multi-Class | R4482-C3922 | model |
Options
dataset
: names of the datasets, MVTec-AD, VisA, or Real-IADdata_path
: path to the datasetencoder
: name of the pretrained encoderinput_size
: size of the image after resizingcrop_size
: size of the image after center croppingINP_num
: number of INPtotal_epochs
: number of training epochsbatch_size
: batch sizephase
: mode, train or testshot
: number of samples per class in the few-shot settingsource_dataset
: name of the pre-trained dataset in the zero-shot setting
Multi-Class Setting
MVTec-AD
Train:
python INP-Former_Multi_Class.py --dataset MVTec-AD --data_path ../mvtec_anomaly_detection --phase train
Test:
python INP-Former_Multi_Class.py --dataset MVTec-AD --data_path ../mvtec_anomaly_detection --phase test
VisA
Train:
python INP-Former_Multi_Class.py --dataset VisA --data_path ../VisA_pytorch/1cls --phase train
Test:
python INP-Former_Multi_Class.py --dataset VisA --data_path ../VisA_pytorch/1cls --phase test
Real-IAD
Train:
python INP-Former_Multi_Class.py --dataset Real-IAD --data_path ../Real-IAD --phase train
Test:
python INP-Former_Multi_Class.py --dataset Real-IAD --data_path ../Real-IAD --phase test
Few-Shot Setting
MVTec-AD
Train:
python INP-Former_Few_Shot.py --dataset MVTec-AD --data_path ../mvtec_anomaly_detection --shot 4 --phase train
Test:
python INP-Former_Few_Shot.py --dataset MVTec-AD --data_path ../mvtec_anomaly_detection --shot 4 --phase test
VisA
Train:
python INP-Former_Few_Shot.py --dataset VisA --data_path ../VisA_pytorch/1cls --shot 4 --phase train
Test:
python INP-Former_Few_Shot.py --dataset VisA --data_path ../VisA_pytorch/1cls --shot 4 --phase test
Real-IAD
Train:
python INP-Former_Few_Shot.py --dataset Real-IAD --data_path ../Real-IAD --shot 4 --phase train
Test:
python INP-Former_Few_Shot.py --dataset Real-IAD --data_path ../Real-IAD --shot 4 --phase test
Single-Class Setting
MVTec-AD
Train:
python INP-Former_Single_Class.py --dataset MVTec-AD --data_path ../mvtec_anomaly_detection --phase train
Test:
python INP-Former_Single_Class.py --dataset MVTec-AD --data_path ../mvtec_anomaly_detection --phase test
VisA
Train:
python INP-Former_Single_Class.py --dataset VisA --data_path ../VisA_pytorch/1cls --phase train
Test:
python INP-Former_Single_Class.py --dataset VisA --data_path ../VisA_pytorch/1cls --phase test
Real-IAD
Train:
python INP-Former_Single_Class.py --dataset Real-IAD --data_path ../Real-IAD --phase train
Test:
python INP-Former_Single_Class.py --dataset Real-IAD --data_path ../Real-IAD --phase test
Zero-Shot Setting
Source dataset: Real-IAD Target dataset: MVTec-AD
python INP-Former_Zero_Shot.py --source_dataset Real-IAD --dataset MVTec-AD --data_path ../mvtec_anomaly_detection
Source dataset: VisA Target dataset: MVTec-AD
python INP-Former_Zero_Shot.py --source_dataset VisA --dataset MVTec-AD --data_path ../mvtec_anomaly_detection
Super-Multi-Class Setting
MVTec-AD+VisA+Real-IAD
Train:
python INP-Former_Super_Multi_Class.py --mvtec_data_path ../mvtec_anomaly_detection --visa_data_path ../VisA_pytorch/1cls --real_iad_data_path ../Real-IAD --phase train
Test:
python INP-Former_Super_Multi_Class.py --mvtec_data_path ../mvtec_anomaly_detection --visa_data_path ../VisA_pytorch/1cls --real_iad_data_path ../Real-IAD --phase test
Results
Similar to Dinomaly, our INP-Former may exhibit slight inaccuracies when using the GT mask, as discussed in this issue. We have now addressed this issue, so the pixel-level AP and F1-max results obtained from the current code may be slightly lower than the metrics reported in the paper.
Multi-Class Setting
Few-Shot Setting
Single-Class Setting
Zero-Shot Setting
Super-Multi-Class Setting
Zero-Shot App
We have also developed an App for zero-shot anomaly detection. The pre-trained weights are trained on the Real-AD or VisA dataset under a multi-class setting and applied to zero-shot anomaly detection on the MVTec-AD dataset.
Run App
python Zero_Shot_App.py
Acknowledgements
We sincerely appreciate Dinomaly for its concise, effective, and easy-to-follow approach. We also thank Reg-AD, as the data augmentation techniques used in our few-shot setting were inspired by it. We further acknowledge OneNIP for inspiring our super-multi-class experiments. Additionally, we would like to thank AdaCLIP for providing inspiration for our zero-shot App.
Citation
If our work is helpful for your research, please consider citing:
@article{luo2025exploring,
title={Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection},
author={Luo, Wei and Cao, Yunkang and Yao, Haiming and Zhang, Xiaotian and Lou, Jianan and Cheng, Yuqi and Shen, Weiming and Yu, Wenyong},
journal={arXiv preprint arXiv:2503.02424},
year={2025}
}
Contact
If you have any questions about our work, please do not hesitate to contact luow23@mails.tsinghua.edu.cn.