🌌 twins_svt-gravit-b3

πŸ”­ This model is part of GraViT: Transfer Learning with Vision Transformers and MLP-Mixer for Strong Gravitational Lens Discovery

πŸ”— GitHub Repository: https://github.com/parlange/gravit

πŸ›°οΈ Model Details

  • πŸ€– Model Type: Twins_SVT
  • πŸ§ͺ Experiment: B3 - J24-all-blocks
  • 🌌 Dataset: J24
  • πŸͺ Fine-tuning Strategy: all-blocks

πŸ’» Quick Start

import torch
import timm

# Load the model directly from the Hub
model = timm.create_model(
    'hf-hub:parlange/twins_svt-gravit-b3',
    pretrained=True
)
model.eval()

# Example inference
dummy_input = torch.randn(1, 3, 224, 224)
with torch.no_grad():
    output = model(dummy_input)
    predictions = torch.softmax(output, dim=1)
print(f"Lens probability: {predictions[0][1]:.4f}")

⚑️ Training Configuration

Training Dataset: J24 (Jaelani et al. 2024)
Fine-tuning Strategy: all-blocks

πŸ”§ Parameter πŸ“ Value
Batch Size 192
Learning Rate AdamW with ReduceLROnPlateau
Epochs 100
Patience 10
Optimizer AdamW
Scheduler ReduceLROnPlateau
Image Size 224x224
Fine Tune Mode all_blocks
Stochastic Depth Probability 0.1

πŸ“ˆ Training Curves

Combined Training Metrics

🏁 Final Epoch Training Metrics

Metric Training Validation
πŸ“‰ Loss 0.0143 0.0596
🎯 Accuracy 0.9949 0.9871
πŸ“Š AUC-ROC 0.9998 0.9980
βš–οΈ F1 Score 0.9949 0.9871

β˜‘οΈ Evaluation Results

ROC Curves and Confusion Matrices

Performance across all test datasets (a through l) in the Common Test Sample (More et al. 2024):

ROC + Confusion Matrix - Dataset A ROC + Confusion Matrix - Dataset B ROC + Confusion Matrix - Dataset C ROC + Confusion Matrix - Dataset D ROC + Confusion Matrix - Dataset E ROC + Confusion Matrix - Dataset F ROC + Confusion Matrix - Dataset G ROC + Confusion Matrix - Dataset H ROC + Confusion Matrix - Dataset I ROC + Confusion Matrix - Dataset J ROC + Confusion Matrix - Dataset K ROC + Confusion Matrix - Dataset L

πŸ“‹ Performance Summary

Average performance across 12 test datasets from the Common Test Sample (More et al. 2024):

Metric Value
🎯 Average Accuracy 0.8178
πŸ“ˆ Average AUC-ROC 0.8050
βš–οΈ Average F1-Score 0.5157

πŸ“˜ Citation

If you use this model in your research, please cite:

@misc{parlange2025gravit,
      title={GraViT: Transfer Learning with Vision Transformers and MLP-Mixer for Strong Gravitational Lens Discovery}, 
      author={RenΓ© Parlange and Juan C. Cuevas-Tello and Octavio Valenzuela and Omar de J. Cabrera-Rosas and TomΓ‘s Verdugo and Anupreeta More and Anton T. Jaelani},
      year={2025},
      eprint={2509.00226},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2509.00226}, 
}

Model Card Contact

For questions about this model, please contact the author through: https://github.com/parlange/

Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including parlange/twins_svt-gravit-b3

Evaluation results

  • Average Accuracy on Common Test Sample (More et al. 2024)
    self-reported
    0.818
  • Average AUC-ROC on Common Test Sample (More et al. 2024)
    self-reported
    0.805
  • Average F1-Score on Common Test Sample (More et al. 2024)
    self-reported
    0.516