Hi, this is the model weights that i used to be able to achieved 2nd place on FathomNet 2025 Competition:
My approach was to used a two-stream hierarchical classifier. It uses EfficientNet-V2 (version M and S) backbones to extract features from both a detailed Region of Interest (ROI) and the full contextual image. These features are fused using an uncertainty-guided gating mechanism before being fed to a classifier head. I also used 5-fold cross-validation using PyTorch Lightning, AdamW optimizer, and cosine annealing scheduler. Early stopping and model checkpointing optimized model selection per fold, each model's predictions were enhanced with 2-view rotational Test-Time Augmentation. The final probability was computed using a weighted geometric mean for optimal performance.
For more info and detailed explanation please check out my github repo: https://github.com/kidshock/fathomnet-2025-2ndplace
For inference steps, you can download the model weights here from the kfold_checkpoints folder and use the inference_only.ipynb from my github repo to get the submission csv file.
If you have any questions, please contact me through long.vu2124@gmail.com, I will gladly answer them all.
Model tree for FathomNet/Fathomnet-2025-2ndplace-ensemble-EffNetv2
Base model
timm/tf_efficientnetv2_m.in21k