Soil temperature representation in AIFS v1 and precipitation results
The AIFS v1 model is trained on two datasets: ERA5 and IFS (operational analysis). The latter is used for fine tuning. The way soil temperature is represented in these datasets is different: there is a mask for oceans in the ERA5 dataset, while IFS data use interpolation to represent soil temperature over the oceans. We find that this difference greatly matters when it comes to precipitation results. What should I make of it? What kind of representation of soil temperature should I use for inference?
Hi! Both the ERA5 and IFS-Operations datasets that the AIFSv1 is trained on do not have the oceans masked for soil temperature, both datasets have some notion of sea-surface temperature over the oceans for the soil temperature fields. For inference if you grab the initial data from the open-data archive this should have the correct representation (matching what the model was trained on), there is an example of grabbing this data in the notebook here: https://huggingface.co/ecmwf/aifs-single-1.0/blob/main/run_AIFS_v1.ipynb
Hope the helps! :)