o >f@s:ddlZddlmZddlZeeZGdddeZdS)N)Callbackcs4eZdZfddZd ddZ  d ddZZS) FixNANinGradcst||_d|_dS)Nr)super__init__monitorcontinuous_nan_batchs)selfr __class__7/home/dufour/Documents/diff_plonk/callbacks/fix_nans.pyr s  zFixNANinGrad.__init__returnNcCsg}g}|D]/\}}|jdur7t|jr||t|jr+||tj|jddd|jdqt|dkrEt d|t|dkrTt d|dSdS)Nr)nanposinfneginfoutz Found NaN in z Found Inf in ) named_parametersgradtorchisnananyappendisinf nan_to_numlenprint)rtrainer pl_module optimizerZhas_nanis_infnameparamr r r on_before_optimizer_steps     z%FixNANinGrad.on_before_optimizer_stepc Cs|j}d}d}|t|jkr2|s2|j||vr%||j|} d}n|d7}|t|jkr2|r|s8tdt| sU|jd7_|jdkrSd|_ t ddSdSd|_dS)NrFTzAsked metric not in logsz5Training interrupted because of NaN in {self.monitor}) callback_metricsrrkeyssqueeze ValueErrorrisfiniter should_stoploginfo) rrroutputsbatch batch_idxlogsiZ found_metriccurrentr r r on_train_batch_ends$   zFixNANinGrad.on_train_batch_end)r N)__name__ __module__ __qualname__rr"r3 __classcell__r r r r rs   r)loggingpytorch_lightning.callbacksrr getLoggerr4r+rr r r r s