o >f @sbddlmZddlZddlZddlZddlZddlmZGdddeZdej j de fdd Z dS) )CallbackN)FullyShardedDataParallelcsNeZdZ   dfdd ZddZdd Zdd d Zd d ZddZZ S) EMACallback+?rTcs,t||_||_||_||_||_dSN)super__init__decaymodule_attr_nameema_module_attr_namestart_ema_stepinit_ema_random)selfr r r r r  __class__2/home/dufour/Documents/diff_plonk/callbacks/ema.pyr s  zEMACallback.__init__cCsp|jdkr6t||jstd|d|jt||js/||jtt||j d| |dSdS)NrzModule z does not have attribute F) global_stephasattrr ValueErrorr add_modulecopydeepcopygetattrevalrequires_grad_ reset_ema)rtrainer pl_modulerrron_train_starts   zEMACallback.on_train_startcCsj|j|jkr ||dS|j|jkr#|jddkr#|j|dddS|j|jkr3|j||jddSdS)Ndrg?)r )rr r update_emar )rrroutputsbatch batch_idxrrron_train_batch_end(s   zEMACallback.on_train_batch_endc Cst||j}t||j}||}|Pt3|}t| | D]\}}||vrC|j rC|| || | |q'Wdn1sNwYWddSWddS1sfwYdSr)rr r get_model_context_managertorchno_grad state_dict itertoolschainnamed_parameters named_buffers requires_gradcopy_detachlerp) rrr ema_modulemodulecontext_manager ema_paramsnameparamrrrr!4s(      "zEMACallback.update_emacCs"t|}t}|r||}|Sr) is_model_fsdp contextlib nullcontextZsummon_full_params)rr3Z fsdp_enabledZmodel_context_managerrrrr&Ds  z%EMACallback.get_model_context_managercCst||j}|jr|dSt||j}||}|)|}t| | D]\}}||vr<|| | q+WddS1sHwYdSr) rr r init_weightsr r&r)r*r+r,r-r/r0)rrr2r3r4r5r6r7rrrrKs      "zEMACallback.reset_ema)rrT)r) __name__ __module__ __qualname__rrr%r!r&r __classcell__rrrrr s rmodelreturncCsPzt|tr WdS|D] \}}t|trWdSq WdSty'YdSw)NTF) isinstancernamed_children ImportError)r@_objrrrr8[s   r8) pytorch_lightningrrr*r'r9torch.distributed.fsdprrnnModuleboolr8rrrrs  R