U wb7@s~ddlmZmZddlZddlZddlZddlm Z ddZ GdddeZ GdddZ Gd d d e ZGd d d eZdS) )ABCabstractmethodNcCs2|dkrt|S|dkr t|Std|dS)z Create a ScheduleSampler from a library of pre-defined samplers. :param name: the name of the sampler. :param diffusion: the diffusion object to sample for. uniformzloss-second-momentzunknown schedule sampler: N)UniformSamplerLossSecondMomentResamplerNotImplementedError)name diffusionr 3/home/zsyue/code/python/GradDiff/models/resample.pycreate_named_schedule_sampler s r c@s$eZdZdZeddZddZdS)ScheduleSamplerau A distribution over timesteps in the diffusion process, intended to reduce variance of the objective. By default, samplers perform unbiased importance sampling, in which the objective's mean is unchanged. However, subclasses may override sample() to change how the resampled terms are reweighted, allowing for actual changes in the objective. cCsdS)z Get a numpy array of weights, one per diffusion step. The weights needn't be normalized, but must be positive. Nr selfr r r weights#szScheduleSampler.weightsc Csp|}|t|}tjt||f|d}t||}dt|||}t| |}||fS)a] Importance-sample timesteps for a batch. :param batch_size: the number of timesteps. :param device: the torch device to save to. :return: a tuple (timesteps, weights): - timesteps: a tensor of timestep indices. - weights: a tensor of weights to scale the resulting losses. )sizep) rthsumrandomchoicelen from_numpylongtofloat) r batch_sizedevicewrZ indices_npindicesZ weights_nprr r r sample+s zScheduleSampler.sampleN)__name__ __module__ __qualname____doc__rrr!r r r r r s  r c@seZdZddZdddZdS)rcCs||_t|g|_dSN) num_timestepsronesZ_weights)rr'r r r __init__?szUniformSampler.__init__FcCs@tjd|j|f|d}|r*t|}nt|}||fS)Nr)r)rrandintr' ones_likehalfr)rrruse_fp16r rr r r r!Cs zUniformSampler.sampleN)F)r"r#r$r)r!r r r r r>src@s eZdZddZeddZdS)LossAwareSamplercsfddttD}t|tjtgtjjddd|D}t |fdd|D}fdd|D}t|t|ddt ||D}ddt ||D}| ||d S) a Update the reweighting using losses from a model. Call this method from each rank with a batch of timesteps and the corresponding losses for each of those timesteps. This method will perform synchronization to make sure all of the ranks maintain the exact same reweighting. :param local_ts: an integer Tensor of timesteps. :param local_losses: a 1D Tensor of losses. cs"g|]}tjdgtjjdqS)rdtyper)rtensorint32r).0_)local_tsr r Xsz=LossAwareSampler.update_with_local_losses..r/cSsg|] }|qSr item)r3xr r r r6bscsg|]}tqSr rzerosrr3bs)r5max_bsr r r6escsg|]}tqSr r:r<) local_lossesr>r r r6fscSs*g|]"\}}|d|D] }|qqSr&r7r3yr=r9r r r r6iscSs*g|]"\}}|d|D] }|qqSr&r7r@r r r r6lsN) rangedistget_world_size all_gatherrr1rr2rmaxzipupdate_with_all_losses)rr5r? batch_sizesZtimestep_batchesZ loss_batchesZ timestepslossesr )r?r5r>r update_with_local_lossesLs$    z)LossAwareSampler.update_with_local_lossescCsdS)a6 Update the reweighting using losses from a model. Sub-classes should override this method to update the reweighting using losses from the model. This method directly updates the reweighting without synchronizing between workers. It is called by update_with_local_losses from all ranks with identical arguments. Thus, it should have deterministic behavior to maintain state across workers. :param ts: a list of int timesteps. :param losses: a list of float losses, one per timestep. Nr )rtsrJr r r rHosz'LossAwareSampler.update_with_all_lossesN)r"r#r$rKrrHr r r r r.Ks#r.c@s.eZdZd ddZddZddZd d Zd S) r MbP?cCsD||_||_||_tj|j|gtjd|_tj|jgtjd|_ dS)Nr0) r history_per_term uniform_probnpr;r'float64 _loss_historyint _loss_counts)rr rPrQr r r r)sz"LossSecondMomentResampler.__init__cCsj|stj|jjgtjdSttj|jddd}|t |}|d|j 9}||j t |7}|S)NrO)axisr) _warmed_uprRr(r r'rSsqrtmeanrTrrQr)rrr r r rsz!LossSecondMomentResampler.weightscCs~t||D]n\}}|j||jkrR|j|ddf|j|ddf<||j|df<q ||j||j|f<|j|d7<q dS)NrrX)rGrVrPrT)rrLrJtlossr r r rHs  z0LossSecondMomentResampler.update_with_all_lossescCs|j|jkSr&)rVrPallrr r r rZsz$LossSecondMomentResampler._warmed_upN)rMrN)r"r#r$r)rrHrZr r r r rs   r)abcrrrnumpyrRtorchrtorch.distributed distributedrCr r rr.rr r r r s & 6