B Ckd_>@sNddlZddlmmZddlmZddlmZmZmZddl m Z m Z m Z ddl ZddlmZmZddlTddlZddlmZdZd%d d Zd&d dZGdddejjZGdddejjZGdddejjZGdddejjZGdddejjZGdddejZGdddejZ ddZ!dd Z"d!d"Z#Gd#d$d$e$Z%dS)'N)Conv1dConvTranspose1dConv2d) weight_normremove_weight_norm spectral_norm)Snake SnakeBeta)*) OmegaConfg?{Gz?cCs*|jj}|ddkr&|jj||dS)NZConv) __class____name__findweightdatanormal_)mmeanstd classnamerU/mnt/bn/ailabrenyi/entries/huangjiawei/text_to_audio/整理/vocoder/bigvgan/models.py init_weightssrcCst|||dS)N)int) kernel_sizedilationrrr get_paddingsr!cs.eZdZd fdd ZddZdd ZZS) AMPBlock1rr#Ncstt||_ttt|d|dt||ddtt|d|dt||ddtt|d|dt||ddg|_ |j t ttt|ddt|ddtt|ddt|ddtt|ddt|ddg|_ |j t t |j t |j |_|dkrHtfddt|jD|_n6|dkrvtfd dt|jD|_ntd dS) Nrr)r paddingrsnakecs g|]}ttjddqS))alpha_logscale) activation) Activation1dr snake_logscale).0_)channelshrr ;sz&Block1.__init__.. snakebetacs g|]}ttjddqS))r()r))r*r r+)r,r-)r.r/rrr0AszRactivation incorrectly specified. check the config file and look for 'activation'.)superr"__init__r/nn ModuleListrrr!convs1applyrconvs2len num_layersrange activationsNotImplementedError)selfr/r.rr r))r)r.r/rr3s8      zAMPBlock1.__init__c Csr|jddd|jddd}}xJt|j|j||D]4\}}}}||}||}||}||}||}q6W|S)Nrr)r<zipr6r8) r>xZacts1Zacts2c1c2a1a2xtrrrforwardHs"  zAMPBlock1.forwardcCs4x|jD] }t|qWx|jD] }t|q WdS)N)r6rr8)r>lrrrrSs   zAMPBlock1.remove_weight_norm)r#r$N)r __module__ __qualname__r3rFr __classcell__rr)rrr"s) r"cs.eZdZd fdd ZddZdd ZZS) AMPBlock2r#rr#Ncstt||_ttt|d|dt||ddtt|d|dt||ddg|_ |j t t |j |_ |dkrtfddt|j D|_n4|dkrtfddt|j D|_ntd dS) Nrr)r r&r'cs g|]}ttjddqS))r()r))r*r r+)r,r-)r.r/rrr0ksz&Block2.__init__..r1cs g|]}ttjddqS))r()r))r*r r+)r,r-)r.r/rrr0qszRactivation incorrectly specified. check the config file and look for 'activation'.)r2rKr3r/r4r5rrr!convsr7rr9r:r;r<r=)r>r/r.rr r))r)r.r/rr3[s$    zAMPBlock2.__init__cCs8x2t|j|jD] \}}||}||}||}qW|S)N)r?rMr<)r>r@carErrrrFxs  zAMPBlock2.forwardcCsx|jD] }t|qWdS)N)rMr)r>rGrrrrs zAMPBlock2.remove_weight_norm)r#rLN)rrHrIr3rFrrJrr)rrrKZsrKcs,eZdZfddZddZddZZS)BigVGANc stt|||_t|j|_t|j|_t t |j |j dddd|_ |jdkrVtnt}t|_xhtt|j|jD]R\}\}}|jtt t|j d||j d|d||||ddgqxWt|_xjtt|jD]X}|j d|d}x@tt|j|jD]*\}\}}|j||||||jdqWqW|jdkrlt||jd } t| d|_ n0|jd krt!||jd } t| d|_ nt"d t t |ddddd|_#x(tt|jD]}|j|$t%qW|j#$t%dS) Nrr#)r&1r)r)r')r(r1zRactivation incorrectly specified. check the config file and look for 'activation'.)&r2rPr3r/r9Zresblock_kernel_sizes num_kernelsZupsample_rates num_upsamplesrrZnum_melsZupsample_initial_channelconv_preresblockr"rKr4r5ups enumerater?Zupsample_kernel_sizesappendr resblocksr;Zresblock_dilation_sizesr)r r+r*activation_postr r= conv_postr7r) r>r/rViukchjdr[)rrrr3s8        &  zBigVGAN.__init__cCs||}xt|jD]}x,tt|j|D]}|j|||}q.Wd}xPt|jD]B}|dkr~|j||j||}qX||j||j||7}qXW||j}qW||}||}t |}|S)N) rUr;rTr9rWrSrZr[r\torchtanh)r>r@r]Zi_upxsrarrrrFs     zBigVGAN.forwardcCs^tdx$|jD]}x|D] }t|qWqWx|jD] }|q6Wt|jt|jdS)NzRemoving weight norm...)printrWrrZrUr\)r>rGZl_irrrrs     zBigVGAN.remove_weight_norm)rrHrIr3rFrrJrr)rrrPs .rPcs&eZdZdfdd ZddZZS) DiscriminatorPr%r#Fcshtt|||_|j|_|dkr(tnt}t |t dt d|j|df|dft dddfd|t t d|jt d|j|df|dft dddfd|t t d|jt d|j|df|dft dddfd|t t d|jt d |j|df|dft dddfd|t t d |jt d |j|dfdd dg|_ |t t d |jdd dd d|_dS) NFr r%r)r&ii)rr)r#r)rr)r2rgr3perioddiscriminator_channel_multd_multrrr4r5rrr!rMr\)r>r/rjrstrideuse_spectral_normnorm_f)rrrr3s0:::4zDiscriminatorP.__init__cCsg}|j\}}}||jdkrH|j||j}t|d|fd}||}|||||j|j}x,|jD]"}||}t|t}||qhW| |}||t |dd}||fS)Nrreflectrr) shaperjFpadviewrM leaky_relu LRELU_SLOPErYr\rcflatten)r>r@fmapbrNtZn_padrGrrrrFs     zDiscriminatorP.forward)r%r#F)rrHrIr3rFrJrr)rrrgsrgcs$eZdZfddZddZZS)MultiPeriodDiscriminatorcsJtt|j|_td|jfdd|jD}t||_dS)Nzmpd_reshapes: {}csg|]}t|jdqS))rn)rgrn)r,rs)r/rrr0sz5MultiPeriodDiscriminator.__init__..) r2r{r3Z mpd_reshapesrfformatr4r5discriminators)r>r/r~)r)r/rr3s z!MultiPeriodDiscriminator.__init__c Cstg}g}g}g}xVt|jD]H\}}||\} } ||\} } || || || || qW||||fS)N)rXr~rY) r>yy_haty_d_rsy_d_gsfmap_rsfmap_gsr]rby_d_rfmap_ry_d_gfmap_grrrrFs     z MultiPeriodDiscriminator.forward)rrHrIr3rFrJrr)rrr{s r{cs,eZdZfddZddZddZZS)DiscriminatorRc st||_t|jdks.td|jt|_|jdkrBt nt }t |drrt d|j |j dkrnt nt }|j|_t |drt d|j|j|_t|tjdtd |jd d d |tjtd |jtd |jd d d d|tjtd |jtd |jd d d d|tjtd |jtd |jd d d d|tjtd |jtd |jddd g|_|tjtd |jdddd |_dS)Nr#z*MRD layer requires list with len=3, got {}Fmrd_use_spectral_normz,INFO: overriding MRD use_spectral_norm as {}mrd_channel_multz-INFO: overriding mrd channel multiplier as {}rrh)r# )r)r&)rr)rmr&)r#r#)rr)r2r3 resolutionr9AssertionErrorr}rv lrelu_slopernrrhasattrrfrrkrlrr4r5rrrMr\)r>cfgrro)rrrr3s(   ***0zDiscriminatorR.__init__cCsrg}||}|d}x.|jD]$}||}t||j}||q W||}||t |dd}||fS)Nrr) spectrogram unsqueezerMrrrurrYr\rcrw)r>r@rxrGrrrrF0s     zDiscriminatorR.forwardcCsv|j\}}}tj|t||dt||dfdd}|d}tj||||ddd}t|}tj|ddd }|S) Nrrp)moderFT)n_fft hop_length win_lengthcenterreturn_complexr)pdim) rrrrsrsqueezercstft view_as_realnorm)r>r@rrrZmagrrrr?s ,  zDiscriminatorR.spectrogram)rrHrIr3rFrrJrr)rrrs rcs&eZdZdfdd ZddZZS)MultiResolutionDiscriminatorFcsPtj|_t|jdks0td|jtfdd|jD|_dS)Nr#zSMRD requires list of list with len=3, each element having a list with len=3. got {}csg|]}t|qSr)r)r,r)rrrr0Rsz9MultiResolutionDiscriminator.__init__..) r2r3Z resolutionsr9rr}r4r5r~)r>rdebug)r)rrr3Ks  z%MultiResolutionDiscriminator.__init__c Csxg}g}g}g}xZt|jD]L\}}||d\} } ||d\} } || || || || qW||||fS)N)r@)rXr~rY) r>rrrrrrr]rbrrrrrrrrFUs   z$MultiResolutionDiscriminator.forward)F)rrHrIr3rFrJrr)rrrJs rc CsTd}xFt||D]8\}}x.t||D] \}}|tt||7}q$WqW|dS)Nrr)r?rcrabs)rrlossdrdgrlglrrr feature_lossfs  rc Csvd}g}g}x^t||D]P\}}td|d}t|d}|||7}||||qW|||fS)Nrrr)r?rcrrYitem) Zdisc_real_outputsZdisc_generated_outputsrZr_lossesZg_lossesrrZr_lossZg_lossrrrdiscriminator_lossos rcCsBd}g}x0|D](}td|d}||||7}qW||fS)Nrrr)rcrrY)Z disc_outputsrZ gen_lossesrrGrrrgenerator_loss}s   rc@s&eZdZd ddZddZddZdS) VocoderBigVGANcudacCshtjtj|ddd}ttj|d}t||_|j|d|j ||_ |j |j dS)Nz best_netG.ptcpu) map_locationzargs.yml generator) rcloadospathjoinr rPrload_state_dictevaldeviceto)r>Z ckpt_vocoderrZ vocoder_sdZ vocoder_argsrrrr3s  zVocoderBigVGAN.__init__c CsXtFt|tjr&t|d}|jtj|j d}| | SQRXdS)Nr)dtyper)rcno_grad isinstancenpndarray from_numpyrrfloat32rrrrnumpy)r>specrrrvocodes   zVocoderBigVGAN.vocodecCs ||S)N)r)r>wavrrr__call__szVocoderBigVGAN.__call__N)r)rrHrIr3rrrrrrrs r)r r)r)&rcZtorch.nn.functionalr4 functionalrrtorch.nnrrrZtorch.nn.utilsrrrrrr<r r Zalias_free_torchr omegaconfr rvrr!Moduler"rKrPrgr{rrrrrobjectrrrrrs.    <+S%5