gX=ddlmZddlmZddlmZddlZddlmZddlZddl m Z m Z ddl m Z ddl Z ddlZddlZddlZdZdZdd Zd Zd Zd Zdd ZddZdZddZdZdZdZddZddZdZ y))absolute_import)division)print_functionN)Paralleldelayedpesqg@gư>cg}|rtjj|rRtjj |d}t |dk(r!tjj |d}|St|5}|D]4}|jj}|j|d6 ddd|St|5}|D]q}|jj}t |dk(r|d|dt|dd }nt |dk(r |d|dd }|js ddd|S#1swY|SxYw#1swY|SxYw) aiReads input paths from a file or directory and configures them for processing. Args: input_path (str): Path to the input directory or file. decode (int): Flag indicating if decoding should occur (1 for decode, 0 for standard read). Returns: list: A list of processed paths or dictionaries containing input and label paths. wav)extrflacN)inputslabelsduration)rr) ospathisdirlibrosautil find_fileslenopenstripsplitappendfloat) input_pathdecodeprocessed_listfidlinepath_s tmp_pathssamples W/Users/zhaoshengkui/Downloads/github/ClearerVoice-Studio/clearvoice_super/utils/misc.pyread_and_config_filer)snN 77== $$\\44ZU4KN>"a'!(!8!8!8!P j! 5S5D!ZZ\//1F"))&)45 5 j *S *D **,I9~"$-aLIaLV[\efg\hVijY1$$-aLIaLI  ! !& )  **  5* s:EA7E E E*c6tj|d}|S)aLoads the model checkpoint from the specified path. Args: checkpoint_path (str): Path to the checkpoint file. use_cuda (bool): Flag indicating whether to use CUDA for loading. Returns: dict: The loaded checkpoint containing model parameters. c|SN)storagelocs r(z!load_checkpoint..Jsw) map_location)torchload)checkpoint_pathuse_cuda checkpoints r(load_checkpointr8=sO:VWJ r1c&|jddS)zRetrieves the current learning rate from the optimizer. Args: optimizer (torch.optim.Optimizer): The optimizer instance. Returns: float: The current learning rate. rlr param_groups) optimizers r(get_learning_rater>Ms  ! !! $T **r1ctdj|tjj |d}tjj |d}tjj |r|}n.tjj |r|}n tdyt |d5}|jj}dddtjj |}tdj|t||} d| vr| d} n| } |j} | jD]} | | vr(| | j| | jk(r | | | | </| jd d | vrH| | j| | jd d jk(r| | jd d | | <d | z| vr.| | j| d | zjk(r | d | z| | <tjst| d |j| td j|y#1swYxYw) a%Reloads a model for evaluation from the specified checkpoint directory. Args: model (nn.Module): The model to be reloaded. checkpoint_dir (str): Directory containing checkpoints. use_cuda (bool): Flag indicating whether to use CUDA. Returns: None zReloading from: {}last_best_checkpointlast_checkpointz4Warning: No existing checkpoint or best_model found!NrzCheckpoint path: {}modelzmodule.z not loadedz-=> Reload well-trained model {} for decoding.)printformatrrjoinisfilerreadlinerr8 state_dictkeysshapereplaceselfload_state_dict) rCcheckpoint_dirr6 best_name ckpt_namenamef model_namer5r7pretrained_modelstatekeys r(reload_for_evalrYXs=  % %n 56 ^-CDI ^->?I ww~~i   " DE dC*AZZ\'') *ggll>:>O  & & 78 (;J *%g.%    Ezz|4 " "uSz'7'7;KC;P;V;V'V)#.E#J [[B '+; ;c @P@PTdehepepqz|~eUAUGUGAG*3;;y"+EFU3Z s]. .5:3C3CGWXabeXeGfGlGl3l*9S=9U3Z ZZ#k234 %  9 @ @ LM5**s 7II(ctjj|d}tjj|rt |d5}|j j }dddtjj|}t||} |j| d||j| d| d} | d} td | | fStd d } d } | | fS#1swYxYw) aReloads the model and optimizer state from a checkpoint. Args: model (nn.Module): The model to be reloaded. optimizer (torch.optim.Optimizer): The optimizer to be reloaded. checkpoint_dir (str): Directory containing checkpoints. use_cuda (bool): Flag indicating whether to use CUDA. strict (bool): If True, requires keys in state_dict to match exactly. Returns: tuple: Current epoch and step. r7rBNrC)strictr=epochstepz)=> Reloaded previous model and optimizer.z8[!] Checkpoint directory is empty. Train a new model ...r) rrrGrHrrIrr8rOrE) rCr=rPr6r[rRrTrUr5r7r\r]s r( reload_modelr^s ^\:I ww~~i )S ! .Q++-J .'',,~zB$_h?  j1&A!!*["9:7#&! 9: $; HI $; . .s C55C>ctjj|dj||}t j |j |j ||d|ttjj||d5}|jdj||dddtd|y#1swYxYw)aSaves the model and optimizer state to a checkpoint file. Args: model (nn.Module): The model to be saved. optimizer (torch.optim.Optimizer): The optimizer to be saved. epoch (int): Current epoch number. step (int): Current training step number. checkpoint_dir (str): Directory to save the checkpoint. mode (str): Mode of the checkpoint ('checkpoint' or other). Returns: None zmodel.ckpt-{}-{}.pt)rCr=r\r]wNz=> Saved checkpoint:) rrrGrFr3saverJrwriterE)rCr=r\r]rPmoder5rTs r(save_checkpointrdsggll-44UDACO JJ))+&113 /0 bggll>40# 6;! %,,UD9:; /2;;s "C  Cc0|jD]}||d< y)a Sets the learning rate for all parameter groups in the optimizer. Args: opt (torch.optim.Optimizer): The optimizer instance whose learning rate needs to be set. lr (float): The new learning rate to be assigned. Returns: None r:Nr;)optr: param_groups r(setup_lrrhs#''  Dr1c8 t|||d}|S#d}Y|SxYw)agCalculates the PESQ (Perceptual Evaluation of Speech Quality) score between clean and noisy signals. Args: clean (ndarray): The clean audio signal. noisy (ndarray): The noisy audio signal. sr (int): Sample rate of the audio signals (default is 16000 Hz). Returns: float: The PESQ score or -1 in case of an error. wbr)cleannoisysr pesq_scores r( pesq_lossrps1"eUD1   sctddt||D}tj|}d|vry|dz dz }t j |j dS)a:Computes the PESQ scores for batches of clean and noisy audio signals. Args: clean (list of ndarray): List of clean audio signals. noisy (list of ndarray): List of noisy audio signals. Returns: torch.FloatTensor: A tensor of normalized PESQ scores or None if any score is -1. rk)n_jobsc3NK|]\}}tt||ywr,)rrp).0cns r( zbatch_pesq..s#$\$!Q%7WY%71%=$\s#%Nrg @cuda)rzipnparrayr3 FloatTensorto)rlrmros r( batch_pesqr~sh%$$\#eUZJ[$\\J*%J Zq.C'J   Z ( + +F 33r1c0|d}|d}tj||}tj|}tj|}|dz}|tj|z}|tj |z}tj ||gdS)zCompresses the power of a complex spectrogram. Args: x (torch.Tensor): Input tensor with real and imaginary components. Returns: torch.Tensor: Compressed magnitude and phase representation of the input. ).r).rg333333?rr3complexabsanglecossinstack)xrealimagspecmagphase real_compress imag_compresss r(power_compressrs V9D V9D ==t $D ))D/C KK E s(C%))E**M%))E**M ;; }5q 99r1ctj||}tj|}tj|}|dz}|tj|z}|tj |z}tj ||gdS)aUncompresses the power of a compressed complex spectrogram. Args: real (torch.Tensor): Compressed real component. imag (torch.Tensor): Compressed imaginary component. Returns: torch.Tensor: Uncompressed complex spectrogram. g @rkr)rrrrrreal_uncompressimag_uncompresss r(power_uncompressrst ==t $D ))D/C KK E -CEIIe,,OEIIe,,O ;;92 >>r1c |j}|j}|j}|j}|dk(r1t j ||j |j} nF|dk(r1t j||j |j} ntd|dyt j|||||| |dS) aXComputes the Short-Time Fourier Transform (STFT) of an audio signal. Args: x (torch.Tensor): Input audio signal. args (Namespace): Configuration arguments containing window type and lengths. center (bool): Whether to center the window. Returns: torch.Tensor: The computed STFT of the input signal. hammingperiodichanningz In STFT,  is not supported!NF)centerwindowonesidedreturn_complex) win_typewin_lenwin_incfft_lenr3hamming_windowr}device hann_windowrEstft) rargsrrrrrrrrs r(rr#s}}HllGllGllG9%%gADDQXXN Y ""7X>AA!((K (#567 ::a'76&[cty zzr1c |j}|j} |j} |j} |dk(r1t j | |j |j} nF|dk(r1t j| |j |j} ntd|dy t j|| | | | ||||d } | S#t j|}t j|| | | | ||||d } Y| SxYw) aComputes the inverse Short-Time Fourier Transform (ISTFT) of a complex spectrogram. Args: x (torch.Tensor): Input complex spectrogram. args (Namespace): Configuration arguments containing window type and lengths. slen (int, optional): Length of the output signal. center (bool): Whether to center the window. normalized (bool): Whether to normalize the output. onesided (bool, optional): If True, computes only the one-sided transform. return_complex (bool): If True, returns complex output. Returns: torch.Tensor: The reconstructed audio signal from the spectrogram. rrrz In ISTFT, rNF) n_fft hop_length win_lengthrr normalizedrlengthr) rrrrr3rr}rrrEistftview_as_complex)rrslenrrrrrrrrrroutput x_complexs r(rr?s}}HllGllGllG9%%gADDQXXN Y ""7X>AA!((K 8*$678 TQg'g%+Fz'/UT M T))!, Yg'V]%+Fz'/UT Ms .C6Dc |j|jz dz}|j|jz dz}tjj j |d|||j|j|jS)a%Computes the filter bank features from an audio signal. Args: audio_in (torch.Tensor): Input audio signal. args (Namespace): Configuration arguments containing window length, shift, and sampling rate. Returns: torch.Tensor: Computed filter bank features. ig?)dither frame_length frame_shift num_mel_binssample_frequency window_type) r sampling_rater torchaudio compliancekaldifbanknum_melsr)audio_inrrrs r( compute_fbankris<<$"4"44t;L,,!3!33d:K  & & , ,XcP\9DSWS`S`>B>P>P^b^k^k - mmr1)r)TT)r7)i>)FFN)NFFFNF)! __future__rrrr3torch.nnnnnumpyrzjoblibrrr rsysrr MAX_WAV_VALUEEPSr)r8r>rYr^rdrhrpr~rrrrrr-r1r(rs '% $   #J +0Nf<34 &4,:*?({8(Tmr1