o v_f5@s0dZddlZddlZddlZddlZddlZddlZddlZddlZddl Zddl Zddl m Z ddl mZddlZddlZddlZddlmZddlmZmZddlmZddlmZdd lmZmZm Z d d Z!d d Z"ddZ#ddZ$ddZ%ddZ&de'fddZ(de'de'fddZ)de'de'de'fdd Z*de'de e'fd!d"Z+dEde'd$e'd%e,ddfd&d'Z-  dFde'd(e'd$e e'd)e e'ddf d*d+Z.    ,dGde'd-e'd.e e'd$e e'd)e e'd/e/ddfd0d1Z0de'd2e'de'fd3d4Z1d5d6Z2dHd8d9Z3dId:d;Z4dZ5d?dZ&d@d Z"dAdBZ6dCdDZ7dS)Jz Copyright (c) 2022, salesforce.com, inc. All rights reserved. SPDX-License-Identifier: BSD-3-Clause For full license text, see the LICENSE file in the repo root or https://opensource.org/licenses/BSD-3-Clause N)Optional)urlparse)download) file_lock g_pathmgr)registrytqdm)check_integritydownload_file_from_google_driveextract_archivecCs"ddlm}|dddS)Nrdatetimez %Y%m%d%H%M)rnowstrftimer r@/mnt/petrelfs/wufan/project/UnimerDemo/unimernet/common/utils.pyr#s rcCst|}|jdvS)N)httphttps)rscheme)url_or_filenameparsedrrris_url)s rcCstjtjtd|S)N cache_root)ospath expanduserjoinrget_pathrel_pathrrrget_cache_path.sr"cCstjtd|S)N library_root)rrrrrr rrr get_abs_path2sr$cCs8t|d }t|WdS1swYdS)Nr)openjsonload)filenamefrrr load_json6s $r+cCsFd}zt|s t|d}W|Sty"td|Y|Swz4 Create the directory if it does not exist. FTzError creating directory: )rexistsmkdirs BaseExceptionprintZdir_path is_successrrrmakedir@s   r3urlc Csddl}|>}|j|ddd%}|jr&|jWdWdS|WdWdS1s9wYWddS1sIwYdS)zh Given a URL, returns the URL it redirects to or the original URL in case of no indirection rNTstreamallow_redirects)requestsSessiongethistoryr4)r4r8sessionresponserrrget_redirected_urlNs "r>view_urlreturncCs,|d}|ddks J|d}d|S)a8 Utility function to transform a view URL of google drive to a download URL for google drive Example input: https://drive.google.com/file/d/137RyRjvTBkBiIfeYBNZBtViDHQ6_Ewsp/view Example output: https://drive.google.com/uc?export=download&id=137RyRjvTBkBiIfeYBNZBtViDHQ6_Ewsp /rviewz/https://drive.google.com/uc?export=download&id=)split)r?splitsfile_idrrrto_google_drive_download_url]s  rG output_pathoutput_file_namec Csddl}|}|j|ddd}|jD]\}}|dr&|d|}qWdn1s1wY|j|dddb}t|tj ||}t |j dd} t |d 9} dd l m } | | d } |jtjd D]} | | | t| qlWdn1swYWdn1swYWdn1swYWddSWddS1swYdS) z Download a file from google drive Downloading an URL from google drive requires confirmation when the file of the size is too big (google drive notifies that anti-viral checks cannot be performed on such files) rNTr5Zdownload_warningz &confirm=)r6verifyzContent-lengthwbrtotal) chunk_size)r8r9r:cookiesitems startswithr3rrrintheadersr&r iter_contentioDEFAULT_BUFFER_SIZEwriteupdatelen)r4rHrIr8r<r=kvr total_sizefiler progress_barblockrrrdownload_google_drive_urlls<        "r`cCsBt|}td|jdurdStd|j}|durdS|dS)Nz(drive|docs)[.]google[.]comz/file/d/(?P[^/]*)id)rrematchnetlocrgroup)r4partsrcrrr_get_google_drive_file_ids rgr)rNc st|d`}tjtjj|ddid9tjd#}tfdddD]}|s,n || |q&Wdn1sAwYWdn1sPwYWddSWddS1shwYdS) NrKz User-AgentZvissl)rSrLcs S)N)readrrNr=rrs z_urlretrieve..) r&urllibrequesturlopenRequestr lengthiterrXrW)r4r)rNfhpbarchunkrrjr _urlretrieves$   "rvrootmd5c Cstj|}|stj|}tj||}t|t||r&td|dSt|}t |}|dur9t ||||Sztd|d|t ||Wn6t j jtfy}z&|dddkrt|dd}td |d|t ||n|WYd}~nd}~wwt||std dS) a~Download a file from a url and place it in root. Args: url (str): URL to download file from root (str): Directory to place downloaded file in filename (str, optional): Name to save the file under. If None, use the basename of the URL. md5 (str, optional): MD5 checksum of the download. If None, do not check z$Using downloaded and verified file: N Downloading  to rzhttps:zhttp:z;Failed download. Trying https -> http instead. Downloading zFile not found or corrupted.)rrrbasenamerr3r r0r>rgr rvrmerrorURLErrorIOErrorreplace RuntimeError)r4rwr)rxfpathrFerrr download_urlsF        rF download_root extract_rootremove_finishedcCsdtj|}|dur |}|stj|}t||||tj||}td||t|||dS)NzExtracting {} to {}) rrrr|rrr0formatr )r4rrr)rxrarchiverrrdownload_and_extract_archives  r cache_dircCst|}tj|tj|jd}t||dd}tj||}t|!tj |sCt d|d|dt |||d}Wdn1sMwYt d|d ||S) z This implementation downloads the remote resource and caches it locally. The resource will only be downloaded if not previously requested. rArryrzz ...)r)NzURL z cached in ) rrrrdirnamelstripr3rDrisfilelogginginfor)r4r parsed_urlrr)cachedrrr cache_urls  rc Cs^zt|r t|t||WdSty.}ztd|WYd}~dSd}~ww)z Simply create the symlinks for a given file1 to file2. Useful during model checkpointing to symlinks to the latest successful checkpoint. z!Could NOT create symlink. Error: N)rr-rmsymlink Exceptionrr)Zfile1Zfile2rrrrcreate_file_symlinks  rTcCs|r td|tj|d}|dvr5t|d}t||tj Wdn1s/wYn|dkrVt|d}t ||Wdn1sPwYn|dkr|rt|d}| t j|d d d |Wdn1s}wYnXt|d }| t j|d d d |Wdn1swYn1|d krt|d }t|}| ||Wdn1swYntd|d|rtd|dSdS)a Common i/o utility to handle saving data to various file formats. Supported: .pkl, .pickle, .npy, .json Specifically for .json, users have the option to either append (default) or rewrite by passing in Boolean value to append_to_json. zSaving data to file: z.pklz.picklerKN.npy.jsonaT) sort_keys w.yamlzSaving  is not supported yetzSaved data to file: )rrrrsplitextrr&pickledumpHIGHEST_PROTOCOLnpsaverWr'dumpsflushyamlr)datar)Zappend_to_jsonverbosefile_extfopenrrrr save_filesH     rc Cs|r td|tj|d}|dkr3t|d }|}Wd|S1s,wY|S|dvrWt|d}tj |dd }Wd|S1sPwY|S|d kr|rz$t|d}t j ||d|d }WdW|S1szwYW|St y}z!td |d |dt j ||d|d }tdWYd}~|Sd}~wt ytdt|d}t j ||dd}WdY|S1swYY|Swt|d}t j ||dd}Wd|S1swY|S|dkr"t|d}t |}Wd|S1swY|S|dkrIt|d}tj |tjd}Wd|S1sBwY|S|dkrmt|d}t|}Wd|S1sfwY|St d|d)a Common i/o utility to handle loading data from various file formats. Supported: .pkl, .pickle, .npy, .json For the npy files, we support reading the files in mmap_mode. If the mmap_mode of reading is not successful, we load data without the mmap_mode. zLoading data from file: rz.txtr%Nrrblatin1)encodingr) allow_pickler mmap_modezCould not mmap z: z. Trying without g_pathmgrz%Successfully loaded without g_pathmgrz5Could not mmap without g_pathmgr. Trying without mmap)rrrr)Loaderz.csvz Reading from r)rrrrrrr& readlinesrr(r ValueErrorrr'r FullLoaderpdread_csv)r)rrrrrrrrrr load_file9s   ,, )) $$             r resource_pathcCs(td}||durtj|S|S)zb Make a path absolute, but take into account prefixes like "http://" or "manifold://" z^\w+://N)rbcompilercrrabspath)rregexrrrrvs  rcCsHd}zt|s t|d}W|Sty#td|Y|Swr,)rr-r.r/rrr1rrrr3s   cCstd|tjdu}|S)zV Check if an input string is a url. look for http(s):// and ignoring the case z^(?:http)s?://N)rbrc IGNORECASE)Z input_urlrrrrrcCs:tj|rtd|t|td|dS)z Utility for deleting a directory. Useful for cleaning the storage space that contains various training artifacts like checkpoints, data etc. zDeleting directory: zDeleted contents of directory: N)rrr-rrshutilrmtree)dirrrr cleanup_dirs  rcCstj|td}|S)z2 Given a file, get the size of file in MB i)rrgetsizefloat)r)Z size_in_mbrrr get_file_sizerr)rh)NN)NNNF)TT)NTF)8__doc__rUr'rrrrbrrm urllib.errorurllib.requesttypingr urllib.parsernumpyrpandasrrZiopath.common.downloadrZiopath.common.file_iorrunimernet.common.registryrZtorch.utils.model_zoor torchvision.datasets.utilsr r r rrr"r$r+r3strr>rGr`rgrRrvrboolrrrrrrrrrrrrs      !  9   &=