o Bgf@sTddlmZddlZddlZddlZddlZddlZddlmZm Z ddl m Z ddl m Z mZmZddlZddlmZddlmZmZddlmZdd lmZdd lmZdd lmZd d lmZd dlm Z m!Z!e"e#Z$e ddZ%Gdddej&eZ'GdddeZ(d4ddZ)d5d6dd Z*d7d%d&Z+d8d9d+d,Z,d:d;d.d/Z-dasz onload_layer..) named_moduleshasattr isinstancerr offload pre_forwardappendrtorchdeviceoriginal_devicesvalues weights_maprindexlistdictkeysospathsplitjoin post_forwardtensorr r ) layerZoffloaded_modulesrmoduleZbase_layer_offloadr- module_name file_nameZ base_name_arri base_nameZsafetensors_filenamerrr onload_layer)s\            r=cseZdZdZd=fdd Zed>d d Zd?ddZd@ddZe dAddZ dBddZ e dCd d!Z e dDd&d'Z e dEd(d)Ze dFd*d+Ze dFd,d-ZdGd.d/Zd0d1ZdHd2d3ZdIdJd7d8Zd9d:ZdKd;d<ZZS)L BaseTunera A base tuner model that provides the common methods and attributes for all tuners that are injectable into a torch.nn.Module For adding a new Tuner class, one needs to overwrite the following methods: - **_prepare_adapter_config**: A private method to eventually prepare the adapter config, for example in case the field `target_modules` is missing. - **_create_and_replace**: A private method to create and replace the target module with the adapter module. - **_check_target_module_exists**: A private helper method to check if the passed module's key name matches any of the target modules in the adapter_config. The easiest is to check what is done in the `peft.tuners.lora.LoraModel` class. Attributes: model (`torch.nn.Module`): The model to which the adapter tuner layers will be attached. forward (`Callable`): The forward method of the model. peft_config (`Union[`PeftConfig`, dict[str, PeftConfig]]`): The adapter configuration object, it should be a dictionary of `str` to `PeftConfig` objects. One can also pass a PeftConfig object and a new adapter will be created with the default name `adapter` or create a new dictionary with a key `adapter_name` and a value of that peft config. config (`dict[str, Any]`): The model configuration object, it should be a dictionary of `str` to `Any` objects. targeted_module_names (`list[str]`): The list of module names that were actually adapted. Can be useful to inspect if you want to quickly double-check that the `config.target_modules` where specified correctly. peft_config(Union[PeftConfig, dict[str, PeftConfig]] adapter_namestrreturnNonecst||_g|_t|dst|tr||in||_nt dt|tr-||j|<n|j |||_ | |j|j||| |j||j|j_dS)Nr?zAlready found a `peft_config` attribute in the model. This will lead to having multiple adapters in the model. Make sure to know what you are doing!)super__init__modeltargeted_module_namesr#r$rr?loggerinfoupdateactive_adapter_pre_injection_hookinject_adapter)selfrGr?rA __class__rr rFs     zBaseTuner.__init__ list[str]cCt|jtr |jgS|jSNr$rLrBrOrrr active_adapters zBaseTuner.active_adaptersargsrkwargscOs|jj|i|SrT)rGforward)rOrYrZrrr r[szBaseTuner.forwardrG nn.ModuleconfigrcCdS)a A hook to be called before the adapter is injected into the model. This method can be overridden by child classes to perform any pre-injection operations. Args: model (`nn.Module`): The model to be adapted. config (`PeftConfig`): The adapter config. adapter_name (`str`): The adapter name. Nr)rOrGr]rArrr rMs zBaseTuner._pre_injection_hook model_configr/cCr^)a A private method to eventually prepare the adapter config. For transformers based models, if `peft_config.target_modules` is None, we can automatically infer the target modules from the `TRANSFORMERS_MODELS_TO_XXX_TARGET_MODULES_MAPPING`. This method can be further refactored in the future to automatically infer it for all tuner models. Check out `peft.tuner.lora.LoraModel._prepare_adapter_config` for an example. Args: peft_config (`PeftConfig`): The adapter config. model_config (`dict`): The transformers model config, that config should contain the `model_type` key. Nr)rOr?r_rrr _prepare_adapter_configsz!BaseTuner._prepare_adapter_configcCr^)a_ A private method to modify the model structure before adapter is applied. See `peft.tuner.lora.LoraModel._prepare_model` for an example. Args: peft_config (`PeftConfig`): The prepared adapter config. model (`nn.Module`): The model that is going to be adapted. Nr)rOr?rGrrr _prepare_model zBaseTuner._prepare_modelkeyboolcCr^)aq A helper private method to check if the passed module's key name matches any of the target modules in the `peft_config.target_modules` list. If it does, return `True`, else return `False`. Args: peft_config (`PeftConfig`): The adapter config. key (`str`): The module's key name. Nr)r?rcrrr _check_target_module_existsrbz%BaseTuner._check_target_module_existstarget target_nameparent current_keycCr^)a Inplace replacement of the target module with the adapter layer. This method needs to be overridden by all the tuner classes. Check `peft.tuners.lora.LoraModel._create_and_replace` for an example. Args: peft_config (`PeftConfig`): The adapter config. adapter_name (`str`): The adapter name. target (`nn.Module`): The target module. target_name (`str`): The target module's name. parent (`nn.Module`): The parent module. current_key (`str`): The key of the current target being adapted. Nr)rOr?rArfrgrhrirrr _create_and_replaceszBaseTuner._create_and_replacecCr^)a) A helper method to mark only the adapter layers as trainable (i.e. module.requires_grad = False) This needs to be overridden for all tuner classes to match the correct key names. Check `peft.tuners.lora.LoraModel._mark_only_adapters_as_trainable` for an example. Nr)rOrGrrr _mark_only_adapters_as_trainablesz*BaseTuner._mark_only_adapters_as_trainablecCr^)z0 Disable all adapters in-place. NrrVrrr disable_adapter_layersz BaseTuner.disable_adapter_layerscCr^)z. Enable all adapters in-place NrrVrrr enable_adapter_layers#rmzBaseTuner.enable_adapter_layerscCr^)z A helper method to check the config when a new adapter is being added. Raise a ValueError if there is something wrong with the config or if it conflicts with existing adapters. Nr)rOr]rrr _check_new_adapter_config*sz#BaseTuner._check_new_adapter_configcCr^)zHelper method to check whether the adapter can be merged. Raise a ValueError if it is not possible to merge the adapter with the given configuration. NrrVrrr _check_merge_allowed3rmzBaseTuner._check_merge_allowedc s|j|}||t|dddu}d}t|dddi}t|dr%|}|||}|||d}dd |D}t||}|D]S|rtt fd d |j Drtt |\} } } t | t slt | |} t| | | n| |d }qC||s{qC|jd }t |\} } } |j||| | | d qC|std|jd||j|||j|jr|D] \} }|| vrd|_q|rt|dst|j |_ dS|j t|j dSdS)a Creates adapter layers and replaces the target modules with the adapter layers. This method is called under the hood by `peft.mapping.get_peft_model` if a non-prompt tuning adapter class is passed. The corresponding PEFT config is directly retrieved from the `peft_config` attribute of the BaseTuner class. Args: model (`nn.Module`): The model to be tuned. adapter_name (`str`): The adapter name. modules_to_saveNFr] model_typecustomto_dictcSg|]\}}|qSrrrrc_rrr Xz,BaseTuner.inject_adapter..c3s|] }|VqdSrTendswith)rmodule_to_savercrr _  z+BaseTuner.inject_adapter..T)rizTarget modules zL not found in the base model. Please check the target modules and try again.)r?rogetattrr#rtr`rar" _maybe_include_all_linear_layersanyrqrr$rsetattrrKrerHr'rj ValueErrortarget_modules set_adapterrWrkinference_modenamed_parameters requires_gradset)rOrGrAr?Z_check_for_modules_to_saveZ_has_modules_to_saver_Zis_target_modules_in_base_modelkey_listrhrfrg new_modulenprr}r rN:sZ               zBaseTuner.inject_adapterN adapter_namesOptional[list[str]]c CsZ||jD]!}t|tr*t||j|dWdn1s%wYq dS)a This method merges the adapter layers into the base model. Merging adapters can lead to a speed up of the forward pass. A copy of the adapter weights is still kept in memory, which is required to unmerge the adapters. In order to merge the adapter weights without keeping them in memory, please call `merge_and_unload`. Args: safe_merge (`bool`, *optional*): If `True`, the merge operation will be performed in a copy of the original weights and check for NaNs before merging the weights. This is useful if you want to check if the merge operation will produce NaNs. Defaults to `False`. adapter_names (`list[str]`, *optional*): The list of adapter names that should be merged. If `None`, all active adapters will be merged. Defaults to `None`. )rN)rprGmodulesr$BaseTunerLayerr=merge)rOrr8rrr merge_adapters  zBaseTuner.merge_adapterc CsN|jD]}t|tr$t| |Wdn1swYqdS)zU This method unmerges all merged adapter layers from the base model. N)rGrr$rr=unmerge)rOr8rrr unmerge_adapters   zBaseTuner.unmerge_adaptercs@|pj}tfdd|D}|rt|dkrtddSdS)Nc3s|] }j|jVqdSrT)r?rq)radapterrVrr r~rz.BaseTuner._unloading_checks..z?Cannot unload multiple adapters that specify `modules_to_save`.)rWrlenr)rOrZadapters_to_considerZis_modules_to_save_availablerrVr _unloading_checkss  zBaseTuner._unloading_checks)r?r@rArBrCrDrCrR)rYrrZr)rGr\r]rrArBrCrD)r?rr_r/rCr)r?rrGr\)r?rrcrBrCrd)r?rrArBrfr\rgrBrhr\rirBrCrD)rGr\rCrD)r]rrCrD)rGr\rArBrT)rrrCrD)rr)__name__ __module__ __qualname____doc__rFpropertyrWr[rMrr`rarerjrkrlrnrorprNrrr __classcell__rrrPr r>ms4!             S r>c@seZdZUdZdZdZded<dZded<dZded <d Z d ed <gZ d ed<d6ddZ e d7ddZ e d7ddZd8d9ddZd:ddZe d;d d!Ze d;d"d#Ze dd,d-Zd?d.d/Zd@d0d1ZdAd4d5ZdS)BraK A tuner layer mixin that provides the common methods and attributes for all tuners. Args: is_pluggable (`bool`, *optional*): Whether the adapter layer can be plugged to any pytorch module active_adapters (Union[List[`str`], `str`], *optional*): The name of the active adapter. Nrztuple[str, ...]adapter_layer_namesother_param_namesFrd_disable_adaptersdefaultstr | list[str]_active_adapterrRmerged_adaptersrCr\cCs"|}t|dr|j}t|ds|S)z (Recursively) get the base_layer. This is necessary for the case that the tuner layer wraps another tuner layer. r)r#rrOrrrr get_base_layers   zBaseTunerLayer.get_base_layer torch.TensorcCs&|}t|dr|j}|S|j}|S)Nqweight)rr#rweight)rOrrrrr rs  zBaseTunerLayer.weightcCs|}|jSrT)rbiasrrrr rszBaseTunerLayer.bias safe_mergerrrDcCtrTNotImplementedError)rOrrrrr rzBaseTunerLayer.mergecCrrTrrVrrr rrzBaseTunerLayer.unmergecCs t|jSrT)rdrrVrrr mergeds zBaseTunerLayer.mergedcC|jSrT)rrVrrr disable_adapterszBaseTunerLayer.disable_adapterscCrrT)rrVrrr rLrzBaseTunerLayer.active_adapterset[str]cCsFt}|jD]}t||}t|tjtjfsq|t|q|S)z:Return all adapter names that can be found on this module.) rrrr$r ModuleDict ParameterDictrKr0)rOadapters layer_namer8rrr _get_available_adapterss  z&BaseTunerLayer._get_available_adapterscCrSrTrUrVrrr rWrXzBaseTunerLayer.active_adaptersenabledcCsD|r ||jd|_dS|jD] }t||}|dqd|_dS)zToggle the enabling and disabling of adapters Takes care of setting the requires_grad flag for the adapter weights. Args: enabled (bool): True to enable adapters, False to disable adapters FTN)rrWrrrrequires_grad_)rOrrr7rrr enable_adapterss      zBaseTunerLayer.enable_adapterscCs`t|tr|g}|jD]}t||}|D]\}}||vr$|dq|dqq ||_dS)aSet the active adapter(s). Additionally, this function will set the specified adapters to trainable (i.e., requires_grad=True). If this is not desired, use the following code. ```py >>> for name, param in model_peft.named_parameters(): ... if ...: # some check on name (ex. if 'lora' in name) ... param.requires_grad = False ``` Args: adapter_name (`str` or `List[str]`): Name of the adapter(s) to be activated. TFN)r$rBrritemsrr)rOrr module_dictrcr7rrr r)s      zBaseTunerLayer.set_adaptercCsBt}|j|jD]}t||}t|dr||q t|S)z3Return a sorted list of all available adapter namesr0)rrrrr#rKr0sorted)rOrrattrrrr _all_available_adapter_namesHs  z+BaseTunerLayer._all_available_adapter_namesrArBcCs|j|jD]}|t||vrt|||=q||jvrV|jdd}|||r0||dS|}|s=|gdS|d}td|d|d||ddSdS)a Delete an adapter from the layer This should be called on all adapter layers, or else we will get an inconsistent state. This method will also set a new active adapter if the deleted adapter was an active adapter. It is important that the new adapter is chosen in a deterministic way, so that the same adapter is chosen on all layers. Args: adapter_name (`str`): The name of the adapter to delete NrzAdapter z< was active which is now deleted. Setting active adapter to .) rrrrWremoverrwarningswarn)rOrArrWZremaining_adaptersZnew_active_adapterrrr delete_adapterSs(    zBaseTunerLayer.delete_adapter)rCr\)rCr)FN)rrdrrrCrDr)rCrd)rCr)rCr)rrdrCrD)rrrCrDr)rArBrCrD)rrrrrLr__annotations__rrrrrrrrrrrrrrWrrrrrrrr rs8                    rrcrBrCbool | re.Match[str] | Nonecs.t|jtrt|j}|S|jvrd}|Stfdd|jD}t|dd}t|dd}|duo@t|tr?t|dknd}|r|rd}|dusQt|dkrXt d}nt|tr`|gn|}|D]}t d |d }|durvnqd|durd }|St | d }t|t r||k}|S||v}|S) aA helper method to check if the passed module's key name matches any of the target modules in the adapter_config. Args: config (`LoraConfig` | `LycorisConfig`): A config to match target modules from key (`str`): A key to search any matches in config Returns: `bool` | `re.Match[str]` | `None`: True of match object if key matches any target modules from config, False or None if no match found Tc3s |] }d|VqdS)rNrz)r target_keyr}rr r~sz-check_target_module_exists..layers_to_transformNlayers_patternrz.*\.[^.]*\.(\d+)\.z.*\.z \.(\d+)\.Fr) r$rrBre fullmatchrrr.rmatchintgroup)r]rctarget_module_foundZ layer_indexesrZis_using_layer_indexes layer_indexpatternrr}r check_target_module_existsys> #     rrtunerrAr/cCs`|j|}dd|jD}ggd}|D]}|||r&|d|q|d|q|S)zw A helper function to inspect the set of matched and unmatched modules for a PEFT model and the given adapter. cSrurrrvrrr rxryz+inspect_matched_modules..)matched unmatchedrr)r?rGr"rer')rrAr]rrrcrrr inspect_matched_moduless   rr?rrGr\cst|jtr |jtks|St|tstdtdtjj t f}t }| D]\}}t||r?| ddd}||q)|durZfdd| Dd }||h8}||_|S) z Helper function to update `target_modules` to all linear/Conv1D layers if provided as 'all-linear'. Adapted from the QLoRA repository: https://github.com/artidoro/qlora/blob/main/qlora.py z:Only instances of PreTrainedModel support `target_modules=`rrNcsg|] \}}|ur|qSrr)rrr8Z output_embrr rxsz4_maybe_include_all_linear_layers..r)r$rrBlowerrr rr(r Linearrrr"rsplitaddget_output_embeddings)r?rGZlinear_classesZlinear_module_namesrr8namesZlast_module_namerrr rs,        rr8rrrRcs|dur|j}t|trtd|d|jrAt|jfdd|D}|r.z'Already following adapters were merged ,z#. You are now additionally merging z/All adapters are already merged, nothing to do.) rWr$rBrrrrrrr4)r8rrrr check_adapters_to_merges    rFcCs>t|}ddd}|r|D] \}}||||q|S)zClone a module in a pytorch model. Clones a module of a model, optionally sharing all the parameters between the original and the clone. Simplifies reusing a module when manipulating the architecture of a model. srcr\dstcSs&|jddD] \}}|||qdS)NF)recurse)rregister_parameter)rrrrrrr _share_weightssz$clone_module.._share_weightsN)rr\rr\)copydeepcopyr" get_submodule)r8 share_weightsclonerr submodulerrr clone_modules  r layer_maplist[tuple[int, int]]c Cs^t|dr |j}t|dst|dr|j}d}d}t|dr$d}|j}nt|dr6t|jdr6d}|jj}n t|dr@d }|j}|rHt|tj sLt d g}|D],\}}t ||D]"}t |}| t||d d |d D] } t| drz|| _qpqYqPt |}|dkr||_n|dkr||j_n |d kr||_nt dt|jdrt ||j_dSdS)a~Replicate layers in a transfomer model with weight sharing. This function looks for a module list attribute at model[(.model)*].layers and replicates the layers in the module list according to the layer map. For example the map `[[0, 4], [2, 5]]` will take the set of layers `[0, 1, 2, 3, 4]` and replace them with a module list containing `[0, 1, 2, 3, 2, 3, 4]`. rGbertNlayersllamaencoderr7hfalconzlCould not locate the layers attribute in the model. Expected Llama, Bert or Falcon compatible architectures.T)rr layer_idxz@Unexpected model type, need to handle post-processing of layers.num_hidden_layers)r#rGrrrr7rr$r ModuleListrrangerr'rrrr]r) rGrrrr new_layersstartendr;Z current_idxrrrr replicate_layers sT          r)rcrBrCr)r)rr>rArBrCr/)r?rrGr\rCrrT)r8rrrrCrR)F)r8r\)rGr\rr)/ __future__rrloggingr1rrabcrr contextlibrtypingrrrr(accelerate.hooksr accelerate.utilsr r r transformersr transformers.pytorch_utilsr peft.utilsrr]rutilsrr getLoggerrrIr=Moduler>rrrrrrrrrrr s@          CK D 2  $