o g@sR dZddlZddlmZmZmZddlmZmZm Z m Z m Z m Z m Z mZddlmZddlZddlmZddlmZmZddlmZdd lmZmZmZmZmZmZm Z m!Z!m"Z"m#Z#m$Z$dd l%m&Z&gd Z'dd dZ(ide(ddde(ddde(ddddde(dddde(dddde(d ddd!e(d"ddd#e(d$ddd%e(d&ddd'e(d(ddd)e(d*ddd+e(d,d-ddd.d/d0d1e(d2ddd3d4d5e(d2ddd3d4d6e(d2ddd3d4d7e(d2ddd3d4d8e(d2ddd3dd9Z)eGd:d;d;Z*eGde&ddd1Z?e&ddd5Z@e&ddd6ZAe&ddd7ZBe&ddd8ZCde e*ee*fdFee*fddZDddZEeGdddZFGdddejGZHdeFfddZIGdddejGZJGdddejGZKGdddejGZLGdddejGZMGdddejGZNGdddejGZOe1eJeKeLeMeNeOdZPde-dejGfddZQde e-ejGffddZRGdddejSZTddeFfddZUdddZVddZWde e-e fde*de+fddĄZXddeXfde+deYde.de e-e fde e.de eFde e fdd̄ZZde+fdd΄Z[GddЄdejGZ\ddd҄Z]dddԄZ^dS)a Bring-Your-Own-Blocks Network A flexible network w/ dataclass based config for stacking those NN blocks. This model is currently used to implement the following networks: GPU Efficient (ResNets) - gernet_l/m/s (original versions called genet, but this was already used (by SENet author)). Paper: `Neural Architecture Design for GPU-Efficient Networks` - https://arxiv.org/abs/2006.14090 Code and weights: https://github.com/idstcv/GPU-Efficient-Networks, licensed Apache 2.0 RepVGG - repvgg_* Paper: `Making VGG-style ConvNets Great Again` - https://arxiv.org/abs/2101.03697 Code and weights: https://github.com/DingXiaoH/RepVGG, licensed MIT In all cases the models have been modified to fit within the design of ByobNet. I've remapped the original weights and verified accuracies. For GPU Efficient nets, I used the original names for the blocks since they were for the most part the same as original residual blocks in ResNe(X)t, DarkNet, and other existing models. Note also some changes introduced in RegNet were also present in the stem and bottleneck blocks for this model. A significant number of different network archs can be implemented here, including variants of the above nets that include attention. Hacked together by / copyright Ross Wightman, 2021. N) dataclassfieldreplace)TupleListDictOptionalUnionAnyCallableSequence)partialIMAGENET_DEFAULT_MEANIMAGENET_DEFAULT_STD)build_model_with_cfg) ClassifierHead ConvBnActBatchNormAct2dDropPath AvgPool2dSame create_conv2d get_act_layerconvert_norm_actget_attnmake_divisible to_2tuple)register_model)ByobNet ByoModelCfg ByoBlockCfgcreate_byob_stem create_blockc Ks|dddddttddd |S) N)r')r(g?bilinearz stem.convzhead.fc) url num_classes input_size pool_sizecrop_pct interpolationmeanstd first_conv classifierr)r*kwargsr5,/home/terry/ogs_model/timm/models/byobnet.py_cfg,sr7gernet_szjhttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-ger-weights/gernet_s-756b4751.pth)r*gernet_mzjhttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-ger-weights/gernet_m-0873c53a.pthgernet_lzjhttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-ger-weights/gernet_l-f31e2e8d.pth)r&r;)r<)r*r,r- repvgg_a2znhttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-repvgg-weights/repvgg_a2-c1ee6d2b.pth)zstem.conv_kxk.convzstem.conv_1x1.conv)r*r2 repvgg_b0znhttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-repvgg-weights/repvgg_b0-80ac3f1b.pth repvgg_b1znhttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-repvgg-weights/repvgg_b1-77ca2989.pth repvgg_b1g4zphttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-repvgg-weights/repvgg_b1g4-abde5d92.pth repvgg_b2znhttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-repvgg-weights/repvgg_b2-25b7494e.pth repvgg_b2g4zphttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-repvgg-weights/repvgg_b2g4-165a85f2.pth repvgg_b3znhttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-repvgg-weights/repvgg_b3-199bc50d.pth repvgg_b3g4zphttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-repvgg-weights/repvgg_b3g4-73c370bf.pth resnet51qzkhttps://github.com/rwightman/pytorch-image-models/releases/download/v0.1-weights/resnet51q_ra2-d47dcc76.pthz stem.conv1)r& rF?)r*r2r,r-test_input_sizer. resnet61qzstem.conv1.convbicubic)r2r,r-r/ geresnet50t gcresnet50t gcresnext26tsbat_resnext26ts)r2r,r-r/min_input_sizec@seZdZUeeejfed<eed<eed<dZ eed<dZ e eee fed<dZ eed <dZe eed <dZe eeefed <dZe eed <dZe eeefed <dZe eeefed<dS)r!typedcsNgsrGbr attn_layer attn_kwargsself_attn_layerself_attn_kwargs block_kwargs)__name__ __module__ __qualname__r strnnModule__annotations__intrTrUrr rVfloatrWrXrr rYrZr[r5r5r5r6r!ns   r!c@seZdZUeeeeedffdfed<dZeed<dZ eed<dZ e eed<d Z e ed <d Zeed <d Ze ed<dZeed<dZeed<dZeed<dZeed<dZe eed<edddZeed<dZe eed<edddZeed<ed ddZeeefed!<dS)"r .blocksconv1x1 downsample3x3 stem_typemaxpool stem_pool stem_chsrG width_factorr num_featuresTzero_init_last_bnFfixed_input_sizerelu act_layer batchnorm norm_layerNrWcCtSNdictr5r5r5r6zByoModelCfg.)default_factoryrXrYcCrvrwrxr5r5r5r6rzr{rZcCrvrwrxr5r5r5r6rzr{r[)r\r]r^rr r!rbrgr_rirkrrmrcrnrdrorpboolrqrsrurWrrXryrYrZr[rr r5r5r5r6r s"          "r rrGrGrGrGcs>d}ddkrfddtfddt|||D}|S)N)@r;rcs|dddkr |SdS)NrrSrr5)chsidx)groupsr5r6rzz_rep_vgg_bcfg..cs&g|]\}}}td|||dqS)rep)rPrQrRrU)r!).0rQrRwf) group_sizer5r6 &z!_rep_vgg_bcfg..)tuplezip)rQrrrRbcfgr5)rrr6 _rep_vgg_bcfgs  rFtypeseveryfirstreturncKst|dksJt|tr tt|rdn|||}|s |dg}t|g}t|D]}||vr4|dn|d}|td|dd|g7}q*t|S)z' interleave 2 block types in stack rSrr)rPrQNr5)len isinstancerclistrangesetr!r)rrrQrr4rei block_typer5r5r6interleave_blockss   rbasicrrS)rPrQrRrTrUrVbottlerig?@rrli )rermrkro0r&r(ii0r; i)rSrr)?rrg@)rQrrr)rerirm)rGrGrG@)r)@rrg@)rr)rrr@)rrrrriquad2isilu)rermrirkrorsedge)rPrQrRrTrUrVr[quadT) extra_conv)rermrirkrorsr[)rPrQrRrTrVitieredger<)extent extra_params)rermrirkrWrXgc)rermrirkrWrj)rermrirkrorsrWbat) block_size)rermrirkrorsrWrXcKtdd|i|S)z GEResNet-Large (GENet-Large from official impl) `Neural Architecture Design for GPU-Efficient Networks` - https://arxiv.org/abs/2006.14090 r: pretrainedN)r:_create_byobnetrr4r5r5r6r:^cKr)z GEResNet-Medium (GENet-Normal from official impl) `Neural Architecture Design for GPU-Efficient Networks` - https://arxiv.org/abs/2006.14090 r9rN)r9rrr5r5r6r9frcKr)z EResNet-Small (GENet-Small from official impl) `Neural Architecture Design for GPU-Efficient Networks` - https://arxiv.org/abs/2006.14090 r8rN)r8rrr5r5r6r8nrcKr)z^ RepVGG-A2 `Making VGG-style ConvNets Great Again` - https://arxiv.org/abs/2101.03697 r=rN)r=rrr5r5r6r=vrcKr)z^ RepVGG-B0 `Making VGG-style ConvNets Great Again` - https://arxiv.org/abs/2101.03697 r>rN)r>rrr5r5r6r>~rcKr)z^ RepVGG-B1 `Making VGG-style ConvNets Great Again` - https://arxiv.org/abs/2101.03697 r?rN)r?rrr5r5r6r?rcKr)z` RepVGG-B1g4 `Making VGG-style ConvNets Great Again` - https://arxiv.org/abs/2101.03697 r@rN)r@rrr5r5r6r@rcKr)z^ RepVGG-B2 `Making VGG-style ConvNets Great Again` - https://arxiv.org/abs/2101.03697 rArN)rArrr5r5r6rArcKr)z` RepVGG-B2g4 `Making VGG-style ConvNets Great Again` - https://arxiv.org/abs/2101.03697 rBrN)rBrrr5r5r6rBrcKr)z^ RepVGG-B3 `Making VGG-style ConvNets Great Again` - https://arxiv.org/abs/2101.03697 rCrN)rCrrr5r5r6rCrcKr)z` RepVGG-B3g4 `Making VGG-style ConvNets Great Again` - https://arxiv.org/abs/2101.03697 rDrN)rDrrr5r5r6rDrcKr) rErN)rErrr5r5r6rEcKr)rrIrN)rIrrr5r5r6rIrcKr)rrKrN)rKrrr5r5r6rKrcKr)rrLrN)rLrrr5r5r6rLrcKr)rrMrN)rMrrr5r5r6rMrcKr)rrNrN)rNrrr5r5r6rNrstage_blocks_cfgcsFt|ts|f}g}t|D]\}|fddtjD7}q|S)Ncsg|]}tddqS)rrQ)r)r_cfgr5r6rsz%expand_blocks_cfg..)rr enumeraterrQ)r block_cfgsrr5rr6expand_blocks_cfgs rcCs |sdS||dks J||S)Nrrr5)rchannelsr5r5r6 num_groupssrc@sTeZdZUeZeed<eZeed<e j Z eed<dZ e eed<dZe eed<dS)LayerFn conv_norm_actnorm_actactNattn self_attn)r\r]r^rrr rbrrr`ReLUrrrrr5r5r5r6rs  rcs,eZdZd deffdd ZddZZS) DownsampleAvgrFNlayersc stt||p t}|dkr|nd}|dks|dkr3|dkr&|dkr&tntj}|d|ddd|_nt|_|j ||d|d|_ dS)z0 AvgPool Downsampling as in 'D' ResNet variants.rrSTF) ceil_modecount_include_pad apply_actN) superr__init__rrr` AvgPool2dpoolIdentityrconv) selfin_chsout_chsstridedilationrr avg_stride avg_pool_fn __class__r5r6rs  zDownsampleAvg.__init__cCs|||Srw)rrrxr5r5r6forwardszDownsampleAvg.forward)rrFN)r\r]r^rrr __classcell__r5r5rr6rs rrcKs:|dkr tdi|S|j|d|dfddi|S)Navgrr kernel_sizerr5)rrpop)downsample_typerr4r5r5r6create_downsample s$rcsFeZdZdZ   dd effd d ZddefddZddZZ S) BasicBlockz$ ResNet Basic Block - kxk + kxk r&rrrNrGrTFrc s0tt|| p t} t||}t||}||ks'|dks'|d|dkr6t|||||dd| d|_nt |_| j |||||dd|_ | sO| j durSt n| ||_ | j ||||d|| dd|_ | ro| j durst n| ||_| dkrt| nt |_| rt |_dS| jdd |_dS) NrrFrrrrrr)rr)rr drop_blockrrTinplace)rrrrrrrshortcutr`rr conv1_kxkr conv2_kxk attn_lastr drop_pathr)rrrrrrr bottle_ratiorgr linear_outrrdrop_path_ratemid_chsrrr5r6rs$     ""$zBasicBlock.__init__rpcC>|r tj|jjj|j|jfD] }t|dr| qdSNreset_parameters r`initzeros_rbnweightrrhasattrrrrprr5r5r6 init_weights/ zBasicBlock.init_weightscCsD||}||}||}||}||}|||}|Srw)rrrrrrrrrr5r5r6r6s     zBasicBlock.forward) r&rrNrGrTFNNrF r\r]r^__doc__rrr}r rrr5r5rr6rsrcFeZdZdZ   dd effd d Zdd efddZddZZ S)BottleneckBlockz4 ResNet-like Bottleneck Block - 1x1 - kxk - 1x1 r&rrrGNrFrrc stt|| p t} t||}t||}||ks'|dks'|d|dkr6t|||||dd| d|_nt |_| ||d|_ | j |||||d|| d|_ | j |||||d|| d|_ | rr| j ||||d|| d|_ nt |_ | s~| jdurt n| ||_| j ||ddd|_| r| jdurt n| ||_|dkrt|nt |_| rt |_dS| jd d |_dS) NrrFrrrrr)rrrrrTr)rrrrrrrrr`rr conv1_1x1r conv2b_kxkr conv3_1x1rrrr)rrrrrrrrrgrrrrrrrrrr5r6rGs:      ""$zBottleneckBlock.__init__rpcCrr) r`rrrrrrrrrrr5r5r6r hr zBottleneckBlock.init_weightscCsb||}||}||}||}||}||}||}||}|||}|Srw) rrrrrrrrrr r5r5r6ros        zBottleneckBlock.forward) r&rrrGNrFFFNNrr r r5r5rr6rCs!rcFeZdZdZ   dd effd d ZddefddZddZZ S) DarkBlocka  DarkNet-like (1x1 + 3x3 w/ stride) block The GE-Net impl included a 1x1 + 3x3 block in their search space. It was not used in the feature models. This block is pretty much a DarkNet block (also DenseNet) hence the name. Neither DarkNet or DenseNet uses strides within the block (external 3x3 or maxpool downsampling is done in front of the block repeats). If one does want to use a lot of these blocks w/ stride, I'd recommend using the EdgeBlock (3x3 /w stride + 1x1) for more optimal compute. r&rrrGNrTFrrc s(tt|| p t} t||}t||}||ks'|dks'|d|dkr6t|||||dd| d|_nt |_| ||d|_ | sJ| j durNt n| ||_ | j |||||d|| dd|_ | rk| j durot n| ||_| dkr}t| nt |_| rt |_dS| jdd|_dS) NrrFrrrrrrrTr)rrrrrrrrr`rrrrrrrrrrrrrrrrrrgrrrrrrrrr5r6rs&     ""$zDarkBlock.__init__rpcCrrrrr5r5r6r r zDarkBlock.init_weightscCN||}||}||}||}||}||}|||}|Srw)rrrrrrrr r5r5r6r      zDarkBlock.forward) r&rrrGNrTFNNrr r r5r5rr6r~s rcr) EdgeBlocka EdgeResidual-like (3x3 + 1x1) block A two layer block like DarkBlock, but with the order of the 3x3 and 1x1 convs reversed. Very similar to the EfficientNet Edge-Residual block but this block it ends with activations, is intended to be used with either expansion or bottleneck contraction, and can use DW/group/non-grouped convs. FIXME is there a more common 3x3 + 1x1 conv block to name this after? r&rrrGNrFrrc s*tt|| p t} t||}t||}||ks'|dks'|d|dkr6t|||||dd| d|_nt |_| j |||||d|| d|_ | sQ| j durUt n| ||_ | j ||ddd|_ | rl| j durpt n| ||_| dkr~t| nt |_| rt |_dS| jdd |_dS) NrrFrrrrTr)rrrrrrrrr`rrrr conv2_1x1rrrrrrr5r6rs&     ""$zEdgeBlock.__init__rpcCrr) r`rrrrrrrrrrr5r5r6r r zEdgeBlock.init_weightscCrrw)rrrrrrrr r5r5r6rrzEdgeBlock.forward) r&rrrGNrFFNNrr r r5r5rr6rs rcsDeZdZdZ  dd effd d Zdd efddZddZZ S) RepVggBlockz RepVGG Block. Adapted from impl at https://github.com/DingXiaoH/RepVGG This version does not currently support the deploy optimization. It is currently fixed in 'train' mode. r&rrrGNr$rrc  stt|| p t} t||} ||ko |dko |d|dk} | r*| j|ddnd|_| j|||||d| | dd|_| j||d|| dd|_ | j durRt n| ||_ | dkrb| rbt | nt |_| jdd |_dS) NrrFrr)rrrrTr)rrrrrridentityrconv_kxkconv_1x1rr`rrrr)rrrrrrrrrgrrrr use_identrr5r6rs   zRepVggBlock.__init__FrpcCs\|D]}t|tjrtj|jddtj|jddqt|j dr,|j dSdS)Ng?rr) modulesrr` BatchNorm2drnormal_rbiasrrr)rrpmr5r5r6r s   zRepVggBlock.init_weightscCsh|jdur||||}n||}||||}||}||}||}||}|Srw)rr rrrr)rrrr5r5r6r s     zRepVggBlock.forward) r&rrrGNr$NNrr r r5r5rr6rs rcr) SelfAttnBlockzI ResNet-like Bottleneck Block - 1x1 - optional kxk - self attn - 1x1 r&rrrGNrFTrrc sRtt|| dus Jt||}t||}||ks(|dks(|d|dkr7t|||||dd| d|_nt|_| ||d|_ | rX| j |||||d||d|_ d}nt|_ | durcint | d}| j |fd|i||_ | r{| |nt|_| j ||ddd|_|d krt|nt|_| rt|_dS| jd d |_dS) NrrFrr) feat_sizerrrTr)rr'rrrrrr`rrrrryrr post_attnrrrr)rrrrrrrrrgrr post_attn_nar(rrrrr opt_kwargsrr5r6rs0      $zSelfAttnBlock.__init__rpcCs4|r tj|jjjt|jdr|jdSdSr) r`rrrrrrrr)rrpr5r5r6r :s  zSelfAttnBlock.init_weightscCsX||}||}||}||}||}||}||}|||}|Srw)rrrrr)rrrr r5r5r6r@s       zSelfAttnBlock.forward) r&rrrGNrFFTNNNrr r r5r5rr6r'sr')rrdarkrrrrblock_fncCs |t|<dSrw)_block_registry)rr-r5r5r6register_blockXs r/blockcKsFt|tjtfr|di|S|tvsJd|t|di|S)NzUnknown block type (r5)rr`rar r.)r0r4r5r5r6r#\sr#cs(eZdZ  d deffdd ZZS) Stemr&rrjN?rc  st|dvs J| pt} tttfrt}} nfddt|Dddd} ||_g|_ d} dgdg|d} |dkrK|sKd| d<|durQ|n|}d g||d g|} |}d}t t | | | D]<\}\}}}|rw| j nt }d |d}|d kr|dkr|j t||| d |||||||d|}||9}|} qk|rd|vr|j t||| d |dtddd|d9}d} |j t||| d ||ksJdS)N)rSrcsg|] }t|qSr5)round)rr chs_decayrr5r6rorz!Stem.__init__..r$rSrrFTrrnum_chs reductionmodule)rrmaxrr&)rrrrrrrrr feature_inforrrrappendry add_modulelowerr` MaxPool2d)rrrrrrnum_repnum_actr5rrm prev_feat stem_stridesstem_norm_actsprev_chs curr_striderchrTnalayer_fn conv_namerr4r6resB   " z Stem.__init__)r&rrjr&Nr2N)r\r]r^rrrr5r5rr6r1cs r1stemc sH|pt}|dvs Jd|vr"d|vrdnd}t||d|||d}ncd|vr8t|d|d |d|f||d }nMd |vrGt||dd ||d }n>d|vrTt||d|d}n1d|vro|ret||dd||d}n |j||ddd}n|r|t||dd||d}n |j||ddd}t|trfdd|jD}||fSt|ddg}||fS)N)r$rrrdeepr7x7rhrrrSr)rArBrrrr&r<)rrrMrG)rAr5rrr)rrrNr(r)rArrrc s&g|]}t|d|dgdqS).r:)r:)ryjoin)rf feat_prefixr5r6rrz$create_byob_stem..r7)rr1rrrr<ry) rrri pool_typerTrrBrLr<r5rSr6r"s.  $ r"cs"|durdStfdd|DS)Ncsg|]}|qSr5r5rrTrOr5r6rz$reduce_feat_size..)r)r(rr5rOr6reduce_feat_sizes"rXcCs|dur|n|}|p iS)a2 Override model level attn/self-attn/block kwargs w/ block level NOTE: kwargs are NOT merged across levels, block_kwargs will fully replace model_kwargs for the block if set to anything that isn't None. i.e. an empty block_kwargs dict will remove kwargs set at model level for that block Nr5)r[ model_kwargs out_kwargsr5r5r6override_kwargssr[r[ block_cfg model_cfgcCs|d}|jdus|jdur7|jsd}nt|j|j}|jp |j}|dur/tt|g|Rnd}t||d}|jdusA|jdurj|jsGd}nt|j|j}|jpS|j}|durbtt|g|Rnd}t||d}||d<|t|j |j dS)Nr)rr) rXrWr[r rrrZrYupdater[)r[r\r] layer_fnsrWrXrYrZr5r5r6update_block_kwargss(     rarr output_stride stem_featr(block_kwargs_fnc Cs|pt}g}dd|jD}dd|D} ddtd|t| | D} d} |d} |d} |}g}t|D]\}}|dj}|dkrM|rM||| |kr[|dkr[| |9} d}| |9} | d vrednd }g}t|D]a\}}t |j |j }|j }t |tr|||}t| ||dkr|nd|| f||j|j| |||d }|jd vr||d <||||d|t|jfi|g7}| }|} |dkr|dkrt||}qm|tj|g7}t| | d|d}q9||tj||fS)NcSsg|]}t|qSr5)rrVr5r5r6rrWz&create_byob_stages..cSsg|] }tdd|DqS)cSsg|]}|jqSr5r)rbcr5r5r6rsz1create_byob_stages...)sum)r stage_bcsr5r5r6rscSsg|]}|qSr5)tolist)rrr5r5r6rrWrrr9r8)rrSrS) rrrrrrrgrrr^r()r\r]zstages.r7)rretorchlinspacerfsplitrrTr=rrRrnrUrr ryrVrgrPr#rXr` Sequential)rrrbrcr(rrdr<rdepthsdprr net_striderFrCstages stage_idxstage_block_cfgsrfirst_dilationre block_idxr\rrr[r5r5r6create_byob_stagessb "         rucCst|j}t|j|d}tt|j|d}|jr#tt|jfi|jnd}|j r4tt|j fi|j nd}t |||||d}|S)N)rurs)rrrrr) rrsrrur rrWrrXrYrZr)rrrrrrrJr5r5r6 get_layer_fns#s ""rvcsNeZdZdZ  dd effd d Zd d ZdddZddZddZ Z S)ra# 'Bring-your-own-blocks' Net A flexible network backbone that allows building model stem + blocks via dataclass cfg definition w/ factory functions for module instantiation. Current assumption is that both stem and blocks are in conv-bn-act order (w/ block ending in act). r%r&rrlTNrrc st||_||_t|} |jr|dusJd|dur"t|nd} g|_tt |j p1|j dj |j } t|| |j|j| d\|_} |j| ddt| | ddd} t|| || d| | d\|_}|j|dd|dd}|jrtt |j |j|_| ||jd |_n||_t|_|jt|j|ddd d g7_t|j|||jd |_|D] \}}t||q| D] }t!|d r|j"|dqdS)Nz8img_size argument is required for fixed input size modelr)rr6r9rO)rr(r8r final_convr7rU drop_rater )rp)#rrr+ryrvrqrr<rcr3rmrerRrnr"rirkrLextendrXrurprorrwr`rryrhead named_modules _init_weightsr"rr )rrr+in_chans global_poolrbrpimg_sizeryrrr(rmrc stage_featrFnr&rr5r6r5sB         zByobNet.__init__cCs|jjSrw)r{fc)rr5r5r6get_classifier\szByobNet.get_classifiercCst|j|||jd|_dS)Nrx)rroryr{)rr+rr5r5r6reset_classifier_szByobNet.reset_classifiercCs"||}||}||}|Srw)rLrprwrr5r5r6forward_featuresbs   zByobNet.forward_featurescCs||}||}|Srw)rr{rr5r5r6rhs  zByobNet.forward)r%r&rrlTNrr)r) r\r]r^rr rrrrrrr5r5rr6r-s' rcCst|tjr4|jd|jd|j}||j}|jjdt d||j dur2|j j dSdSt|tj rTtjj|jddd|j durRtj|j dSdSt|tjrjtj|jtj|j dSdS)Nrrrrg{Gz?)r0r1)rr`Conv2dr out_channelsrrdatar$mathsqrtr%zero_Linearrrr#ones_)r&rfan_outr5r5r6r}ns      r}cKs*tt||ft|t|tddd|S)NT)flatten_sequential) default_cfgr] feature_cfg)rr default_cfgs model_cfgsry)variantrr4r5r5r6r~sr)r$)r~rrr r5)r$r$rLN)rS)_rr dataclassesrrrtypingrrrrr r r r functoolsr ritorch.nnr` timm.datarrhelpersrrrrrrrrrrrrrregistryr__all__r7rr!r rr_rcr}rryrr:r9r8r=r>r?r@rArBrCrDrErIrKrLrMrNrrrrarrrrrrrr'r.r/r#rlr1r"rXr[rardrurvrr}rr5r5r5r6s(   4     %)+-028     & +05 :? DI P^n  .                 "  0;65.7 / # $  < A