a eq@sddlZddlZddlmZddlmZejdduZz6er^ddlm Z m Z dZ e dne de Wn e yd Z e d Yn0Gd d d ejZGd ddeZGdddejZGdddeZdS)N)Tensor)nnZXFORMERS_DISABLED)memory_efficient_attentionunbindTz!xFormers is available (Attention)z xFormers is disabled (Attention)Fz%xFormers is not available (Attention)c s@eZdZd eeeeeeddfdd Zeed d d ZZ S) AttentionFTN)dim num_headsqkv_bias proj_bias attn_drop proj_dropreturncsft||_||}|d|_tj||d|d|_t||_tj|||d|_ t||_ dS)Nbias) super__init__r scalerLinearqkvDropoutr projr)selfr r r r r rhead_dim __class__!/data/tang/glrm/core/attention.pyr s   zAttention.__init__xrc Cs|j\}}}||||d|j||jddddd}|d|j|d|d}}}||dd} | jdd} || } | |dd|||}| |}| |}|S) Nrrr ) shaperreshaper permuter transposesoftmaxr rr) rr"BNCrqkvattnrrr forward3s ."    zAttention.forward)rFTrr __name__ __module__ __qualname__intboolfloatrrr5 __classcell__rrrr rsrcs&eZdZdeedfdd ZZS)MemEffAttentionNr!c sts |durtdt|S|j\}}}||||d|j||j}t|d\}}} t ||| |d}||||g}| |}| |}|S)N-xFormers is required for using nested tensorsrr# attn_bias) XFORMERS_AVAILABLEAssertionErrorrr5r)rr*r rrrr) rr"rAr.r/r0rr1r2r3rrr r5Ds     zMemEffAttention.forward)Nr7r8r9rr5r=rrrr r>Csr>c sJeZdZd eeeeeeeeedd fdd Zeeeed d d ZZ S) CrossAttentionrFTrN) r dim_qdim_kdim_vr r r r rrc st||_||_||} | d|_tj|||d|_tj|||d|_tj|||d|_ t ||_ tj|||d|_ t | |_ dS)Nrr)rrr r rrrto_qto_kto_vrr rr) rr rFrGrHr r r r rrrrr rXs   zCrossAttention.__init__r1r2r3rc Cs|j\}}}|jd}|j|||||j|j|jdddd}|||||j|j|jdddd}|||||j|j|jdddd}|| dd}|j dd}| |}|| dd||d} | | } | | } | S)Nr$rr#rr&r'r()r)rrIr*r r r+rJrKr,r-r rr) rr1r2r3r.r/_Mr4r"rrr r5qs  2,,    zCrossAttention.forward)rFTrrr6rrrr rEWs$rEcs*eZdZdeeeedfdd ZZS)MemEffCrossAttentionNrLc sts |durtdt|S|j\}}}|jd} |j|||||j|j |j}| ||| |j|j |j}| ||| |j|j |j}t ||||d}|||d}| |}||}|S)Nr?r$r@r')rBrCrr5r)rrIr*r r rJrKrrr) rr1r2r3rAr"r.r/rMrNrrr r5s   &    zMemEffCrossAttention.forward)NrDrrrr rOsrO)oswarningstorchrrenvirongetZXFORMERS_ENABLEDZ xformers.opsrrrBwarn ImportErrorModulerr>rErOrrrr  s$     $2