not supported in Turing GPU Architecture ?

#3
by Durgaram - opened

Turing architecture GPUs are not equipped with native bfloat16 (BF16) support, and their FP16 capabilities are limited—particularly when it comes to efficient half-precision model loading and inference, which is better optimized in Ampere and later architectures. so we can use this weight only in ampere gpus

Sign up or log in to comment