Add AWQ Quant?
#1
by
Foggierlucky
- opened
This looks very promising! Thanks for all of the hard work! Would it be possible to release an AWQ version so that we can try to run this at 96 GB VRAM at home?
I will provide feedback to the relevant personnel. int4 still has some loss currently. FP8 has very little loss.
I appreciate your response! I am a home user and fairly novice. Will zai-org/GLM-4.5-Air-FP8 fit in 96GB VRAM? I tried some small calculations (not sure how accurate) and i came up with a little over 120GB at FP8 with decent context, so I am hoping for a smaller quant without too much loss in accuracy.int4 still has some loss currently. FP8 has very little loss.
I see.
Thanks for responding, and its amazing that us home users will have something this powerful, so thank you!