Is MXFP4_MOE more efficient than Q4_K_M? Which one should perform better?
#3
by
nmkd
- opened
Q4KM is a bit larger but I was wondering if maybe MXFP4 had an efficiency advantage to compensate that.
Which one would you recommend if VRAM is not an issue for either of them?
When i measured the perplexity of each, the Q4_K_M was lower. I haven't done extensive testing of it though, both seemed to have decent output.