Is MXFP4_MOE more efficient than Q4_K_M? Which one should perform better?

by nmkd - opened Aug 7

nmkd

Aug 7

Q4KM is a bit larger but I was wondering if maybe MXFP4 had an efficiency advantage to compensate that.

Which one would you recommend if VRAM is not an issue for either of them?

Owner Aug 7

When i measured the perplexity of each, the Q4_K_M was lower. I haven't done extensive testing of it though, both seemed to have decent output.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment