Parvez Shaikh's picture

1 2

Parvez Shaikh

pmshaikh

·

AI & ML interests

Medicine, Medical, Clinical, Mathematics, Math, Coding, Python, R, reasoning, Research

Recent Activity

new activity about 2 months ago

DavidAU/Qwen3-128k-30B-A3B-NEO-MAX-Imatrix-gguf:Strong model for reseaech in math, reasoning, coding for medical data or any data in general

upvoted a collection about 2 months ago

Thinking / Reasoning Models - Reg and MOEs.

replied to mlabonne's post about 2 months ago

⚡ AutoQuant AutoQuant is the evolution of my previous AutoGGUF notebook (https://colab.research.google.com/drive/1P646NEg33BZy4BfLDNpTz0V0lwIU3CHu). It allows you to quantize your models in five different formats: - GGUF: perfect for inference on CPUs (and LM Studio) - GPTQ/EXL2: fast inference on GPUs - AWQ: super fast inference on GPUs with vLLM (https://github.com/vllm-project/vllm) - HQQ: extreme quantization with decent 2-bit and 3-bit models Once the model is converted, it automatically uploads it on the Hugging Face Hub. To quantize a 7B model, GGUF only needs a T4 GPU, while the other methods require an A100 GPU. Here's an example of a model I quantized using HQQ and AutoQuant: https://huggingface.co/mlabonne/AlphaMonarch-7B-2bit-HQQ I hope you'll enjoy it and quantize lots of models! :) 💻 AutoQuant: https://colab.research.google.com/drive/1b6nqC7UZVt8bx4MksX7s656GXPM-eWw4

View all activity

Organizations

None yet

models 0

None public yet

datasets 0

None public yet