Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Parvez Shaikh's picture
1 2

Parvez Shaikh

pmshaikh
Ā·

AI & ML interests

Medicine, Medical, Clinical, Mathematics, Math, Coding, Python, R, reasoning, Research

Recent Activity

new activity about 2 months ago
DavidAU/Qwen3-128k-30B-A3B-NEO-MAX-Imatrix-gguf:Strong model for reseaech in math, reasoning, coding for medical data or any data in general
upvoted a collection about 2 months ago
Thinking / Reasoning Models - Reg and MOEs.
replied to mlabonne's post about 2 months ago
⚔ AutoQuant AutoQuant is the evolution of my previous AutoGGUF notebook (https://colab.research.google.com/drive/1P646NEg33BZy4BfLDNpTz0V0lwIU3CHu). It allows you to quantize your models in five different formats: - GGUF: perfect for inference on CPUs (and LM Studio) - GPTQ/EXL2: fast inference on GPUs - AWQ: super fast inference on GPUs with vLLM (https://github.com/vllm-project/vllm) - HQQ: extreme quantization with decent 2-bit and 3-bit models Once the model is converted, it automatically uploads it on the Hugging Face Hub. To quantize a 7B model, GGUF only needs a T4 GPU, while the other methods require an A100 GPU. Here's an example of a model I quantized using HQQ and AutoQuant: https://huggingface.co/mlabonne/AlphaMonarch-7B-2bit-HQQ I hope you'll enjoy it and quantize lots of models! :) šŸ’» AutoQuant: https://colab.research.google.com/drive/1b6nqC7UZVt8bx4MksX7s656GXPM-eWw4
View all activity

Organizations

None yet

models 0

None public yet

datasets 0

None public yet
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs