π§ MedScholar-1.5B

MedScholar-1.5B is a compact, instruction-aligned medical question-answering model fine-tuned on 1 million randomly selected examples from the MIRIAD-4.4M dataset. It is based on the Qwen/Qwen2.5-1.5B-Instruct model and designed for efficient, in-context clinical knowledge exploration β not diagnosis.
π Model Details
- Base Model: Qwen2.5-1.5B-Instruct-unsloth-bnb-4bit
- Fine-tuning Dataset: MIRIAD-4.4M
- Samples Used: 1,000,000 examples randomly selected from the full set
- Prompt Style: Minimal QA format (see below)
- Training Framework: Unsloth with QLoRA
- License: Apache-2.0 (inherits from base model); dataset is ODC-By 1.0
π Prompt Format
### Question:
What is the role of LDL in cardiovascular health?
### Answer:
LDL plays a central role in the development of atherosclerosis by delivering cholesterol to peripheral tissues...
- The model expects the prompt to end with
### Answer:
, and will generate only the answer text. - Do not include the answer in the prompt during inference.
π Dataset Consent & License
This model was fine-tuned using randomly selected 1 million examples from the MIRIAD-4.4M dataset, which is released under the ODC-By 1.0 License.
The MIRIAD dataset is intended exclusively for academic research and educational exploration. As stated by its authors:
βThe outputs generated by models trained or fine-tuned on this dataset must not be used for medical diagnosis or decision-making involving real individuals.β
β οΈ Intended Use
This model is for research, educational, and exploration purposes only. It is not a medical device and must not be used to provide clinical advice, diagnosis, or treatment.
π‘ Example Inference (Python)
from transformers import pipeline
pipe = pipeline("text-generation", model="yasserrmd/MedScholar-1.5B", device=0)
prompt = """### Question:
What are the symptoms of acute pancreatitis?
### Answer:
"""
response = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7)
print(response[0]["generated_text"])
π€ Acknowledgements
- MIRIAD Dataset by Zheng et al. (2025) β https://huggingface.co/datasets/miriad/miriad-4.4M
- Qwen2.5 by Alibaba β https://huggingface.co/Qwen
- Training infrastructure: Unsloth
π Citation
@misc{yasser2025medscholar,
title = {MedScholar-1.5B: Compact medical QA model fine-tuned on MIRIAD},
author = {Mohamed Yasser},
year = {2025},
howpublished = {\url{https://huggingface.co/yasserrmd/MedScholar-1.5B}},
}
This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 191