Update README.md

3595229 verified 22 days ago

4.82 kB

	---
	language: en
	license: mit
	tags:
	- text-classification
	- mental-health
	- transformer
	- distilbert
	- depression
	- anxiety
	- clinical-nlp
	- huggingface
	datasets:
	- custom
	library_name: transformers
	pipeline_tag: text-classification
	widget:
	- text: "I feel hopeless and can't sleep properly."
	example_title: "Depression"
	- text: "I’m anxious all the time and can’t focus."
	example_title: "Anxiety"
	- text: "Everything’s fine. I’m feeling good."
	example_title: "Healthy"
	model-index:
	- name: distilbert-mentalhealth-classifier
	results:
	- task:
	type: text-classification
	name: Text Classification
	dataset:
	name: Filtered Combined Dataset
	type: custom
	metrics:
	- type: accuracy
	value: 0.856
	- type: f1
	value: 0.854
	---

	# 🧠 DistilBERT Mental Health Classifier

	This model is a fine-tuned version of [`distilbert-base-uncased`](https://huggingface.co/distilbert-base-uncased) for mental health condition classification. It is trained on a custom dataset containing user statements labeled with categories such as depression, anxiety, PTSD, and more.


	# 🧠 Use Case
	This model is designed for:

	Early detection of mental health symptoms in user conversations

	Clinical research on NLP-based diagnostic support

	AI assistants that provide empathetic triage or support

	# 🧪 Performance
	The model shows significant improvements after fine-tuning:

	\| Sample Size \| Accuracy (Before) \| F1 Score (Before) \| Accuracy (After) \| F1 Score (After) \|
	\| ----------- \| ----------------- \| ----------------- \| ---------------- \| ---------------- \|
	\| 200 Samples \| 0.075 \| 0.0142 \| 0.830 \| 0.8267 \|
	\| 500 Samples \| 0.070 \| 0.0141 \| 0.856 \| 0.8544 \|


	✅ These results indicate that fine-tuning with a high-quality mental health dataset enables DistilBERT to make informed predictions from free-form user input.

	# 📚 Dataset
	The model was fine-tuned on Filtered_Combined_Data.csv, a curated dataset of 42,000+ statements labeled across multiple mental health categories. Each sample includes:

	statement — a natural language user message

	label — a mental health condition such as "Depression", "Anxiety", or "Healthy"

	# 🏗️ Prompt Format (used during fine-tuning)
	text
	Copy
	Edit
	### Instruction:
	Classify the mental health condition in the following statement.

	Input:
	{text}

	Response:
	{label}
	This instruction format aligns the classifier with instruction-tuned language models.

	---

	# 🧠 Labels Covered

	The model classifies input statements into the following mental health categories (example):

	- Anxiety
	- Depression
	- PTSD
	- OCD
	- Bipolar Disorder
	- ADHD
	- Healthy
	- Others (as labeled in dataset)

	---

	# ⚙️ Training Configuration

	- Base Model: `distilbert-base-uncased`
	- Epochs: 3
	- Total Steps: ~36,500
	- Batch Size: 16
	- Max Length: 512
	- Quantization: None
	- Learning Rate: 2e-5
	- Optimizer: AdamW
	- Evaluation: Accuracy, Weighted F1

	---


	# 📂 Model Files

	- `pytorch_model.bin` — fine-tuned model weights
	- `tokenizer_config.json`, `vocab.txt`, etc. — tokenizer files
	- `config.json` — architecture and label mapping
	- `README.md` — this file

	---

	# 📄 License

	This model is licensed under the MIT License — free for personal, academic, and commercial use with attribution.

	---

	# 🙋 Author

	Developed by Dileep Reddy Suram
	📍 For multimodal clinical AI assistant research and PhD preparation
	🔗 [Hugging Face Profile](https://huggingface.co/dsuram)

	---

	# 🚀 Citation

	If you use this model, please cite:

	# 📦 How to Use (Quick Start)

	```python
	from transformers import pipeline

	classifier = pipeline("text-classification", model="dsuram/distilbert-mentalhealth-classifier")
	classifier("I feel anxious all the time and can't concentrate.")
	---
	🧪 Inference (Advanced)
	You can also use the tokenizer + model directly:


	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	#### Load model and tokenizer
	model = AutoModelForSequenceClassification.from_pretrained("dsuram/distilbert-mentalhealth-classifier")
	tokenizer = AutoTokenizer.from_pretrained("dsuram/distilbert-mentalhealth-classifier")

	# Input text
	text = "I feel lost, hopeless, and don't see a way out."

	# Tokenize and predict
	inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
	outputs = model(**inputs)
	logits = outputs.logits
	predicted_class_id = torch.argmax(logits, dim=1).item()

	# Map to label
	label_map = model.config.id2label
	print(f"Predicted label: {label_map[predicted_class_id]}")
	---