metadata
library_name: transformers
license: apache-2.0
base_model: bert-base-uncased
tags:
- paraphrase-detection
- sentence-pair-classification
- glue
- mrpc
metrics:
- accuracy
- f1
model-index:
- name: bert_paraphrase
results:
- task:
name: Paraphrase Detection
type: text-classification
dataset:
name: GLUE MRPC
type: glue
config: mrpc
split: validation
metrics:
- name: Accuracy
type: accuracy
value: 0.8676
- name: F1
type: f1
value: 0.9078
language:
- en
bert_paraphrase
This model is a fine-tuned version of bert-base-uncased on the Microsoft Research Paraphrase Corpus (MRPC), a subset of the GLUE benchmark.
It is trained to determine whether two sentences are semantically equivalent (paraphrases) or not.
π Evaluation Results
- Loss: 0.4042
- Accuracy: 0.8676
- F1: 0.9078
π§Ύ Model Description
- Model type: BERT-base (uncased)
- Task: Binary classification (paraphrase vs not paraphrase)
- Languages: English
- Labels:
0
β Not paraphrase1
β Paraphrase
β Intended Uses & Limitations
Intended uses & limitations
Intended uses
- Detect if two sentences convey the same meaning.
- Useful for:
- Duplicate question detection (e.g., Quora, FAQ bots).
- Semantic similarity search.
- Improving information retrieval systems.
Limitations
- Only trained on English (MRPC dataset).
- May not generalize well to other domains (e.g., legal, medical).
- Binary labels only (no "degree of similarity").
π How to Use
You can use this model with the Hugging Face pipeline
for quick inference:
from transformers import pipeline
paraphrase_detector = pipeline(
"text-classification",
model="azherali/bert_paraphrase",
tokenizer="azherali/bert_paraphrase"
)
single_pair = [
{"text": "The car is red.", "text_pair": "The automobile is red."},
]
result = paraphrase_detector(single_pair)
print( result)
[{'label': 'paraphrase', 'score': 0.9801033139228821}]
# Test pairs
pairs = [
{"text": "The car is red.", "text_pair": "The automobile is red."},
{"text": "He enjoys playing football.", "text_pair": "She likes cooking."},
]
result = paraphrase_detector(pairs)
print( result)
[{'label': 'paraphrase', 'score': 0.9801033139228821}, {'label': 'not_paraphrase', 'score': 0.9302119016647339}]
Using AutoModel & AutoTokenizer:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("azherali/bert_paraphrase")
model = AutoModelForSequenceClassification.from_pretrained("azherali/bert_paraphrase")
# Example sentences
sent1 = "The quick brown fox jumps over the lazy dog."
sent2 = "A fast brown fox leaps over a lazy dog."
inputs = tokenizer(sent1, sent2, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
predicted_class = torch.argmax(logits, dim=1).item()
print("Prediction:", model.config.id2label[predicted_class])
Prediction: paraphrase
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
No log | 1.0 | 230 | 0.3894 | 0.8309 | 0.8836 |
No log | 2.0 | 460 | 0.3511 | 0.8505 | 0.8964 |
0.4061 | 3.0 | 690 | 0.4042 | 0.8676 | 0.9078 |
Framework versions
- Transformers 4.55.2
- Pytorch 2.8.0+cu126
- Datasets 4.0.0
- Tokenizers 0.21.4