BERT Chinese SMS Name Classifier

這是一個基於 ckiplab/bert-base-chinese 微調的中文SMS簡訊姓名分類器,專門用於識別SMS簡訊中是否包含姓名。

模型描述

  • 基礎模型: ckiplab/bert-base-chinese
  • 任務: 二元分類 (姓名檢測)
  • 語言: 中文 (繁體/簡體)
  • 標籤: 0 (無姓名), 1 (有姓名)

使用方式

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# 載入模型和分詞器
tokenizer = AutoTokenizer.from_pretrained("renhehuang/bert-chinese-sms-name-classifier")
model = AutoModelForSequenceClassification.from_pretrained("renhehuang/bert-chinese-sms-name-classifier")

# 預測
text = "您好,王先生,您的訂單已準備好了"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=256)

with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions, dim=-1).item()
    name_probability = predictions[0][1].item()

print(f"預測結果: {'有姓名' if predicted_class == 1 else '無姓名'}")
print(f"姓名機率: {name_probability:.4f}")

訓練詳情

  • 最大序列長度: 256
  • 批次大小: 16 (訓練), 32 (評估)
  • 學習率: 2e-5
  • 優化器: AdamW

注意事項

  1. 此模型專門針對中文SMS簡訊進行優化
  2. 最適合處理長度在256字符以內的文本
  3. 在Apple Silicon設備上性能最佳

Citation

@misc{bert-chinese-sms-name-classifier,
  author = {Your Name},
  title = {BERT Chinese SMS Name Classifier},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/renhehuang/bert-chinese-sms-name-classifier}}
}
Downloads last month
11
Safetensors
Model size
102M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support