CBDC Stance Classifier: A Domain-Specific BERT for CBDC-Related Stance Detection

The CBDC-Stance model classifies Central Bank Digital Currency (CBDC)โ€“related text into three stance categories: Pro-CBDC โ€” supportive of CBDC adoption (e.g., highlighting benefits, efficiency, innovation), Wait-and-See โ€” neutral or cautious, expressing neither strong support nor strong opposition, often highlighting the need for further study, and Anti-CBDC โ€” critical of CBDC adoption (e.g., highlighting risks, concerns, opposition).

Base Model: bilalzafar/CentralBank-BERT โ€” CentralBank-BERT is a domain-adapted BERT base (uncased), pretrained on 66M+ tokens across 2M+ sentences from central-bank speeches published via the Bank for International Settlements (1996โ€“2024). It is optimized for masked-token prediction within the specialized domains of monetary policy, financial regulation, and macroeconomic communication, enabling better contextual understanding of central-bank discourse and financial narratives.

Training Data: The training dataset consisted of 1,647 CBDC-related sentences from BIS speeches, manually annotated into three sentiment categories: Pro-CBDC (742 sentences), Wait-and-See (694 sentences), and Anti-CBDC (211 sentences).

Intended Uses: The model is designed for classifying stance in speeches, articles, or statements related to CBDCs, supporting research into CBDC discourse analysis, and monitoring stance trends in central banking communications.

Training Details

The model was trained starting from the bilalzafar/CentralBank-BERT checkpoint, using a BERT-base architecture with a new three-way softmax classification head and a maximum sequence length of 320 tokens. Training was run for up to 8 epochs, with early stopping at epoch 6, a batch size of 16, a learning rate of 2e-5, weight decay of 0.01, and a warmup ratio of 0.06, optimized using AdamW. The loss function was Focal Loss (ฮณ = 1.0, soft focal, no extra class weights), and a WeightedRandomSampler based on the square root of inverse frequency was applied to handle class imbalance. FP16 precision was enabled for efficiency, and the best checkpoint was selected based on the Macro-F1 score. The dataset was split into 80% training, 10% validation, and 10% test sets, stratified by label with class balance applied.

Performance and Metrics

On the test set, the model achieved an accuracy of 0.8485, a macro F1-score of 0.8519, and a weighted F1-score of 0.8484. Class-wise performance showed strong results across all categories: Anti-CBDC (Precision: 0.8261, Recall: 0.9048, F1: 0.8636), Pro-CBDC (Precision: 0.8421, Recall: 0.8533, F1: 0.8477), and Wait-and-See (Precision: 0.8636, Recall: 0.8261, F1: 0.8444). The best validation checkpoint recorded an accuracy of 0.8303, macro F1 of 0.7936, and weighted F1 of 0.8338, with a validation loss of 0.3883. On the final test evaluation, loss increased slightly to 0.4223, while all key metrics improved compared to the validation set.


Other CBDC Models

This model is part of the CentralBank-BERT / CBDC model family, a suite of domain-adapted classifiers for analyzing central-bank communication.

Model Purpose Intended Use Link
bilalzafar/CentralBank-BERT Domain-adaptive masked LM trained on BIS speeches (1996โ€“2024). Base encoder for CBDC downstream tasks; fill-mask tasks. CentralBank-BERT
bilalzafar/CBDC-BERT Binary classifier: CBDC vs. Non-CBDC. Flagging CBDC-related discourse in large corpora. CBDC-BERT
bilalzafar/CBDC-Stance 3-class stance model (Pro, Wait-and-See, Anti). Research on policy stances and discourse monitoring. CBDC-Stance
bilalzafar/CBDC-Sentiment 3-class sentiment model (Positive, Neutral, Negative). Tone analysis in central bank communications. CBDC-Sentiment
bilalzafar/CBDC-Type Classifies Retail, Wholesale, General CBDC mentions. Distinguishing policy focus (retail vs wholesale). CBDC-Type
bilalzafar/CBDC-Discourse 3-class discourse classifier (Feature, Process, Risk-Benefit). Structured categorization of CBDC communications. CBDC-Discourse
bilalzafar/CentralBank-NER Named Entity Recognition (NER) model for central banking discourse. Identifying institutions, persons, and policy entities in speeches. CentralBank-NER

Repository and Replication Package

All training pipelines, preprocessing scripts, evaluation notebooks, and result outputs are available in the companion GitHub repository:

๐Ÿ”— https://github.com/bilalezafar/CentralBank-BERT


Usage

from transformers import pipeline

# Load pipeline
classifier = pipeline("text-classification", model="bilalzafar/CBDC-Stance")

# Example sentences
sentences = [
"CBDCs will reduce costs and improve payments."
]

# Predict
for s in sentences:
    result = classifier(s, return_all_scores=False)[0]
    print(f"{s}\n โ†’ {result['label']} (score={result['score']:.4f})\n")

# Example output: 
# [{CBDCs will reduce costs and improve payments. โ†’ Pro-CBDC (score=0.9788)}]

Citation

If you use this model, please cite as:

Zafar, M. B. (2025). CentralBank-BERT: Machine Learning Evidence on Central Bank Digital Currency Discourse. SSRN. https://papers.ssrn.com/abstract=5404456

@article{zafar2025centralbankbert,
  title={CentralBank-BERT: Machine Learning Evidence on Central Bank Digital Currency Discourse},
  author={Zafar, Muhammad Bilal},
  year={2025},
  journal={SSRN Electronic Journal},
  url={https://papers.ssrn.com/abstract=5404456}
}
Downloads last month
7
Safetensors
Model size
109M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for bilalzafar/CBDC-Stance

Finetuned
(6)
this model