Model Card for ESG-sentiment-FinBERT
Model Details
Model Description
This model is a fine-tuned version of ProsusAI/finbert for ESG-sentiment text classsification. It is designed to clasify financial sentences in two dimensions: ESG contents and sentiment.
In terms of ESG contents, there are 5 different labels, "E" for environmental-related contents, "S" for social-related contents, "G" for governance-related contents, "Gen-ESG" for general ESG contents without any specific pillar, and "Non-ESG" for non-ESG contents. In terms of sentiments, there are 3 different labels, "pos" for positive, "neg" for negative, and "neu" for neutral.
For example, a label "E-pos" means the sentence has enronmental contents and it is positive.
- Developed by: Yue Odile Wu
- Model type: BERT-based text classifier (
BertForSequenceClassification
) - Language(s) (NLP): English
- License: Apache 2.0
- Finetuned from model [optional]: ProsusAI/finbert
Model Sources
- Repository: YueOdileWu/ESG-sentiment-FinBERT
- Paper (for base model): FinBERT: Financial Sentiment Analysis with Pre-trained Language Models
Uses
Direct Use
The model is intended for direct use in classifying English sentences from the financial or corporate domain into ESG-Sentiment double categories. You can use it with the text-classification
pipeline.
Downstream Use
It can be used to measure how positively/negatively firms talk about ESG in their financial texts, such as the 10-K, 485BPOS, or earnings call transcripts, etc. Financial news texts fit the model as well. The model can be a component in larger models, such as a monitoring tool to track corporate ESG commitments from news and press releases.
Out-of-Scope Use
The model is not suitable for:
- Languages other than English: Both the base model and fine-tuning data are in English.
- Non-finance/corporate texts: Performance may degrade significantly on texts from other domains.
- Tasks other than text classification.
Bias, Risks, and Limitations
- Training biases: The model is trained on a synthetic text dataset. The synthetic sentences simulates the real sentences in 10-K files, and they may not perfectly reflect the complexity, nuance, and distribution of real-world financial text.
- Subjective of the ESG Concept: The definitions of what constitutes "Environmental," "Social," "Governance," "General ESG" can be subjective and vary among different rating standards and methodologies. The model's classifications are based on the definitions implicit in its training data.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model. The sentences provided in the example code are randomly selected from real 10-K files, and are first-stage-parsed.
from transformers import BertForSequenceClassification, BertTokenizerFast, pipeline
import torch
# Load the fine-tuned model and tokenizer from the saved directory
model_path = "./finbert-finetuned_ESGneu_Val"
model = BertForSequenceClassification.from_pretrained("./finbert-finetuned_ESGneu_Val")
tokenizer = BertTokenizerFast.from_pretrained("./finbert-finetuned_ESGneu_Val")
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
# Example sentences to categorize
sentences = [
"Our search for a business combination, and any target business with which we ultimately consummate a business combination, may be materially adversely affected by the coronavirus (COVID-19) pandemic and other events, and the status of debt and equity markets.",
"Past performance by our management team and their affiliates may not be indicative of future performance of an investment in the Company.",
"We therefore believe that any necessary provision for creditors will be reduced and should not have a significant impact on our ability to distribute the funds in the trust account to our public stockholders.",
"Since our staggered board may prevent our stockholders from replacing a majority of our board of directors at any given annual meeting, it may entrench management and discourage unsolicited stockholder proposals",
"10.49 Credit Agreement, dated as of October 31, 2019, by and between FuelCell Energy, Inc., the Guarantors from time to time party thereto, the Lenders and Orion Energy Partners Investment Agent, LLC (incorporated by reference to Exhibit 10.2 to the Company s Current Report on Form 8-K filed on November 6, 2019).",
"RISK FACTORS Risks Related to COVID-19 The COVID-19 pandemic has adversely impacted our business and financial results and that of many of our customers, and the ultimate impact will depend on future developments, which are highly uncertain, cannot be predicted and are largely outside of our control, including the scope and duration of the pandemic and actions taken by governmental authorities in response to the pandemic.",
"On July 1, 2019, we entered into a conversion agreement with Australis, whereby Australis has agreed to convert the Debentures on July 1, 2020.",
"We derive a significant portion of our revenues from a small number of customers and licensees, and particularly from their sale of premium tier devices, and we expect this trend to continue in the foreseeable future.",
"Because of its inherent limitations, internal control over financial reporting may not prevent or detect misstatements.",
"In the PEN, ODEQ also alleged violations of major source new source review, CAO and federal hazardous air pollutant control technology requirements and gave notice to the Company that ODEQ had referred the matter to USEPA for review and possible formal enforcement.",
"We also have centralized departments for each segment whose focus is to build and maintain relationships with potential and existing national employers and develop graduate job opportunities and, where possible, relocation assistance, sign-on bonuses, tool packages and tuition reimbursement plans with our manufacturer brand partners and other industry employers."
]
# Get predictions
# single sentence
result = classifier(sentence_individual)
sentence_individual = ["remains uncertainty as to the effect of COVID-19 on our business in both the short and long-term."]
print(f"\n{sentence_individual}\n")
print(f"\n{result}\n")
# multiple sentences
results = classifier(sentences)
# Print results
for sentence, result in zip(sentences, results):
print(f"Sentence: {sentence}")
print(f"Prediction: {result}\n")
['remains uncertainty as to the effect of COVID-19 on our business in both the short and long-term.']
{'label': 'S-neu', 'score': 0.8255176544189453}
Sentence: Our search for a business combination, and any target business with which we ultimately consummate a business combination, may be materially adversely affected by the coronavirus (COVID-19) pandemic and other events, and the status of debt and equity markets.
Prediction: {'label': 'Non-ESG-neg', 'score': 0.9994741082191467}
Sentence: Past performance by our management team and their affiliates may not be indicative of future performance of an investment in the Company.
Prediction: {'label': 'Non-ESG-neu', 'score': 0.6650416851043701}
Sentence: We therefore believe that any necessary provision for creditors will be reduced and should not have a significant impact on our ability to distribute the funds in the trust account to our public stockholders.
Prediction: {'label': 'Non-ESG-neu', 'score': 0.9919637441635132}
Sentence: Since our staggered board may prevent our stockholders from replacing a majority of our board of directors at any given annual meeting, it may entrench management and discourage unsolicited stockholder proposals
Prediction: {'label': 'G-pos', 'score': 0.5211058259010315}
Sentence: 10.49 Credit Agreement, dated as of October 31, 2019, by and between FuelCell Energy, Inc., the Guarantors from time to time party thereto, the Lenders and Orion Energy Partners Investment Agent, LLC (incorporated by reference to Exhibit 10.2 to the Company s Current Report on Form 8-K filed on November 6, 2019).
Prediction: {'label': 'Non-ESG-neu', 'score': 0.999854564666748}
Sentence: RISK FACTORS Risks Related to COVID-19 The COVID-19 pandemic has adversely impacted our business and financial results and that of many of our customers, and the ultimate impact will depend on future developments, which are highly uncertain, cannot be predicted and are largely outside of our control, including the scope and duration of the pandemic and actions taken by governmental authorities in response to the pandemic.
Prediction: {'label': 'Non-ESG-neg', 'score': 0.9773620963096619}
Sentence: On July 1, 2019, we entered into a conversion agreement with Australis, whereby Australis has agreed to convert the Debentures on July 1, 2020.
Prediction: {'label': 'Non-ESG-neu', 'score': 0.9999104738235474}
Sentence: We derive a significant portion of our revenues from a small number of customers and licensees, and particularly from their sale of premium tier devices, and we expect this trend to continue in the foreseeable future.
Prediction: {'label': 'Non-ESG-neu', 'score': 0.9998589754104614}
Sentence: Because of its inherent limitations, internal control over financial reporting may not prevent or detect misstatements.
Prediction: {'label': 'G-neu', 'score': 0.8928971290588379}
Sentence: In the PEN, ODEQ also alleged violations of major source new source review, CAO and federal hazardous air pollutant control technology requirements and gave notice to the Company that ODEQ had referred the matter to USEPA for review and possible formal enforcement.
Prediction: {'label': 'E-neg', 'score': 0.8567849397659302}
Sentence: We also have centralized departments for each segment whose focus is to build and maintain relationships with potential and existing national employers and develop graduate job opportunities and, where possible, relocation assistance, sign-on bonuses, tool packages and tuition reimbursement plans with our manufacturer brand partners and other industry employers.
Prediction: {'label': 'S-neu', 'score': 0.8255176544189453}
- Downloads last month
- 10
Model tree for YueOdileWu/ESG-sentiment-FinBERT
Base model
ProsusAI/finbert