Model Card for Dora-adopted Lite-Oute-1-300M-Instruct

The model was trained with DoRA adapter to classify the sentiment of twitter messages into 'positive', 'negative', and 'neutral'. It was trained on cardiffnlp/tweet_eval dataset. DoRA-adopted layers include k_proj and v_proj weight matrices for all attention layers.

Model Details

The system prompt for the model is as follows:

You are a helpful assistant that classifies the sentiment of a message. Classify the sentiment of the given message as exactly one word: 'negative', 'neutral', or 'positive'. Be brief, respond with exactly one word.

Inputs for the model should be provided in the following format:

Message: "[text of the message]"

The model is trained to output labels in the following format:

The sentiment of the message is [label].

where [label] is either 'positive', 'negative' or 'neutral'.

Labels can be extracted from the model's outputs with the following function:

import re
def postprocess_sentiment(output_text: str) -> str:
    """
    Extracts the sentiment classification ("positive" or "negative") from the model's output text.

    Process:
        1. Splits the output at the first occurrence of the keyword "assistant" and processes the text after it.
        2. Uses a regular expression to search for the first occurrence of the words "positive" or "negative" (ignoring case).
        3. Returns the found sentiment in lowercase. If no match is found, returns an empty string.

    Parameters:
        output_text (str): The complete text output from the model, including conversation headers.

    Returns:
        str: The sentiment classification or empty string
    """

    parts = output_text.split("assistant", 1)
    text_to_process = parts[0] if len(parts) > 1 else output_text
    text_to_process = text_to_process.lower()
    match = re.search(rf"\b({'|'.join(IDX2NAME.values())})\b", text_to_process, re.IGNORECASE)
    return match.group(1).lower() if match else ""

Training Details

Only k_proj and v_proj layers were adopted. Model was trained for 3 epoches with learning rate=5e-4 and batch_size=8. Final loss (CrossEntropy) was 0.0466.

Evaluation

Confusion matrix calculated on the test set is presented below:

It corresponds to macro f1-score of 0.5166.

Examples of outputs:

Input (correct label is 'positive'):

Message: "QT @user In the original draft of the 7th book, Remus Lupin survived the Battle of Hogwarts. #HappyBirthdayRemusLupin"

Output:

"The sentiment of the message is positive"

Input (correct label is 'neutral'):

Message: "Sorry bout the stream last night I crashed out but will be on tonight for sure. Then back to Minecraft in pc tomorrow night."

Output:

"The sentiment of the message is neutral"

Input (correct label is 'positive'):

Message: "@user Alciato: Bee will invest 150 million in January, another 200 in the Summer and plans to bring Messi by 2017"

Output:

"The sentiment of the message is neutral"

X1716
/

llm-course-hw3-dora

Model Card for Dora-adopted Lite-Oute-1-300M-Instruct

Model Details

Training Details

Evaluation

Examples of outputs:

Model tree for X1716/llm-course-hw3-dora

Dataset used to train X1716/llm-course-hw3-dora

Collection including X1716/llm-course-hw3-dora

llm-course-hw3