Model Description

care-chinese-llama3.1-8b is based on meta-llama/Llama-3.1-8B-Instruct and further fine-tuned on our CARE, enhancing cultural awareness especially in Chinese.

Example

>>> from vllm import LLM, SamplingParams
>>> from transformers import AutoTokenizer
>>> import torch

>>> model = LLM(model="geyang627/care-chinese-llama3.1-8b", tensor_parallel_size=torch.cuda.device_count(), dtype="auto", trust_remote_code=True, max_model_len=2048)

>>> tokenizer = AutoTokenizer.from_pretrained("geyang627/care-chinese-llama3.1-8b", use_fast=False, trust_remote_code=True)
>>> if tokenizer.pad_token is None:
>>>     tokenizer.pad_token = tokenizer.eos_token
>>>     tokenizer.pad_token_id = tokenizer.eos_token_id

>>> sampling_params = SamplingParams(temperature=0.7, top_p=1.0, max_tokens=256)
>>> outputs = model.generate(["为什么中国人不喜欢数字4?"], sampling_params)
>>> print(outputs[0].outputs[0].text)
Downloads last month
6
Safetensors
Model size
8.03B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for geyang627/care-chinese-llama3.1-8b

Finetuned
(1755)
this model
Quantizations
1 model

Collection including geyang627/care-chinese-llama3.1-8b