tensora-autotrain / docs /source /quickstart_py.mdx
hardiktiwari's picture
Upload 244 files
33d4721 verified
# Quickstart with Python
AutoTrain is a library that allows you to train state of the art models on Hugging Face Spaces, or locally.
It provides a simple and easy-to-use interface to train models for various tasks like llm finetuning, text classification,
image classification, object detection, and more.
In this quickstart guide, we will show you how to train a model using AutoTrain in Python.
## Getting Started
AutoTrain can be installed using pip:
```bash
$ pip install autotrain-advanced
```
The example code below shows how to finetune an LLM model using AutoTrain in Python:
```python
import os
from autotrain.params import LLMTrainingParams
from autotrain.project import AutoTrainProject
params = LLMTrainingParams(
model="meta-llama/Llama-3.2-1B-Instruct",
data_path="HuggingFaceH4/no_robots",
chat_template="tokenizer",
text_column="messages",
train_split="train",
trainer="sft",
epochs=3,
batch_size=1,
lr=1e-5,
peft=True,
quantization="int4",
target_modules="all-linear",
padding="right",
optimizer="paged_adamw_8bit",
scheduler="cosine",
gradient_accumulation=8,
mixed_precision="bf16",
merge_adapter=True,
project_name="autotrain-llama32-1b-finetune",
log="tensorboard",
push_to_hub=True,
username=os.environ.get("HF_USERNAME"),
token=os.environ.get("HF_TOKEN"),
)
backend = "local"
project = AutoTrainProject(params=params, backend=backend, process=True)
project.create()
```
In this example, we are finetuning the `meta-llama/Llama-3.2-1B-Instruct` model on the `HuggingFaceH4/no_robots` dataset.
We are training the model for 3 epochs with a batch size of 1 and a learning rate of `1e-5`.
We are using the `paged_adamw_8bit` optimizer and the `cosine` scheduler.
We are also using mixed precision training with a gradient accumulation of 8.
The final model will be pushed to the Hugging Face Hub after training.
To train the model, run the following command:
```bash
$ export HF_USERNAME=<your-hf-username>
$ export HF_TOKEN=<your-hf-write-token>
$ python train.py
```
This will create a new project directory with the name `autotrain-llama32-1b-finetune` and start the training process.
Once the training is complete, the model will be pushed to the Hugging Face Hub.
Your HF_TOKEN and HF_USERNAME are only required if you want to push the model or if you are accessing a gated model or dataset.
## AutoTrainProject Class
[[autodoc]] project.AutoTrainProject
## Parameters
### Text Tasks
[[autodoc]] trainers.clm.params.LLMTrainingParams
[[autodoc]] trainers.sent_transformers.params.SentenceTransformersParams
[[autodoc]] trainers.seq2seq.params.Seq2SeqParams
[[autodoc]] trainers.token_classification.params.TokenClassificationParams
[[autodoc]] trainers.extractive_question_answering.params.ExtractiveQuestionAnsweringParams
[[autodoc]] trainers.text_classification.params.TextClassificationParams
[[autodoc]] trainers.text_regression.params.TextRegressionParams
### Image Tasks
[[autodoc]] trainers.image_classification.params.ImageClassificationParams
[[autodoc]] trainers.image_regression.params.ImageRegressionParams
[[autodoc]] trainers.object_detection.params.ObjectDetectionParams
### Tabular Tasks
[[autodoc]] trainers.tabular.params.TabularParams