todorristov commited on
Commit
f41a66e
Β·
1 Parent(s): fcee07d

Revert "fixed README.md file"

Browse files

This reverts commit fcee07db5df946ec0de33aefeb928233cd7d89f0.

undid the README.md :wq
:wq

README.md CHANGED
@@ -1,230 +1,14 @@
1
- # CAR CLASSIFICATION - Brand, Model & Model Year
2
-
3
- This project is a deep learning pipeline that classifies car **brand**, **model**, and **model year** from a single image using a fine-tuned ConvNeXt model. It uses the [Stanford Cars dataset](https://huggingface.co/datasets/tanganke/stanford_cars) and leverages **transfer learning** with `facebook/convnext-large-224`. Built in **PyTorch**, this modular and scalable pipeline supports training, evaluation, and inference.
4
-
5
- ---
6
-
7
- ## πŸ” Key Features
8
-
9
- - Download and preprocess image data from Hugging Face
10
- - Fine-tune pretrained ConvNeXt models (modern ConvNets inspired by transformers)
11
- - Track training metrics and model checkpoints
12
- - Predict the class of custom input images using saved models
13
- - Modular design for training, evaluation, and inference
14
-
15
-
16
- ---
17
-
18
- ## 🧰 Installation
19
-
20
-
21
- ## πŸ”§ Setup Instructions
22
-
23
- 1. Clone the repo from GitHub
24
- ``` bash
25
- git clone https://github.com/Brainster-Data-Science-Academy/CarClassificationTeam1
26
-
27
- cd CarClassificationTeam1
28
- ```
29
-
30
- 2. Create and activate a virtual environment
31
- ``` bash
32
- python -m venv venv
33
- source venv/bin/activate # On Windows: venv\Scripts\activate
34
- ```
35
-
36
- 3. Install dependencies
37
- ``` bash
38
- pip install -r requirements.txt
39
- ```
40
-
41
- 4. Download the dataset
42
- ``` bash
43
- python-m src.data.download.download.py
44
- ```
45
-
46
- ---
47
-
48
- ## πŸ“ Requirements
49
-
50
- - Python 3.8+
51
- - PyTorch 2.3.0+cu126
52
- - torchvision 0.18.0+cu126
53
- - torchaudio 2.3.0+cu126
54
- - transformers
55
- - datasets
56
- - Other dependencies as listed in requirements.txt
57
-
58
- ---
59
-
60
- ## 🧠 Model Architecture
61
-
62
- ![ConvNeXt Architecture](images/convnext_architecture.png)
63
-
64
- We fine-tuned a pretrained [ConvNeXt](https://huggingface.co/facebook/convnext-base-224) vision transformer model:
65
-
66
- - **Model**: ConvNeXt-Base (224x224 resolution)
67
- - **Pretrained on**: ImageNet-1k
68
- - **Fine-tuned on**: Stanford Cars (196 classes)
69
- - **Transfer Learning**: Only the last **two ConvNeXt stages** and the **classification head** were trained
70
-
71
-
72
- Since the **Stanford Cars** dataset contains a relatively small number of training examples (~8,100 training and ~8,000 validation images), we adopted a **transfer learning** strategy. The ConvNeXt model was initialized with pretrained weights from ImageNet-1k, and only the final classification head was randomly initialized and fine-tuned for our 196 target classes.
73
-
74
- To balance generalization and training efficiency, we unfroze and trained only the last two stages of the ConvNeXt backbone (Stages 3 and 4), along with the classification head. Earlier layers remained frozen to preserve robust pretrained features.
75
-
76
-
77
- **Data Augmentation**:
78
-
79
- ```python
80
- transforms.Compose([
81
- transforms.RandomResizedCrop(image_size, scale=(0.8, 1.0), ratio=(0.75, 1.33)),
82
- transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
83
- transforms.RandomHorizontalFlip(),
84
- transforms.RandomRotation(degrees=15),
85
- transforms.RandomGrayscale(p=0.1),
86
- transforms.ToTensor(),
87
- transforms.GaussianBlur(kernel_size=(5, 9), sigma=(0.1, 5)),
88
- transforms.RandomErasing(p=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3)),
89
- transforms.Normalize(mean=mean, std=std),
90
- ])
91
- ```
92
-
93
- ---
94
-
95
- ## πŸ“Š Performance
96
-
97
- - **Train Accuracy**: `98.62%`
98
- - **Validation Accuracy**: `92.30%`
99
- - **Train Loss (Cross Entrophy)**: `0.9010`
100
- - **Validation Loss (Cross Entrophy)**: `1.1231`
101
-
102
- ---
103
-
104
- ## πŸš€ Usage (Example)
105
-
106
- ```python
107
- from PIL import Image
108
- from transformers import AutoImageProcessor, ConvNextForImageClassification
109
- import torch
110
-
111
- # Load model and processor
112
- model = ConvNextForImageClassification.from_pretrained("todorristov/car_classification_model")
113
- processor = AutoImageProcessor.from_pretrained("todorristov/car_classification_model")
114
-
115
- # Load and preprocess image
116
- image = Image.open("example.jpg").convert("RGB")
117
- inputs = processor(images=image, return_tensors="pt")
118
-
119
- # Predict
120
- with torch.no_grad():
121
- logits = model(**inputs).logits
122
- predicted_class = logits.argmax(-1).item()
123
-
124
- print(f"Predicted class ID: {predicted_class}")
125
- ```
126
-
127
- ---
128
-
129
- ## πŸ‹οΈ Training Details
130
-
131
- - **Framework**: PyTorch
132
- - **Hardware**: NVIDIA RTX 4060
133
- - **Epochs**: 32 (early stopped training after 28 epochs)
134
- - **Batch Size**: 32
135
- - **Optimizer**: AdamW (lr=1e-4, weight_decay=1e-4)
136
- - **Loss Function**: Cross Entropy(label_smoothing=0.1)
137
- - **Scheduler**: ReduceLROnPlateau (factor=0.5, patience=2, min_lr=1e-6)
138
-
139
- ![Training and Validation Metrics](reports/figures/Training_Validation-Loss_Accuracy.png)
140
-
141
- This result demonstrates the effectiveness of fine-tuning high-capacity pretrained models on medium-sized, domain-specific datasets. The model generalizes well despite visual similarities between different car models and years.
142
-
143
- ---
144
-
145
- ## ⚠️ Limitations
146
-
147
- - Trained only on 196 classes from Stanford Cars (mostly 1990–2012 U.S. models)
148
- - Poor performance on:
149
- - Damaged or modified vehicles
150
- - Non-standard angles or lighting
151
- - Not suitable for unseen/new car models β€” retraining needed
152
-
153
- ---
154
-
155
- ## πŸ›  Project Details
156
-
157
- - **Developed by**: Todor Ristov, Goran Nikoloski, Milana Sokolova
158
- - **For**: TwinCar Project, Sols (Skopje, North Macedonia)
159
- - **Language**: Python
160
- - **Framework**: PyTorch
161
- - **License**: [MIT](LICENSE)
162
-
163
- ---
164
-
165
- ## πŸ”— Resources
166
-
167
- - πŸ“š Stanford Cars Dataset: [https://huggingface.co/datasets/tanganke/stanford\_cars](https://huggingface.co/datasets/tanganke/stanford_cars)
168
- - πŸ€— Model Card: [https://huggingface.co/sols/car-classification-convnext](https://huggingface.co/sols/car-classification-convnext)
169
- - 🌐 GitHub Repository: [https://github.com/Brainster-Data-Science-Academy/CarClassificationTeam1](https://github.com/Brainster-Data-Science-Academy/CarClassificationTeam1)
170
- - 🌟 Demo Space: [https://huggingface.co/spaces/todorristov/car-classification-convnext](https://huggingface.co/spaces/todorristov/car-classification-convnext)
171
-
172
  ---
173
- ## 🀝 Contributing
174
-
175
- Contributions are welcome! Please open an issue or submit a pull request. Make sure to update tests and documentation as needed.
176
-
 
 
 
 
 
 
177
  ---
178
- ## πŸ“‚ Project Structure
179
-
180
- ```
181
- project_root/
182
- β”‚
183
- β”œβ”€β”€ images/ # Model architecture visualizations
184
- β”‚
185
- β”œβ”€β”€ models/ # Stores trained model checkpoints (e.g., best_model.pt)
186
- β”‚ └── best_model.pt
187
- β”‚
188
- β”œβ”€β”€ notebooks/ # Jupyter notebooks for model exploration and experiments
189
- β”‚
190
- β”œβ”€β”€ reports/ # Training logs (loss, accuracy, LR, time, etc.)
191
- β”‚
192
- β”œβ”€β”€ src/ # Source code
193
- β”‚ β”œβ”€β”€ data/ # Data-related scripts
194
- β”‚ β”‚ β”œβ”€β”€ datadownloader.py # Downloads and saves dataset to local folders
195
- β”‚ β”‚ └── datatransforms.py # Data augmentation and preprocessing transforms
196
- β”‚ β”‚
197
- β”‚ β”œβ”€β”€ models/ # Model utilities
198
- β”‚ β”‚ └── load_model.py # Loads model, processor, and device
199
- β”‚ β”‚
200
- β”‚ β”œβ”€β”€ utils/ # Utility scripts
201
- β”‚ β”‚ └── save_label_map.py # Saves class label map
202
- β”‚ β”‚
203
- β”‚ β”œβ”€β”€ evaluate.py # Evaluation logic per epoch
204
- β”‚ β”œβ”€β”€ inference.py # Inference script for classifying new images
205
- β”‚ β”œβ”€β”€ train_utils.py # Training helper functions (e.g., metric calc, logging)
206
- β”‚ β”œβ”€β”€ train.py # Main training script
207
- β”‚ └── visualize.py # Visualizations (e.g., confusion matrix, sample predictions)
208
- β”‚
209
- β”œβ”€β”€ README.md # Project documentation
210
- └── requirements.txt # Project dependencies
211
- ```
212
-
213
- ---
214
-
215
- ## πŸ’¬ Citation
216
-
217
- ```
218
- @misc{twin-car-classification,
219
- title={Car Classification - Brand, Model & Model Year},
220
- author={Todor Ristov},
221
- year={2025},
222
- howpublished={\url{https://huggingface.co/todorristov/car_classification_model}},
223
- note={A deep learning pipeline for vehicle recognition.}
224
- }
225
- ```
226
-
227
- ---
228
-
229
- Feel free to ⭐ the repo and share your feedback!
230
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Car Classification Convnext
3
+ emoji: πŸ‘
4
+ colorFrom: yellow
5
+ colorTo: blue
6
+ sdk: gradio
7
+ sdk_version: 5.34.2
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ short_description: Car classification model trained on Stanford Cars
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
images/convnext_architecture.png DELETED
Binary file (52.9 kB)
 
reports/.gitkeep DELETED
File without changes
reports/figures/.gitkeep DELETED
File without changes
reports/figures/Training_Validation-Loss_Accuracy.png DELETED
Binary file (58.4 kB)