llava-chat / README.md
Prashant26am's picture
Move app.py to root directory for Hugging Face Space deployment
1ea681e
---
title: LLaVA Chat
emoji: 🖼️
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.19.2
app_file: app.py
pinned: false
license: mit
---
# LLaVA Chat
A lightweight implementation of LLaVA (Large Language and Vision Assistant) optimized for Hugging Face Spaces deployment.
## Features
- Efficient model loading with 8-bit quantization
- Memory-optimized inference
- FastAPI backend with Gradio interface
- Support for image understanding and visual conversations
- Optimized for deployment on Hugging Face Spaces
## Quick Start
1. Visit the [Hugging Face Space](https://huggingface.co/spaces/Prashant26am/llava-chat)
2. Upload an image
3. Ask questions about the image
4. Get AI-powered responses
## Local Development
1. Clone the repository:
```bash
git clone https://github.com/Prashant-ambati/llava-implementation.git
cd llava-implementation
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Run the application:
```bash
python llava-chat/app.py
```
## Model Architecture
- Vision Model: CLIP ViT-Base
- Language Model: TinyLlama-1.1B-Chat
- Projection Layer: MLP with configurable hidden dimensions
## Memory Optimization
The implementation includes several memory optimization techniques:
- 8-bit quantization for language model
- Efficient image processing
- Gradient checkpointing
- Memory-efficient attention
- Automatic mixed precision
## API Endpoints
- `POST /process_image`: Process an image with a prompt
- `GET /status`: Check model and application status
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- Based on the paper "Visual Instruction Tuning" (NeurIPS 2023)
- Uses models from Hugging Face Transformers
- Built with FastAPI and Gradio