File size: 3,184 Bytes
f274dbe |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
---
base_model: EleutherAI/gpt-j-6b
language:
- en
license: apache-2.0
pipeline_tag: text-generation
library_name: furiosa-llm
tags:
- furiosa-ai
---
# Model Overview
- **Model Architecture:** GPT-J
- **Input:** Text
- **Output:** Text
- **Model Optimizations:**
- Beam search optimization (beam=4) for MLPerf (This model cannot run for greedy search, top-k, top-p)
- **Maximum Context Length:** 2k tokens
- Maximum Prompt Length: 1920 tokens
- Maximum Generation Length: 2048 tokens
- **Intended Use Cases:** Intended for commercial and non-commercial use. Same as [EleutherAI/gpt-j-6b](https://huggingface.co/EleutherAI/gpt-j-6b), this models is intended for text summarization.
- **Release Date:** 04/12/2025
- **Version:** v2025.2
- **License(s):** [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md)
- **Supported Inference Engine(s):** Furiosa LLM
- **Supported Hardware Compatibility:** FuriosaAI RNGD
- **Preferred Operating System(s):** Linux
- **Fine-tunes:** This model is fine-tuned for text summarization. More details can be found at [Datasets & Models at mlcommons/inferences/gpt-j/README.md](https://github.com/mlcommons/inference/blob/7bf59976b5f4eb7c5b8f30a88af832e028028446/language/gpt-j/README.md#datasets--models)
- **Quantization:**
- Tool: Furiosa Model Compressor v0.6.2, included in Furiosa SDK 2025.2
- Weight: float8, Activation: float8, KV cache: float8
- Calibration: [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) ([instruction](https://github.com/mlcommons/inference/blob/7bf59976b5f4eb7c5b8f30a88af832e028028446/language/gpt-j/README.md#download--process-dataset))
## Description:
This is pre-compiled model of a fine-tuned and quantized version of [EleutherAI/gpt-j-6b](https://huggingface.co/EleutherAI/gpt-j-6b). [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) is used for calibration and fine-tuned for text summarization. Detailes about how this model was fine-tuned and calibrated can be found in [mlcommons/inferences/gpt-j/README.md](https://github.com/mlcommons/inference/blob/7bf59976b5f4eb7c5b8f30a88af832e028028446/language/gpt-j/README.md).
As mentioned above, this model is fine-tuned for text summarization task.
Please use the following prompt when using this model and replace the {INPUT} part accordingly:
```
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
Summarize the following news article:
### Input:
{INPUTS}
### Response:
```
## Usage
### Furiosa-LLM
Follow the example command below after [installing Furiosa-LLM and its prerequisites](https://developer.furiosa.ai/latest/en/getting_started/furiosa_llm.html#installing-furiosa-llm).
```sh
furiosa-llm serve furiosa-ai/gpt-j-6b-FP8-MLPerf
```
### MLPerf Benchmark using RNGD
Follow the example command below after [installing furiosa-mlperf and its prerequisites](https://developer.furiosa.ai/latest/en/getting_started/furiosa_mlperf.html).
```sh
furiosa-mlperf gpt-j-offline furiosa-ai/gpt-j-6b-FP8-MLPerf ./mlperf-result
```
|