bosonai
/

higgs-audio-v2-generation-3B-base

Model card Files Files and versions

xingjian-bosonai commited on 21 days ago

Commit

096073c

·

verified ·

1 Parent(s): 0eda647

Update README.md

Files changed (1) hide show

README.md +39 -11

README.md CHANGED Viewed

@@ -127,19 +127,47 @@ pip install -r requirements.txt
 pip install -e .
 ```
-Afterwards, you can launch generation examples via the `generation.py` provided in the repository.
-```bash
-python3 examples/generation.py \
---transcript examples/transcript/single_speaker/en_basic.txt \
---ref_audio belinda \
---seed 12345
 ```
-Alternatively, here's a python script that you can try to convert text to speech after you have installed higgs-audio.
 ## License
 TBA

 pip install -e .
 ```
+Afterwards, try to run the following python code snippet to convert text to speech.
+```python
+from boson_multimodal.serve.serve_engine import HiggsAudioServeEngine, HiggsAudioResponse
+from boson_multimodal.data_types import ChatMLSample, Message, AudioContent
+import torch
+import torchaudio
+import time
+import click
+MODEL_PATH = "bosonai/higgs-audio-v2-generation-3B-staging"
+AUDIO_TOKENIZER_PATH = "bosonai/higgs-audio-v2-tokenizer-staging"
+messages = [
+    Message(
+        role="system",
+        content="Generate audio following instruction.\n\n<|scene_desc_start|>\nSPEAKER0: british accent\n<|scene_desc_end|>",
+    ),
+    Message(
+        role="user",
+        content="The sun rises in the east and sets in the west. This simple fact has been observed by humans for thousands of years.",
+    ),
+]
+device = "cuda" if torch.cuda.is_available() else "cpu"
+serve_engine = HiggsAudioServeEngine(MODEL_PATH, AUDIO_TOKENIZER_PATH, device=device)
+output: HiggsAudioResponse = serve_engine.generate(
+    chat_ml_sample=ChatMLSample(messages=messages),
+    max_new_tokens=1024,
+    temperature=1.0,
+    top_p=0.95,
+    top_k=50,
+    stop_strings=["<|end_of_text|>", "<|eot_id|>"],
+    seed=12345,
+)
+torchaudio.save(f"output.wav", torch.from_numpy(output.audio)[None, :], output.sampling_rate)
 ```
+You can also check https://github.com/boson-ai/higgs-audio/examples for more example scripts.
 ## License
 TBA