This is the SATE MVP, integrate the all the pipelines into one framework. Contain the main entrance, build for docker.
main.py: Input: Entire audio file Output: Transcription with annotation
Preprocess: Segmentation + speaker diarization -> crisper whisper transcriptions for each segmentation
P.S. Should keep transcript consist in each pipelines.
IMAGE CREATION:
docker build -t sate_0.11 .
(New) HOW TO USE after image created:
docker run --gpus all -it --rm
-v /home/easgrad/shuweiho/workspace/volen/SATE_docker_test/input:/sate/input
-v /home/easgrad/shuweiho/workspace/volen/SATE_docker_test/session_data:/sate/session_data
-p 7860:7860
sate_0.11
curl -X POST http://localhost:7860/process
-F "audio_file=@/home/easgrad/shuweiho/workspace/volen/SATE_docker_test/input/454.mp3"
-F "device=cuda"
-F "pause_threshold=0.25"
(Old - don't follow it) HOW TO USE after image created:
docker run --gpus all -it --rm
-v /home/easgrad/shuweiho/workspace/volen/SATE_docker_test/input:/sate/input
-v /home/easgrad/shuweiho/workspace/volen/SATE_docker_test/session_data:/sate/session_data
-p 5000:5000
sate_0.10
curl -X POST http://localhost:5000/process
-H "Content-Type: application/json"
-d '{
"input_audio_file": "/sate/input/454.mp3",
"device": "cuda",
"pause_threshold": 0.5
}'
Test on HF space:
curl -X POST https://Sven33-SATE.hf.space/process -F "audio_file=@/home/easgrad/shuweiho/workspace/volen/SATE_docker_test/input/454.mp3" -F "device=cuda" -F "pause_threshold=0.25"