metadata

license: mit
title: 🎙️ PodXplainClone
sdk: gradio
emoji: 📚
colorFrom: red
colorTo: blue
pinned: true
short_description: ' A CPU-friendly AI podcast generator using SpeechT5.'

🎙️ PodXplainClone

From script to story — voice it like never before, even on CPU.

PodXplainClone is an experimental Hugging Face Space designed to demonstrate the core functionality of an AI-powered podcast generator, specifically optimized to run efficiently on CPU hardware. It allows users to transform written dialogue or narrative text into a natural-sounding audio podcast with multiple distinct voices.

This space serves as a CPU-friendly alternative and development sandbox to the main PodXplain project (which is awaiting GPU resources for a more advanced model).

✨ Features

📝 Long-form Support: Handle up to 50,000 characters of text
🎭 Multi-speaker Audio: Automatic speaker detection and assignment with distinct voices
🔄 Smart Segmentation: Intelligent text splitting with progress tracking
🎵 High-quality Output: MP3 format for optimal file size and compatibility
🚀 Real-time Progress: Live updates during generation
🎨 Modern UI: Clean, intuitive Gradio interface

🛠️ Tech Stack

Frontend: Gradio for interactive web interface
TTS Engine: microsoft/speecht5_tts for natural, multi-speaker voice synthesis (optimized for CPU)
Audio Processing: pydub for audio manipulation and MP3 conversion
Hosting: Hugging Face Spaces (currently configured for CPU Basic tier)

📋 How to Use

Input Text: Paste or type your podcast script (up to 50,000 characters)
Choose Mode: Select speaker detection mode:
- Auto: Smart detection based on content structure
- Paragraph: Speaker changes at paragraph breaks
- Dialogue: Detection based on dialogue markers
Generate: Click "Generate Podcast" and watch the progress
Download: Get your MP3 file and listen to your podcast!

🚀 Quick Start

Local Development

# Clone the repository
git clone [https://huggingface.co/spaces/Nick021402/PodXplainClone](https://huggingface.co/spaces/Nick021402/PodXplainClone) # Clone this specific Space
cd PodXplainClone

# Install dependencies
pip install -r requirements.txt

# Run the application
python app.py

Note: This space, PodXplainClone, is a separate effort to demonstrate the core features of the PodXplain project. It was specifically created to utilize Text-to-Speech models that do not require paid GPU hardware, thereby ensuring accessibility and functionality within Hugging Face's free CPU tiers. The original PodXplain project aims to integrate larger, more advanced TTS models like Nari DIA 1.6B when dedicated GPU access becomes available.
Developed by: Nick021402