Papers
arxiv:2508.16232

Hybrid Pruning: In-Situ Compression of Self-Supervised Speech Models for Speaker Verification and Anti-Spoofing

Published on Aug 22
Authors:
,
,
,
,
,
,

Abstract

A unified framework integrates structured pruning with task-specific fine-tuning for speech processing models, achieving significant parameter reduction with minimal performance loss and improved generalization.

AI-generated summary

Although large-scale self-supervised learning (SSL) models like WavLM have achieved state-of-the-art performance in speech processing, their significant size impedes deployment on resource-constrained devices. While structured pruning is a key technique for model compression, existing methods typically separate it from task-specific fine-tuning. This multi-stage approach struggles to create optimal architectures tailored for diverse downstream tasks. In this work, we introduce a unified framework that integrates structured pruning into the downstream fine-tuning process. Our framework unifies these steps, jointly optimizing for task performance and model sparsity in a single stage. This allows the model to learn a compressed architecture specifically for the end task, eliminating the need for complex multi-stage pipelines and knowledge distillation. Our pruned models achieve up to a 70\% parameter reduction with negligible performance degradation on large-scale datasets, achieving equal error rates of 0.7\%, 0.8\%, and 1.6\% on Vox1-O, -E, and -H, respectively. Furthermore, our approach demonstrates improved generalization in low-resource scenarios, reducing overfitting and achieving a state-of-the-art 3.7\% EER on ASVspoof5.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2508.16232 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.16232 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.16232 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.