arxiv:2110.11144

RCT: Random Consistency Training for Semi-supervised Sound Event Detection

Published on Oct 21, 2021

Authors:

Abstract

A semi-supervised learning strategy with random consistency training, self-consistency loss, and hard mixup data augmentation improves sound event detection in data-deficient scenarios.

AI-generated summary

Sound event detection (SED), as a core module of acoustic environmental analysis, suffers from the problem of data deficiency. The integration of semi-supervised learning (SSL) largely mitigates such problem while bringing no extra annotation budget. This paper researches on several core modules of SSL, and introduces a random consistency training (RCT) strategy. First, a self-consistency loss is proposed to fuse with the teacher-student model to stabilize the training. Second, a hard mixup data augmentation is proposed to account for the additive property of sounds. Third, a random augmentation scheme is applied to flexibly combine different types of data augmentations. Experiments show that the proposed strategy outperform other widely-used strategies.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2110.11144 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2110.11144 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2110.11144 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.