Papers
arxiv:2507.15339

LionGuard 2: Building Lightweight, Data-Efficient & Localised Multilingual Content Moderators

Published on Jul 21
Authors:
,
,

Abstract

LionGuard 2, a lightweight multilingual moderation classifier using pre-trained embeddings and a multi-head ordinal classifier, outperforms commercial and open-source systems across benchmarks without fine-tuning large models.

AI-generated summary

Modern moderation systems increasingly support multiple languages, but often fail to address localisation and low-resource variants - creating safety gaps in real-world deployments. Small models offer a potential alternative to large LLMs, yet still demand considerable data and compute. We present LionGuard 2, a lightweight, multilingual moderation classifier tailored to the Singapore context, supporting English, Chinese, Malay, and partial Tamil. Built on pre-trained OpenAI embeddings and a multi-head ordinal classifier, LionGuard 2 outperforms several commercial and open-source systems across 17 benchmarks, including both Singapore-specific and public English datasets. The system is actively deployed within the Singapore Government, demonstrating practical efficacy at scale. Our findings show that high-quality local data and robust multilingual embeddings can achieve strong moderation performance, without fine-tuning large models. We release our model weights and part of our training data to support future work on LLM safety.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2507.15339 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2507.15339 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 2