ZeroNER: Fueling Zero-Shot Named Entity Recognition via Entity Type Descriptions

ZeroNER is a description-driven Named Entity Recognition (NER) model designed to generalize to unseen entity types in zero-shot settings—where no labeled examples are available for the target classes.

🔗 Paper: ZERONER: Fueling Zero-Shot Named Entity Recognition via Entity Type Descriptions (ACL 2025)
🔧 Code: Available Soon!

🚀 What is ZeroNER?

ZeroNER is a BERT-based cross-encoder fine-tuned using a silver dataset generated with LLM supervision. Unlike previous zero-shot methods that rely solely on entity type names, ZeroNER uses natural language descriptions of entity types to disambiguate and generalize better across domains.

It was built to address key limitations in real-world NER:

LLM Supervision: We use a frozen LLM to generate a domain-diverse distillation dataset using type descriptions.
Self-correction: A second round of LLM filtering ensures the silver dataset remains high quality.
Student Training: A compact BERT model is trained using both the entity mention and the entity type description, forming a cross-encoder for robust generalization.
Hard Zero-Shot Evaluation: We enforce strict zero-shot constraints—no overlap in type names or descriptions between train/dev/test.

How to use it

We integrated our model into the IBM Zshot library, enabling users to quickly and easily deploy ZeroNER in their workflows.

!pip install -U zshot==0.0.11 gliner datasets

import spacy
import datasets

from zshot import PipelineConfig, displacy
from zshot.linker import LinkerSMXM
from zshot.utils.data_models import Entity


entities = [
    Entity(name='FAC', description='Names of man-made structures: infrastructure (streets, bridges), buildings, monuments, etc. belong to this type. Buildings that are referred to using the name of the company or organization that uses them should be marked as FAC when they refer to the physical structure of the building itself, usually in a locative way: "I\'m reporting live from right outside [Massachusetts General Hospital]"', vocabulary=None),
    Entity(name='LOC', description='Names of geographical locations other than GPEs. These include mountain ranges, coasts, borders, planets, geo-coordinates, bodies of water. Also included in this category are named regions such as the Middle East, areas, neighborhoods, continents and regions of continents. Do NOT mark deictics or other non-proper nouns: here, there, everywhere, etc. As with GPEs, directional modifiers such as "southern" are only marked when they are part of the location name itself.', vocabulary=None),
    Entity(name='WORK_OF_ART', description='Titles of books, songs, television programs and other creations. Also includes awards. These are usually surrounded by quotation marks in the article (though the quotations are not included in the annotation). Newspaper headlines should only be marked if they are referential. In other words the headline of the article being annotated should not be marked but if in the body of the text here is a reference to an article, then it is markable as a work of art.', vocabulary=None)
]

nlp = spacy.blank("en")
nlp_config = PipelineConfig(
    linker=LinkerSMXM(model_name="disi-unibo-nlp/zeroner-base"),
    entities=entities,
    device='cuda'
)

nlp.add_pipe("zshot", config=nlp_config, last=True)

text = """
I remember the SMS was written like this at that time , saying that , ah , there was a sewage pipe leakage accident on the side road at the southeast corner of Jingguang Bridge at East Third Ring Road , and , well , traffic supervision was implemented near Chaoyang Road , Jingguang Bridge , and East Third Ring Road , and requesting cars to make a detour .
"""

doc = nlp(text)
displacy.serve(doc, style="ent")

We have created a free Google Colab notebook to help you explore the library and customize it for your specific use case with ease.

📥 Training Data

The model is trained on synthetic annotations generated by LLaMA-3.1-8B-instruct over the Pile Uncopyrighted dataset.

The resulting automatically annotated dataset, PileUncopyrighted-NER-BIO, follows the BIO format and was used as the training source for this model.

📊 Performance

ZeroNER outperforms both:

LLMs up to 8B parameters (e.g., LLaMA-3.1, Granite-3.0, Qwen2.5, etc.)
Contaminated small encoder models (e.g. GLiNER) that leak information across splits

More details are provided in our paper.

🤝 Citation

If you use ZeroNER in your research, please cite:

@inproceedings{cocchieri-etal-2025-zeroner,
    title = "{Z}ero{NER}: Fueling Zero-Shot Named Entity Recognition via Entity Type Descriptions",
    author = "Cocchieri, Alessio  and
      Mart{\'i}nez Galindo, Marcos  and
      Frisoni, Giacomo  and
      Moro, Gianluca  and
      Sartori, Claudio  and
      Tagliavini, Giuseppe",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.805/",
    doi = "10.18653/v1/2025.findings-acl.805",
    pages = "15594--15616",
    ISBN = "979-8-89176-256-5",
    abstract = "What happens when a named entity recognition (NER) system encounters entities it has never seen before? In practical applications, models must generalize to unseen entity types where labeled training data is either unavailable or severely limited{---}a challenge that demands zero-shot learning capabilities. While large language models (LLMs) offer extensive parametric knowledge, they fall short in cost-effectiveness compared to specialized small encoders. Existing zero-shot methods predominantly adopt a relaxed definition of the term with potential leakage issues and rely on entity type names for generalization, overlooking the value of richer descriptions for disambiguation. In this work, we introduce ZeroNER, a description-driven framework that enhances hard zero-shot NER in low-resource settings. By leveraging general-domain annotations and entity type descriptions with LLM supervision, ZeroNER enables a BERT-based student model to successfully identify unseen entity types. Evaluated on three real-world benchmarks, ZeroNER consistently outperforms LLMs by up to 16{\%} in F1 score, and surpasses lightweight baselines that use type names alone. Our analysis further reveals that LLMs derive significant benefits from incorporating type descriptions in the prompts."
}

disi-unibo-nlp
/

zeroner-base