---
license: mit
language:
- en
base_model:
- answerdotai/ModernBERT-base
pipeline_tag: token-classification
tags:
- token classification
- hallucination detection
- transformers
---

# LettuceDetect: Hallucination Detection Model

**Model Name:** lettucedect-base-modernbert-en-v1  
**Organization:** KRLabsOrg  

## Overview

LettuceDetect is a transformer-based model for hallucination detection on context and answer pairs, designed for Retrieval-Augmented Generation (RAG) applications. This model is built on **ModernBERT**, which has been specifically chosen and trained becasue of its extended context support (up to **8192 tokens**). This long-context capability is critical for tasks where detailed and extensive documents need to be processed to accurately determine if an answer is supported by the provided context.

## Model Details

- **Architecture:** ModernBERT (Base) with extended context support (up to 8192 tokens)
- **Task:** Token Classification / Hallucination Detection
- **Training Dataset:** RagTruth (with potential extensions to biomedical datasets)
- **Language:** English

## How It Works

The model is trained to identify tokens in the answer text that are not supported by the given context. During inference, the model returns token-level predictions which are then aggregated into spans. This allows users to see exactly which parts of the answer are considered hallucinated.

## Usage

### Installation

Install the 'lettucedetect' repository

```bash
pip install lettucedetect
```

### Using the model

```python
from lettucedetect.models.inference import HallucinationDetector

# Initialize the detector with the transformer-based approach.
detector = HallucinationDetector(method="transformer", model_path="KRLabsOrg/lettucedect-base-modernbert-en-v1")


context = """Briefly answer the following question:
What is the capital of France? What is the population of France?

Bear in mind that your response should be strictly based on the following three passages:
passage 1: France is a country in Europe. The capital of France is Paris. The population of France is 67 million.

output:
"""

answer = "The capital of France is Paris. The population of France is 69 million."

# Get span-level predictions indicating which parts of the answer are considered hallucinated.
predictions = detector.predict(context, answer, output_format="spans")
print("Predictions:", predictions)

# Predictions: [{'start': 31, 'end': 71, 'confidence': 0.9944414496421814, 'text': ' The population of France is 69 million.'}]
```