File size: 2,924 Bytes
2ce4d6d
293a305
2ce4d6d
 
 
 
 
45ba71c
c00b847
 
 
 
2ce4d6d
 
07f99cb
 
c00b847
07f99cb
88218fc
 
 
 
 
 
 
 
 
 
 
a47a283
07f99cb
 
 
 
c00b847
07f99cb
 
 
c00b847
88218fc
 
 
 
 
07f99cb
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
title: Embedding Playground
emoji: 🌐
colorFrom: blue
colorTo: yellow
sdk: static
pinned: false
license: apache-2.0
short_description: 'Exploring text embeddings and group similarity'
models:
  - onnx-community/Qwen3-0.6B-ONNX
  - onnx-community/Qwen3-Embedding-0.6B-ONNX
---

# Embedding WebGPU Playground

This is a browser-based playground for exploring text embeddings, group similarity, and clustering using WebGPU and ONNX models.

## Features
- **Text search**: Use your browser's search (Ctrl+F) to quickly find and highlight text within the textarea or results.
- **Text input**: Enter text in the textarea. Use single newlines (`\n`) to separate lines within a group, and triple newlines (`\n\n\n`) to separate groups.
- **Group similarity heatmap**: Click **Show Similarity Heatmap** to compute and visualize cosine similarity between group embeddings as a heatmap.
- **Search cluster reordering**: If a group header contains the word `search`, you can control how other groups and lines are ordered relative to the search group using the **Search Cluster Sort Mode** dropdown:
  - **By Group Similarity**: Orders groups by similarity to the search group, and lines within each group by similarity to the search group embedding.
  - **By Max Search Line**: Orders lines within each group by their maximum similarity to any line in the search group.
- **K-Means & Balanced K-Means clustering**: Set the number of clusters and clustering type, then click **Clustering** to group all lines into clusters. The textarea is updated to reflect the new clusters.
- **UMAP scatter plot**: Click **Cluster Plot** to visualize clusters in 2D using UMAP. Cluster names are shown in the legend.
- **Cluster naming**: Click **Naming Cluster** to generate descriptive names for each cluster using a text generation model. Names are updated in both the textarea and the scatter plot legend.
- **Progress bar**: All major actions display a progress bar during processing.

## Tech stack
- [@huggingface/transformers](https://www.npmjs.com/package/@huggingface/transformers) (ESM, WebGPU)
- [ONNX Qwen3-Embedding-0.6B-ONNX](https://huggingface.co/onnx-community/Qwen3-Embedding-0.6B-ONNX)
- [Plotly.js](https://plotly.com/javascript/) (UMD)
- [umap-js](https://github.com/PAIR-code/umap-js) (for 2D projection)

## Usage
1. Enter or paste your text in the textarea.
2. Separate groups with triple newlines if you want to compare group similarity.
3. (Optional) Use the **Search Cluster Sort Mode** dropdown to control how the search cluster reorders groups/lines.
4. Click **Show Similarity Heatmap** to compute and visualize group similarities.
5. To cluster all lines, set the number of clusters and click **Clustering**. The textarea and heatmap will update to reflect the new clusters.
6. Click **Cluster Plot** to visualize clusters in 2D.
7. Click **Naming Cluster** to generate descriptive names for each cluster.

---