Qwen3-Embedding-0.6B (ONNX Standard / FP32)
This repository contains the unquantized (FP32), ONNX-exported version of Qwen/Qwen3-Embedding-0.6B.
It provides maximum precision and is compatible with Hugging Face's Text Embeddings Inference (TEI) or the optimum library.
Model Details
| Attribute | Detail |
|---|---|
| Base Model | Qwen/Qwen3-Embedding-0.6B |
| Format | ONNX (Opset 17) |
| Quantization | None (FP32 / Standard Precision) |
| Task | Feature Extraction / Semantic Embedding |
| File Size | ~2.4 GB |
Usage with Text Embeddings Inference (TEI)
This model is pre-configured for TEI.
Note: auto-truncate is required because the model supports 32k context, but Docker defaults to smaller batches.
Option A: Docker CLI
docker run --rm -p 8080:80 \\
-v $PWD/data:/data \\
ghcr.io/huggingface/text-embeddings-inference:cpu-latest \\
--model-id Svenni551/Qwen3-Embedding-0.6B-ONNX \\
--pooling mean \\
--auto-truncate
Option B: Docker Compose
services:
embedding-service:
image: ghcr.io/huggingface/text-embeddings-inference:cpu-latest
environment:
- MODEL_ID=Svenni551/Qwen3-Embedding-0.6B-ONNX
- POOLING=mean
- MAX_CLIENT_BATCH_SIZE=8
- MAX_BATCH_TOKENS=2048
- AUTO_TRUNCATE=true
volumes:
- ./data:/data
ports:
- "8080:80"
Usage with Python (Optimum)
pip install optimum[onnxruntime] transformers
from optimum.onnxruntime import ORTModelForFeatureExtraction
from transformers import AutoTokenizer
import torch
model_id = "Svenni551/Qwen3-Embedding-0.6B-ONNX"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = ORTModelForFeatureExtraction.from_pretrained(model_id)
inputs = tokenizer("Hello World", padding=True, truncation=True, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
# Mean Pooling
attention_mask = inputs['attention_mask']
token_embeddings = outputs.last_hidden_state
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
embeddings = torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
print(embeddings.shape)
- Downloads last month
- 28