Embed This
🏆
Generate 1024‑dim embeddings for each input line
You can use a VLM if you really want an API. That being said, there are a lot of OCR models that can run on CPU (albeit slow)
50b per token isn't very efficient... Wonder if we could make this 4: https://huggingface.co/inclusionAI/Ling-1T/blob/main/config.json#L22