YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

GOT-OCR2 ONNX Export (stepfun-ai/GOT-OCR2_0)

This directory contains ONNX exports produced from stepfun-ai/GOT-OCR2_0 using BaofengZan/GOT-OCRv2-onnx (llm-export).

Contents

  • got_ocr2_vision_encoder.onnx (+ .onnx.data): Vision encoder. Input: images; output: visual features. Ready for use with TranslateBlue's GOT-OCR2 ONNX integration.
  • got_ocr2_decoder.onnx (optional): Single decoder ONNX; when present, full OCR runs: image β†’ vision encoder β†’ encoder_hidden_states β†’ decoder (BOS then autoregressive) β†’ logits β†’ text. Produce it with scripts/export_got_ocr2_decoder_onnx.py.
  • tokenizer.json, tokenizer_config.json: Tokenizer for decode. Generated from the Hugging Face model vocab.
  • Split decoder (for reference; app does not load these as a single decoder):
    • embedding.onnx, block_0.onnx .. block_23.onnx, lm.onnx, norm.onnx, mm_projector_vary.onnx (each with .onnx.data where applicable).

App compatibility

TranslateBlue's GOTOCR2OCRService expects two ONNX files:

  1. got_ocr2_vision_encoder.onnx – present; you can run the vision encoder in the app.
  2. got_ocr2_decoder.onnx – a single decoder ONNX that takes decoder_input_ids and encoder_hidden_states and outputs logits. When this file is present (e.g. from scripts/export_got_ocr2_decoder_onnx.py), full OCR runs: vision encoder β†’ encoder_hidden_states β†’ decoder (with BOS then autoregressive tokens) β†’ logits β†’ decoded text.

See Docs/GOT_OCR2_ONNX_Export.md for I/O names and export options.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support