How do you calculate inference speed on OlmOCR-Bench?

#26
by Piperino - opened

Is the time of 5.71 pages/sec calculated sequentially and therefore 1 page at a time or do you do batches to fill the entire h100 gpu?

bump, could you share how to correctly deploy the model on H100 to achieve such speed?

Sign up or log in to comment