Back to Tools
vision.ocr
Generates a detailed text description of visual content in images.
Input
Drop file or click to upload
JPG, PNG, WEBP — max 20MB
Output
Results will appear here after execution.
Example Output
{
"text": "Hello World\nLine 2",
"blocks": [
{ "text": "Hello World", "bbox": [10, 20, 200, 50], "confidence": 0.98 }
],
"model": "qwen-vl-max",
"usage": { "input_tokens": 512, "output_tokens": 64 }
}