Back to Tools
vision.describe
Generates a detailed text description of visual content in images.
Input
Drop file or click to upload
JPG, PNG, WEBP — max 20MB
Output
Results will appear here after execution.
Example Output
{
"description": "A sunlit outdoor café scene with
several patrons seated under yellow
parasols along a cobblestone street.",
"model": "qwen-vl-max",
"usage": { "input_tokens": 1024, "output_tokens": 128 }
}