调用魔搭社区(ModelScope)Qwen3-VL 多模态 API 进行视觉解析。使用 OpenAI SDK 兼容方式调用,支持图片内容描述、OCR 文字提取、视觉问答、对象检测等功能。用户提到"魔搭"、"ModelScope"、"Qwen-VL"、"多模态视觉"、"解析图片"等关键词时应触发。
- Initial release of ms-qwen-vl skill for multi-modal visual analysis via ModelScope Qwen3-VL API. - Supports image description, OCR text extraction, visual question answering, object detection, and chart analysis. - Compatible with OpenAI SDK, with sample Python and CLI usage provided. - Handles both local images (auto-converted to base64) and online image URLs. - Offers two model modes: fast (30B) and precise (235B). - Detailed task options and usage instructions included in the documentation.