Vision-driven desktop automation using Midscene. Control your desktop (macOS, Windows, Linux) with natural language commands. Operates entirely from screensh...
**This version clarifies safety rules, adds feature instructions, and highlights desktop-focused use cases.** - Added a warning: skill takes over the real mouse/keyboard; use Browser Automation for web apps instead. - New rule: only minimize windows, never close them unless specifically requested by the user. - Expanded documentation on what actions `act` can perform in a single step. - Documented the new `assert` command for natural language screen state checks. - Explained how to use `tap --locate` with reference images for precise visual matching. - Added instructions for converting and consuming detailed HTML automation reports. - Updated command examples to use the `-y` flag for non-interactive CLI use.