GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent
Bin Xie, Rui Shao, Gongwei Chen, Kaiwen Zhou, Yinchuan Li, Jie Liu, Min Zhang, Liqiang Nie·May 22, 2025
Summary
GUI-explorer autonomously navigates dynamic environments, excelling in transition-aware knowledge mining. It surpasses state-of-the-art agents, achieving high task success rates without parameter adjustments. This agent integrates exploration and unsupervised knowledge extraction, enhancing AI's understanding of GUI elements. It identifies inconsistencies, constructs diverse function-aware trajectories, and extracts precise operation logic, outperforming current methods with significant improvements in task success rates and reduced prior knowledge errors.
Introduction
Background
Overview of GUI exploration challenges
Importance of dynamic environment navigation
Objective
Enhancing AI's understanding of GUI elements
Improving task success rates without parameter adjustments
Method
Exploration Strategy
Autonomous navigation techniques
Integration of exploration and unsupervised knowledge extraction
Data Collection
Methods for gathering GUI data
Data Preprocessing
Techniques for preparing data for analysis
Inconsistency Identification
Algorithms for detecting inconsistencies in GUI elements
Function-Aware Trajectory Construction
Methods for creating diverse, function-aware paths
Precise Operation Logic Extraction
Approaches for identifying and extracting operation logic
Results
Task Success Rates
Comparison with state-of-the-art agents
Prior Knowledge Errors
Reduction in errors through autonomous learning
Performance Metrics
Quantitative analysis of improvements
Conclusion
Future Directions
Potential advancements in GUI exploration
Impact on AI and GUI Interaction
Enhanced capabilities in dynamic environments
Basic info
papers
artificial intelligence
Advanced features