<aside> <img src="/icons/alien-pixel_green.svg" alt="/icons/alien-pixel_green.svg" width="40px" /> Home
</aside>
<aside> <img src="/icons/verified_green.svg" alt="/icons/verified_green.svg" width="40px" /> About PAPER
This channel is dedicated to building and expanding a comprehensive paper database focused on the Web Agent field and the boarder GUI agent field. Let’s collaborate to enrich this database and advance research in the exciting world of web agents!
</aside>
🔥 Newly updated(May)
New~ May 30 |ZeroGUI: Automating Online GUI Learning at Zero Human Cost
New~ May 29 |Grounded Reinforcement Learning for Visual Reasoning
New~ May 28 |UI-Evol: Automatic Knowledge Evolving for Computer Use Agents
New~ May 28 |RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments
New~ May 28 |WorkForceAgent-R1: Incentivizing Reasoning Capability in LLM-based Web Agents via Reinforcement Learning
New~ May 27 |BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism
New~ May 27 |AdInject: Real-World Black-Box Attacks on Web Agents via Advertising Delivery
New~ May 27 |UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents
New~ May 26 |WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback
New~ May 26 |ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
May 23 |Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
May 23 |ProgRM: Build Better GUI Agents with Progress Rewards
May 22 |GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent
May 22 |ARPO:End-to-End Policy Optimization for GUI Agents with Experience Replay
May 22 |WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning
May 22 |GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents
May 21 |ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search
May 20 |Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
May 20 |From Assistants to Adversaries: Exploring the Security Risks of Mobile LLM Agents
May 20 |Efficient Agent Training for Computer Use
May 20 |Hidden Ghost Hand: Unveiling Backdoor Vulnerabilities in MLLM-Powered Mobile GUI Agents
May 20 |EVA: Red-Teaming GUI Agents via Evolving Indirect Prompt Injection
May 19 |Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
May 19 |GEM: Gaussian Embedding Modeling for Out-of-Distribution Detection in GUI Agents
May 19 |Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents
May 18 |Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
May 18 |Enhance Mobile Agents Thinking Process Via Iterative Preference Learning
May 18 |UIShift: Enhancing VLM-based GUI Agents through Self-supervised Reinforcement Learning
May 17 |Mobile-Bench-v2: A More Realistic and Comprehensive Benchmark for VLM-based Mobile Agents
May 16 |Group-in-Group Policy Optimization for LLM Agent Training
May 16 |A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?
May 15 |LLM-Explorer: Towards Efficient and Affordable LLM-based Exploration for Mobile Apps
May 15 |AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenge
May 12 |Web-Bench: A LLM Code Benchmark Based on Web Standards and Frameworks
May 11 |Seed1.5-VL Technical Report
May 11 |Web Page Classification using LLMs for Crawling Support
May 06 |OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents
May 01 |ScaleTrack: Scaling and back-tracking Automated GUI Agents
May 01 |[Visual Test-time Scaling for GUI Agent Grounding ](https://webagentlab.notion.site/Visual-Test-time-Scaling-for-GUI-Agent-Grounding-1ebcc62ec9f1805bb890cdc5711cbc25)
May 01 |Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning