<aside> <img src="/icons/alien-pixel_green.svg" alt="/icons/alien-pixel_green.svg" width="40px" /> Home
</aside>
<aside> <img src="/icons/verified_green.svg" alt="/icons/verified_green.svg" width="40px" /> About PAPER
This channel is dedicated to building and expanding a comprehensive paper database focused on the Web Agent field and the boarder GUI agent field. Let’s collaborate to enrich this database and advance research in the exciting world of web agents!
</aside>
🔥 Newly updated(Jul)
New~Jul 15 |Let’s Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification
New~Jul 14 |NeuralOS: Towards Simulating Operating Systems via Neural Generative Models
New~Jul 13 |LaSM: Layer-wise Scaling Mechanism for Defending Pop-up Attack on GUI Agents
Jul 09 |VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation
Jul 08 |MobileGUI-RL: Advancing Mobile GUI Agent through Reinforcement Learning in Online Environment
Jul 08 |GTA1: GUI Test-time Scaling Agent
Jul 08 |R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding
Jul 06 |Hijacking JARVIS: Benchmarking Mobile GUI Agents against Unprivileged Third Parties
Jul 06 |WebSynthesis: World-Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis
Jul 05 |How to Train Your LLM Web Agent: A Statistical Diagnosis
Jul 04 |Less is More: Empowering GUI Agent with Context-Aware Simplification
Jul 04 |WebSailor: Navigating Super-human Reasoning for Web Agent
Jul 03 |Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks
Jul 03 |WebArXiv: Evaluating Multimodal Agents on Time-Invariant arXiv Tasks
Jul 01 |SafeMobile: Chain-level Jailbreak Detection and Automated Evaluation for Multimodal Mobile Agents
Jul 01 |GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Jul 01 |Qwen-GUI-3B: A Lightweight Vision-Language Model for Cross-Resolution GUI Grounding
Jul 01 |From Prompt Injections to Protocol Exploits: Threats in LLM-Powered AI Agents Workflows
Jul 01 |LineRetriever: Planning-Aware Observation Reduction for Web Agents