Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents
Anzeige
Ähnliche Artikel
arXiv – cs.AI
•
Meta-Policy Reflexion: Reusable Reflective Memory and Rule Admissibility for Resource-Efficient LLM Agent
arXiv – cs.LG
•
Tool Zero: Training Tool-Augmented LLMs via Pure RL from Scratch
arXiv – cs.LG
•
AI Agents in Drug Discovery
arXiv – cs.AI
•
DeepCompress: A Dual Reward Strategy for Dynamically Exploring and Compressing Reasoning Chains
arXiv – cs.AI
•
BMGQ: A Bottom-up Method for Generating Complex Multi-hop Reasoning Questions from Semi-structured Data
arXiv – cs.AI
•
DeepAgent: A General Reasoning Agent with Scalable Toolsets